Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
PROCESSING CANDIDATE ABNORMALITIES IN MEDICAL IMAGERY BASED ON A HIERARCHICAL CLASSIFICATION
Document Type and Number:
WIPO Patent Application WO/2017/011532
Kind Code:
A1
Abstract:
Techniques are provided for automated detection and processing of abnormalities in medical imagery. The techniques include determining voxels involved with each of a plurality of candidate abnormalities in a medical image. The techniques also include determining whether a candidate abnormality belongs in one of multiple abnormality classes based at least on anatomical location context, size, and shape by successively subjecting the candidate abnormality to hierarchical tests for anatomical context, size range based on anatomical context, and shape parameter value range based on anatomical context and size range. The techniques include rejecting a candidate abnormality that does not fall into any one of the abnormality classes. The techniques also include presenting on a display device a property of the candidate abnormality based on the abnormality class to which the candidate abnormality belong.

Inventors:
LU LIN (US)
ZHAO BINSHENG (US)
SCHWARTZ LAWRENCE H (US)
Application Number:
PCT/US2016/042055
Publication Date:
January 19, 2017
Filing Date:
July 13, 2016
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV COLUMBIA (US)
LU LIN (US)
ZHAO BINSHENG (US)
SCHWARTZ LAWRENCE H (US)
International Classes:
A61B6/03; G01N33/574; G06T7/60; G06V10/42; G16B40/20; G16H10/40; G16H30/20; G16H50/20; G16H70/60
Foreign References:
US20090252395A12009-10-08
US6760468B12004-07-06
US6728334B12004-04-27
Attorney, Agent or Firm:
MOLINELLI, Eugene J. et al. (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A non-transitory computer-readable medium carrying one or more sequences of instructions, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:

determining voxels involved with each of a plurality of candidate tissue abnormalities in a medical image;

determining whether a candidate abnormality belongs in one of a plurality of

abnormality classes based at least on anatomical location context, size, and shape by successively subjecting the candidate abnormality to hierarchical tests for anatomical context, size range based on anatomical context, and shape parameter value range based on anatomical context;

rejecting a candidate abnormality that does not fall into any one of the plurality of nodule classes; and

presenting on a display device a property of the candidate abnormality based on the abnormality class to which the candidate abnormality belongs.

2. A non-transitory computer-readable medium as recited in claim 1 , wherein determining whether the candidate abnormality belongs in one of the plurality of abnormality classes further comprises determining whether the candidate abnormality belongs in one of the plurality of abnormality classes based also on a type of connection with another feature in the medical image.

3. A non- transitory computer-readable medium as recited in claim 1, wherein the candidate abnormalities are candidate nodules in a lung and the anatomical contexts include: mediastinum for candidate nodules attached to pulmonary vasculature; chest wall for candidate nodules attached to a chest wall; and, peripheral for remaining candidate nodules.

4. A non-transitory computer-readable medium as recited in claim 1 , wherein the property presented includes a number of voxels of the candidate abnormality that is not rejected or locations of the voxels of the candidate abnormality that is not rejected or an identifier for an abnormality class to which the candidate abnormality that is not rejected belongs or some combination. A system comprising:

at least one processor; and

at least one memory including one or more sequences of instructions,

the at least one memory and the one or more sequences of instructions configured to, with the at least one processor, cause the apparatus to perform at least the following, determining voxels involved with each of a plurality of candidate abnormalities in a medical image;

determining whether a candidate abnormality belongs in one of a plurality of

abnormality classes based at least on anatomical location context, size, and shape by successively subjecting the candidate abnormality to hierarchical tests for anatomical context, size range based on anatomical context, and shape parameter value range based on anatomical context and size range; rejecting a candidate abnormality that does not fall into any one of the plurality of abnormality classes; and

presenting on a display a property of the candidate abnormality based on the

abnormality class to which the candidate abnormality belongs.

A method comprising:

determining on a processor voxels involved with each of a plurality of candidate abnormalities in a medical image;

determining on a processor whether a candidate abnormality belongs in one of a plurality of abnormality classes based at least on anatomical location context, size, and shape by successively subjecting the candidate abnormality to hierarchical tests for anatomical context, size range based on anatomical context, and shape parameter value range based on anatomical context and size range;

rejecting on a processor a candidate abnormality that does not fall into any one of the plurality of abnormality classes; and

presenting on a display device a property of the candidate abnormality based on the abnormality class to which the candidate abnormality belongs.

Description:
PROCESSING CANDIDATE ABNORMALITIES IN MEDICAL IMAGERY

BASED ON A HIERARCHICAL CLASSIFICATION USING ANATOMICAL LOCATION AND CHARACTERISTICS OF CANDIDATES

CROSS-REFERENCE TO RELATED APPLICATION

[0001] This application claims benefit of Provisional Appln. 62/191,834, filed July 13, 2015, the entire contents of which are hereby incorporated by reference as if fully set forth herein, under 35 U.S.C. § 119(e).

BACKGROUND

[0002] Automated detection of nodules in medical images becomes critical for cancer screening and early identification of cancer metastasis and determining efficacy of cancer treatment. A visible nodule is a structure in an organ that has different intensity values to those of the organ in medical imagery such as magnetic resonance imagery (MRI) and computed tomography (CT) x-ray imagery. Performing nodule detection by human analysts suffers because the human can become tired and lose focus when screening a large number of images. Additionally, the accuracy of manual nodule detection depends on radiologists' reading skills, which varies among individual radiologists. Automated computer-aided detection methods, once trained by expert radiologists, can tirelessly provide fast and objective nodule detection results.

SUMMARY

[0003] Techniques, including a method, a system and a computer-readable medium, are provided for automated detection and processing of tissue abnormalities, such as nodules, in medical imagery. The techniques include determining voxels involved with each of a plurality of candidate nodules in a medical image. The techniques also include determining whether a candidate nodule belongs in one of multiple nodule classes based at least on anatomical location context, size, and shape by successively subjecting the candidate nodule to hierarchical tests for anatomical context, size range based on anatomical context, and shape parameter value range based on anatomical context and size range. The techniques include rejecting a candidate nodule that does not fall into any one of the nodule classes. The techniques also include presenting a property of the candidate nodule on a display device based on the nodule class to which the candidate nodule belongs. [0004] Still other aspects, features, and advantages are readily apparent from the following detailed description, simply by illustrating a number of particular embodiments and implementations, including the best mode contemplated for carrying out the invention. Other embodiments are also capable of other and different features and advantages, and its several details can be modified in various obvious respects, all without departing from the spirit and scope of the invention. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.

BRIEF DESCRIPTION OF THE DRAWINGS

[0005] Embodiments are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like reference numerals refer to similar elements and in which:

[0006] FIG. 1A is a block diagram that illustrates an imaging system for tissue detection, according to an embodiment;

[0007] FIG. IB is a block diagram that illustrates scan elements in a 2D scan, such as one scanned image from a CT scanner;

[0008] FIG. 1C is a block diagram that illustrates scan elements in a 3D scan, such as stacked multiple scanned images from a CT imager or true 3D scan elements from volumetric CT imagers or ultrasound;

[0009] FIG. 2 is a flow chart that illustrates an example method to classify candidate nodules based on a hierarchical series of tests for anatomical location context, size and shape, according to an embodiment;

[0010] FIG. 3 is a block diagram that illustrates a computer system upon which an embodiment of the invention may be implemented;

[0011] FIG. 4 illustrates a chip set upon which an embodiment of the invention may be implemented;

[0012] FIG. 5 is a plot that illustrates an example distribution of lung nodule sizes in a data set used to demonstrate the method of FIG. 2, according to an embodiment;

[0013] FIG. 6A through FIG. 6C are flow charts that illustrate an example of the method of

FIG. 2 implemented for lung nodules, according to an embodiment;

[0014] FIG. 7A through FIG. 7F are images that illustrate six example types of nodules processed by a peripheral-nodule detector, according to an embodiment; [0015] FIG. 8A through FIG. 8C are images that illustrate three example types of nodules processed by a chestwall-nodule detector, according to an embodiment;

[0016] FIG. 9A through FIG. 9D are images that illustrate four example types of nodules processed by a mediastinum-nodule detector; according to an embodiment;

[0017] FIG. 10A through FIG. 10E are images that illustrate example results from various processing steps, according to an embodiment;

[0018] FIG. 11 is an image that illustrates an example scaled geodesic distance map resulting from one step of a classification process, according to an embodiment;

[0019] FIG. 12 is a table that illustrates example node features, according to an embodiment;

[0020] FIG. 13A and FIG. 13B are free-response receiver operating characteristic (FROC) curves that illustrate example performance of the method on a training set and testing set, respectively, according to an embodiment;

[0021] FIG. 14A and FIG. 14B are FROC curves that illustrate example performance of the method on a training set and testing set, respectively, using different diameter thresholds, according to various embodiments; and

[0022] FIG. 151 A and FIG. 15B are FROC curves that illustrate example performance of the method on a training set and testing set, respectively, using different sphericity thresholds, according to various embodiments.

DETAILED DESCRIPTION

[0023] A method and apparatus are described for automatic detection and processing of tissue abnormalities, such as nodules, in medical imagery. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the present invention.

[0024] Notwithstanding that the numerical ranges and parameters setting forth the broad scope are approximations, the numerical values set forth in specific non-limiting examples are reported as precisely as possible. Any numerical value, however, inherently contains certain errors necessarily resulting from the standard deviation found in their respective testing measurements. Unless otherwise clear from the context, a numerical value presented herein has an implied precision given by the least significant digit. Thus a value 1.1 implies a value from 1.05 to 1.15. The term "about" is used to indicate a broader range centered on the given value, and unless otherwise clear from the context implies a broader range around the least significant digit, such as "about 1.1" implies a range from 1.0 to 1.2. If the least significant digit is unclear, then the term "about" implies a factor of two, e.g., "about X" implies a value in the range from 0.5X to 2X, for example, about 100 implies a value in a range from 50 to 200. Moreover, all ranges disclosed herein are to be understood to encompass any and all sub-ranges subsumed therein. For example, a range of "less than 10" can include any and all sub-ranges between (and including) the minimum value of zero and the maximum value of 10, that is, any and all sub-ranges having a minimum value of equal to or greater than zero and a maximum value of equal to or less than 10, e.g., 1 to 4.

[0025] Some embodiments of the invention are described below in the context of lung nodules. However, the invention is not limited to this context. In other embodiments the method is used to detect nodules, or other tissue abnormalities, in other organs and tissues, such as in the brain, liver, intestines, muscles or connective tissue.

1. Structural Overview

[0026] FIG. 1A is a block diagram that illustrates an imaging system 100 for tissue detection, according to an embodiment. The system 100 is designed for determining the spatial arrangement of soft target tissue in a living body. For purposes of illustration, a living body is depicted, but is not part of the system 100. In the illustrated embodiment, a living body is depicted in a first spatial arrangement 132a at one time and includes a target tissue in a corresponding spatial arrangement 134a. At a different time, the same living body is in a second spatial arrangement 132b that includes the same or changed target tissue in a different corresponding spatial arrangement 134b.

[0027] In the illustrated embodiment, system 100 includes a scanning device 140, such as a full dose X-ray computed tomography (CT) scanner, or a magnetic resonance imaging (MRI) scanner, among others, such as magnetic resonance spectral imaging (MRSI) scanner and ultrasound scanner,. In some embodiments, the scanning device 140 is used at one or more different times. The device 140 is configured to produce scanned images that each represents a cross section of the living body at one of multiple cross sectional (transverse) slices arranged along the axial direction of the body, which is oriented in the long dimension of the body. [0028] In system 100, data from the imager 140 is received at a computer 160 and stored on storage device 162. Computer systems and storage devices like 160, 162, respectively, are described in more detail below with reference to FIG. 3 and FIG. 4. Scan data 180a, 180b, 190a, 190b based on data measured at imager 140 at one or more different times or axial locations or both are stored on storage device 162. For example, scan data 180a and scan data 180b, which include scanned images at two slices separated in the axial direction, is stored based on measurements from scanning device 140 at one time. Scan data 190a, 190b, which include scanned images at two slices separated in the axial direction, is stored based on measurements from scanning device 140 at a different time.

[0029] In various embodiments, a tissue and abnormality detection process 150 operates on computer 160 to determine a boundary between scan elements of scan data which are inside and outside a particular target tissue or tissue abnormality. The boundary data is stored in boundary data 158 in associations with the scan data, e.g., scan data 180a, 180b, 190a, 190b.

[0030] Although processes, equipment, and data structures are depicted in FIG. 1A as integral blocks in a particular arrangement for purposes of illustration, in other embodiments one or more processes or data structures, or portions thereof, are arranged in a different manner, on the same or different hosts, in one or more databases, or are omitted, or one or more different processes or data structures are included on the same or different hosts. For example, although system 100 is depicted with a particular number of scanning devices 140, computers 160, and scan data 150, 160 on storage device 162 for purposes of illustration, in other embodiments more or fewer scanning devices, computers, storage devices and scan data constitute an imaging system for determining spatial arrangement of tissues, including cells.

[0031] FIG. IB is a block diagram that illustrates scan elements in a 2D scan 110, such as one scanned image from a CT scanner. The two dimensions of the scan 110 are represented by the x direction arrow 102 and the y direction arrow 104. The scan 110 consists of a two dimensional array of 2D scan elements 112 (also called picture elements, pixels, in case the data are displayed as an image on a display device) each with an associated position.

Typically, a 2D scan element position is given by a row number in the x direction and a column number in the y direction of a rectangular array of scan elements. A value at each scan element position represents a measured or computed intensity or amplitude that represents a physical property (e.g., X-ray absorption, or resonance frequency of an MRI scanner) at a corresponding position in at least a portion of the spatial arrangement 132a, 132b of the living body. The measured property is called amplitude hereinafter and is treated as a scalar quantity. In some embodiments, two or more properties are measured together at a pixel location and multiple amplitudes are obtained that can be collected into a vector quantity, such as spectral amplitudes in MRSI. Although a particular number and

arrangement of equal sized circular scan elements 112 are shown for purposes of illustration, in other embodiments, more elements in the same or different arrangement with the same or different sizes and shapes are included in a 2D scan.

[0032] FIG. 1C is a block diagram that illustrates scan elements in a 3D scan 120, such as stacked multiple scanned images from a CT imager or true 3D scan elements from volumetric CT imagers or MRI or ultrasound (US) imagers. The three dimensions of the scan are represented by the x direction arrow 102, the y direction arrow 104, and the z direction arrow 106. The scan 120 consists of a three dimensional array of 3D scan elements (also called volume elements and abbreviated as voxels) 122 each with an associated position. Typically, a 3D scan element position is given by a row number in the x direction, column number in the y direction and a scanned image number (also called a scan number) in the z (axial) direction of a cubic array of scan elements or a temporal sequence of scanned slices. A value at each scan element position represents a measured or computed intensity that represents a physical property (e.g., X-ray absorption for a CT scanner, or resonance frequency of an MRI scanner) at a corresponding position in at least a portion of the spatial arrangement 132a, 132b of the living body. Although a particular number and arrangement of equal sized spherical scan elements 122 are shown for purposes of illustration, in other embodiments, more elements in the same or different arrangement with the same or different sizes and shapes are included in a 3D scan. A 3D scan repeated in time, as the fourth dimension, provides a 4D scan of a single subject.

[0033] The term voxels is used herein to represent either 2D scan elements (pixels) or 3D scan elements (voxels), or 4D scan elements, or some combination, depending on the context.

[0034] Amplitude is often expressed as one of a series of discrete gray-levels, given in Hounsfield units (HU) for x-ray CT scans.

2. Method Overview

[0035] FIG. 2 is a flow chart that illustrates an example method 200 to classify candidate nodules based on a hierarchical series of tests for anatomical location context, size and shape, according to an embodiment. Although steps are depicted in FIG. 2, as integral steps in a particular order for purposes of illustration, in other embodiments, one or more steps, or portions thereof, are performed in a different order, or overlapping in time, in series or in parallel, or are omitted, or one or more additional steps are added, or the method is changed in some combination of ways. In example embodiments described in a later section, an embodiment of the method 200 is applied to classifying lung nodules in CT scans. A means for performing any or all of these steps includes the scanning device 140 depicted in FIG. 1A or the computer systems or chip set as described below with reference to FIG. 3 and FIG. 4, respectively, alone or in some combination.

[0036] In step 201, data is obtained that indicates a set of images. The set represents a corresponding succession of cross sections of a subject, such as an animal or human patient, e.g., from one or more MRI or CT scans. Any method may be used to obtain the set of images, including collecting the images from a MRI or CT scanner, retrieving the images from data storage either locally or remotely, either in response to a query or unsolicited, from various files or a database. In step 203, each image is tested against one or more criteria for using the images. These criteria ensure that at least some images of the set are suitable for the analyses to follow. If not, control passes back to step 201 to obtain another set, or (not shown) the process ends. For nodule identification and classification it is useful to have three dimensional data with at least two voxels in each dimension for capturing nodules of about 3

_3

millimeter (mm, 1 mm = 10 meters) diameter or larger. In step 203 it is determined if the data set supports this kind of resolution. The criteria for the lung nodules are described in more detail in the later section.

[0037] In some embodiments in which tuning of the parameters that are applied to navigate the hierarchy of tests occurs for a particular organ, a fraction of the data set is separated into a training set and another fraction is separated into a testing set. For the training set and testing set, nodule or other abnormality identification by an expert is used to indicate which abnormalities are to be identified and which are to be rejected.

[0038] In step 205 candidate abnormalities are determined, using any method known in the art to identify voxels in one or more images that are candidates for being classified as an abnormality, such as a nodule. In an illustrated embodiment, during step 205, images of the set for one subject are interpolated to isotropic volumetric data. Then a target organ volume is determined that includes several adjacent interpolated images from the same subject. Within the organ volume, multi-scale enhancement filters of dot and line are applied to the volumetric data. The output of dot and line enhancement filters are normalized and combined into voxel clusters with properties indicted by the parameters Zc_dot and Zc_line. Candidate abnormalities are determined according to half-peak volumes of the voxel clusters. Half -peak volume is defined as the number of voxels that have Zc_dot values larger than half of the local maximal Zc_dot peak value. Half-peak volume is obtained by applying a threshold ThrZc_dot = 0.02 to the output of dot filter of the target organ volume. Voxels with Zc_dot values smaller than ThrZc_dot are regarded as noise and set as 0; otherwise they are set as 1. The cluster volumes are labeled using a region-growing algorithm. Within each labeled volume, voxels with Zc_dot values smaller than half of the local peak Zc_dot value are removed and the remaining voxels are selected to create the candidate abnormality. In some embodiments, the candidate abnormalities are stored with their characteristic (e.g., voxel coordinates, subject identifier, and cluster parameters) in data storage on one or more computer systems or chip sets, as described in more detail later with reference to FIG. 3 and FIG. 4.

[0039] The following steps 207 through 293 are repeated for each candidate nodule to test the candidate nodule in a hierarchical set of operations, called nodes of a classification tree. In each node, a node-specific set of conditions are evaluated and tested against node-specific thresholds. A candidate abnormality that satisfies the node-specific thresholds of the node- specific conditions is classified as an abnormality of that node; and, node-specific features are extracted from the accepted abnormality. A candidate abnormality that does not satisfy the node-specific threshold is passed on to the next node in the hierarchy. A candidate abnormality that fails all the tests is rejected as not a true abnormality, e.g., not an abnormality of the type being sought, such as a nodule of a particular tissue.

[0040] In step 207, the next candidate abnormality is selected for classification, e.g., retrieved from storage on a computer system or chip set. In step 211, at the first level of the hierarchy of tests, an anatomical location context is determined for the abnormality. For example, it is determined whether a nodule is away from the boundaries of the organ, or, if at a boundary, whether the candidate nodule is associated with or affected by a tissue type or organ that shares the boundary with the candidate nodule. The result of step 211 is an anatomical location context for the current candidate abnormality. For purposes of illustration it is assumed that the anatomical location contexts considered for the current organ are designated context A, context B, among others. For each context, a different set of subsequent steps are taken, all analogous to the steps shown in FIG. 2 for context A. In the example embodiment, three contexts are used for candidate lung nodules, "mediastinum" for candidate nodules attached to pulmonary vasculature; "chest wall" for candidate nodules attached to a chest wall; and, "peripheral" for remaining candidate nodules.

[0041] In step 213 is determined whether the candidate abnormality is considered small in the current context. In different contexts, the methods to determine the size of a candidate abnormality differ and the thresholds to determine the difference between "small" and not small (called "large" here) can also differ. For example, the way to determine the size of a candidate nodule that is separate from a boundary might include the boundary and give a wrong result if the same method is used in a different context at the boundary. In some embodiments, more than two size thresholds are used to distinguish small from mid-sized and mid-sized from large candidate abnormalities. The separation of candidate abnormality processing by size is based on the observation that shape determinations are more difficult with the smaller candidate abnormalities and subtle changes in shape may not be as detectable as with larger candidate abnormalities. Thus the size test performed at step 213 depends on the context, in terms of either the processing of the candidate abnormality to determine a size or in terms of the one or more thresholds that are first order parameters of the method 200, or both. In some embodiments, the values for the first thresholds are based on optimizing the performance of the method 200 using the training set and the testing set.

[0042] If the candidate abnormality is small according to step 213, then control passes to step 215. Otherwise control passes to step 217. Both steps test for the shape of the nodule. In different contexts and size ranges, the methods to determine the shape of a candidate abnormality differ and the thresholds to determine the difference between "ball like" and not ball-like (called "irregular" here) can also differ. A ball-like abnormality is spherical and unattached to other tissues, whereas an irregular abnormality is not spherical or is attached to other tissues, especially vessels, or both. Thus the shape tests performed at steps 215 and 217 depend on the context and size, in terms of either the processing of the candidate abnormality to determine a shape (e.g., sphericity or blobness or both, as described in a later section for candidate lung nodules) or in terms of the one or more shape thresholds that are first order parameters of the method 200, or both. In some embodiments, the values for the thresholds are based on optimizing the performance of the method 200 using the training set and the testing set. [0043] Although the size test is shown as the second level of the hierarchy, and the shape test as the third level of the hierarchy in FIG. 2, in other embodiments, the order is switched and shape is in the second level and size is the third level, or one or more different characteristics, such as intensity value or intensity gradient, is used as the second or third level or even introduced in another level of the hierarchy above the leaf nodes level. In some embodiments, the order of the properties used in the hierarchy in one location context, e.g., location context A, is different from the order of the properties used in the hierarchy in a different location context, e.g., location context B.

[0044] Based on the results of the size and shape tests, or other tests, particular processes and thresholds are applied in the last level of the hierarchy, called leaf nodes, represented by steps 221, 241, 261, 281. In some embodiments, some leaf nodes are compound nodes made up of several sub-nodes based on other properties than size and shape or other properties used as intermediate levels of the hierarchy, such as type of connection to adjacent tissue, as explained in the example embodiment for periphery node 5 and periphery node 6. Each node in the detection tree, including, in some embodiments, nodes corresponding to steps 213, 215 and 217, corresponds to one type of abnormality.

[0045] For each node, one or two node-specific algorithms were adopted. Each node contains three procedures for testing the current candidate abnormality. The procedures are as follows: 1) compute condition values, 2) compare condition values to thresholds and determine whether the candidate abnormality could be classified by the node, and 3) extract abnormality features (and train classifier). In each node, candidate abnormalities determined as "Yes" (as indicated by the PASS branch steps 223, 243, 263 and 283 in FIG. 2) are regarded as true abnormalities of the class corresponding to the node, as indicated in steps 227, 247, 267 and 277 in FIG. 2. In some embodiments, steps 227, 247, 267 and 273 each includes storing or presenting on a display device the classified abnormality and zero or more of its nodule features.

[0046] Otherwise, candidate abnormalities determined as "No" will continue to the next node until all leaf nodes have been executed. This is indicated by the "ALL?" branch steps 225, 245, 265 and 285. If all leaf nodes have not been executed, the next leaf node is executed, as indicated by the arrows leading from the "ALL?" branch steps 225, 245, 265 and 285 to leaf nodes 241, 261, 281 and 221, respectively. If all leaf nodes for the anatomical location context have been applied and the candidate abnormality is found to belong to none of those classes, then control passes to step 291 where the candidate is rejected as not an abnormality of interest. In some embodiments, step 291 includes storing or presenting on a display device the rejected abnormality and zero or more of its features.

[0047] In some embodiments, both branches of size test node 213 and shape test nodes 215 and 217 are not followed. Instead, one branch of each is followed leading to only one leaf node. If a candidate nodule fails to be classified by the first leaf node, then control passes to the next leaf node, until all leaf nodes have been applied as determined in steps 225, 245, 265 and 285. If all leaf nodes for the anatomical location context have been applied and the candidate abnormality is found to belong to none of those classes, then control passes to step 291 where the candidate is rejected as not an abnormality of interest.

[0048] In various embodiments, control then passes to step 293 to determine whether there is another candidate abnormality to classify. If so, then control passes back to step 207 to repeat the process for the next candidate abnormality. If not, the process ends.

3. Example embodiment for lung nodules as the tissue abnormality

[0049] A more detailed example embodiment for use with candidate nodules in the lung as the tissue abnormality is described here. The diversity of lung nodules poses difficulty for the current computer-aided diagnostic (CAD) schemes for lung nodule detection on computed tomography (CT) scan images, especially in large-scale CT screening studies. The hybrid method described here integrates several existing and widely-used algorithms in the field of nodule detection, in a new way, including morphological operation, dot-enhancement based on Hessian matrix, adaptive thresholding, fuzzy connectedness segmentation, local density maximum algorithm, geodesic distance map and regression tree classification, as described in various references listed herein. Therefore these well known algorithms are not described in detail herein. In the new combination, all of the adopted algorithms were organized into a novel tree structure with multiple nodes. Each node in the tree structure tailored to deal with one type of lung nodule.

[0050] Intensity based schemes use the intensity difference of voxels between lung nodules and surrounding background (tissues) for nodule detection. Zhao et al. [15] developed a local density maximum algorithm to detect nodules with local high-density structures. Li et al. [7] employed enhancement procedure into nodule detection by applying a group of selective dot- enhancement filters to the original CT images. The dot-enhancement filters could suppress other normal anatomic structures that may be easily mistaken for nodules, such as small blood vessels and airway walls. Messay et al. [9] combined intensity thresholding with morphological processing to generate candidate nodules for detection.

[0051] Shape based schemes developed statistical models to characterize lung nodules and search for matching objects in the image space. Okada et al. [11] presented a method to fit nodules by ellipsoid based on anisotropic Gaussian filters. Casio et al. [4] developed a series of stable 3D mass-spring models for nodule matching. Pai et al.[12] proposed to identify initial candidate nodules based on surface normal overlap. Matsumoto et al. [8] developed novel quantized convergence index filters aiming to match round lesions. Ye et al. [14] presented a method that could provide good description to objects of specific shapes with high spherical elements by employing volumetric shape index map and the dot-enhancement map. Machine learning schemes characterize candidate nodules as a group of extracted features and use pre-trained classifiers for the discrimination. Golosio et al. [5] proposed a multi-threshold method to construct a classification system for candidate nodules of different intensity levels with the help of artificial neural networks. Murphy et al. [10] employed shape index and curvedness as nodule features and reduced false positives via two successive k- nearest-neighbor classifiers.

[0052] Tan et al. [13] proposed to conduct nodule determination in a newly-defined gauge coordinates system with feature- selective classifiers based on genetic algorithms and artificial neural networks. Li et al. [6] proposed to utilize both the global three-dimensional (3D) information and local two-dimensional (2D) information of CT scan images to train nodule detectors.

[0053] Despite significant progress, there are still no widely accepted CAD schemes, especially dealing with large amounts of datasets. The complications of nodule detection usually stem from five problematic situations recognized by herein.. First, CT scans in the dataset might be generated by various protocols (e.g., different tube exposure time products, different convolution kernels, and different slice thicknesses) and thus with various noise patterns. Second, nodules might attach to surrounding structures of similar density (e.g., small blood vessels, airway walls and pulmonary scars), leading to the failure of purely intensity-based schemes. Third, high-contrast structures (e.g., pulmonary vasculature and chest wall) might impede the reliability of models-based schemes. Fourth, intensity can vary significantly even in an individual nodule, especially if the nodule contains ground glass opacity (GGO). Fifth, the shape and size of nodules can vary significantly. Machine-learning based schemes have to use a lot of features to characterize heterogeneous candidate nodules, leading to overfitting and low efficiency.

[0054] The method was demonstrates on a data set of actual scans including 294 CT scans from the Lung Image Database Consortium (LIDC) dataset. The CT scans were randomly divided into two independent subsets: a training set (196 scans) and a testing set (98 scans). In total, the 294 CT scans contained 631 lung nodules, which were annotated by at least two radiologists participating in the LIDC project. The sensitivity and false positive per scan using the method trained on the training set were 87% and 2.61, respectively, for the training set and 85.2% and 3.13, respectively, for the testing set.

[0055] The database contains lung nodules that were annotated by four experienced radiologists (who participated in the LIDC project) in a two-phase reading procedure. The first reading of CT scan images was blind. Each radiologist identified the locations and radiological characteristics of lung nodules independently. The second reading was unblinded. Each radiologist reviewed the first reading results along with the information provided by the other three radiologists in the second reading phase, and decided whether to change their previous annotations. Final annotations were determined after the second unblinded reading.

[0056] The dataset used in this embodiment were selected from LIDC database according to eight conditions: 1) the CT scan contained nodules of diameter 3-30 millimeters (mm, 1 mm

_3

= 10 meters); 2) the CT scan images were not contrast-enhanced; 3) the slice thickness was 1.25-3 mm; 4) the slice interval was 0.625-3.0 mm; 5) the scan voltage was 12-140 kilovolts,

3

peak (kVp, lkV=10 volts); 6) the tube current-time product was 20-320 milli amperes per

_3

second (mAs, 1 mA = 10 amperes); 7) the image size was 512x512 pixels; 8) the image pixel spacing was 0.5-0.8. The eight conditions were designed in reference to the national lung screening trail [21]. FIG. 5 is a plot that illustrates an example distribution of lung nodule sizes in the data set used to demonstrate the method of FIG. 2, according to an embodiment. The size distribution of the selected nodules is shown and approximates a Poisson distribution with a peak at a size of about 6 to 7 mm.

[0057] FIG. 6A through FIG. 6C are flow charts that illustrate an example of the method of FIG. 2 implemented for lung nodules, according to an embodiment. The method of FIG. 6 A through FIG. 6C is implemented as three separate modules. The first module generates candidate nodules. The second module classifies them into three anatomical location context categories: peripheral-nodule; chestwall-nodule; and mediastinum-nodule. The third module, responsible for detection, is composed of detection nodes in the form of the tree structure of FIG. 6A. As a result, the candidate nodules are categorized and partitioned in terms of their location, size, and shape. Based on its location, a candidate nodule can be categorized as a peripheral-nodule, chestwall-nodule or mediastinum-nodule. Based on its size, a candidate nodule can be large or small, according to its diameter. And based on its shape, a candidate nodule can be ball-like or irregular according to sphericity. A ball-like nodule is spherical and unattached to other tissues, whereas an irregular nodule is not spherical or is attached to other tissues, especially vessels, or both.

[0058] Based on these three criteria for the different levels of the tree structure, a total of thirteen types of nodules were defined. As shown in FIG. 6B, each leaf node in the detection tree (called simply a node hereinafter) corresponds to one type of nodule.

[0059] For each node, one or two specific algorithms were adopted. Each node contains three procedures for selecting the corresponding nodule (see FIG. 6B and FIG. 6C). The procedures are as follows: 1) compute condition values, 2) compare condition values to thresholds and determine whether the candidate nodule could enter (i.e., be accepted by) the node, and 3) extract image features and train classifier. In each node, candidate nodules determined as "Yes" (accepted) will be regarded as true nodules; otherwise, candidate nodules determined as "No" will continue to the next node. To facilitate the illustration, nodes are named according to their order in the detection pipeline. For instance, node "PER_NODE_3" denotes the third node to detect within the peripheral-nodule

[0060] The example implementation involves multiple parameters, which can be put into two groups, primary parameters and secondary parameters. Primary parameters are those used to categorize and partition candidate nodules (e.g. diameter and sphericity thresholds), while secondary parameters are derived from adopted specific algorithms. Experiments were performed with the test sets and training sets to determine the optimal or preferred values for the primary parameters (as described in more detail below). For secondary parameters, the settings were derived from the related literature where such parameters have been widely studied.

[0061] The procedure to generate candidate nodules is composed of the following steps. Step 1 interpolate CT images to isotropic volumetric data. In this step, all CT images are stacked to form a volume and then interpolated into an isotropic one, that is, volumetric data of same spacing along sagittal, coronal and transverse directions. Step 2 generates target lung area. In step 2, target lung area is segmented by the lung segmentation algorithm developed by Zhao [15]. Step 3 applies multi-scale enhancement filters of dot and line to the volumetric data as described in more detail below. Step 4 normalizes and combines the output of dot and line enhancement filters. Step 5 determines candidate nodules according to half-peak volume.

[0062] In step 3, multi-scale enhancement filters of dot and line developed by Li et al. [7] are adopted. Suppose the basic dot and line shapes in the volumetric data could be represented by Gaussian functions as in Equation 1 and Equation 2.

i r \ ( x 2 +y 2 +z 2 \ .. .

d{x, y, z) = exp ( ^ —— J (1) l(x, y, z) = exp(- (2)

The enhancement dot and line filters are constructed by the three eigenvalues of the Hessian matrix of the above equations to yield Equation 3 and Equation 4.

\λ I 2

Zdot i . ^-i) = TT > i f λ ι < 0, λ 2 < 0, λ 3 < 0; 0 otherwise; (3) ζΐίη β α ι 2 3 ) = 1λΜ ^ 1 ~ 1λ3 Ϊ) , if λ 1 < 0, λ 2 < 0; 0 otherwise; (4)

Where λ χ , λ 2 and A 3 are the three eigenvalues that satisfy \λ \≥ \λ 2 \≥ |A 3 1 . In the multi- scale enhancement filters, the output of dot and line filters will multiply their own Gaussian factor σ Λ 2, which is used as a scale to modify filters for the detection of target objects of different sizes. For a voxel, the final filter output is the maximal filter output among all the scales.

[0063] In step 4, the output of dot and line filters is first normalized to [0,1] (denoted as z n_dot an d z n_iine) > an d then the normalized dot and line filters are combined according to Equation 5 through Equation 8.

zdot

(5)

z c _aot = (1 - expi- ¾^))exp(- (7)

Zcjine = (1 - ex V {- ¾1)) βχ ρ(- (8)

The DOT_MAXVALUE, LINE_MAXVALUE a and β are empirically set as 35.0, 100.0[6, 7], 0.4 and 0.4[22], respectively.

[0064] In step 5, the concept of "half -peak volume" is used to generate candidate nodules. Half-peak volume is defined as the number of voxels that have z_c_dot values larger than half of the local maximal z_c_dot peak value. Half -peak volume is obtained by this procedure. A threshold Thr_z c_dot =0.02 is applied to the output of dot filter of the target lung area. Voxels with z_c_dot values smaller than Thr_z c_dot are regarded as noise and set as 0, otherwise they are set as 1. The candidate volumes are labeled using a region-growing algorithm. Within each labeled candidate volume, voxels with z_c_dot values smaller than half of the local peak z_c_dot value are removed and the remaining voxels are selected to create the candidate nodule.

[0065] Candidate nodules can be classified as peripheral, chestwall or mediastinum based on their location. Examples of the thirteen types of nodules in the corresponding nodes are presented in FIG. 7 through FIG. 9. FIG. 7A through FIG. 7F are images that illustrate six example types of nodules processed by a peripheral-nodule detector, according to an embodiment. FIG. 8A through FIG. 8C are images that illustrate three example types of nodules processed by a chestwall-nodule detector, according to an embodiment. FIG. 9A through FIG. 9D are images that illustrate four example types of nodules processed by a mediastinum-nodule detector; according to an embodiment.

[0066] FIG. 10A through FIG. 10E are images that illustrate example results from various processing steps, according to an embodiment. Though images in FIG. 10 through FIG. 10E are 2D, the procedure is conducted in 3D. The first step is to extract the lung region from the CT images by applying a threshold Thr_category = -275 HU[23]. FIG. 10A shows an example slice of lung image. FIG. 10B shows an example extracted lung region. Candidate nodules attached to the pulmonary vasculature are defined as mediastinum-nodules.

Candidate nodules attached to the chest wall are defined as chestwall-nodules. Candidate nodules other than mediastinum-nodules and chestwall-nodules are defined as peripheral- nodules. Secondly, a morphological opening operation is applied to the extracted region with a ball of radii 60 mm. FIG. IOC shows an example modified region after morphological opening operation. The surface of the mask in FIG. IOC is defined as chest wall.. Then the extracted region is subtracted from the modified region. FIG. 10D shows an example region after subtracting extracted region from modified region. In the subtracted volumes, the largest object is defined as the pulmonary vasculature. FIG. 10E shows an example pulmonary vasculature.

[0067] The example peripheral-nodule detector consists of six nodes, PER_NODE_l through PER_NODE_6, which produce example nodules depicted in FIG. 7A though FIG. 7F. [0068] PER_NODE_l is for large candidate nodules. As shown in FIG. 10D, a group of objects could be attained after applying a threshold Thr_ac ate gory. Then, by performing morphological open operation (with spherical structural element of radii 3.0 mm) to the objects, one can obtain a group of candidate nodules. Candidate nodules of diameter greater than 9 mm (meaning that after opening the potential nodules should be at least 9+3*2 = 15 mm in diameter), are used for classification in PER_NODE_l.

[0069] PER_NODE_2 and PER_NODE_3 are based on the properties of blobness, sphericity and half peak volume. The thresholds of blobness>75.0 [6, 7] and sphericity>0.50 are used to select ball-like candidate nodules. If a candidate nodule has a half peak diameter>3 mm, it is used for classification in PER_NODE_2. If a candidate nodule has a half peak diameter<3 mm, it is used for classification in PER_NODE_3. Blobness and sphericity are calculated by Equations 9 and 10.

blobness = Max(Zc - do ^ * /ιαΖ/ 7 peak volume (9)

Max(z c Une )

sphericity = (10)

Surface nodule

[0070] PER_NODE_4 targets irregular nodules with no attaching tissue, especially ground glass opacity (GGO) with no vessel going through or surrounding. Candidate nodules that encounter PER_NODE_4 will be modified by an adaptive threshold based on Hounsfield units [16]. First, the voxel with the largest z_c_dot value is picked up and labeled as point p_max_c_dot. Then, a Hounsfield unit threshold Thr_adaptive_HU = 7-275 HU [25] is established. The I is the Hounsfield unit of point p_max_c_dot. The Thr_adaptive_HU is applied to the reconstruction of candidate nodule, that is, the voxels with Hounsfield units larger than the Thr_adaptive_HU are selected to reconstruct the candidate nodule's volumetric data.

[0071] PER_NODE_5 mainly takes advantage of intensity information to refine irregular- shaped candidate nodule. First, candidate nodules that are directed to PER_NODE_5 undergo fuzzy segmentation [17]. The refined result by fuzzy segmentation is used for the following procedures. In fuzzy segmentation, the relation between each pair of voxels (p,q) is represented by fuzzy connectedness strength, computed by Equation 11.

/uzzy(p, q) = exp(- ¾¾ (11) where A(p,q)=(I _p+I_q )*0.5-μ, I _p and I_q are the Hounsfield units of the voxels p and q. μ and σ are mean and standard deviation of Hounsfield units in the target region. In this work, the μ and σ are estimated from the target region determined by the local density maximum algorithm developed by Zhao [15]. The algorithm provides a potential nodule region quickly.

Suppose Volume_ori is the original volume of candidate nodules, and Volume_ldm is the volume provided by local density maximum algorithm. Thus, the ratio between the two volumetric data can be computed by Equation 12.

LDMRatio ori = ^iu m e ori nvoiu m e ldm

° rl Volume ori '

In this work, small value LDMRatio_ori>0.05 are used to select candidate nodules for classification in the PER_NODE_5; otherwise, candidate nodules of LDMRatio_ori<0.05 were considered as noise.

[0072] PER_NODE_6 is similar to PER_NODE_5. When PER_NODE_5 fails to refine irregular-shape candidate nodules based on intensity, PER_NODE_6 will use geodesic distance transformation to estimate the mean and standard deviation of Hounsfield units for fuzzy segmentation instead of local density maximum algorithm. Geodesic distance transformation is preferable because local density maximum algorithm might fail if candidate nodules are traversed by more than one vessel. In PER_NODE_6, an adaptive threshold algorithm mentioned above is used to generate an initial region and then apply geodesic distance transformation. Secondly, the local maximal geodesic distance G_max_local is searched according to the connectivity between voxels within the initial region. Then, all voxels that are within the initial region, have connectivity with the point of local maximal geodesic distance and have geodesic distance value larger than 0.5* G_max_local [18] are selected to construct the refined candidate nodules for classification in PER_NODE_6.

[0073] The chestwall-nodule detector consists of three nodes CWL_NODE_l through CWL_NODE_3, which produce example nodules depicted in FIG. 8A though FIG. 8C. The first CWL_NODE_l is for large candidate nodules, which is similar to PER_NODE_l, while the other two nodes are for un-ball-like nodules. For the last two nodes, candidate nodules would be modified by a finer threshold based on Hounsfield units. Furthermore, scaled geodesic distance map [18] is used to characterize those un-ball-like nodules. Candidate chestwall-nodules are regenerated by using the region-growing algorithm, which has two parameters, the starting seeds and the grown threshold. In this work, the starting seeds are the original candidate nodules. The grown threshold is Thr_¾ 'all-refined = -700 HU [23], that is, the voxels of Hounsfield units larger than Thr_wall_refined will be included into the regenerated volumetric data of candidate chestwall-nodules. [0074] Scaled geodesic distance map is used to attain the shape property of candidate chestwall-nodules. To calculate the scaled geodesic distance map, a geodesic distance transformation is applied to the regions of candidate nodules starting from the outer surface of the lung wall. The outer surface of the lung wall is actually the outer surface of the target lung region as described above. Then, a scale factor of 1.0 is used to scale the geodesic distance map. FIG. 11 is an image that illustrates an example scaled geodesic distance map resulting from one step of a classification process, according to an embodiment. As shown in FIG. 11 , the scaled geodesic distance map appears as a group of iso-surfaces indicated by an arrow.

[0075] Two parameters were developed to characterize the shape property of candidate chestwall-nodules based on the scaled geodesic distance map, the local range of maximal iso- surface (LRMI) and the average ratio between neighboring iso-surfaces (ARM). Suppose SGDMgd represents the volume of an object's iso-surface with geodesic distance gd.

GD_maxSGDM denotes the order of iso-surface with the volume of MaxSGDMgd.

GD_locall _maxSGDM and GD_local2_maxSGDM are used to represent the orders of the two iso-surfaces with the closest local minimal volumes around GD_maxSGDM. Since a scale factor 1.0 is used to scale the geodesic distance map, the values of GD naxSGDM, GD_locall _maxSGDM and GD_local2_maxSGDM also correspond to the real geodesic distance values in the scaled geodesic distance map. Thus, the LRMI can be computed

GDi 0C ai2 maxSGDM ≥ GD maxSGDM ≥ GDi oca i lmaxSGDM Suppose N sosurface represents the total number of iso-surfaces within the scaled geodesic distance map. Thus, the ARNI can be computed according to Equation 14.

ARNI = - y N isosurface SGDM(gd n )

(.Nisosurface-l) 71= 2 SGDM(gd n→ ) '

In this work, candidate chestwall-nodules with LRMI>0 [26] are directed to CWL_NODE_2, while candidate nodules with LRMI=0 [26] are directed to CWL_NODE_3. The two proposed parameters LRMI and ARNI are used in CWL_NODE_2 and CWL_ NODE_3, respectively.

[0076] The mediastinum-nodule detector consists of four nodes MED_NODE_l through MED_NODE_4, which produce example nodules depicted in FIG. 9A though FIG. 9D. [0077] The entering condition of MED_NODE_l is the same as that of PER_NODE_l. Candidate nodules of diameter>15 mm are used for classification in MED_NODE_l.

[0078] The candidate nodules that encounter MED_NODE_2 are first modified by the adaptive threshold algorithm described above. Then, the sphericity of each regenerated candidate nodule is calculated. Those regenerated candidate nodules with sphericity larger than 0.5 are used for classification in MED_NODE_2.

[0079] The entering condition of MED_NODE_3 is the same as that of PER_NODE_5. It is noted that, only those candidate nodules with LDMRatio or i > 0.05 are used for classification in MED_NODE_3.

[0080] The preprocessing procedure for MED_NODE_4 is the same as that of

PER_NODE_6.

[0081] Classifiers in the nodes are trained by image features and the regression tree classification algorithm [19] based on two concepts: the detection rate (DR) and the false positive rate (FR). The DR and FR can be calculated according to Equation 15 and Equation 16.

_ . „ The number of missinq nodules in the node \ „

Detection rate = 1 * 100% (15)

V The number of total true nodules in the node J

_ . . . The number of false nodules in the node

False positive rate = (16)

The number of total true nodules in the node

In the training procedure, two thresholds DR hr and FR_thr are used to control the number scale of candidates that are processed in the classifier. That is, the training result of DR is advantageously higher than DRjhr, while the training result of FR is advantageously lower than FR_thr. Thus, the number scales of candidates in the nodes are determined by a series of pre-setting DR_thrs and FR hrs. For instance, low DR_thr and high FR hr might make the amount of candidates in an individual node too large, while high DRjhr and low FRjhr might make it too small. Both situations are harmful to the performance of the tree structure. Thus, proper settings of DRjhrs and FRjhrs is advantageous. Fortunately, there are many studies [6, 9, 10, 13, 14] to help guide the settings of DR_thr and FR_thr. The algorithms in those studies have been proven useful for the detection of certain types of lung nodules. The performance of CAD schemes in those studies is reported to sensitivities of 80%~90% and false positives per scan of more than 4.0. Therefore, DRjhrs is set as 75% and FRjhrs is set as 4.0 for most of nodes in the detection system, except those last nodes in the detectors. The DRjhrs and FRjhrs of the last nodes (PER_NODE_6, CWL_NODE_3 and MED_NODE_4) were set as 50% and 4.0 respectively, because candidate nodules in the last nodes are the most difficult to classify.

[0082] After setting DR_thr and FR_thr, the classifier in node would be trained by the regression tree classifier algorithm. The parameters, which can achieve the highest DR under conditions of OR>DR_thr and R<FR_thr, were selected to form the classifier. In the implementation, firstly, numbers of false candidate nodules were removed according to the feature value ranges of true positives. Secondly, a free-response receiver operating characteristic (FROC) [27] curve was trained with regression tree classification. Finally, the local maximum is searched in the window defined by DR hr and FR_thr. Nodes with DR lower than 75% would be discarded. FIG. 12 is a table that illustrates example node features, according to an embodiment. All of the features used for node training are presented in Table 1 presented in FIG. 12.

[0083] Features listed in Table 1 of FIG. 12 are defined or calculated as follows.

Mean_CT_Value is obtained by computing the average Hounsfield units of volumetric data. Volume is obtained by counting the total voxels of volumetric data. Sphericity is defined by Equation 10. AdaThr_ is obtained by computing the sphericity of volumetric data that is modified by adaptive threshold. Max_dot is defined by Equation 17 and Dotline_Ratio by Equation 18.

Max_dot = Max(z c dot ) (17)

Max(z c do t )

DotLine Ratio =

Max(z c _ Ujle ) (18)

Blobness is defined by Equation 9. GD_R1 is obtained by computing the maximal geodesic distance to the surface of volumetric data (The distance value inside the volumetric data is positive, otherwise negative). GD_Ratiol and GD_Ratio2 are obtained by suppose p_c is the voxel corresponding to the value GD_R1. The geodesic distance transformation is performed starting from p_c and recording the maximal geodesic distance GD_R2 and the corresponding voxel p_l. In addition, the geodesic distance transformation is performed again starting from p_l and recording the maximal geodesic distance GD_R3 and the corresponding voxel p_2. Thus, the ratios are defined by Equation 19 and Equation 20.

GD Ratiol (19)

_R3

GD Ratio2 = (20)

_R1 Dice is obtained by supposing Volume_ori is the original volume of candidate nodules, and Volume nod is the volume of candidate nodules that have been modified by the

corresponding preprocessing. Then Dice is computed by Equaiton 21.

Volume ori nVolume mod

Vice = z * (21)

Volume or i Volume mo( i

Ori_LDMRatio and Modified _LDMRatio are obtained using Equation 22 and 23 respectively.

Ori LDM Ratio = m e ori nvoiume ldm

Volume or i

Modified LDMRatio = " 6modn " eMm (23)

Volume mod

LRMI and ARNI are obtained using Equation 13 and Equation 14, provided above. And the final feature Modified iatio is defined by Equation 24.

Modified_Ratio = Volume °™ (24)

Volume mod

[0084] Experimental results were obtained as described next. The 294 CT scans were randomly divided into two groups at the ratio of 2: 1, 196 training CT scans and 98 testing CT scans. The first group was used as a training set to develop the nodule detection system with leave-one-out cross-validation, while the latter was used as a test set to test the nodule detection system independently. The training set contains 408 annotated nodules, while the testing set contains 233 annotated nodules. Two indicators, sensitivity and false positives per scan, were used for the presentation of performance. The sensitivity and false positives per scan were calculated according to Equation 15 and Equation 26.

„ . . . Number of total truly detected nodules „ ~ ~„,

Sensitivity = — —— * 100% (25)

Number of toral annotated nodules

. . Number of total false positives

Flase positives per scan = ■ ■ (26)

Number of total scans

[0085] The criterion to differentiate a truly detected nodule from a false positive was based on the comparison between the centroid of a detected nodule and that of the nearest annotated nodule. If the Euclidean distance from the centroid of the detected nodule to that of the nearest annotated nodule did not exceed a distance threshold, the detected nodule was regarded as a truly detected nodule, otherwise it was considered a false positive. For these purposes, if the nearest annotated nodule was smaller than 10 mm, the distance threshold was

6 mm, otherwise the distance threshold was 12 mm. The centroid of an annotated nodule was computed by averaging the readers' delineations, while the centroid of a detected nodule was computed by their corresponding nodes. The result of the proposed hierarchical classifier method is presented in Table 2. FIG. 13A and FIG. 13B are free-response receiver operating characteristic (FROC) curves that illustrate example performance of the method on a training set and testing set, respectively, according to an embodiment. The proposed hybrid method was coded in C++ and tested on a computer with 3.60 GHz CPU and 8 GB Memory. On average, the hybrid method takes 628 second/scan for detection without I/O process.

Table 2. The result of the ro osed h brid method

[0086] There are a total of thirteen nodes in the hybrid method, each targeting a different type of nodule. In each node, all the candidate nodules that meet the entering conditions are trained based on the pre-setting DR hr and FR_thr. To evaluate each node, we present the final DRs and FRs in Table 3.

Table 3. Details of each node for both the training and test set

[0087] While the values of the secondary parameters were obtained from the pertinent literature, experiments described next were performed to determine the optimal values for diameter and sphericity thresholds.

[0088] There are two diameter thresholds, one is used to define large nodules

(PER_NODE_l, CWL_NODE_l and MED_NODE_l), while the other differentiates large ball-like candidate peripheral-nodules (PER_NODE_2) from small ball-like candidate peripheral-nodules (PER_NODE_3). FIG. 14A and FIG. 14B are FROC curves that illustrate example performance of the method on a training set and testing set, respectively, using different diameter thresholds, according to various embodiments. These experiments determine the first threshold. It can be seen that diameter =15 mm is the best threshold, providing the most sensitivity for both the training set and the test set. This result is consistent with the diameter distribution shown in FIG. 5, in which most nodules are of diameter < 15 mm. For the second threshold, a purpose was to isolate extremely small peripheral-nodules (PER_NODE_3). Thus, the half peak diameter = 3 mm was set as the threshold. That is because diameter = 3 mm is the minimal diameter of nodules of interest.

[0089] "Sphericity" is used to differentiate ball-like candidate nodules (PER_NODE_2-3, MED_NODE_2) from irregular candidate nodules (PER_NODE_4-6, MED_NODE_3-4). According to the Equation 10, the range of sphericity should be from 0 to 1, so experiments were conducted on sphericity values of 0.00, 0. 25, 0.5 and 0.75, respectively. FIG. 151A and FIG. 15B are FROC curves that illustrate example performance of the method on a training set and testing set, respectively, using different sphericity thresholds, according to various embodiments. As shown in FIG. 15, sphericity=0.50 is the best threshold. Detection performance was the worst at sphericity=0, which mixes together all nodules regardless of their shape.

[0090] To further evaluate the performance of the proposed hierarchical classifier method, experimental results were compared with other existing CAD schemes. However, it is difficult to make direct comparison because of the following reasons: 1. Different CAD schemes usually use different testing databases. Different testing databases might contain a various number of CT scans. Also, different CT scans might be derived from different CT protocols. 2. Different CAD schemes might be evaluated with different methods, such as N- fold cross-validation, leave-one-out cross-validation and independent test. 3. Different CAD schemes might use different lung segmentation methods, which have an effect on detection performance. It is believed that a relative comparison is helpful despite these limitations. Some recent reporting CAD schemes on nodule detection are presented in Table 4.

Table 4. Com arison of CAD schemes

Sen is the sensitivity. FP/s is the false positives per scan.

[0091] The hierarchical classifier method can achieve sensitivity of 87% with 2.61 false positive per scan on the training dataset, and sensitivity of 85.2% with 3.13 false positive per scan on the testing dataset, respectively. This method was compared with six recent CAD schemes, four of which were evaluated on an independent test. As shown in Table 4, only the hierarchical classifier method yielded sensitivity higher than 85% while maintaining false positive per scan lower than 4.0. Though Li's scheme had better results than the five others, it was only tested by the cross-validation method rather than using an independent dataset. Besides, Li's dataset was small, containing no more than one hundred CT scans. Murphy's method was tested on CT scans of the most number, but it required too many features and the sensitivity was just 80%. The hierarchical classifier scheme and Ye's scheme used the least features, but Ye's scheme resulted in much more false positives. Therefore, as indicated by Table 4, the hierarchical classifier achieved a better performance than the recently reported CAD schemes. [0092] This hierarchical classifier method for lung nodules has at least two advantages. The first advantage is that candidate nodules are partitioned in a "soft" way rather than a "hard" way [19, 24]. In practice, it is impossible to partition nodules with rigorous conditions, because nodules vary widely in intensity and shape. For that reason, as many as thirteen nodes were used in the hierarchical classifier lung nodule classifier. Some of the thirteen nodes might have overlap. For example, nodes targeting at large candidate nodules

(PER_NODE_l, CWL_NODE_l and MED_NODE_l), are actually similar to each other in terms of size, as shown in FIG. 7A, FIG. 8A and FIG. 9A. Based on such overlap, accurate thresholds were not needed to control the entry into nodes, because one true nodule that was rejected by one node might be accepted by another. As indicated in Table 3, only 4.17% and 4.93% of true nodules were entirely missed by nodes in the training and test dataset, respectively. The second advantage is that specific types of candidate nodules can be processed by specific nodes. To partition nodules, the order of nodes is very useful. In the hierarchical classifier method, after determining the "Location", candidate nodules are partitioned in the order of from large and regular to small and irregular. Intuitively, large and regular nodules are more easily detected correctly than those small and irregular nodules; thus, nodes targeting large and regular nodules yielded better performance than those nodes targeting small and irregular nodules. As indicated by Table 3, all of these six nodes targeting large and regular nodules (PER_NODE_l-2, CWL_NODE_l-2 and MED_NODE_l-2 (See FIG. 8A, FIG. 8B, FIG. 9A. FIG. 9B) could achieve high DR close to 100% while maintaining very low FR.

[0093] Nodes targeting at small and irregular nodules included PER_NODE_3-6,

CWL_NODE_3 and MED_NODE_3-4 (See FIG. 8C through FIG. 8E, FIG. 9C and FIG. 9D). Among the seven nodes, two nodes targeting at irregular-shaped chestwall-nodules (CWL_NODE_3) and irregular peripheral-nodules with no attaching tissues (PER_NODE_4) have the best performance. The good performance of PER_NODE_4 is due to the non- attachment to any other tissues (See FIG. 8D). The high performance of CWL_NODE_3 lies in the anatomical fact that there were only a few small vessels and airways located in the region close to the lung wall. Two nodes targeting irregular-shape peripheral-nodules of different intensity to the attaching tissues (PER_NODE_5) and irregular-shape mediastinum- nodules of different intensity to the attaching tissues (MED_NODE_3), yielded median performance. In the PER_NODE_5 and MED_NODE_3 additional intensity information was introduced to partition nodules from adjacent vessels, if there were only little parts of nodule attaching to vessels (See FIG. 9E and FIG. IOC). Three nodes targeting very small peripheral-nodules (PER_NODE_3), irregular-shape peripheral-nodules of similar intensity to the attaching tissues (PER_NODE_6) and irregular-shape mediastinum-nodules of similar intensity to the attaching tissues (MED_NODE_4), were the least successful. For the PER_NODE_3, very small candidate nodules (See FIG. 8C) are difficult to differentiate from pulmonary scars which usually exist in damaged lungs and the lungs of older patients. For the PER_NODE_6 and MED_NODE_4, candidate nodules were similar to small vessels not only based on shape but also on intensity, making them extremely difficult to detect. That is why PER_NODE_6 and MED_NODE_4 were placed at the end of the detectors.

[0094] Improvement in terms of detection speed are achieved in some embodiments. Most of the time-consuming algorithms would be implemented in parallel via a graphics processing unit, such as Hessian-matrix-based enhancement filtering, fuzzy segmentation, lung segmentation, among others.

[0095] In summary, the hierarchical classifier method performed well, yielding high sensitivity and low false positives per scan in a large scale dataset. The present method would be a useful tool for both routine diagnosis and screening studies on a wide variety of CT imaging protocols.

4. Computational Hardware overview

[0096] FIG. 3 is a block diagram that illustrates a computer system 300 upon which an embodiment of the invention may be implemented. Computer system 300 includes a communication mechanism such as a bus 310 for passing information between other internal and external components of the computer system 300. Information is represented as physical signals of a measurable phenomenon, typically electric voltages, but including, in other embodiments, such phenomena as magnetic, electromagnetic, pressure, chemical, molecular atomic and quantum interactions. For example, north and south magnetic fields, or a zero and non-zero electric voltage, represent two states (0, 1) of a binary digit (bit). Other phenomena can represent digits of a higher base. A superposition of multiple simultaneous quantum states before measurement represents a quantum bit (qubit). A sequence of one or more digits constitutes digital data that is used to represent a number or code for a character. In some embodiments, information called analog data is represented by a near continuum of measurable values within a particular range. Computer system 300, or a portion thereof, constitutes a means for performing one or more steps of one or more methods described herein.

[0097] A sequence of binary digits constitutes digital data that is used to represent a number or code for a character. A bus 310 includes many parallel conductors of information so that information is transferred quickly among devices coupled to the bus 310. One or more processors 302 for processing information are coupled with the bus 310. A processor 302 performs a set of operations on information. The set of operations include bringing information in from the bus 310 and placing information on the bus 310. The set of operations also typically include comparing two or more units of information, shifting positions of units of information, and combining two or more units of information, such as by addition or multiplication. A sequence of operations to be executed by the processor 302 constitutes computer instructions.

[0098] Computer system 300 also includes a memory 304 coupled to bus 310. The memory 304, such as a random access memory (RAM) or other dynamic storage device, stores information including computer instructions. Dynamic memory allows information stored therein to be changed by the computer system 300. RAM allows a unit of information stored at a location called a memory address to be stored and retrieved independently of information at neighboring addresses. The memory 304 is also used by the processor 302 to store temporary values during execution of computer instructions. The computer system 300 also includes a read only memory (ROM) 306 or other static storage device coupled to the bus 310 for storing static information, including instructions, that is not changed by the computer system 300. Also coupled to bus 310 is a non-volatile (persistent) storage device 308, such as a magnetic disk or optical disk, for storing information, including instructions, that persists even when the computer system 300 is turned off or otherwise loses power.

[0099] Information, including instructions, is provided to the bus 310 for use by the processor from an external input device 312, such as a keyboard containing alphanumeric keys operated by a human user, or a sensor. A sensor detects conditions in its vicinity and transforms those detections into signals compatible with the signals used to represent information in computer system 300. Other external devices coupled to bus 310, used primarily for interacting with humans, include a display device 314, such as a cathode ray tube (CRT) or a liquid crystal display (LCD), for presenting images, and a pointing device 316, such as a mouse or a trackball or cursor direction keys, for controlling a position of a small cursor image presented on the display 314 and issuing commands associated with graphical elements presented on the display 314.

[0100] In the illustrated embodiment, special purpose hardware, such as an application specific integrated circuit (IC) 320, is coupled to bus 310. The special purpose hardware is configured to perform operations not performed by processor 302 quickly enough for special purposes. Examples of application specific ICs include graphics accelerator cards for generating images for display 314, cryptographic boards for encrypting and decrypting messages sent over a network, speech recognition, and interfaces to special external devices, such as robotic arms and medical scanning equipment that repeatedly perform some complex sequence of operations that are more efficiently implemented in hardware.

[0101] Computer system 300 also includes one or more instances of a communications interface 370 coupled to bus 310. Communication interface 370 provides a two-way communication coupling to a variety of external devices that operate with their own processors, such as printers, scanners and external disks. In general the coupling is with a network link 378 that is connected to a local network 380 to which a variety of external devices with their own processors are connected. For example, communication interface 370 may be a parallel port or a serial port or a universal serial bus (USB) port on a personal computer. In some embodiments, communications interface 370 is an integrated services digital network (ISDN) card or a digital subscriber line (DSL) card or a telephone modem that provides an information communication connection to a corresponding type of telephone line. In some embodiments, a communication interface 370 is a cable modem that converts signals on bus 310 into signals for a communication connection over a coaxial cable or into optical signals for a communication connection over a fiber optic cable. As another example, communications interface 370 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN, such as Ethernet. Wireless links may also be implemented. Carrier waves, such as acoustic waves and electromagnetic waves, including radio, optical and infrared waves travel through space without wires or cables. Signals include man-made variations in amplitude, frequency, phase, polarization or other physical properties of carrier waves. For wireless links, the communications interface 370 sends and receives electrical, acoustic or electromagnetic signals, including infrared and optical signals that carry information streams, such as digital data.

[0102] The term computer-readable medium is used herein to refer to any medium that participates in providing information to processor 302, including instructions for execution. Such a medium may take many forms, including, but not limited to, non- volatile media, volatile media and transmission media. Non- volatile media include, for example, optical or magnetic disks, such as storage device 308. Volatile media include, for example, dynamic memory 304. Transmission media include, for example, coaxial cables, copper wire, fiber optic cables, and waves that travel through space without wires or cables, such as acoustic waves and electromagnetic waves, including radio, optical and infrared waves. The term computer-readable storage medium is used herein to refer to any medium that participates in providing information to processor 302, except for transmission media.

[0103] Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, a hard disk, a magnetic tape, or any other magnetic medium, a compact disk ROM (CD-ROM), a digital video disk (DVD) or any other optical medium, punch cards, paper tape, or any other physical medium with patterns of holes, a RAM, a programmable ROM (PROM), an erasable PROM (EPROM), a FLASH-EPROM, or any other memory chip or cartridge, a carrier wave, or any other medium from which a computer can read. The term non-transitory computer-readable storage medium is used herein to refer to any medium that participates in providing information to processor 302, except for carrier waves and other signals.

[0104] Logic encoded in one or more tangible media includes one or both of processor instructions on a computer-readable storage media and special purpose hardware, such as ASIC 320.

[0105] Network link 378 typically provides information communication through one or more networks to other devices that use or process the information. For example, network link 378 may provide a connection through local network 380 to a host computer 382 or to equipment 384 operated by an Internet Service Provider (ISP). ISP equipment 384 in turn provides data communication services through the public, world-wide packet-switching communication network of networks now commonly referred to as the Internet 390. A computer called a server 392 connected to the Internet provides a service in response to information received over the Internet. For example, server 392 provides information representing video data for presentation at display 314.

[0106] The invention is related to the use of computer system 300 for implementing the techniques described herein. According to one embodiment of the invention, those techniques are performed by computer system 300 in response to processor 302 executing one or more sequences of one or more instructions contained in memory 304. Such instructions, also called software and program code, may be read into memory 304 from another computer-readable medium such as storage device 308. Execution of the sequences of instructions contained in memory 304 causes processor 302 to perform the method steps described herein. In alternative embodiments, hardware, such as application specific integrated circuit 320, may be used in place of or in combination with software to implement the invention. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.

[0107] The signals transmitted over network link 378 and other networks through communications interface 370, carry information to and from computer system 300.

Computer system 300 can send and receive information, including program code, through the networks 380, 390 among others, through network link 378 and communications interface 370. In an example using the Internet 390, a server 392 transmits program code for a particular application, requested by a message sent from computer 300, through Internet 390, ISP equipment 384, local network 380 and communications interface 370. The received code may be executed by processor 302 as it is received, or may be stored in storage device 308 or other non- volatile storage for later execution, or both. In this manner, computer system 300 may obtain application program code in the form of a signal on a carrier wave.

[0108] Various forms of computer readable media may be involved in carrying one or more sequence of instructions or data or both to processor 302 for execution. For example, instructions and data may initially be carried on a magnetic disk of a remote computer such as host 382. The remote computer loads the instructions and data into its dynamic memory and sends the instructions and data over a telephone line using a modem. A modem local to the computer system 300 receives the instructions and data on a telephone line and uses an infrared transmitter to convert the instructions and data to a signal on an infra-red a carrier wave serving as the network link 378. An infrared detector serving as communications interface 370 receives the instructions and data carried in the infrared signal and places information representing the instructions and data onto bus 310. Bus 310 carries the information to memory 304 from which processor 302 retrieves and executes the instructions using some of the data sent with the instructions. The instructions and data received in memory 304 may optionally be stored on storage device 308, either before or after execution by the processor 302.

[0109] FIG. 4 illustrates a chip set 400 upon which an embodiment of the invention may be implemented. Chip set 400 is programmed to perform one or more steps of a method described herein and includes, for instance, the processor and memory components described with respect to FIG. 3 incorporated in one or more physical packages (e.g., chips). By way of example, a physical package includes an arrangement of one or more materials, components, and/or wires on a structural assembly (e.g., a baseboard) to provide one or more

characteristics such as physical strength, conservation of size, and/or limitation of electrical interaction. It is contemplated that in certain embodiments the chip set can be implemented in a single chip. Chip set 400, or a portion thereof, constitutes a means for performing one or more steps of a method described herein.

[0110] In one embodiment, the chip set 400 includes a communication mechanism such as a bus 401 for passing information among the components of the chip set 400. A processor 403 has connectivity to the bus 401 to execute instructions and process information stored in, for example, a memory 405. The processor 403 may include one or more processing cores with each core configured to perform independently. A multi-core processor enables

multiprocessing within a single physical package. Examples of a multi-core processor include two, four, eight, or greater numbers of processing cores. Alternatively or in addition, the processor 403 may include one or more microprocessors configured in tandem via the bus 401 to enable independent execution of instructions, pipelining, and multithreading. The processor 403 may also be accompanied with one or more specialized components to perform certain processing functions and tasks such as one or more digital signal processors (DSP) 407, or one or more application-specific integrated circuits (ASIC) 409. A DSP 407 typically is configured to process real-world signals (e.g., sound) in real time independently of the processor 403. Similarly, an ASIC 409 can be configured to performed specialized functions not easily performed by a general purposed processor. Other specialized components to aid in performing the inventive functions described herein include one or more field

programmable gate arrays (FPGA) (not shown), one or more controllers (not shown), or one or more other special-purpose computer chips.

[0111] The processor 403 and accompanying components have connectivity to the memory 405 via the bus 401. The memory 405 includes both dynamic memory (e.g., RAM, magnetic disk, writable optical disk, etc.) and static memory (e.g., ROM, CD-ROM, etc.) for storing executable instructions that when executed perform one or more steps of a method described herein. The memory 405 also stores the data associated with or generated by the execution of one or more steps of the methods described herein. 5. Alterations, extensions and modifications

[0112] In the foregoing specification, the invention has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense. Throughout this specification and the claims, unless the context requires otherwise, the word "comprise" and its variations, such as "comprises" and "comprising," will be understood to imply the inclusion of a stated item, element or step or group of items, elements or steps but not the exclusion of any other item, element or step or group of items, elements or steps. Furthermore, the indefinite article "a" or "an" is meant to indicate one or more of the item, element or step modified by the article. As used herein, unless otherwise clear from the context, a value is "about" another value if it is within a factor of two (twice or half) of the other value. While example ranges are given, unless otherwise clear from the context, any contained ranges are also intended in various embodiments. Thus, a range from 0 to 10 includes the range 1 to 4 in some embodiments.

6. References.

[0113] The above description makes reference to the following publications.

1. Siegel, R., D. Naishadham, and A. Jemal, Cancer statistics, 2013. CA Cancer J Clin, 2013. 63(1): p. 11-30.

2. Roos, J.E., et al., Computer-aided detection (CAD) of lung nodules in CT scans: radiologist performance and reading time with incremental CAD assistance. Eur Radiol, 2010. 20(3): p. 549-57.

3. Beyer, F., et al., Comparison of sensitivity and reading time for the use of computer- aided detection (CAD) of pulmonary nodules at MDCT as concurrent or second reader. Eur Radiol, 2007. 17(11): p. 2941-7.

4. Cascio, D., et al., Automatic detection of lung nodules in CT datasets based on stable 3D mass-spring models. Comput Biol Med, 2012. 42(11): p. 1098-109.

5. Golosio, B., et al., A novel multithreshold method for nodule detection in lung CT. Med Phys, 2009. 36(8): p. 3607-18.

6. Guo, W. and Q. Li, High performance lung nodule detection schemes in CT using local and global information. Med Phys, 2012. 39(8): p. 5157-68. 7. Li, Q., S. Sone, and K. Doi, Selective enhancement filters for nodules, vessels, and airway walls in two- and three-dimensional CT scans. Med Phys, 2003. 30(8): p. 2040-51.

8. Matsumoto, S., et al., Pulmonary nodule detection in CT images with quantized convergence index filter. Med Image Anal, 2006. 10(3): p. 343-52.

9. Messay, T., R.C. Hardie, and S.K. Rogers, A new computationally efficient CAD system for pulmonary nodule detection in CT imagery. Med Image Anal, 2010. 14(3): p. 390- 406.

10. Murphy, K., et al., A large-scale evaluation of automatic pulmonary nodule detection in chest CT using local image features and k-nearest-neighbour classification. Med Image Anal, 2009. 13(5): p. 757-70.

11. Okada, K., D. Comaniciu, and A. Krishnan, Robust anisotropic Gaussian fitting for volumetric characterization of pulmonary nodules in multislice CT. IEEE Trans Med Imaging, 2005. 24(3): p. 409-23.

12. Paik, D.S., et al., Surface normal overlap: a computer-aided detection algorithm with application to colonic polyps and lung nodules in helical CT. IEEE Trans Med Imaging, 2004. 23(6): p. 661-75

13. Tan, M., et al., A novel computer-aided lung nodule detection system for CT images. Med Phys, 2011. 38(10): p. 5630-45.

14. Ye, X., et al., Shape -based computer-aided detection of lung nodules in thoracic CT images. IEEE Trans Biomed Eng, 2009. 56(7): p. 1810-20.

15. Zhao, B., et al., Automatic detection of small lung nodules on CT utilizing a local density maximum algorithm. J Appl Clin Med Phys, 2003. 4(3): p. 248-60.

16. Franaszek, M., et al., Hybrid segmentation of colon filled with air and opacified fluid for CT colonography. IEEE Trans Med Imaging, 2006. 25(3): p. 358-68.

17. Kobashi, S. and J.K. Udupa, Fuzzy connectedness image segmentation for newborn brain extraction in MR images. Conf Proc IEEE Eng Med Biol Soc, 2013. 2013: p. 7136-9.

18. Kang, D.G., D.C. Suh, and J.B. Ra, Three-dimensional blood vessel quantification via centerline deformation. IEEE Trans Med Imaging, 2009. 28(3): p. 405-14.

19. Breiman, J., et al., Classification and Regression Trees. 1983.

20. Hafner, M., et al., Delaunay triangulation-based pit density estimation for the classification of polyps in high-magnification chromo-colonoscopy. Comput Methods Programs Biomed, 2012. 107(3): p. 565-81. 21. National Lung Screening Trial Research, T., et al., The National Lung Screening Trial: overview and study design. Radiology, 2011. 258(1): p. 243-53.

22. Frangi, A.F., et al. Multiscale Vessel Enhancement Filtering, in MICCAI. 1998. Berlin, Germany: Springer.

23. Bellotti, R., et al., A CAD system for nodule detection in low-dose lung CTs based on region growing and a new active contour model. Med Phys, 2007. 34(12): p. 4901-10.

24. Viola, P. and M.J. Jones, Robust Real-Time Face Detection. International Journal of Computer Vision, 2004. 57(2): p. 137-154.

25. Funama, Y., et al., Detection of nodules showing ground-glass opacity in the lungs at low-dose multidetector computed tomography: phantom and clinical study. J Comput Assist Tomogr, 2009. 33(1): p. 49-53.

26. Lu, L., et al., Fully automated colon segmentation for the computation of complete colon centerline in virtual colonoscopy. IEEE Trans Biomed Eng, 2012. 59(4): p. 996-1004.

27. Metz, C.E., Some practical issues of experimental design and data analysis in radiological ROC studies. Investigation Radiology, 1989. 24(3): p. 234-245.