Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
PATHOLOGICAL IMAGE ANALYSIS USING NEIGHBOURHOOD GRAPHS AND GEOMETRIC STATISTICS OF NUCLEI
Document Type and Number:
WIPO Patent Application WO/2013/052632
Kind Code:
A1
Abstract:
A method and system for automated pathological image analysis is disclosed. Nuclei are detected in a histological image. A relative neighborhood graph of the detected nuclei is generated. The relative neighborhood graph of the detected nuclei is pruned, and a descriptor of nuclei linkage is extracted from the pruned relative neighborhood graph. The histological image is then classified as benign or malignant based on the descriptor extracted from the pruned relative neighborhood graph, for example using a trained support vector machine (SVM).

Inventors:
BAHLMANN CLAUS (US)
NI JIE (US)
SINGH MANEESH KUMAR (US)
JOHNSON JEFFREY P (US)
CHEKKOURY IDRISSI ANDREI CHAKIB (US)
KHURD PARMESHWAR (US)
Application Number:
PCT/US2012/058704
Publication Date:
April 11, 2013
Filing Date:
October 04, 2012
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SIEMENS CORP (US)
International Classes:
G06V10/426; G06T7/00
Other References:
BILGIN C ET AL: "Cell-Graph Mining for Breast Tissue Modeling and Classification", 2007 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY : [EMBC '07] ; LYON, FRANCE, 22 - 26 AUGUST 2007 ; [IN CONJUNCTION WITH THE BIENNIAL CONFERENCE OF THE SOCIÉTÉ FRANÇAISE DE GÉNIE BIOLOGIQUE ET MÉDICAL (SFGB, 22 August 2007 (2007-08-22), pages 5311 - 5314, XP031337419, ISBN: 978-1-4244-0787-3
DIOGO VIEIRA ANDRADE ET AL: "Good Approximations for the Relative Neighbourhood Graph", PROCEEDINGS OF THE 13TH CANADIAN CONFERENCE ON COMPUTATIONAL GEOMETRY, 1 January 2001 (2001-01-01), pages 25 - 28, XP055050700
VIRIYASAKSATHIAN B ET AL: "Automatic counting method using fast radial symmetry transform", ROBOTICS AND BIOMIMETICS, 2008. ROBIO 2009. IEEE INTERNATIONAL CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 22 February 2009 (2009-02-22), pages 2174 - 2177, XP031465929, ISBN: 978-1-4244-2678-2
BAHRAM PARVIN ET AL: "Iterative Voting for Inference of Structural Saliency and Characterization of Subcellular Events", IEEE TRANSACTIONS ON IMAGE PROCESSING, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 16, no. 3, 1 March 2007 (2007-03-01), pages 615 - 623, XP011165360, ISSN: 1057-7149, DOI: 10.1109/TIP.2007.891154
RIGHETTI F ET AL: "2D-CELL: image processing software for extraction and analysis of 2-dimensional cellular structures", COMPUTER PHYSICS COMMUNICATION, ELSEVIER SCIENCE PUBLISHERS B.V., AMSTERDAM, NL, vol. 67, no. 3, 1 January 1992 (1992-01-01), pages 509 - 526, XP024454555, ISSN: 0010-4655, [retrieved on 19920101], DOI: 10.1016/0010-4655(92)90056-5
COSATTO ET AL.: "Grading Nuclear Pleomorphism on Histological Micrographs", ICPR, IEEE, 2008, pages 1 - 4, XP031411705
Attorney, Agent or Firm:
CONOVER, Michele L. et al. (170 Wood Avenue SouthIselin, New Jersey, US)
Download PDF:
Claims:
CLAIMS:

1 . A method for automated analysis of a histological image, comprising:

detecting nuclei in the histological image;

generating a relative neighborhood graph of the detected nuclei;

pruning the relative neighborhood graph of the detected nuclei;

extracting a descriptor of nuclei linkage from the pruned relative neighborhood graph; and

classifying the histological image as benign or malignant based on the descriptor extracted from the pruned relative neighborhood graph.

2. The method of claim 1 , wherein the step of detecting nuclei in the histological image comprises:

detecting ellipses in the histological image using a generalized fast radial symmetry (GFRS) transform, wherein each ellipse is represented as an affine transformation of a unit circle.

3. The method of claim 1 , wherein the step of detecting nuclei in the histological image comprises:

detecting ellipses in the histological image by generating a plurality of response maps of the histological image corresponding to a plurality of ellipses sampled over a constrained set of parameters for orientation and lengths of semi- major and semi-minor axes, wherein each response map represents a confidence for a corresponding one of the plurality of ellipses to be centered at each of a plurality of pixels in the response map; and

creating a list of candidate ellipses from each of the plurality of response maps, by performing spatial non-maximum suppression for each response map; and

detecting all ellipses in the histological image using a sequential selection process which selects a non-overlapping set of ellipses with highest confidence values from the list of candidate ellipses.

4. The method of claim 3, wherein the step of detecting ellipses in the histological image by generating a plurality of response maps of the histological image corresponding to a plurality of ellipses sampled over a constrained set of parameters for orientation and lengths of semi-major and semi-minor axes comprises:

for each of the plurality of ellipses, determining a response to the ellipse centered at each of the plurality of pixels in the response map based on voting directions calculated for each of a plurality of pixels in the histological image.

5. The method of claim 4, wherein the step of determining a response to the ellipse centered at each of the plurality of pixels in the response map based on voting directions calculated for each of a plurality of pixels in the histological image comprises:

for each of the plurality of pixels in the histological image:

calculating a voting direction for each pixel in the histological image; for each pixel for which the calculated voting direction points towards the pixel at which the ellipse is centered, incrementing a vote total for the ellipse being located at that pixel in the corresponding response map; and

for each pixel for which the calculated voting direction points away from the pixel at which the ellipse is centered, decrementing a vote total for the ellipse being located at that pixel in the corresponding response map.

6. The method of claim 5, wherein the step of calculating a voting direction for each pixel in the histological image comprises:

for each pixel in the histological image:

calculating a transformed tangent vector in circle space based on a tangent vector at that pixel and an affine transformation that transforms a unit circle to the ellipse;

calculating a transformed normal vector in the circle space based on the transformed tangent vector in the circle space; and calculating the voting direction based on the transformed normal vector and the affine transformation.

7. The method of claim 3, wherein the step of detecting all ellipses in the histological image using a sequential selection process which selects a non- overlapping set of ellipses with highest confidence values from the list of candidate ellipses comprises:

(a) ranking the detected candidate ellipses from all of the plurality of response maps based on confidence values;

(b) selecting an ellipse having the highest confidence value from a remaining set of the detected candidate ellipses;

(c) if a pixel region of the selected ellipse does not interfere with a pixel region of another ellipse already included in a set of final detection results, adding the selected ellipse to the set of final detection results;

(d) removing the selected ellipse from the remaining set of the detected candidate ellipses; and

(e) repeating steps (b)-(d) until no detected candidate ellipse remains in the remaining set of the detected candidate ellipses.

8. The method of claim 1 , wherein the step of generating a relative

neighborhood graph of the detected nuclei comprises:

generating an approximated relative neighborhood graph of the detected nuclei using an Urquhart graph.

9. The method of claim 1 , wherein the step of generating a relative

neighborhood graph of the detected nuclei comprises:

generating a graph of vertices and edges, each vertex corresponding to a detected nuclei, and each edge connecting a pair of detected nuclei; and

calculating a weight for each edge based on ellipse confidences and measurements of dissimilarity of location, orientation, and scale between the nuclei detected by the edge.

10. The method of claim 9, wherein the step of pruning the relative neighborhood graph of the detected nuclei comprises:

removing edges having weights lower than a threshold;

removing an edge having the lower weight at a junction of two edges, when an angle between the two edges at the junction is within a threshold of π / 2 ;

for junctions of three edges, calculating the angle between each pair of two edges, keeping two of the edges having the angle closest to π , and removing the remaining edge; and

removing isolated edges connecting two nuclei that are not connected to any other nuclei.

1 1 . The method of claim 1 , wherein the step of extracting a descriptor of nuclei linkage from the pruned relative neighborhood graph comprises:

detecting connected components in the pruned relative neighborhood graph;

summing edge weights for each connected component; and

generating a histogram of summed edge weights for the connected components detected in the pruned relative neighborhood graph.

12. The method of claim 1 1 , wherein the step of classifying the histological image as benign or malignant based on the descriptor extracted from the pruned relative neighborhood graph comprises:

classifying the histological image as benign or malignant based on the histogram of summed edge weights for the connected components detected in the pruned relative neighborhood graph using a trained support vector machine (SVM).

13. An apparatus for automated analysis of a histological image, comprising: means for detecting nuclei in the histological image;

means for generating a relative neighborhood graph of the detected nuclei; means for pruning the relative neighborhood graph of the detected nuclei; means for extracting a descriptor of nuclei linkage from the pruned relative neighborhood graph; and

means for classifying the histological image as benign or malignant based on the descriptor extracted from the pruned relative neighborhood graph.

14. The apparatus of claim 13, wherein the means for detecting nuclei in the histological image comprises:

means for detecting ellipses in the histological image using a generalized fast radial symmetry (GFRS) transform, wherein each ellipse is represented as an affine transformation of a unit circle.

15. The apparatus of claim 13, wherein the means for detecting nuclei in the histological image comprises:

means for detecting ellipses in the histological image by generating a plurality of response maps of the histological image corresponding to a plurality of ellipses sampled over a constrained set of parameters for orientation and lengths of semi-major and semi-minor axes, wherein each response map represents a confidence for a corresponding one of the plurality of ellipses to be centered at each of a plurality of pixels in the response map;

means for creating a list of candidate ellipses from each of the plurality of response maps;

means for detecting all ellipses in the histological image from the list of candidate ellipses.

16. The apparatus of claim 15, wherein the means for detecting ellipses in the histological image by generating a plurality of response maps of the histological image corresponding to a plurality of ellipses sampled over a constrained set of parameters for orientation and lengths of semi-major and semi-minor axes comprises:

means for determining a response to the ellipse centered at each of the plurality of pixels in the response map based on voting directions calculated for each of a plurality of pixels in the histological image.

17. The apparatus of claim 13, wherein the means for generating a relative neighborhood graph of the detected nuclei comprises:

means for generating an approximated relative neighborhood graph of the detected nuclei.

18. The apparatus of claim 13, wherein the means for generating a relative neighborhood graph of the detected nuclei comprises:

means for generating a graph of vertices and edges, each vertex corresponding to a detected nuclei, and each edge connecting a pair of detected nuclei; and

means for calculating a weight for each edge based on ellipse confidences and measurements of dissimilarity of location, orientation, and scale between the nuclei detected by the edge.

19. The apparatus of claim 13, wherein the means for extracting a descriptor of nuclei linkage from the pruned relative neighborhood graph comprises:

means for generating a histogram of summed edge weights for connected components in the pruned relative neighborhood graph.

20. The apparatus of claim 19, wherein the means for classifying the histological image as benign or malignant based on the descriptor extracted from the pruned relative neighborhood graph comprises:

means for classifying the histological image as benign or malignant based on the histogram of summed edge weights for connected components in the pruned relative neighborhood graph.

21 . A non-transitory computer readable medium storing computer program instructions for automated analysis of a histological image, the computer program instructions when executed on a processor cause the processor to perform operations comprising:

detecting nuclei in the histological image;

generating a relative neighborhood graph of the detected nuclei;

pruning the relative neighborhood graph of the detected nuclei; extracting a descriptor of nuclei linkage from the pruned relative

neighborhood graph; and

classifying the histological image as benign or malignant based on the descriptor extracted from the pruned relative neighborhood graph.

22. The non-transitory computer readable medium of claim 21 , wherein the operation of detecting nuclei in the histological image comprises:

detecting ellipses in the histological image using a generalized fast radial symmetry (GFRS) transform, wherein each ellipse is represented as an affine transformation of a unit circle.

23. The non-transitory computer readable medium of claim 21 , wherein the operation of detecting nuclei in the histological image comprises:

detecting ellipses in the histological image by generating a plurality of response maps of the histological image corresponding to a plurality of ellipses sampled over a constrained set of parameters for orientation and lengths of semi- major and semi-minor axes, wherein each response map represents a

confidence for a corresponding one of the plurality of ellipses to be centered at each of a plurality of pixels in the response map;

creating a list of candidate ellipses from each of the plurality of response maps, by performing spatial non-maximum suppression for each response map; and

detecting all ellipses in the histological image using a sequential selection process which selects a non-overlapping set of ellipses with highest confidence values from the list of candidate ellipses.

24. The non-transitory computer readable medium of claim 23, wherein the operation of detecting ellipses in the histological image by generating a plurality of response maps of the histological image corresponding to a plurality of ellipses sampled over a constrained set of parameters for orientation and lengths of semi- major and semi-minor axes comprises: for each of the plurality of ellipses, determining a response to the ellipse centered at each of the plurality of pixels in the response map based on voting directions calculated for each of a plurality of pixels in the histological image.

25. The non-transitory computer readable medium of claim 24, wherein the operation of determining a response to the ellipse centered at each of the plurality of pixels in the response map based on voting directions calculated for each of a plurality of pixels in the histological image comprises:

for each of the plurality of pixels in the histological image:

calculating a voting direction for each pixel in the histological image; for each pixel for which the calculated voting direction points towards the pixel at which the ellipse is centered, incrementing a vote total for the ellipse being located at that pixel in the corresponding response map; and

for each pixel for which the calculated voting direction points away from the pixel at which the ellipse is centered, decrementing a vote total for the ellipse being located at that pixel in the corresponding response map.

26. The non-transitory computer readable medium of claim2 5, wherein the operation of calculating a voting direction for each pixel in the histological image comprises:

for each pixel in the histological image:

calculating a transformed tangent vector in circle space based on a tangent vector at that pixel and an affine transformation that transforms a unit circle to the ellipse;

calculating a transformed normal vector in the circle space based on the transformed tangent vector in the circle space; and

calculating the voting direction based on the transformed normal vector and the affine transformation.

27. The non-transitory computer readable medium of claim 24, wherein the operation of detecting all ellipses in the histological image using a sequential selection process which selects a non-overlapping set of ellipses with highest confidence values from the list of candidate ellipses comprises:

(a) ranking the detected candidate ellipses from all of the plurality of response maps based on confidence values;

(b) selecting an ellipse having the highest confidence value from a remaining set of the detected candidate ellipses;

(c) if a pixel region of the selected ellipse does not interfere with a pixel region of another ellipse already included in a set of final detection results, adding the selected ellipse to the set of final detection results;

(d) removing the selected ellipse from the remaining set of the detected candidate ellipses; and

(e) repeating steps (b)-(d) until no detected candidate ellipse remains in the remaining set of the detected candidate ellipses.

28. The non-transitory computer readable medium of claim 21 , wherein the operation of generating a relative neighborhood graph of the detected nuclei comprises:

generating an approximated relative neighborhood graph of the detected nuclei using an Urquhart graph.

29. The non-transitory computer readable medium of claim 21 , wherein the operation of generating a relative neighborhood graph of the detected nuclei comprises:

generating a graph of vertices and edges, each vertex corresponding to a detected nuclei, and each edge connecting a pair of detected nuclei; and

calculating a weight for each edge based on ellipse confidences and measurements of dissimilarity of location, orientation, and scale between the nuclei detected by the edge.

30. The non-transitory computer readable medium of claim 29, wherein the operation of pruning the relative neighborhood graph of the detected nuclei comprises:

removing edges having weights lower than a threshold; removing an edge having the lower weight at a junction of two edges, when an angle between the two edges at the junction is within a threshold of π/ 2 ;

for junctions of three edges, calculating the angle between each pair of two edges, keeping two of the edges having the angle closest to π , and removing the remaining edge; and

removing isolated edges connecting two nuclei that are not connected to any other nuclei.

31 . The non-transitory computer readable medium of claim 21 , wherein the operation of extracting a descriptor of nuclei linkage from the pruned relative neighborhood graph comprises:

detecting connected components in the pruned relative neighborhood graph;

summing edge weights for each connected component; and

generating a histogram of summed edge weights for the connected components detected in the pruned relative neighborhood graph.

32. The non-transitory computer readable medium of claim 31 , wherein the operation of classifying the histological image as benign or malignant based on the descriptor extracted from the pruned relative neighborhood graph comprises: classifying the histological image as benign or malignant based on the histogram of summed edge weights for the connected components detected in the pruned relative neighborhood graph using a trained support vector machine (SVM).

Description:
PATHOLOGICAL IMAGE ANALYSIS USING NEIGHBOURHOOD GRAPHS AND GEOMETRIC

STATISTICS OF NUCLEI

[0001] This application claims the benefit of U.S. Provisional Application No. 61/543,344, filed October 5, 201 1 , the disclosure of which is herein incorporated by reference.

BACKGROUND OF THE INVENTION

[0002] The present invention relates to automated analysis of pathological images, and more particularly to automatically distinguishing between benign and malignant biopsy samples based on corresponding histological images.

[0003] Breast cancer is the second leading cause of cancer-related death in women. Traditionally, histopathological images of biopsy samples are examined by pathologists to determine the presence and malignancy of cancer. The Bloom- Richardson (BR) grading system is commonly used for breast cancer grading. The BR grading system examiners the tubularity, the nuclei pleomorphism, and the mitotic index, and then determined how aggressive the cancer is based on these three characteristics. However, agreement between pathologists regarding cancer grades of biopsy samples using the BR grading system can be quite low due to large variations in personal experience. Accordingly, the development of an automated cancer diagnosis system is desirable.

BRIEF SUMMARY OF THE INVENTION

[0004] The present invention provides a method and system for automated analysis of histological images to distinguish between benign and malignant biopsy samples. Embodiments of the present invention detect nuclei in a histological image using an ellipse detection method built upon the concept of fast radial symmetry transform. Embodiments of the present invention build a relative neighborhood graph using the nuclei centroids as vertices, and then quantify the strength of nuclei linkage based on the spatial distances between the nuclei centroids and the ellipse characteristics of the detected nuclei. Descriptors based on these nuclei linkages are then fed into a statistical (or machine learning based) classifier for classification of cancerous or non-cancerous tissue.

[0005] In one embodiment of the present invention, nuclei are detected in a histological image. A relative neighborhood graph of the detected nuclei is generated. The relative neighborhood graph of the detected nuclei is pruned, and a descriptor of nuclei linkage is extracted from the pruned relative neighborhood graph. The histological image is classified as benign or malignant based on the descriptor extracted from the pruned relative neighborhood graph.

[0006] These and other advantages of the invention will be apparent to those of ordinary skill in the art by reference to the following detailed description and the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0007] FIG. 1 illustrates a method for automatically analyzing a histological image according to an embodiment of the present invention;

[0008] FIG. 2 illustrates an exemplary color transformed image;

[0009] FIG. 3 illustrates an algorithm for calculating the voting direction for the GFRS transform according to an embodiment of the present invention;

[0010] FIG. 4 illustrates detection results for an exemplary ellipse;

[0011] FIG. 5 illustrates an algorithm for post processing to remove overlapping ellipses according to an embodiment of the present invention;

[0012] FIG. 6 illustrates exemplary nuclei detection results;

[0013] FIG. 7 illustrates exemplary approximated relative neighborhood graphs generated based on detected nuclei;

[0014] FIG. 8 illustrates exemplary junction patterns for which edges are removed from the approximated relative neighborhood graph;

[0015] FIG. 9 illustrates pruned graphs and corresponding histograms for exemplary histological images; and

[0016] FIG. 10 is a high level block diagram of a computer capable of implementing the present invention. DETAILED DESCRIPTION

[0017] The present invention is directed to a method and system for automated analysis of histological images to distinguish between benign and malignant biopsy samples. Embodiments of the present invention are described herein to give a visual understanding of automated histological image analysis method. A digital image is often composed of digital representations of one or more objects (or shapes). The digital representation of an object is often described herein in terms of identifying and manipulating the objects. Such manipulations are virtual manipulations accomplished in the memory or other circuitry / hardware of a computer system. Accordingly, it is to be understood that embodiments of the present invention may be performed within a computer system using data stored within the computer system.

[0018] FIG. 1 illustrates a method for automatically analyzing a histological image according to an embodiment of the present invention. As illustrated in FIG. 1 , at step 102, a histological image is received. The histological image can be an image of a tissue sample, such as a biopsy sample, at a microscopic level. The histological image can be generated by a microscope, such as an electron microscope or a light microscope. The tissue sample can be stained with hematoxylin and eosin (H&E) prior to generating the histological image, such that the histological image is an H&E stained image. The histological image can be pre-processed to retrieve the color vectors H and E in CMY color space using a color transform method. For example, the color transform method described in Cosatto et al., "Grading Nuclear Pleomorphism on Histological Micrographs", in ICPR, IEEE, 2008, pp. 1 -4, which is incorporated herein by reference, can be used to retrieve the H and E color vectors. As embodiments of the present invention are specifically interested in the nuclei, the H color vector, which binds to the nuclear chromatin to enhance the contrast between the nuclei and the background, is used in an advantageous embodiment of the present invention. The received histological image can be a pre-processed histological image, which has been pre-processed to retrieve the H color vector prior to receiving the image. Alternatively, the received histological image can be received directly from a microscope imaging device and the pre-processing performed on the received histological image prior to step 104. FIG. 2 illustrates an exemplary color transformed image. As illustrated in FIG. 2, image (a) is an original color image in which the nuclei are stained blue from the hematoxylin and the cytoplasm is stained pink from the eosin. Image (b) is a H channel image generated by retrieving the H color vector from the original color image shown in image (a).

[0019] Returning to FIG. 1 , at step 104, nuclei are detected in the histological image. In an advantageous embodiment of the present invention, the knowledge that nuclei usually have an elliptical shape is utilized in order to formulate nuclei detection as an ellipse detection problem.

[0020] One way to detect an elliptical shape in an image is to use the generalized Hough transform. As five parameters are typically required to define an ellipse, the generalized Hough transform needs a five dimensional accumulator which is very time consuming and computationally expensive. Related work on Hough-like transform based ellipse detection mainly focus on reducing the accumulator space through defining geometric constraints and finding an efficient parametric representation of the ellipse.

[0021] In an advantageous embodiment of the present invention, an ellipse is represented as an affine transform of a unit circle. Let p be the parameterization of a circle:

ΪΚ ) ::::· { < : V ) ί¾ , SHI {&} + yY , 0 < < 2

J where c = {c x ,c y is the center of the circle. In this case, an ellipse q at the same location c with orientation Θ , and (a, b) as the length of the semi-major and semi- minor axes, respectively, can be represented by a suitable affine transformation G of the circle, such that: (1 ) where R and S are the rotation matrix and scaling matrix, respectively. Let A(2) be the group of affine transformations of the plane, then G can be restricted to A(2) to ensure uniqueness of the affine transformation.

[0022] As dense sampling in the five dimensional parameter space (c x ,c y ,0,a,b) to determine the Hough-like transform cumbersome and time consuming, geometric constraints and efficient voting algorithms are advantageous to reduce the operational complexity. It can be noted that an ellipse has the characteristics of radial symmetry. The fast radial symmetry (FRS) transform is a radial symmetry operator to detect points of high radial symmetry and is effective at locating circular regions. In an advantageous embodiment, the voting scheme of the FRS transform is extended to vote for an ellipse by representing an ellipse as an affine transform of a circle.

[0023] The FRS transform method is summarized as follows. For each radius n , the algorithm uses image gradients to vote for both the positively-affected pixel and negatively-affected pixel, which are defined as:

¾IP)

-«»1 ) · · · · · · · P

p +ve (p) corresponds to a pixel which the gradient g(p) is pointing towards, and _ ve (p) corresponds to a pixel which the gradient g(p) is pointing away from. From these pixels, an orientation projection image O N and a magnitude projection image M n are generated. In particular, for each positively affected pixel, the corresponding point p +ve (p) in O N and M n is increased by 1 and ||g(p , respectively. Similarly, for each negatively affected pixel, the corresponding point p _ ve (p) in O N and M n is increased by 1 and , respectively. The radial symmetry response map is then defined as:

where

(2)

(3)

A n is a Gaussian smoothing function, a is the radial strictness parameter, and k n is a scaling factor across different radii.

[0024] While the FRS transform is very effective at detecting circular symmetries, ambiguity would arise for detecting shapes with geometric variation, such as an ellipse where the gradient direction deviates from the radial vector. This bias can lead to diffusion and dispersion of the locus of symmetry in the object space. In an advantageous embodiment, the framework of the FRS is modified to address the bias in the voting direction using a Generalized Fast Radial Symmetry (GFRS) transform.

[0025] For radially-symmetric structures like ellipses that are not perfectly circular, the image gradient is not an unbiased estimator for the direction of the centroid. For such cases, the GRFS transform can be utilized. The GRFS transform exploits the insight that a unit circle can be transformed to any conic using a parameterized sub-class of affine spatial transformations.

[0026] Let G a be some such transformation for the parameter a . In this case, the image of a given ellipse can be obtained by applying the spatial transformation G a to the image of a unit circle for an appropriate a . Since G a is linear on the spatial domain R 2 , all linear subspaces of this domain are also mapped by (restrictions of) the same transformation G a . This includes the tangents at corresponding points on the circle and ellipse as well as the voting directions. In addition, the voting direction can be estimated from the local differential structure of the image. Further, G a achieves a natural parameterization of the space of ellipses in a manner that FRS can easily be embedded into GFRS by appropriately sampling G a and using it to modify the voting direction. However, the voting direction is now unbiased. Similar to FRS, this results in a set of response images, parameterized by a , which undergo a post-processing step of non-maximum suppression.

[0027] FIG. 3 illustrates an algorithm 300 for calculating the voting direction for the GFRS transform according to an embodiment of the present invention. Let ρ(φ), q{<f) , 0 < φ≤2π be the corresponding points on the circle and ellipse, respectively. T p ^ and r ?w are the tangent vectors at ρ{φ) and q ) , respectively, and Ν ρ ^ and N ?w are the normal vectors at ρ(φ) and q^) , respectively. The voting direction can then be calculated using the steps of the algorithm 300 of FIG. 3. As illustrated in FIG. 3, at step 302, the tangent vector r ?w at a point q^) on an ellipse a point is transformed into the tangent vector T p ^ in the circle space based on the affine transformation G a . At step 304, the transformed normal vector N p ^ in the circle space is calculated based on the affine transformation of the transformed tangent vector T p ^ . At step 306, the voting direction ?w for the centroid of the ellipse is calculated based on the affine transformation G a and the transformed normal vector N p ^ in the circle space. The algorithm 300 of FIG. 3 is performed for each (voting) pixel of the input image. And the responses are accumulated in the response map image. Each pixel of the response map represents the confidence of the ellipse being centered at that (accumulator) pixel. The response map is indexed by ellipse parameters given by G a .

[0028] The results of ellipse detection using the GFRS transform can be illustrated with an example, in which the affine transformation matrix is set to: i.e., θ = π/4, α = 6, ΰ = 4 . FIG. 4 illustrates detection results for the exemplary ellipse with the parameters θ = π/ 4, α = 6, δ = 4 . The present inventors selected this exemplary set of parameters based on the observation that the length of the semi- major and semi-minor axes of detected nuclei are often around 6 and 4 pixels, respectively, however, the choice of orientation of the detected nuclei is only for illustrative purposes. The ellipse detection can be repeated with different affine parameters to detect nuclei of varying sizes and orientations.

[0029] In the above described GFRS ellipse detection process, the five dimensional parameter space of an ellipse is decomposed into a three dimensional subspace of the affine group A(2) and a two dimensional location subspace. For each point sampled from the constrained affine group A(2) , the modified radial symmetry transform is applied to detect an ellipse of a particular shape. Hence, through densely sampling the three dimensional parameter space and applying efficient voting in the two dimensional location space, response maps of ellipses of all defined sizes and orientations can be obtained. Prior knowledge of the size and shape of nuclei can be exploited to quantize the parameter space for ellipse detection. As for an ellipse with orientation Θ which is in the range of [π, 2π] , it can be replaced by α - π without changing the set of image points on the ellipse. Accordingly, the geometric constraint that Θ is in the range of [θ, π] can be added, and in a possible implementation, a number (e.g., 8) discrete sampling points from the orientation space can be sampled to cover the range: {θ ί , ϊ = 0, ...,7, θ ί = ϊ * πΙ%) .

Next, as the length of the semi-major and semi-minor axis should satisfy the constraint: b < a, along with the prior knowledge of the nuclei size, the aspect ratio can be defined to be s = a lb , and in a possible implementation the following quantization is used: b = [4, 6, 8], s = [1 .2, 1 .5, 1 .8, 2.3, 2.7]. Accordingly, in an exemplary implementation, 120 points are sampled in total from the subset of affine group A(2) . [0030] As the length of the major/minor axis of ellipse increases, so does the number of gradient votes from the perimeter of the ellipse. To alleviate this bias for large ellipses, it is advantageous to normalize across scale. For the FRS transform it is advantageous to scale O n and M n by the expected maximum value of O n and saturate O n at this value, and the expected maximum value of O n can be determined experimentally. Embodiments of the present invention can take a similar route and determine the normalizing factor k n for ellipses across different lengths of semi-minor axis.

[0031] Once response maps are detected for each of the ellipses sampled from the affine group A(2) , a non-maximum suppression like post processing is performed to remove overlaps among the detected ellipses. FIG. 5 illustrates an algorithm 500 for post processing to remove overlapping ellipses according to an embodiment of the present invention. For each response map obtained by sampling the subspace of affine transformation, an initial non-maximal suppression is performed in the 2D location space. After the initial non-maximal suppression, overlaps still exist among detected ellipses from different response maps. To deal with this, the detected ellipses ( Ep t , i = \,... ,n ) from all the response maps are ranked based on the confidence value { conf t ), and the pixel region ( α, ) associated with each ellipse is calculated. The confidence values are the response values normalized by the response value expected from a perfect ellipse for each response map. As illustrated in FIG. 5, starting (at 502) from an empty set (· ' $.), an ellipse with the highest confidence value is selected from the remaining ellipse set ( U ) (504). At each iteration, the selected ellipse is added to the set A if the associated pixel region of the selected ellipse does not overlap with associated pixel regions of the ellipses in set A (506), and then the selected ellipse is removed from the set U of ellipses to be considered (508). FIG. 6 illustrates exemplary nuclei detection results. As illustrated in FIG. 6 nuclei are detected in a histological image by detecting ellipses of different size, shape, and orientation.

[0032] Returning to FIG. 1 , at step 106, a relative neighborhood graph of the detected nuclei is generated. The spatial arrangement of nuclei can often be an indicator of cancer malignancy. In an advantageous embodiment of the present invention, a relative neighborhood graph (RNG) is generated in order to model the cell architecture. The RNG is defined by connecting two points p l and p 2 by an edge whenever there does not exist a third point p 3 that is closer to both p l and p 2 than they are to each other. In a possible implementation, the RNG can be efficiently generated using an Urquhart graph, which is an approximation to an RNG. The Urquhart graph is obtained by removing a longest edge from each triangle in the Delaunay triangulation. The detected nuclei are used as the vertex set to generate the approximated RNG. FIG. 7 illustrates exemplary approximated relative neighborhood graphs generated based on detected nuclei. As illustrated in FIG. 7, image (a) shows an exemplary approximated relative neighborhood graph generated based on detected nuclei in a histological image of a benign tissue sample, and image (b) shows an exemplary approximated relative neighborhood graph generated based on detected nuclei in a histological image of a malignant tissue sample.

[0033] From the viewpoint of a pathologist, a set of close-by nuclei, with pairwise similarity in size and orientation typically describe a healthy chain of nuclei along a tubule and, hence, are an indication of benignity. This knowledge, together with the information extracted from the ellipse detection, can be exploited to quantize the amount of nuclei group affinity. Let G = {V,E) denote the approximated RNG, where V = v l ,... ,v n is the set of nodes of nuclei, and each node is characterized by the features of its associated ellipse: location (x, y) , orientation θ , and semi-major and semi-minor axes lengths (a,b) . W = W tj , i,j = \, 2,... ,n is the set of weights for edges connected in the approximated RNG. The edge weights can be defined as a multivariate RBF kernel, as follows:

¾ ί ^ ex pi - X \ d:

The variables d t j , g t j , and s t j describe the dissimilarity of two neighboring nuclei i and j in terms of location, orientation, and scale, respectively, i.e., -- ¾ i } '

Here, for normalization along the multi-variate dimensions, L is the length of the longest edge in the graph, and m a and m b are the length of the longest semi-major and semi-minor axis of the detected ellipses, respectively. , λ 2 , and , are parameters that balance the importance of these three features. In this formulation, edges that connect parallel and consistent nuclei have high weights.

[0034] Returning to FIG. 1 , at step 108, the RNG of the detected nuclei is pruned. Once the approximated RNG is generated, the approximated RNG can be pruned using a plurality of pruning operations. In a first post-processing operation, low-weight edges (edges with weights below a particular threshold) are removed from the graph. In a second post-processing operation, certain junction patterns, which are not consistent with nuclei link features, are removed. FIG. 8 illustrates exemplary junction patterns for which edges are removed from the approximated RNG. The angle between two edges can be defined as:

< i -

UK !

p-1 where p l is the intersection point of two edges, and p 2 and p 3 are the other vertexes of the two edges. For two intersecting edges which have an angle close (e.g., within a certain threshold) to π 12 , the edge with the smaller weight is removed. Image (a) of FIG. 8 illustrates two intersecting edges 802 and 804 with an angle close to π/ 2. For three intersecting edges at the same node, the angle is calculated between every pair of two edges and the two edges whose angle is closest to π are kept, while the third edge is removed. Image (b) of FIG. 8 shows three intersecting edges 806, 808, and 810. In the example of image (b), edge 810 would be removed, as the angle between edges 806 and 808 is closest io π . In a third post-processing operation, isolated edges are removed. Isolated edges are edges that connect two vertices that do not connect to any other vertex.

[0035] Returning to FIG. 1 , at step 1 10, a descriptor is extracted from the pruned graph, based on the nuclei linkages in the pruned graph. From the pruned graph, the connected components, which correspond to the structure of nuclei links, can be detected. The present inventors have observed that benign cases usually have more connected components and longer connected components. To quantitize this "connectivity strength", edge weights of each connected component are summed up and a histogram of the sum of edge weights values for connected components over a fixed sized region (e.g., a 256 x 256 image) is calculated. This histogram is used as the descriptor for the nuclei linkages in the histological images. FIG. 9 illustrates pruned graphs and corresponding histograms for exemplary histological images. In particular, FIG. 9 shows pruned graphs 900 and 910 and corresponding histograms 905 and 915, respectively, for histological images of benign tissue samples, and pruned graphs 920 and 930 and corresponding histograms 925 and 935, respectively, for histological images of malignant tissue samples. As shown in FIG. 9, it can be observed that the benign images have significantly more connected components with high accumulated edge weights.

[0036] Returning to FIG. 1 , at step 1 12, the histological image is classified based on the extracted descriptor. In particular, the histological image can be classified as benign or malignant based on the extracted descriptor using statistical (or machine learning based) classification. In an advantageous embodiment, the histogram of accumulated edge weights of the connected components of the pruned graph is used as a descriptor and fed into a trained Support Vector Machine (SVM). The SVM is trained offline based on a set of training data. In particular, the SVM is trained based on histograms extracted from pruned graphs of histological images having a known classification. The trained SVM automatically classifies the histological image as benign or malignant based on the histogram extracted from the pruned graph in the current histological image. In a possible implementation, the SVM uses a radial basis function (RBF) kernel.

[0037] The above-described methods for automated analysis of a histological image may be implemented on a computer using well-known computer processors, memory units, storage devices, computer software, and other components. A high level block diagram of such a computer is illustrated in FIG. 10. Computer 1002 contains a processor 1004 which controls the overall operation of the computer 1002 by executing computer program instructions which define such operation. The computer program instructions may be stored in a storage device 1012, or other computer readable medium (e.g., magnetic disk, CD ROM, etc.) and loaded into memory 1010 when execution of the computer program instructions is desired. Thus, the steps of the method of FIGS. 1 , 3, and 5 may be defined by the computer program instructions stored in the memory 1010 and/or storage 1012 and controlled by the processor 1004 executing the computer program instructions. An image acquisition device 1020, such as an electron microscope or a light microscope, can be connected to the computer 1002 to input histological images to the computer 1002. It is possible to implement the image acquisition device 1020 and the computer 1002 as one device. It is also possible that the image acquisition device 1020 and the computer 1002 communicate wirelessly through a network. The computer 1002 also includes one or more network interfaces 1006 for communicating with other devices via a network. The computer 1002 also includes other input/output devices 1008 that enable user interaction with the computer 1002 (e.g., display, keyboard, mouse, speakers, buttons, etc.). One skilled in the art will recognize that an implementation of an actual computer could contain other components as well, and that FIG. 10 is a high level representation of some of the components of such a computer for illustrative purposes.

[0038] The foregoing Detailed Description is to be understood as being in every respect illustrative and exemplary, but not restrictive, and the scope of the invention disclosed herein is not to be determined from the Detailed Description, but rather from the claims as interpreted according to the full breadth permitted by the patent laws. It is to be understood that the embodiments shown and described herein are only illustrative of the principles of the present invention and that various modifications may be implemented by those skilled in the art without departing from the scope and spirit of the invention. Those skilled in the art could implement various other feature combinations without departing from the scope and spirit of the invention.