Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHODS, SYSTEMS, AND DEVICES RELATING TO SHADOW DETECTION FOR REAL-TIME OBJECT IDENTIFICATION
Document Type and Number:
WIPO Patent Application WO/2017/177284
Kind Code:
A1
Abstract:
The various embodiments herein relate to shadow identification systems and methods for use in object identification. The embodiments include applying a smoothing filter to an original image, determining a threshold in the smooth image, and creating a mask based on the threshold.

Inventors:
REES STEVEN (AU)
TSCHARKE MATTHEW (AU)
Application Number:
PCT/AU2017/050340
Publication Date:
October 19, 2017
Filing Date:
April 14, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV OF SOUTHERN QUEENSLAND (AU)
International Classes:
G06T5/40; G06V10/34
Domestic Patent References:
WO2007145654A12007-12-21
WO2014165787A12014-10-09
Foreign References:
US20120219218A12012-08-30
US20070110309A12007-05-17
US8619151B22013-12-31
Download PDF:
Claims:
Claims

What is claimed is:

1. A real-time and real-world environment method of shadow detection, the method comprising: applying a smoothing filter to an original image to create a smoothed image;

determining an intensity threshold in the smoothed image; and

creating a mask based on the intensity threshold.

2. The method of claim 1 , wherein the applying the smoothing filter comprises replacing a pixel intensity value of each pixel of interest in the image with a mean intensity value of neighboring pixels.

3. The method of claim 2, wherein the neighboring pixels comprise a window of pixels surrounding the pixel of interest.

4. The method of claim 3, wherein the window of pixels comprises a 3 x 3 area of pixels, a 10 x 10 area of pixels, or a 20 x 20 area of pixels.

5. The method of claim 1 , wherein the determining the intensity threshold in the smoothed image comprises:

creating a graphical summary of pixel intensities in the smoothed image; and determining the intensity threshold based on the graphical summary.

6. The method of claim 5, wherein the graphical summary is a histogram, and further wherein the determining the intensity threshold comprises identifying the minima between peaks in the histogram.

7. The method of claim 1 , wherein the providing the original image comprises providing the original image in a grayscale image, a color image, a depth image, a fluorescence image, a thermal image, or an infrared image.

8. The method of claim 7, wherein the providing the original image in the grayscale image comprises capturing the original image in the grayscale image or converting the original image to the grayscale image.

9. The method of claim 1 , wherein the mask comprises a binary mask image.

10. The method of claim 1 , further comprising identifying shadowed regions in the original image by overlaying the mask over a color version of the original image.

1 1. The method of claim 1 , further comprising applying a color correction method to the mask.

12. A real-time and real-world environment object identification system, the system comprising:

(a) a central controller component comprising a processor;

(b) a vision system operably coupled to the central controller component, the vision system configured to capture at least one original image of a target area; and

(c) a shadow detection module associated with the central controller component, the shadow detection module configured to:

(i) apply a smoothing filter to the at least one original image to create a smoothed image;

(ii) determine a threshold in the smoothed image;

(iii) create a mask based on the threshold; and

(iv) identify shadowed regions in the at least one original image by

overlaying the mask over the at least one original image.

13. The system of claim 12, wherein the vision system is further configured to capture the at least one original image as a grayscale image.

14. The system of claim 12, wherein the shadow detection module is further configured to convert the at least one original image to a grayscale image.

15. The system of claim 12, wherein the mask comprises a binary mask image.

16. The system of claim 12, further comprising a color correction module associated with the central controller component, wherein the color correction module is configured to determine an amount of color correction based on the mask.

17. A real-time and real-world environment method of shadow detection, the method comprising: applying a smoothing filter to an original image to create a smoothed image;

creating a graphical summary of pixel intensities in the smoothed image;

identifying a minima between peaks in the graphical summary to determine an intensity threshold in the smoothed image; and

creating a mask based on the intensity threshold.

18. The method of claim 17, wherein the graphical summary is a histogram.

Description:
METHODS, SYSTEMS, AND DEVICES RELATING TO SHADOW DETECTION

FOR REAL-TIME OBJECT IDENTIFICATION

Cross-Reference to Related Application(s)

[001] This application claims priority to U.S. Provisional Application 62/323,173, filed April 15,

2016 and entitled "Methods, Systems, and Devices Relating to Shadow Detection for Real-Time Object Identification," which is hereby incorporated herein by reference in its entirety.

Field of the Invention

[002] The various embodiments disclosed herein relate to systems and methods for shadow detection used in object identification, including real-time object identification. Some specific exemplary embodiments include systems and methods for automated object selection, driverless vehicles, or plant identification.

Background of the Invention

[003] Machine vision technologies are used in a variety of different systems and methods, including, for example, driverless vehicles, automated object selection, and various other vision-aided robotic or automated systems. Several methods have been used for segmenting objects in the various machine vision technologies. In these various systems, varied lighting conditions can impact the effectiveness of the known machine vision technology and segmentation processes. One method for addressing the varied lighting conditions is a shadow detection process. However, known shadow detection processes have limitations, including incorrect segmentation caused by variation in the color of the light source. That is, the color variation in the light source degrades the segmentation quality so that either plant material captured in the image is missed or portions of non-plant material are incorrectly categorized as plant material. Another problem with the known process of real-time shadow detection and correction is processing time. That is, the known methods do not have the processing speed necessary for real-time systems. Processing time is limited in real-time systems such as automated spot spraying or vision guidance systems where the frame rate may need to be, for example, 30 frames per second ("FPS") or faster. In such exemplary known systems, a frame rate of 30 FPS or faster leaves less than 33 milliseconds ("ms") to compensate for shadows and daylight, identify the presence of the target object (such as a weed or crop row, for example), and determine the action required. The known methods cannot operate at that speed.

[004] There is a need in the art for improved systems and methods for shadow detection in real-time object identification.

Brief Summary of the Invention [005] Discussed herein are various real-time and real-world environment shadow detection systems and methods for use in object identification.

[006] In Example 1 , a real-time and real-world environment method of shadow detection comprises applying a smoothing filter to an original image to create a smoothed image, determining an intensity threshold in the smoothed image, and creating a mask based on the intensity threshold.

[007] Example 2 relates to the method according to Example 1 , wherein the applying the smoothing filter comprises replacing a pixel intensity value of each pixel of interest in the image with a mean intensity value of neighboring pixels.

[008] Example 3 relates to the method according to Example 2, wherein the neighboring pixels comprise a window of pixels surrounding the pixel of interest.

[009] Example 4 relates to the method according to Example 3, wherein the window of pixels comprises a 3 x 3 area of pixels, a 10 x 10 area of pixels, or a 20 x 20 area of pixels.

[010] Example 5 relates to the method according to Example 1 , wherein the determining the intensity threshold in the smoothed image comprises creating a graphical summary of pixel intensities in the smoothed image and determining the intensity threshold based on the graphical summary.

[011] Example 6 relates to the method according to Example 5, wherein the graphical summary is a histogram, and further wherein the determining the intensity threshold comprises identifying the minima between peaks in the histogram.

[012] Example 7 relates to the method according to Example 1 , wherein the providing the original image comprises providing the original image in a grayscale image, a color image, a depth image, a fluorescence image, a thermal image, or an infrared image.

[013] Example 8 relates to the method according to Example 7, wherein the providing the original image in the grayscale image comprises capturing the original image in the grayscale image or converting the original image to the grayscale image.

[014] Example 9 relates to the method according to Example 1 , wherein the mask comprises a binary mask image.

[015] Example 10 relates to the method according to Example 1 , further comprising identifying shadowed regions in the original image by overlaying the mask over a color version of the original image.

[016] Example 1 1 relates to the method according to Example 1 , further comprising applying a color correction method to the mask.

[017] In Example 12, a real-time and real-world environment object identification system comprises a central controller component comprising a processor, a vision system operably coupled to the central controller component, the vision system configured to capture at least one original image of a target area, and a shadow detection module associated with the central controller component. The shadow detection module is configured to apply a smoothing filter to the at least one original image to create a smoothed image, determine a threshold in the smoothed image, create a mask based on the threshold; and identify shadowed regions in the at least one original image by overlaying the mask over the at least one original image.

[018] Example 13 relates to the system according to Example 12, wherein the vision system is further configured to capture the at least one original image as a grayscale image.

[019] Example 14 relates to the system according to Example 12, wherein the shadow detection module is further configured to convert the at least one original image to a grayscale image.

[020] Example 15 relates to the system according to Example 12, wherein the mask comprises a binary mask image.

[021] Example 16 relates to the system according to Example 12, further comprising a color correction module associated with the central controller component, wherein the color correction module is configured to determine an amount of color correction based on the mask.

[022] In Example 17, a real-time and real-world environment method of shadow detection comprises applying a smoothing filter to an original image to create a smoothed image, creating a graphical summary of pixel intensities in the smoothed image, identifying a minima between peaks in the graphical summary to determine an intensity threshold in the smoothed image, and creating a mask based on the intensity threshold.

[023] Example 18 relates to the method according to Example 17, wherein the graphical summary is a histogram.

[024] While multiple embodiments are disclosed, still other embodiments of the present invention will become apparent to those skilled in the art from the following detailed description, which shows and describes illustrative embodiments of the invention. As will be realized, the invention is capable of modifications in various obvious aspects, all without departing from the spirit and scope of the present invention. Accordingly, the drawings and detailed description are to be regarded as illustrative in nature and not restrictive.

Brief Description of the Drawings

[025] FIG. 1 is a photographic image of a real-world environment in which shadowed and non- shadowed regions are to be identified.

[026] FIG. 2 is a version of the image of FIG. 1 that has been smoothed or blurred via a smoothing process, according to one embodiment.

[027] FIG. 3 is a histogram depicting a summary of the pixel intensities of the image of FIG. 2, according to one embodiment.

[028] FIG. 4 is a diagram depicting a specific smoothing process: a pixel intensity averaging process, according to one embodiment. Detailed Description

[029] The various embodiments disclosed herein relate to real-time systems for identifying specific objects amongst several different objects in real-world environments utilizing an improved method of shadow detection. In other implementations, the various systems can use the shadow detection method in combination with a color correction method. Some implementations relate to various vision- aided automated systems such as vision guidance, automated object selection, or collision avoidance systems. Other specific embodiments relate to systems and methods for identifying specific plant species amongst several plant species utilizing the shadow detection process, and, in some cases, and the color correction process as well. In all the various systems and applications disclosed or contemplated herein, it is understood that the improved shadow detection methods and systems address varied lighting conditions to remove shadows from images and thereby process the images correctly for further use depending on the system or application.

[030] The automated identification of one or more specific objects amongst others utilizes machine vision technology. For purposes of this application, "machine vision" is the analysis of images to extract data for controlling a process or activity - it can be used to automate tasks typically performed by human visual inspection. In the various implementations herein, the machine vision technology is configured to identify specific objects.

[031 ] Certain system, method, and device embodiments described and contemplated herein relate to shadow detection in advanced driver assistance systems such as vision guidance systems, collision avoidance systems, lane changing systems, and autonomous vehicle vision systems. Alternatively, the various embodiments relate to shadow detection in security surveillance, industrial vision control systems (such as, for example, timber volume estimation and cut optimisation systems), vision-based sorting machines for food products and industrial components, and vision-based earth work systems (such as, for example, land levelling and fill systems). In other implementations, the systems, methods, and devices relate to shadow detection in any vision-aided robotic systems in which shadows may be encountered.

[032] Alternatively, various system, method, and device embodiments described herein relate to shadow detection in real-time identification of weed plants amongst crop plants and selectively spraying those weed plants with a pesticide in real world (as opposed to testing or lab) conditions. Alternative embodiments relate to selectively killing those weed plants by any other known means. Further implementations relate to incorporation of the various systems, methods, and devices disclosed and contemplated herein into either ground-based or aerial platforms.

[033] For purposes of this application, the term "real-time" describes a system that produces a correct result within a specified time, and more specifically for purposes of this application describes a machine vision system that is able to identify objects as the system progresses at an effective working speed. The systems, devices, and methods can be used in real world conditions that include a myriad of variations during use. [034] It is also understood that the various systems, methods, and embodiments disclosed and contemplated herein can be used for any purpose that relates to identification of one or more specific objects amongst several different objects in situations in which shadows may be encountered. Various other identification implementations include medical applications such as identification of specific objects in a magnetic resonance image ("MRI") for diagnosis purposes or other such applications. Alternatively, other exemplary applications include any type of object sorting, including high-speed object sorting, such as the type of sorting necessary for conveyor-based operations relating to mining, foodstuff production, or packaging or package transport.

[035] The various embodiments disclosed and contemplated herein relate to systems, devices, and methods that utilize shadow detection in robust object detection systems that can operate in real-time and in real world conditions to successfully identify the target objects amongst several objects and then selectively take action based on that identification. More specifically, the various embodiments are configured to detect the bright and shadowed regions in an image scene for detecting the presence of a target object, such as, for example, plant material. In certain embodiments in which the target is plant material, the systems, devices, and methods enable the application of algorithms capable of segmenting plant material from background in real-time, real-world, no-till and traditional tillage situations. In a specific example, the various shadow detection embodiments disclosed or contemplated herein can be used in combination with the object identification systems, methods, and devices disclosed in pending International Application PCT/US15/29261 , which was filed on May 5, 2015 and is entitled "Methods, Systems, and Devices Relating to Real-Time Object Identification," which is hereby incorporated herein by reference in its entirety. It is understood that the various shadow detection systems and methods can be incorporated as a module or method into any of the object identification embodiments disclosed in the '261 Application.

[036] It is understood that color correction can also be utilized in the object detection process and/or to further enhance the shadow identification. The various embodiments disclosed or contemplated herein can be used in conjunction with a color correction method that determines the amount of color correction in an image scene by finding the intensity and color of the light source and identifying the bright and shadowed regions and the intensity of the image scene. The system and method, which aredisclosed in pending International Application PCT/IB2017/050719, which was filed on February 9, 2017 and is entitled "Imaging Device with White Balance Compensation and Related Systems and Methods," which is hereby incorporated herein by reference in its entirety, use a camera directed toward the light source to identify the intensity and color of that source.

[037] The shadow detection process, according to one embodiment, includes the following steps, as explained in further detail below. First, an image is captured. Next, a smoothing filter is applied to the image which "blurs" the image. After the smoothing filter is applied, a histogram is created of the pixel intensities in the smoothed image. The histogram is then used to identify any shadow regions by finding the minima between the shadow peaks (reflecting lower intensities) and bright peaks (reflecting higher intensities) in the histogram and thereby determining an intensity threshold that can then be applied to the smoothed image, thus creating a mask. Finally, the mask is then overlaid on the color image of the same scene to identify the shadowed regions in the scene.

[038] This shadow detection process, according to one embodiment, will be explained in additional detail below.

[039] FIG. 1 depicts an original image scene that will be analyzed using the shadow detection process, according to one embodiment, to identify any shadowed regions in the image. It should be noted that the image in FIG. 1 is a scene having a bright region 10 and a shadowed region 12. In accordance with one implementation, once the original image (like FIG. 1 ) is captured in color, it is converted into a grayscale image, and an averaging process is applied to the grayscale image. Alternatively, the image is captured in grayscale. In a further alternative, the averaging process can be applied to the image (including, for example, a color image) without the image first being converted into a grayscale image. The original captured image can be any known type of image, including, for example, a color, depth, thermal, infrared, or fluorescence image. Further, the shadow to be detected can originate from any known radiating energy source, including, for example, sunlight, heat, or artificial light.

[040] The smoothing (also referred to as "averaging" or "blurring") process (which can be accomplished using a known averaging filter, mean filter, Gaussian blur, or median filter, for example) is a known process used to "smooth" or "blur" an image. In one exemplary embodiment, a known averaging filter is used to produce the blur in the following fashion. The main idea of the averaging process is to run through the image pixel by pixel, replacing the specific pixel intensity value of each pixel of interest with the median intensity value of neighboring pixels based upon a "window" of pixels surrounding that pixel of interest. As an example, FIG. 4 depicts a 3 x 3 window (a window that is sized to capture an area that is 3 pixels wide and 3 pixels high) with the pixels around the pixel of interest numbered as shown. It is understood that any other known window size can be used, such as a 10 x 10 or 20 x 20 window. The pixel intensity value inserted in the pixel of interest is the value resulting from the sum of the intensity values of pixels 1 to 8, which is then divided by 8 to arrive at the mean intensity value. It is understood that the pixel intensity can be representative of the color, saturation, heat, or other actual features of the image, depending on the source of the radiating energy. In the context of images, the end result of this process is that the intensity of the pixels are averaged, thereby lowering the contrast (range of pixel intensities) within the bright areas and dark areas. Alternatively, instead of smoothing, a known process of reducing the quantization levels of the image could be used.

[041] FIG. 2 depicts a smoothed or "blurred" version of the image of FIG. 1 after going through the averaging process described above. As can be seen in the figure, the brightest bright spots of the bright region 10 have been lowered in intensity, while the dark areas of the shadowed region 12 have been made more homogenous (been "smoothed" such that specific objects are more difficult to identify).

[042] In certain implementations, the amount of smoothing or blurring required can vary depending on the resolution and dynamic range of the image in question. For example, for an image captured with a 640 x 480 resolution camera with a dynamic range of 50 db, an averaging over a 10 x 10 window to a 20 x 20 window is satisfactory. If the amount of smoothing is too much, the image will be smoothed to the point where the bright and dark portions are less distinguishable (less separable) and smaller shadows can go undetected. If the amount of smoothing isn't enough, the data has too much resolution and no peaks are produced in the histogram.

[043] As mentioned above, once the image has been smoothed, a histogram of the pixel intensities is created with a bin size that depends on the resolution and dynamic range in the image. A bin is what the pixel values are sorted into. For example, if a bin size of 128 was used, there would only be 2 bins, because the pixel intensity range is between 0 and 255. That is, all pixels with a value of 128 or less would be sorted into Bin 1 , and all pixels with a value greater than 128 would be sorted into Bin 2. Hence, if there were 2,500 pixel values of 128 or lower, then Bin 1 would have the value of 2,500 in it, while Bin 2 would hold the number of pixels above 128.

[044] In the exemplary image depicted in FIG. 2, a bin size of 15 was used (with a total of 17 bins. The histogram can be used to identify the bright regions and shadow regions in an image such that no shadow regions go undetected. FIG. 3 depicts the histogram summarizing the pixel intensities of the averaged image of FIG. 2. In a histogram, the bright and dark areas are seen as separate peaks in the histogram with the size of the peaks relative to the size of the shadow and bright regions. The histogram of FIG. 3 has two peaks 20, 22 because, as discussed above, there are two regions in the image of FIG. 1 : one bright region 10 and one shadow region 12. The more regions in the image, the greater the number of peaks. The original image of FIG. 1 captures a scene generated from a camera pointing down at the ground - it doesn't capture any skyline. If the skyline were included in the image, there would potentially be three peaks in the histogram of FIG. 3. In contrast, if the scene had only one region (either entirely a bright region or entirely a shadow region), the histogram would have only one peak.

[045] The histogram is then used to identify the shadowed regions in the original image. More specifically, an intensity threshold is determined by identifying the minima between two peaks in the histogram. For example, with respect to the histogram in FIG. 3, the minima 24 is the lowest point between the two peaks 20, 22. The minima is used to determine the intensity threshold by obtaining the bin number of the minima (which is 7 in the example set forth in FIG. 3) and multiplying by the bin size (which is 15 in the example). Therefore, the intensity threshold value in this specific example would be 1 15 (7 x 15).

[046] The threshold can then be applied to the blurred image to create a mask according to a known process and thereby identify the shadowed regions as white and the well-lit areas as black. For example, in one embodiment, the mask is a binary mask image, which is an image in which the image pixels are in one of two states: a zero for pixel areas of no interest and a 1 for pixel areas of interest (alternatively, the two states can be 0 and 255 so that all areas of interest appear white and all areas of no interest are black). Alternatively, the mask can be any known image, map, array, or vector of shadow and non-shadow regions associated with the original image. Subsequently, the mask can then be overlaid on the original image of FIG. 1 to identify the shadowed region(s).

[047] According to one embodiment, the correction technique mentioned above that employs a camera aimed at the light source can be employed at this point to compensate for any inconsistent lighting. That is, the technique provides for adjustment of the white balance of a resulting image based on detection of the white balance of the light source using an algorithm.

[048] Further, according to one implementation, the shadow region can further be corrected according to the following process. First, the average intensity values for red (R), green (G) and blue (B) are determined for the light area.

[049] The average light values are calculated in the following fashion. The original mask of the shadow region is dilated and stored as the minimum light image and maximum light image according to a known process. The maximum light image is then dilated and the difference between the minimum and maximum light images is used as a mask to determine the average RGB light values of the scene image.

[050] Further, the average dark values are calculated as follows. The original mask of the shadow region is eroded and stored as the minimum dark image and the maximum dark image in accordance with a known process. The maximum dark image is then eroded further. The difference between the minimum and maximum dark images is then used as a mask to determine the average RGB dark values of the scene image.

[051] The gain required for each channel in the dark area to increase the RGB values to a level similar to the light area is calculated as follows: average light value/average dark value for RGB. This value is then applied to the area of the image under the shadow mask region. This corrects for color.

[052] There is a known issue for those images that capture scenes in which there is stubble present that has been pushed over (stubble that is bent over so that a substantial length of each piece of stubble is horizontal to the ground). The issue is that it is possible that the gains calculated above will not be large enough to compensate for the intensity drop in the pixels capturing the dirt/ground between the pieces of pushed over stubble, because they will remain dark. However, this is typically not a problem as the plant material is usually found above the "pushed over" stubble.

[053] According to a further embodiment, any plant material in the image can be identified in the following manner. The image can be binarised such that black represents non-plant material and white represents plant material with a pixel comparison of plant material. It is understood that there are numerous machine techniques that can be used to identify plant material in a color image, such as: G>R and G>B or a modified version of this formula to enhance green. If there are binarised contiguous white areas greater than a predetermined size, then there is plant material present. The predetermined size should be large enough to cut out noise but small enough to keep small plants.

[054] Although the present invention has been described with reference to preferred embodiments, persons skilled in the art will recognize that changes may be made in form and detail without departing from the spirit and scope of the invention.