Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
FAST REPEATED INTEGRAL IMAGES
Document Type and Number:
WIPO Patent Application WO/2012/067844
Kind Code:
A1
Abstract:
A repeated integral images method filters image data in only two passes, e.g., the first pass filters horizontal rows of pixels and a second pass filters vertical columns of pixels, or in a single pass. The filter performs at least one infinite impulse response (IIR) filter and at least one finite impulse response (FIR) filter on the image data. A plurality of IIR filters and FIR filters maybe performed to approximate a Gaussian filter. By minimizing the number of passes, the data flow between the processing unit and the storage unit is greatly reduced compared to conventional repeated integral images method thereby improving computation time.

Inventors:
TSAI MING-CHANG (US)
BAHETI PAWAN K (US)
CHARI MURALI R (US)
KRISHNAMOORTHI RAGHURAMAN (US)
Application Number:
PCT/US2011/059288
Publication Date:
May 24, 2012
Filing Date:
November 04, 2011
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
QUALCOMM INC (US)
TSAI MING-CHANG (US)
BAHETI PAWAN K (US)
CHARI MURALI R (US)
KRISHNAMOORTHI RAGHURAMAN (US)
International Classes:
G06T5/20
Foreign References:
US7006704B22006-02-28
Other References:
BRANISLAV KISACANIN ED: "Integral Image Optimizations for Embedded Vision Applications", IMAGE ANALYSIS AND INTERPRETATION, 2008. SSIAI 2008. IEEE SOUTHWEST SYMPOSIUM ON, IEEE, PISCATAWAY, NJ, USA, 24 March 2008 (2008-03-24), pages 181 - 184, XP031249234, ISBN: 978-1-4244-2296-8
AMIT BHATIA ET AL: "Stacked Integral Image", 2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION : ICRA 2010 ; ANCHORAGE, ALASKA, USA, 3 - 8 MAY 2010, IEEE, PISCATAWAY, NJ, USA, 3 May 2010 (2010-05-03), pages 1530 - 1535, XP031743335, ISBN: 978-1-4244-5038-1
SHOAIB EHSAN ET AL: "Novel Hardware Algorithms for Row-Parallel Integral Image Calculation", DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS, 2009. DICTA '09, IEEE, PISCATAWAY, NJ, USA, 1 December 2009 (2009-12-01), pages 61 - 65, XP031613839, ISBN: 978-1-4244-5297-2
Attorney, Agent or Firm:
HALBERT, Michael, J. (4010 Moorpark Avenue Suite 21, San Jose California, US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A method of filtering an image comprising:

receiving image data comprising an array of pixels;

performing at least one infinite impulse response (IIR) filter and at least one finite impulse response (FIR) filter on a first plurality of lines of pixels in the image data that extends in a first direction;

performing at least one IIR filter and at least one FIR filter on a second plurality of lines of pixels that extends in a second direction that is different than the first direction to produce a filtered image data; and

storing the filtered image data.

2. The method of claim 1, wherein the filtered image data approximates a Gaussian filtered image data.

3. The method of claim 1, wherein performing the at least one IIR filter and the at least one FIR filter on the first plurality of lines of pixels in the image data produces a partially filtered image data, the method further comprising storing the partially filtered image data, and wherein performing the at least one IIR filter and the at least one FIR filter on the second plurality of lines of pixels is on the partially filtered image data.

4. The method of claim 3, wherein:

performing the at least one IIR filter and the at least one FIR filter on the first plurality of lines of pixels comprises performing the at least one IIR filter and the at least one FIR filter on all lines of pixels in the image data that extend in the first direction before storing the partially filtered image data; and

performing the at least one IIR filter and the at least one FIR filter on the second plurality of lines of pixels comprises performing the at least one IIR filter and the at least one FIR filter on all lines of pixels in the filtered image data that extend in the second direction before storing the filtered image data.

5. The method of claim 1, wherein: performing the at least one IIR filter and the at least one FIR filter on the first plurality of lines of pixels comprises performing a plurality of IIR filters and a plurality of FIR filters on the first plurality of lines of pixels; and

performing the at least one IIR filter and the at least one FIR filter on the second plurality of lines of pixels comprises performing a plurality of IIR filters and a plurality of FIR filters on the second plurality of lines of pixels.

6. The method of claim 5, further comprising locally storing an output of each of the plurality of IIR filters and each of the plurality of FIR filters in a local storage unit that is different than a storage unit for storing the filtered image data.

7. The method of claim 5, wherein:

performing the plurality of IIR filters and the plurality of FIR filters on the first plurality of lines of pixels comprises performing an equal number of IIR filters and FIR filters; and

performing the plurality of IIR filters and the plurality of FIR filters on the second plurality of lines of pixels comprises performing a second equal number of IIR filters and FIR filters.

8. The method of claim 5, wherein:

performing the plurality of IIR filters and the plurality of FIR filters on the first plurality of lines of pixels comprises performing the plurality of IIR filters and the plurality of FIR filters serially; and

performing the plurality of IIR filters and the plurality of FIR filters on the second plurality of lines of pixels comprises performing the plurality of IIR filters and the plurality of FIR filters serially.

9. The method of claim 5, wherein:

performing the plurality of IIR filters and the plurality of FIR filters on the first plurality of lines of pixels comprises combining a plurality of parallel IIR filters into a single higher-order IIR filter and combining a plurality of parallel FIR filters into a single higher-order FIR filter and performing the single higher-order IIR filter and the single higher-order FIR filter serially.

10. The method of claim 9, wherein the single higher-order IIR filter is performed before the single higher-order FIR filter.

11. The method of claim 1, wherein performing the at least one IIR filter and the at least one FIR filter on the first plurality of lines of pixels comprises performing the at least one IIR filter before performing the at least one FIR filter.

12. The method of claim 1, wherein performing the at least one IIR filter and the at least one FIR filter on the first plurality of lines of pixels in the image data and performing the at least one IIR filter and the at least one FIR filter on the second plurality of lines of pixels is performed in a single pass.

13. The method of claim 12, wherein the image data comprises a first number of rows of pixels and a second number of columns of pixels, the method further comprising:

dividing the image data into a plurality of stripes, each stripe comprising the first number of rows of pixels and a third number of columns of pixels that is less than the second number of columns, wherein the at least one IIR filter and the at least one FIR filter on the first plurality of lines of pixels in the image data and the at least one IIR filter and the at least one FIR filter on the second plurality of lines of pixels are performed on each of the plurality of stripes sequentially.

14. The method of claim 5, wherein performing the plurality of IIR filters and the plurality of FIR filters on the first plurality of lines of pixels and performing the plurality of IIR filters and the plurality of FIR filters on the second plurality of lines of pixels comprises performing the plurality of IIR filters on the first plurality of lines of pixels and the plurality of IIR filters on the second plurality of lines of pixels in a common IIR stage before performing the plurality of FIR filters on the first plurality of lines of pixels and performing the plurality of FIR filters on the second plurality of lines of pixels.

15. The method of claim 14, wherein performing the plurality of FIR filters on the second plurality of lines of pixels occurs before performing the plurality of FIR filters on the first plurality of lines of pixels.

16. The method of claim 14, further comprising performing a second plurality of FIR filters on the first plurality of lines of pixels and performing a second plurality of FIR filters on the second plurality of lines of pixels after the common IIR stage.

17. An apparatus comprising:

storage unit for storing image data;

a data bus coupled to the storage unit;

a fast repeated integral images filter coupled to the storage unit through the data bus, the fast repeated integral images filter comprising:

at least one infinite impulse response (IIR) filter; and

at least one finite impulse response (FIR) filter coupled to the at least one

IIR filter;

wherein the fast repeated integral images filter is adapted to receive the image data from the storage unit through the data bus, and to filter a plurality of lines of pixels in the image data that extends in a first direction with the at least one IIR filter and the at least one FIR filter and to filter a second plurality of lines of pixels that extends in a second direction that is different than the first direction to produce a filtered image data and to output the filtered image data to the storage unit through the data bus.

18. The apparatus of claim 17, wherein the filtered image data approximates a Gaussian filtered image data.

19. The apparatus of claim 17, wherein the fast repeated integral images filter is adapted to filter the plurality of lines of pixels in the image data that extends in the first direction with the at least one IIR filter and the at least one FIR filter to produce a partially filtered image data and to output the partially filtered image data to the storage unit through the data bus and wherein the fast repeated integral images filter is adapted to receive the partially filtered image data from the storage unit and to filter the second plurality of lines of pixels in the partially filtered image data that extends in the second direction.

20. The apparatus of claim 19, wherein the fast repeated integral images filter is adapted to filter all of the plurality of lines of pixels in the image data that extend in the first direction to produce the partially filtered image data and to filter all of the plurality of lines of pixels in the partially filtered image data that extend in the second direction to produce the filtered image data.

21. The apparatus of claim 17, wherein the fast repeated integral images filter comprises:

a plurality of IIR filters; and

a plurality of FIR filters coupled to the plurality of IIR filters.

22. The apparatus of claim 21, further comprising a local storage unit, wherein the plurality of IIR filters are directly coupled to the local storage unit and the plurality of FIR filters are directly coupled to the local storage unit.

23. The apparatus of claim 21, wherein there are an equal number of IIF filters and FIR filters in the plurality of IIR filters and the plurality of FIR filters.

24. The apparatus of claim 21, wherein the plurality of IIR filters and the plurality of FIR filters are coupled in series.

25. The apparatus of claim 21, wherein the plurality of IIR filters comprises a plurality of parallel IIR filters combined into a single higher-order IIR filter and the plurality of FIR filters comprises a plurality of parallel FIR filters combined into a single higher-order FIR filter, wherein the single higher-order IIR filter is coupled in series with the single higher-order FIR filter.

26. The apparatus of claim 25, wherein the single higher-order IIR filter is coupled to receive the image data stored in the storage unit through the data bus and the single higher-order FIR filter is coupled to output a partially filtered image data to the storage unit through the data bus.

27. The apparatus of claim 17, wherein the at least one IIR filter is coupled to receive the image data stored in the storage unit through the data bus and the at least one FIR filter is coupled to output a partially filtered image data to the storage unit through the data bus.

28. The apparatus of claim 21, wherein the fast repeated integral images filter is adapted to filter the plurality of lines of pixels that extends in the first direction and to filter the second plurality of lines of pixels that extends in the second direction with the plurality of IIR filters and the plurality of FIR filters in a single pass.

29. The apparatus of claim 28, wherein the image data comprises a first number of rows of pixels and a second number of columns of pixels, the image data being divided into a plurality of stripes, each stripe comprising the first number of rows of pixels and a third number of columns of pixels that is less than the second number of columns, wherein the fast repeated integral images filter is adapted to filter the plurality of lines of pixels that extends in the first direction and to filter the second plurality of lines of pixels that extends in the second direction in each of the plurality of stripes sequentially.

30. The apparatus of claim 21, wherein the plurality of IIR filters are grouped in a common IIR stage before the plurality of FIR filters.

31. The apparatus of claim 30, further comprising a vertical FIR stage including the at least one FIR filter adapted to filter the second plurality of lines of pixels that extends in the second direction and a horizontal FIR stage including the at least one FIR filter adapted to filter the plurality of lines of pixels that extends in the first direction, wherein the vertical FIR stage is before the horizontal FIR stage.

32. The apparatus of claim 30, wherein the plurality of FIR filters are coupled to receive an output from the common IIR stage, the fast repeated integral images filter further comprising a second plurality of FIR filters coupled to receive the output from the common IIR stage.

33. An apparatus comprising:

means for receiving image data comprising an array of pixels;

means for performing at least one infinite impulse response (IIR) filter and at least one finite impulse response (FIR) filter on a first plurality of lines of pixels in the image data that extends in a first direction;

means for performing at least one IIR filter and at least one FIR filter on a second plurality of lines of pixels that extends in a second direction that is different than the first direction to produce a filtered image data; and

means for storing the filtered image data.

34. The apparatus of claim 33, wherein the means for performing the at least one IIR filter and the at least one FIR filter on the first plurality of lines of pixels in the image data produces a partially filtered image data, the apparatus further comprising means for storing the partially filtered image data, and wherein the second plurality of lines of pixels is from the partially filtered image data.

35. The apparatus of claim 33, wherein:

the means for performing the at least one IIR filter and the at least one FIR filter on the first plurality of lines of pixels comprises means for performing a plurality of IIR filters and a plurality of FIR filters on the first plurality of lines of pixels; and

the means for performing the at least one IIR filter and the at least one FIR filter on the second plurality of lines of pixels comprises means for performing a plurality of IIR filters and a plurality of FIR filters on the second plurality of lines of pixels.

36. The apparatus of claim 35, further comprising means for locally storing an output of each of the plurality of IIR filters and each of the plurality of FIR filters, the means for locally storing is different than the means for storing the filtered image data.

37. The apparatus of claim 35, wherein: the means for performing the plurality of IIR filters and the plurality of FIR filters on the first plurality of lines of pixels performs the plurality of IIR filters and the plurality of FIR filters serially; and

the means for performing the plurality of IIR filters and the plurality of FIR filters on the second plurality of lines of pixels performs the plurality of IIR filters and the plurality of FIR filters serially.

38. The apparatus of claim 35, wherein:

the means for performing the plurality of IIR filters and the plurality of FIR filters on the first plurality of lines of pixels comprises a plurality of parallel IIR filters combined into a single higher-order IIR filter and a plurality of parallel FIR filters combined into a single higher-order FIR filter, wherein the single higher-order IIR filter is performed in series with the single higher-order FIR filter.

39. The apparatus of claim 33, wherein the means for performing the at least one IIR filter and the at least one FIR filter on the first plurality of lines of pixels in the image data and the means for performing the at least one IIR filter and the at least one FIR filter on the second plurality of lines of pixels use a single pass for the image data.

40. The apparatus of claim 39, wherein the image data comprises a first number of rows of pixels and a second number of columns of pixels, the apparatus further comprising means for dividing the image data into a plurality of stripes, each stripe comprising the first number of rows of pixels and a third number of columns of pixels that is less than the second number of columns, wherein the means for performing the at least one IIR filter and the at least one FIR filter on the first plurality of lines of pixels and the means for performing the at least one IIR filter and the at least one FIR filter on the second plurality of lines of pixels operate on each of the plurality of stripes sequentially.

41. The apparatus of claim 35, wherein the means for performing the plurality of IIR filters and the plurality of FIR filters on the first plurality of lines of pixels and the means for performing the plurality of IIR filters and the plurality of FIR filters on the second plurality of lines of pixels comprises means for performing the plurality of IIR filters on the first plurality of lines of pixels and the plurality of IIR filters on the second plurality of lines of pixels in a common IIR stage that is before the means for performing the plurality of FIR filters on the first plurality of lines of pixels and the means for performing the plurality of FIR filters on the second plurality of lines of pixels.

42. The apparatus of claim 41, wherein the means for performing the plurality of FIR filters on the second plurality of lines of pixels is before the means for performing the plurality of FIR filters on the first plurality of lines of pixels.

43. The apparatus of claim 41, further comprising a means for performing a second plurality of FIR filters on the first plurality of lines of pixels and means for performing a second plurality of FIR filters on the second plurality of lines of pixels that is after the common IIR stage.

44. A computer-readable medium including program code stored thereon,

comprising:

program code to receive image data comprising an array of pixels;

program code to perform at least one infinite impulse response (IIR) filter and at least one finite impulse response (FIR) filter on a first plurality of lines of pixels in image data that extends in a first direction;

program code to perform at least one IIR filter and at least one FIR filter on a second plurality of lines of pixels that extends in a second direction to produce a filtered image data, the second direction being different than the first direction; and

program code to store the filtered image data.

45. The computer-readable medium of claim 44, wherein the filtered image data approximates a Gaussian filtered image data.

46. The computer-readable medium of claim 44, wherein the program code to perform the at least one IIR filter and the at least one FIR filter on the first plurality of lines of pixels produces a partially filtered image data, further comprising program code to output the partially filtered image data to a storage unit and wherein the at least one IIR filter and the at least one FIR filter is performed on the second plurality of lines of pixels from the partially filtered image data.

47. The computer-readable medium of claim 44, wherein:

the program code to perform performing the at least one IIR filter and the at least one FIR filter on the first plurality of lines of pixels comprises program code to perform a plurality of IIR filters and a plurality of FIR filters on the first plurality of lines of pixels; and

the program code to perform the at least one IIR filter and the at least one FIR filter on the second plurality of lines of pixels comprises program code to perform a plurality of IIR filters and a plurality of FIR filters on the second plurality of lines of pixels.

48. The computer-readable medium of claim 47, further comprising program code to locally store an output of each of the plurality of IIR filters and each of the plurality of FIR filters.

49. The computer-readable medium of claim 47, wherein:

the program code to perform the plurality of IIR filters and the plurality of FIR filters on the first plurality of lines of pixels comprises program code to perform the plurality of IIR filters and the plurality of FIR filters serially; and

the program code to perform the plurality of IIR filters and the plurality of FIR filters on the second plurality of lines of pixels comprises the program code to perform the plurality of IIR filters and the plurality of FIR filters serially.

50. The computer-readable medium of claim 47, wherein:

the program code to perform the plurality of IIR filters and the plurality of FIR filters on the first plurality of lines of pixels comprises program code to combine a plurality of parallel IIR filters into a single higher-order IIR filter and to combine a plurality of parallel FIR filters into a single higher-order FIR filter and to perform the single higher-order IIR filter and the single higher-order FIR filter serially.

51. The computer-readable medium of claim 44, further comprising program code to perform the at least one IIR filter and the at least one FIR filter on the first plurality of lines of pixels and to perform the at least one IIR filter and the at least one FIR filter on the second plurality of lines of pixels comprises program code in a single pass.

52. The computer-readable medium of claim 47, wherein the program code to perform the plurality of IIR filters and the plurality of FIR filters on the first plurality of lines of pixels and the program code to perform the plurality of IIR filters and the plurality of FIR filters on the second plurality of lines of pixels comprises program code to perform the plurality of IIR filters on the first plurality of lines of pixels and to perform the plurality of IIR filters on the second plurality of lines of pixels in a common IIR stage and to perform the plurality of FIR filters on the second plurality of lines of pixels before performing the plurality of FIR filters on the first plurality of lines of pixels.

Description:
FAST REPEATED INTEGRAL IMAGES

Ming-Chang Tsai

Pawan K. Baheti

Murali R. Chari

Raghuraman Krishnamoorthi

[0001] This application claims priority to U.S. Provisional Application No. 61/413,814, filed November 15, 2010 and entitled "Fast Repeated Integral Images", and to U.S. Serial No. 13/230,775, filed September 12, 2011, and entitled "Fast Repeated Integral Images, both of which are assigned to the assignee hereof and are incorporated herein by reference.

BACKGROUND

[0002] Digital images are typically processed prior to displaying or otherwise using the images in various applications. Image processing is a form of signal processing in which the input is an image, such as a photograph or frame of view and the output is an image or parameters related to the image, which may then be used in various applications, such as computer vision, computer graphics etc. Image processing may be used to enhance the image, e.g., by removing noise, improving contrast or sharpness, etc., or to extract data from the image.

[0003] Computing a moving average is common in many applications including one- dimensional signal processing problems such as real-time acoustic echo cancellation. The moving average operation is typically a simple rectangular or boxcar filtering that is performed by adding newer samples coming into one end of the window, and subtracting older samples leaving from the other end at a step size required by the resolution of the moving average computation. The two-dimensional (2D) equivalent is typically performed by taking 2D differential among four corners of the intended spatial area, e.g., if rectangular, in an integral image. Variations of integral image exist to support rotation and non-rectangular areas when necessary. For sophistication beyond uniform 2D filters, multiple integral images can be stacked to mimic a desired filter shape with the trade-off between the cost of computational complexity and metrics such as the accuracy of filter approximation or other depending on the application. For some of the filter shapes, such as a Gaussian function, a more effective approximation is possible by repeated boxcar filtering, using integral images for computational efficiency, commonly referred to as repeated integral images method.

[0004] However, despite its accuracy, the cost of the data I/O bandwidth that results from the conventional architecture of repeated integral images method sometimes render the method unsuitable for many applications. Thus, an improved architecture for repeated integral images is desired.

SUMMARY

[0005] A repeated integral images method filters image data in only two passes, e.g., the first pass filters horizontal rows of pixels and a second pass filters vertical columns of pixels, or in a single pass. The filter performs at least one infinite impulse response (IIR) filter and at least one finite impulse response (FIR) filter on the image data. A plurality of IIR filters and FIR filters maybe performed to approximate a Gaussian filter. By minimizing the number of passes, the data flow between the processing unit and the storage unit is greatly reduced compared to conventional repeated integral images method thereby improving computation time.

[0006] In one implementation, a method of filtering an image includes receiving image data comprising an array of pixels, performing at least one infinite impulse response (IIR) filter and at least one finite impulse response (FIR) filter on a first plurality of lines of pixels in the image data that extends in a first direction and performing at least one IIR filter and at least one FIR filter on a second plurality of lines of pixels that extends in a second direction that is different than the first direction to produce a filtered image data. The filtered image data is then stored.

[0007] In another implementation, an apparatus includes a storage unit for storing image data, a data bus coupled to the storage unit, and a fast repeated integral images filter coupled to the storage unit through the data bus, the fast repeated integral images filter includes at least one infinite impulse response (IIR) filter; and at least one finite impulse response (FIR) filter coupled to the at least IIR filter. The fast repeated integral images filter is adapted to receive image data from the storage unit through the data bus, and to filter a plurality of lines of pixels in the image data that extends in a first direction with the at least one IIR filter and at least FIR filter and to filter a second plurality of lines of pixels that extends in a second direction that is different than the first direction to produce a filtered image data and to output the filtered image data to the storage unit through the data bus.

[0008] In another implementation, an apparatus includes means for receiving image data comprising an array of pixels, means for performing at least one infinite impulse response (IIR) filter and at least one finite impulse response (FIR) filter on a first plurality of lines of pixels in the image data that extends in a first direction and means for performing at least one IIR filter and at least one FIR filter on a second plurality of lines of pixels that extends in a second direction that is different than the first direction to produce a filtered image data. The apparatus further includes means for storing the filtered image data.

[0009] In yet another implementation, a computer-readable medium including program code stored thereon, includes program code to receive image data comprising an array of pixels; program code to perform at least one infinite impulse response (IIR) filter and at least one finite impulse response (FIR) filter on a first plurality of lines of pixels in image data that extends in a first direction; program code to perform at least one IIR filter and at least one FIR filter on a second plurality of lines of pixels that extends in a second direction to produce a filtered image data, the second direction being different than the first direction; and program code to store the filtered image data.

BRIEF DESCRIPTION OF THE DRAWING

[0010] Fig. 1 illustrates a direct filtering implementation using a conventional 2D boxcar filter.

[0011] Fig. 2 illustrates a conventional filtering implementation using integral images.

[0012] Figs. 3 A and 3B illustrate a top view and side view, respectively, of

conventional ID stacked boxcar filters.

[0013] Fig. 4 illustrates the convolution of ID boxcar filters, illustrating that with each repetition a better approximation of a Gaussian is produced. [0014] Fig. 5 is a block diagram illustrating the architecture for performing a conventional repeated integral images method.

[0015] Fig. 6 illustrates one implementation of the fast repeated integral images filter.

[0016] Fig. 7 illustrates another implementation of the fast repeated integral images filter.

[0017] Fig. 8 is a block diagram illustrating the architecture for the fast repeated integral images filter shown in Fig. 6.

[0018] Fig. 9 illustrates an implementation of a two dimensional fast repeated integral images filter.

[0019] Fig. 10 illustrates another implementation of a two dimensional fast repeated integral images filter.

[0020] Fig. 11 is a block diagram illustrating the architecture for a 2D fast repeated integral images filter, such as that shown in Fig. 10.

[0021] Fig. 12 illustrates another implementation of a two dimensional fast repeated integral images filter.

[0022] Fig. 13 illustrates another implementation of a two dimensional fast repeated integral images filter with an expanded FIR filter.

[0023] Fig. 14 illustrates an image divided into multiple stripes that may be separately processed by the two dimensional fast repeated integral images filter.

[0024] Fig. 15 is a flow chart illustrating the operation of the fast repeated integral images filter.

[0025] Fig. 16 illustrates a block diagram of a mobile platform which is capable performing the fast repeated integral images method.

DETAILED DESCRIPTION

[0026] An efficient implementation of a fast repeated integral images method, e.g., for approximating a Gaussian filter, may be used for filtering images in applications such as computer vision, computer graphics, etc. The proposed approach minimizes

computational complexity, data input/output (I/O) bandwidth, as well as cache requirement while maintaining maximum flexibility for implementation in comparison to known techniques, such as boxcar filtering, stacked integral images, and repeated integral images method.

[0027] Fig. 1 illustrates a direct filtering implementation using a conventional 2D boxcar filter 10, which is commonly used by many applications either stand-alone or as foundation for more sophisticated functions. Conventional 2D boxcar filter 10 provides a moving average of an image 12. The 2D boxcar filter 10 has a width of n pixel and a height of m pixels, that moves across the image 12, in both the X and Y directions, e.g., in a raster scan, as illustrated by arrows. As the boxcar filter 10 moves across the image 12, a moving average of the pixels within the boxcar filter 10 is generated. However, computational complexity as well as memory access bandwidth for 2D boxcar filtering could be unreasonably high if implemented directly as shown in Fig. 1 given large number of redundant computation as well as memory accesses.

[0028] Fig. 2 illustrates a conventional filtering implementation using integral images. Integral images is a more efficient implementation of 2D boxcar filtering that uses a simple differential among the four corners of the spatial area 20 of the integral image. The spatial area 20 is typically rectangular, but can be other shapes. The 2D

accumulation of pixel values is generated only once. In other words, the pixels of image 22 are accumulated horizontally, as illustrated by the triangle 24, and then accumulated vertically as illustrated by triangle 26. As illustrated by the + and - signs in Fig. 2, the boxcar filtering is produced as a 2D differential among the four rectangular areas, which is moved along the horizontal axis, then along the vertical axis.

[0029] A plurality of stacked 2D boxcar filters of different sizes may be used to implement non-uniform filters, such as staircase functions, or even to approximate more sophisticated filters. To improve efficiency, stacked boxcar filtering can be extended to the integral image method shown in Fig. 2 using multiple sets of four corners, one set for each boxcar filter, referred to as stacked integral images. Figs. 3 A and 3B illustrate a top view and side view, respectively, of conventional 2D stacked boxcar filters, in which there are stacked 1, 2, 3, 4, and 5 boxcar filters, and which may be used with stacked integral images to approximate a Gaussian filter. Variations are possible, such as spatial rotation of the stack of boxcar filters, e.g., by 45°, for finer granularity in filter widths over the pixel grid, or a combination of rotation and scaling of individual boxcar filters for desired optimization depending on the requirements of applications. The approximation of the desired filter may be improved using a sufficiently large number of boxcar filters.

[0030] For non-uniform filter functions, such as a Gaussian function, a more accurate approximation may be more efficiently achieved by repeatedly convolving, instead of stacking, boxcar filters as suggested by the law of large number from probability theory. Fig. 4 illustrates the convolution of ID boxcar filters, where it can be seen with each repetition a better approximation of a Gaussian is produced.

[0031] The repeatedly convolving approach may also be used to improve efficiency of integral image method, which is commonly referred to as the repeated integral images method. Repeated integral images method differs from the stacked integral images method, in that the repeated integral images method requires one image integral per boxcar filtering. Fig. 5 is a block diagram illustrating the architecture 40 for performing a repeated integral images method for Gaussian approximation. As can be seen, the architecture 40 includes a processing unit 42 and storage unit 44, as well as a data bus 46. The processing unit 42 performs a 2D image integral 48 followed by 2D image differential 50, as described in Fig. 2, sequentially multiple times, illustrated by the dotted arrow between a plurality of 2D image integrals 48 and 2D image differentials 50. The input and output each time the 2D image integral 48 is performed and each time the 2D image differential 50 is retrieved from and provided to the storage unit 44 through data bus 46.

[0032] Using a Gaussian function approximation as a common and practical use case for comparison of direct boxcar filtering, stacked integral images and repeated integral images, the complexity of these methods is shown below in Table 1, including a computation in number of accumulations per pixel, data I/O bandwidth in number of passes of which the entire resulted image must be written out to memory for subsequent processing, and cache requirement in number of pixels that it must hold in order to fully support generation of single output pixel within any particular pass. Method Stages Accumulations/Pixel Passes Cache

Direct boxcar M ∑(Wi*Li-l), i=l..M M > Max(Wi*Li), i=l..M filtering

Stacked M 3*M+2 2 > Max(Wi*Li), i=l..M integral

images

Repeated N 5*N 2*N > Max(Wi*Li), i=l..N integral

images

Table 1 - Comp exity of three approaches for Gaussian approximation

[0033] Notice in Table 1 that the stacked integral images and the direct boxcar filtering methods have the same number of stages M as they are mathematically equivalent. Additionally, repeated integral images method may use fewer stages N<M, while achieving the same accuracy of approximation. For example, empirically, M is at least 6 or more if N = 3. The cache requirement is the same for all three methods, where Wi and Li are the width and length of individual boxcar filters.

[0034] Based on analysis in Table 1, stacked integral images method is a better choice over direct boxcar filtering method for Gaussian approximation given the efficiency advantages in number of accumulations per pixel and the number of passes. However, the choice between stacked integral images method and repeated integral images method is not as clear given that the repeated integral images method has a moderate advantage in computation, (5*N)=(5*3)=15, over that of stacked integral images method, (3*M+2)=(3*6+2)=20, but requires a significantly greater number of passes, 2*N=2*3=6, than that of stacked integral images method, only 2. Thus, the choice between the stacked integral images method and the repeated integral images method is dependent on the cost trade-off of particular platforms, unless M » N.

[0035] The proposed approach for fast repeated integral images method of filtering retains the nature of repeated integral images method from the viewpoint of signal processing, but utilizes an improved architecture for much more efficient operation, which requires modestly less computation and significantly less flow of data between processing unit and storage unit, as well as significantly less local storage within processing unit.

[0036] The fast repeated integral images method recognizes that the image integral is 2D infinite impulse response (IIR) filtering, which is separable as two one-dimensional (ID) IIRs. Moreover, the summation or average of a 2D rectangular region by addition and subtractions, or differential in general, among the four corners of the desired region in the integral image is a 2D finite impulse response (FIR) filtering, which is also separable as two ID FIRs. Further, it is to be noted that a system of cascaded linear filters allow the stages of linear filters to be re-ordered, while retaining functional equivalence end-to-end. Additionally, all cascaded horizontal linear filtering stages can be performed in one single pass and all cascaded vertical linear filtering stages can also be performed in one single pass separately.

[0037] The operation of 2D image integral can be expressed as: out(x, y) = in(x, y) + out(x— l,y) + out(x, y— l)— out(x - 1, y - 1) eq.1 Out(z x ,z ) = In(z x ,z ) + Out(z x ,z )z ~l + Out(z x ,z )z ~l -Out(z x ,z )z ~l z ~l eq.2

[0038] The operation of a 2D differential (e.g., spatial area 20 in Fig.2) over an integral image can be expressed as: out(x, y) = in(x,y)-in(x-W,y)-in(x,y-L) + in(x-W,y-L) eq.4

Out(z x ,z y ) = In(z x ,z y )-In(z x ,z y )z ~w +In(z x ,z y )z y ~L -In(z x ,z y )z x ~w z y ~L eq.5

Out(z x ,z y ) _ w _ L

d z x ,z y ) =-— y - = (l ~ z x )(l-z y ) eq.6

lfl(Z x ,Z )

[0039] Where W is the width of the 2D region and L is the length of the 2D region.

[0040] A single stage box filter can be expressed as: eqs. 7

[0041] Repeated integral images method can be expressed as:

R epeated(z x , z y ) = Y[j(z x , z y ' $ i (z x , z y ),i = l...N

i

= n[( 1 - ¾ 1 ) "1 ( 1 - ¾ w )][( 1 - 1 ) "1 ( 1 - £i )]. i = 1 "^ eqs. 8

= {all horizontal operations} {(followed by) all vertical operations}

[0042] Fig. 6 illustrates one implementation of the fast repeated integral images filter 100, which may be used for Gaussian approximation. The fast repeated integral images filter 100 includes at least one ID infinite impulse response (IIR) filter 110 and at least one ID finite impulse response (FIR) filter 120 coupled together. The IIR filter 110 includes, e.g., a 1-sample delay flip-flop 112 and an adder 114. The FIR filter 120 includes a plurality of 1-sample delay flip-flops 112, 122, and 124 and an adder 114 that receives the output of adder 114 from the IIR filter 110 and the inverted output of the last flip-flip 124. The number of flip-flops 112, 122, and 124 used in the FIR filter 120 is determined by the desired level of resemblance to the target filter, such as a Gaussian filter. In Fig. 6, the IIR filter 110 and FIR filter 120 set is shown repeated twice, but it should be understood that the IIR filter 110 and FIR filter 120 may be repeated as many times as desired (including zero repeats), where each repetition improves the

approximation of the Gaussian. Moreover, if desired, the order of the IIR filter 110 and FIR filter 120 may be switched, i.e., the FIR filter 120 may precede the IIR filter 110. Further, the order of the filters 110 and 120 may vary between different repeating sets of filters. The use of cascaded linear filters allows the stages of linear filters to be reordered while retaining functional equivalence end-to-end. In operation, the horizontal lines are each passed through the fast repeated integral images filter 100 in a single pass and separately the vertical lines are each passed through the fast repeated integral images filter 100 in a single pass, assuming the differential region has the same width and length. If desired, separate cascaded linear filters may be used for the vertical lines and the horizontal lines.

[0043] Fig. 7 illustrates another implementation of the fast repeated integral images filter 150, which may be used for Gaussian approximation. As discussed above in reference to Fig. 6, the use of cascaded linear filters allows the stages of linear filters to be re-ordered while retaining functional equivalence end-to-end. Thus, filter 150 in Fig. 7 is functionally equivalent end-to-end with filter 100 shown in Fig. 6, but the order of the linear filters is varied. As illustrated in Fig. 7, an IIR filter 160 is produced by combining two IIR filters including two 1 -sample delay flip-flops 162, 164, and two adders 166, 168, resulting in a higher-order IIR filter 160. The IIR filter 160 is in series with a single higher order FIR filter 180, which is produced by two FIR filters, including 1-sample delay flip-flops 162, 164, 182, 184, and 186, and adders 188, 190 and 192. It should be understood that if desired, an image may be filtered vertically before horizontally. Other functionally equivalent configurations of n linear IIR filters coupled to n linear FIR filters may be used.

[0044] Fig. 8 is a block diagram illustrating the architecture 200 for a multi stage fast repeated integral images filter 210, similar to that shown in Fig. 6. The architecture 200 includes a processing unit 220, which is illustrated as performing the fast repeated integral images filter 210, a storage unit 230, as well as a data bus 240. In operation, an image is input to the fast repeated integral images filter 210 from the storage unit 230, through data bus 240, only once and a filtered image is output from the fast repeated integral images filter 210 to the storage unit 230, through data bus 240, only once. The fast repeated integral images filter 210 is used twice, once to filter an image in the horizontal dimension and a second time to filter the image in the vertical dimension. If desired, two separate fast repeated integral images filters 210 may be used to process the horizontal dimension and the vertical dimension sequentially. The fast repeated integral images filter 210 includes a ID IIR image integral (sometimes referred to as IIR filter) 212 and by a ID FIR image differential (sometimes referred to as FIR filter) 214, as described in Figs. 6 and 7, which sequentially filter the image multiple times, as illustrated by the dotted arrow between the plurality of IIR filters 212 and FIR filters 214. Each IIR filter 212 and FIR filter 214 are coupled to a local storage 222 in the processing unit 220. The dotted arrows between the IIR filters 212 and FIR filters 214 illustrate the flow of the process as opposed to a direct connection. As described above, the order as well as the number of IIR filters 212 and FIR filters 214 may be varied if desired.

[0045] The architecture 200 and method 300 for the fast repeated integral images filter provides a more efficient operation for repeated integral images, as illustrated in Table 2 below which shows the complexity of repeated integral images and fast repeated integral images for Gaussian approximation.

Table 2 - Complexity of four approaches for Gaussian approximation

[0046] As can be seen, in comparison to a conventional repeated integral images method, the computational complexity reduction with the fast repeated integral images method is about 25% going from (5*N) to (4*N), but there is significant reduction in the number of passes from (2*N) to 2. Thus, the fast repeated integral images method is clearly a better choice than the stacked integral images method (shown in Table 1) for advantages in computational complexity, while requiring no additional passes or greater cache requirement. [0047] The advantages of fast repeated integral images method and architecture over conventional methods and architecture for repeated integral images method include (1) lower computational complexity, (2) much lower requirement for data flow between processing unit and storage unit, (3) much lower requirement for local data storage within processing unit, (4) fundamentally better choice instead of by making different trade-off among computation, data flow, and local storage, (5) easily extendible to operation in higher dimensional space or for higher order of repeated integral images.

[0048] If desired, the fast repeated integral images filter may be performed in a single pass, i.e., filtering in two dimensions, as opposed to two passes as required by a one dimension filter. Fig. 9 illustrates an implementation of a 2D fast repeated integral images filter 300. The fast repeated integral images filter 300 is illustrated as including a cascade of two stages 310 and 330, but fewer or additional stages may be used if desired. As illustrated, the first stage 310 is divided into a horizontal portion 311, which filters rows of pixels in the x direction, and a vertical portion 321 that filters columns of pixels in the y direction. The horizontal portion 311 of the first stage 310 includes an IIR filter 312 coupled to an FIR filter 314, which has a filter width of m. The vertical portion 321 of the first stage 321 includes an IIR filter 322 coupled to an FIR filter 324, which has a filter width of n. The second stage 330 similarly includes IIR filters 332 and 342 for the rows and columns of pixels, respectively, and FIR filters 334 and 344 for the rows and columns of pixels, respectively.

[0049] As discussed above, the order of the IIR filters and FIR filters may be varied. For example, as illustrated in Fig. 10, a modified 2D fast repeated integral images filter 300A cascades all the IIR filters 312, 322, 332, 342 together in a common IIR stage 302, as these are independent of the FIR operations. The FIR filters may be grouped into a vertical stage 304, including FIR filters 324 and 344, and a horizontal stage 306, including FIR filters 314 and 334.

[0050] Fig. 11 is a block diagram illustrating the architecture 200' for a 2D fast repeated integral images filter 300, such as that shown in Fig. 10. Similar to the architecture 200 described in Fig. 8, the architecture 200' includes a processing unit 220, which is illustrated as performing the 2D fast repeated integral images filter 300, a storage unit 230, as well as a data bus 240. In operation, an image is input to the 2D fast repeated integral images filter 300 from the storage unit 230, through data bus 240, only once and a filtered image is output from the 2D fast repeated integral images filter 300 to the storage unit 230, through data bus 240 only once. Additionally, the 2D fast repeated integral images filter 300 is used in one pass to filter the image in both the horizontal dimension and the vertical dimension. The fast repeated integral images filter 300 is illustrated as including the common IIR stage 262, as well as the FIR vertical stage 264 and FIR horizontal stage 266, as describe in Fig. 10, and which sequentially filter the image one or more times, as illustrated by the dotted arrow which show the flow of the process as opposed to a direct connection. Each IIR filter and FIR filter in the common IIR stage 262, FIR vertical stage 264 and FIR horizontal stage 266 is coupled to a local storage unit 222 in the processing unit 220. As described above, the order as well as the number of IIR filters and FIR filters may be varied if desired and thus, may not be grouped in the common IIR stage 262, FIR vertical stage 264 and FIR horizontal stage 266 as illustrated in Fig. 11.

[0051] The implementation of the 2D fast repeated integral images filter reduces complexity and requires 7x-8x less operations than conventional implementation of convolution with a 2D Gaussian filter. The implementation of the 2D fast repeated integral images filter, however, requires additional local storage relative to the ID fast repeated integral images filter as described in Table 2 above. The amount of local storage for implementation of the 2D fast repeated integral images filter with K stages and FIRs of length Μ 1; M 2 , ... M K includes 1 row for each IIR filter for a total of K rows, where the bit width of the IIRs is determined by the maximum length of the FIRs that follow, and Μ 1; M 2 , ... Μκ rows for intermediate output storage of the FIRs.

[0052] As illustrated in Fig. 12, the 2D fast repeated integral images filter 300B, similar to that shown in Fig. 10, may include a single common IIR stage 302 and multiple FIR filter banks 305 and 307. With multiple FIR filter banks sharing the same common IIR stage 302, different types of box filters may be implemented without requiring repeating of the IIR filter process, thereby reducing processing time. Moreover, as illustrated in Fig. 13, another implementation of the 2D fast repeated integral images filter 300C, similar to that shown in Fig. 12, includes within FIR filter banks 313 and 315 a single longer FIR filter, i.e., expanded FIR Y, which is used in place of multiple FIR Y filters, shown in Fig. 12. The use of expanded FIR filters within the FIR filter banks 313 and 315 is advantageous as it reduces the local storage requirements.

[0053] Additionally, the 2D fast repeated integral images filter may divide an image with a plurality of stripes so that the local storage requirements are independent of image size and there is a fixed cost per pixel in terms of arithmetic complexity. Fig. 14 illustrates a striped image 450 and a 2D fast repeated integral images filter 300D capable of processing the striped image 450.

[0054] As illustrated in Fig. 14, the striped image is divided into multiple vertically oriented stripes 452, 454, and 456, so that each tile has the same number of rows of pixels but each tile has fewer columns of pixels than are present in the full image 450. As illustrated, the stripes 452, 454, and 456 have a width of P columns and a height H. A portion of the width of columns 452 and 454 is identified as overhead N, which is equal to the maximum filter length in the IIR stage 302.

[0055] The 2D fast repeated integral images filter 300D shown in Fig. 14 includes an IIR Output Buffer 303, but otherwise may be similar to previously described 2D fast repeated integral images filters. Fig. 14 also shows the inclusion of an additional FIR bank 317, but fewer or additional FIR banks may be used. The 2D fast repeated integral images filter 300D processes each stripe separately, i.e., both the horizontal width and height of the stripe, before moving to the next stripe, which is advantageous as it reduces the required size of the local storage unit 322 (shown in Fig. 11). The required cache size is the product of the stripe width P plus the overhead N from any previous stripe, and the height required for the FIR computation, i.e., maximum filter length * (P+N). In order to filter subsequent stripes, e.g., stripes 454 and 456, the 2D fast repeated integral images filter 300D requires information from the preceding stripe, which is the overhead N. Thus, additional memory accesses in the amount of N samples/row/block is provided by the IIR Output Buffer 303. The 2D fast repeated integral images filter 300D is considered a single pass filter even though multiple stripes are sequentially filtered as the data is from the image is read once from each location of the image. The size of P is chosen so that N/P is small. For example, choosing P=80 and N=25, the extra overhead in memory access is 30%, i.e., execution time is increased by 30%. With P=80, however, the required cache size of the local storage unit 372, is less than 20kB and is independent of the image size.

[0056] Fig. 15 is a flow chart illustrating the method 500 of the fast repeated integral images filter. As illustrated, image data that includes an array of pixels is received by the fast repeated integral images filter (502). The image data is filtered by the fast repeated integral images filter to generate filtered image data (504). The filtered image data is generated by performing at least one IIR filter and at least one FIR filter on a plurality of lines of pixels (e.g., rows of pixels) in the image data that extend in the first direction (506). The filtered image data is further generated by performing at least one at least one IIR filter and at least one FIR filter on a plurality of the lines of pixels (e.g., columns of pixels) that extend in the second direction thereby producing the filtered image data (508). The filtered image data is output and stored, e.g., in the storage unit, e.g., storage unit 230 (510). The generation of the filtered image data (504) may be performed in two passes, e.g., using ID fast repeated integral images filter 210. For example, performing at least one at least one IIR filter and at least one FIR filter on the lines of pixels in the first direction (506), produces partially filtered image data, which is provided to and stored, e.g., in the storage unit 230 (507) and retrieved prior to performing the IIR filter and FIR filter on the lines of pixels in the second direction in the partially filtered image data (508). All of the lines may be filtered in the first direction before filtering all of the lines in the second direction and, thus,

communication with the storage unit 230 is minimized. If desired, however, if desired, a subset of lines may be filtered at a time, which will require additional passes.

[0057] Alternatively, the generation of the filtered image data (504) may be performed in a single pass, e.g., using the 2D fast repeated integral images filters described above, in which case step 507 is unnecessary. Additionally, if the 2D fast repeated integral images filter is used to generate the filtered image data, the image data may be striped so that multiple portions of the image are sequentially filtered, as discussed above, in which case certain portions of the striped image (e.g., overhead N) are filtered twice resulting in additional overhead.

[0058] The fast repeated integral images method may be performed by any apparatus with the appropriate architecture. By way of example, Fig. 16 illustrates a block diagram of a mobile platform 600 which is capable performing the fast repeated integral images method. A mobile platform refers to any portable electronic device such as a cellular or other wireless communication device, personal communication system (PCS) device, personal navigation device (PND), Personal Information Manager (PIM), Personal Digital Assistant (PDA), or other suitable mobile device. The fast repeated integral images method may be performed in devices that stationary or partially stationary as well.

[0059] The mobile platform 600 is illustrated as including a camera 602, which is capable of capturing images of video frames to be filtered. The mobile platform 600 also includes a user interface 610 that includes the display 612, e.g., capable of displaying images captured by the camera 602. The user interface 610 may also include a keypad 618 or other input device through which the user can input information into the mobile platform 600. If desired, the keypad 618 may be obviated by integrating a virtual keypad into the display 612 with a touch sensor. The user interface 610 may also include a microphone 616 and speaker 614, e.g., if the mobile platform is a cellular telephone. Of course, mobile platform 600 may include other elements unrelated to the present disclosure.

[0060] The mobile platform 600 also includes a control unit 620 that is connected to and communicates with the camera 602 and user interface 610, along with other features. The control unit 620 may be provided by a processor 622 and associated memory/storage 624, which may include software 626, as well as hardware 628, and firmware 630. The mobile platform 600 is illustrated as including a fast repeated integral images filter 632, which may be any of the ID fast repeated integral images filters or any of the 2D fast repeated integral images filters described above, and which is coupled to local storage 222 and coupled to storage unit 230 through the data bus 240. The fast repeated integral images filter 632 is illustrated separately from processor 622 for clarity, but may be implemented in the processor 622 based on instructions in the software 626 which is run in the processor 622. It will be understood as used herein that the processor 622, as well as the fast repeated integral images filter 632 can, but need not necessarily include, one or more microprocessors, embedded processors, controllers, application specific integrated circuits (ASICs), digital signal processors (DSPs), and the like. The term processor is intended to describe the functions implemented by the system rather than specific hardware. Moreover, as used herein the terms "memory" and "storage" refers to any type of computer storage medium, including long term, short term, or other memory associated with the mobile platform, and is not to be limited to any particular type of memory or number of memories, or type of media upon which memory is stored.

[0061] The methodologies described herein may be implemented by various means depending upon the application. For example, these methodologies may be

implemented in hardware 628, firmware 630, software 626, or any combination thereof. For a hardware implementation, the fast repeated integral images filter 632 may be implemented within one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), processors, controllers, micro-controllers, microprocessors, electronic devices, other electronic units designed to perform the functions described herein, or a combination thereof.

[0062] For a firmware and/or software implementation, the methodologies may be implemented with modules (e.g., procedures, functions, and so on) that perform the functions described herein. Any machine-readable medium tangibly embodying instructions may be used in implementing the methodologies described herein. For example, software codes may be stored in memory 624 and executed by the processor 622. Memory may be implemented within or external to the processor 622.

[0063] If implemented in firmware and/or software, the functions may be stored as one or more instructions or code on a computer-readable medium. Examples include non- transitory computer-readable media encoded with a data structure and computer- readable media encoded with a computer program. Computer-readable media includes physical computer storage media. A storage medium may be any available medium that can be accessed by a computer. By way of example, and not limitation, such computer- readable media can comprise RAM, ROM, Flash Memory, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer; disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.

[0064] Although the present invention is illustrated in connection with specific embodiments for instructional purposes, the present invention is not limited thereto. Various adaptations and modifications may be made without departing from the scope of the invention. Therefore, the spirit and scope of the appended claims should not be limited to the foregoing description.