Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
OBJECT DETECTION BASED ON ANALYSIS OF A SEQUENCE OF IMAGES
Document Type and Number:
WIPO Patent Application WO/2020/035524
Kind Code:
A1
Abstract:
The proposal concerns a method, an apparatus, and a computer program comprising program code for detecting a wanted object in a sequence of images captured by an image sensor. Information about a bounding rectangle of a moving object in the sequence of images and information about a contour area of the moving object are obtained (10). Geometrical coefficients are then determined (11) for the moving object using the information about the bounding rectangle and the information about the contour area. Based on at least the geometrical coefficients it is determined (12) whether the moving object is a wanted moving object.

Inventors:
CHMIELEWSKI INGO (DE)
SIEMENS EDUARD (DE)
MATVEEV IVAN (DE)
Application Number:
PCT/EP2019/071797
Publication Date:
February 20, 2020
Filing Date:
August 14, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HOCHSCHULE ANHALT (DE)
International Classes:
G06K9/00
Domestic Patent References:
WO2003098977A12003-11-27
Foreign References:
US20150289337A12015-10-08
Other References:
ANTONIO FERNÁNDEZ-CABALLERO ET AL: "Thermal-Infrared Pedestrian ROI Extraction through Thermal and Motion Information Fusion", SENSORS, vol. 14, no. 4, 10 April 2014 (2014-04-10), pages 6666 - 6676, XP055558461, DOI: 10.3390/s140406666
EUN JEON ET AL: "Human Detection Based on the Generation of a Background Image by Using a Far-Infrared Light Camera", SENSORS, vol. 15, no. 3, 19 March 2015 (2015-03-19), pages 6763 - 6788, XP055558462, DOI: 10.3390/s150306763
BERTOZZI M ET AL: "Pedestrian Detection for Driver Assistance Using Multiresolution Infrared Vision", IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 53, no. 6, November 2004 (2004-11-01), pages 1666 - 1678, XP011122457, ISSN: 0018-9545, DOI: 10.1109/TVT.2004.834878
ZHAO XINYUE ET AL: "Robust pedestrian detection in thermal infrared imagery using a shape distribution histogram feature and modified sparse representation classification", PATTERN RECOGNITION, vol. 48, no. 6, 24 December 2014 (2014-12-24), pages 1947 - 1960, XP029200786, ISSN: 0031-3203, DOI: 10.1016/J.PATCOG.2014.12.013
T. TRNOVSZKY ET AL., COMPARISON OF BACKGROUND SUBTRACTION METHODS ON NEAR INFRA-RED SPECTRUM VIDEO SEQUENCES, 2017
Attorney, Agent or Firm:
SCHMIDT-UHLIG, Thomas (DE)
Download PDF:
Claims:
Claims

1. A method of detecting a wanted moving object (60) in a sequence of near- infrared images captured by a low-resolution image sensor (3), the method comprising:

- obtaining (10) information about a bounding rectangle (61 ) of a moving object (60) in the sequence of near-infrared images and information about a contour area of the moving object (60);

- determining (11 , 51 ) geometrical coefficients for the moving object (60) using the information about the bounding rectangle (61 ) and the information about the contour area; and

- determining (12, 55) whether the moving object (60) is a wanted moving object based on at least the geometrical coefficients. 2. The method according to claim 1 , wherein the sequence of images is

preprocessed (41 ) by a background subtraction algorithm and a movement detection algorithm.

3. The method according to claim 1 or 2, wherein a filtering is applied to the

moving object (60) by determining whether the bounding rectangle overlaps with a margin area (62) of the image sensor (3).

4. The method according to any of the preceding claims, wherein the geometrical coefficients determined (11 , 51 ) for the moving object (60) include an area coefficient, which is determined from a height and a width of the bounding rectangle (61 ) and the contour area of the moving object (60), and wherein the moving object (60) is determined (12, 55) not to be a wanted moving object if the area coefficient does not fall within a first range of values. 5. The method according to claim 4, wherein for determining the area coefficient a ratio of the height of the bounding rectangle (61 ) to the width of the bounding rectangle (61 ) is compared with boundary values.

6. The method according to claim 4 or 5, wherein the geometrical coefficients

determined (11 , 51 ) for the moving object (60) include an extent coefficient, which is determined from a ratio of the contour area of the moving object (60) to an area of the bounding rectangle (61 ), and wherein the moving object (60) is determined (12, 55) not to be a wanted moving object if the extent coefficient is smaller than a threshold.

7. The method according to claim 6, wherein the moving object is split (52) into two or more sub-objects when the extent coefficient is smaller than a threshold and the area coefficient falls within a second range of values.

8. The method according to any of the preceding claims, wherein a brightness coefficient is determined (53) for the moving object (60) from a ratio of bright areas within the contour area of the moving object (60) to the contour area of the moving object (60), and wherein the moving object (60) is determined (12, 55) not to be a wanted moving object if the brightness coefficient is larger than a threshold.

9. The method according to any of the preceding claims, wherein an object

number value is determined (54), which indicates the number of objects found in an image of the sequence of images, wherein the moving object (60) is determined (12, 55) not to be a wanted moving object if the object number value is larger than a threshold.

10. The method according to any of the preceding claims, wherein the low- resolution image sensor (3) is configured to capture the sequence of near- infrared images in a wavelength range from 700 nm to 1200 nm.

11 . The method according to any of the preceding claims, wherein the wanted

moving object (60) is a pedestrian or a cyclist.

12. The method according to any of the preceding claims, wherein the low- resolution image sensor (3) has a resolution of 100x75 pixels.

13. An apparatus (20, 30) for detecting a wanted object in a sequence of near- infrared images captured by a low-resolution image sensor (3), characterized in that the apparatus (20, 30) is configured to perform the steps of a method according to one of claims 1 to 12.

14. Computer program comprising program code which, when executed by a

computing system, causes the computing system to perform the steps of a method according to one of claims 1 to 12 for detecting a wanted object in a sequence of near-infrared images captured by a low-resolution image sensor (3).

15. A lighting unit (1 ) comprising:

- a light source (2);

- a low-resolution image sensor (3);

- an apparatus (20, 30) according to claim 13 for detecting a wanted moving object in a sequence of near-infrared images captured by the low-resolution image sensor (3); and

- a device (4) for switching on and off the light source (2) responsive to detecting a wanted moving object.

Description:
Object detection based on analysis of a sequence of images

The present disclosure is related to the field of object detection in computer vision and image processing. More particularly, the present disclosure is related to outdoor detection of moving objects to control lighting of sidewalks at nighttime.

In order to provide intelligent or adaptive lighting of sidewalks, it is necessary to detect moving objects, in particular pedestrians or cyclists. Lighting is dimmed when no activity is detected, but brightened when movement is detected. Motion detection typically uses dedicated motion sensors or computer vision approaches.

For example, WO 03/098977 A1 discloses a device for switching on and off a light source in a lighting unit that is designed to form part of a group of lighting units arranged along a stretch of road, track or the like. The device comprises a timing circuit designed to generate a signal corresponding to an activation period for a light source comprised in the lighting unit and a movement detector, which is arranged to detect moving objects such as persons and vehicles in its detection area and, when that occurs, to generate an activation signal.

Two kinds of widely used motion sensors are passive infrared and ultrasonic sensors. Passive infrared sensors are compact, low cost and have a low power consumption. However, their operation is highly dependent on the ambient temperature and the ambient brightness level. In addition, outdoor passive infrared sensors are not able to detect fast moving objects. The detection threshold for moving objects is typically limited to a speed of about 3 m/s.

Ultrasonic technology enables obtaining distance information relative to the moving object. The distance information can be used for calculating the velocity of the object motion. Unfortunately, multipath reception of the signal distorts the measurements of the distance between the emitter and the receiver. In addition, these sensors are sensitive to temperature changes, as the temperature has a significant impact on the speed of sound. A further disadvantage is the typically quite small opening angle of operation of about 30-40°. Another type of motion sensor are radio wave sensors, which are commonly used for object detection in security or surveillance systems. Radio wave sensors have a high sensitivity, which can lead to an incorrect operation of a detector, e.g. due to vibrating equipment or small animals. The efficiency of radio wave sensors depends on the ambient conditions.

Hybrid systems based on combinations of ultrasonic sensors and passive infrared sensors or radio wave sensors and passive infrared sensors focus on indoor object detection and localization. Such systems are thus hardly acceptable for applications related to street lighting.

One widely used computer vision approach for object detection uses algorithms based on histograms of oriented gradients combined with a support vector machine. This approach is used for detection of an object’s shape using a feature descriptor for representing its shape. The advantage of the method is the possibility to detect an object independently of the conditions at which an image was taken, e.g. the angle of a camera view, the image resolution, the distance between the object and the camera, or the lighting conditions. In addition, histograms of oriented gradients allow achieving a high accuracy and detecting objects based on a single frame in outdoor scenes. However, the use of histograms of oriented gradients requires high computational efforts due to complexity of the algorithm, which makes the method not applicable for deployment on low-performance SoCs (System on a Chip).

Another approach in computer vision is object detection by means of frame

differencing or background subtraction. These methods can be used to detect moving objects by calculating differences between reference images and newly captured images. Background subtraction methods build an average background model, which is regularly updated to adapt to the varying lighting conditions. The most spread methods are based on Gaussian mixture models, such as GMG, KNN, MOG, or MOG2 [1] These methods are convenient for movement detection using sequences of images in the near infrared spectrum and are thus suitable for an application to smart lighting systems. A further advantage of this approach is that it does not require high computational power and can be implemented on low- performance SoCs. However, an optimization for one-channel images is required in order to achieve feasible results. Also, such methods do not provide functionality for a differentiation between different types of objects. Therefore, they do not allow allocating object types to different detected objects. In addition, the approach is prone to a high false-positive detection error rate due to a weak stability to the dynamic changes in illumination.

In summary, contemporary existing detection means, which can be used in an outdoor environment for detection of pedestrians, cyclists or other moving objects, have a number of significant disadvantages, in most cases a high detection error rate and a low detection range. Their detection reliability heavily depends on the operating conditions. Also, these detection means have limitations regarding the type of object to be detected.

Widely spread computer vision methods have achieved significant successes related to the object detection task. However, these methods either require high

computational power of the processing unit or do not support separation of object types.

There is, therefore, a need for an improved, less computing intensive approach for detection of a wanted moving object in a sequence of images.

This object is achieved by a method according to claim 1 , a corresponding apparatus according to claim 13 for performing the method, and a corresponding computer program according to claim 14. The dependent claims include advantageous further developments and improvements of the present principles as described below.

In a first aspect, a method of detecting a wanted moving object in a sequence of images captured by an image sensor is described. The method comprises:

- obtaining information about a bounding rectangle of a moving object in the sequence of images and information about a contour area of the moving object;

- determining geometrical coefficients for the moving object using the information about the bounding rectangle and the information about the contour area; and

- determining whether the moving object is a wanted moving object based on at least the geometrical coefficients. In a second aspect, an apparatus for detecting a wanted moving object in a sequence of images captured by an image sensor is described. The apparatus comprises:

- means for obtaining information about a bounding rectangle of a moving object in the sequence of images and information about a contour area of the moving object;

- means for determining geometrical coefficients for the moving object using the information about the bounding rectangle and the information about the contour area; and

- means for determining whether the moving object is a wanted moving object based on at least the geometrical coefficients.

In a third aspect, a computer program comprises program code, which, when executed by a computing system, causes the computing system to:

- obtain information about a bounding rectangle of a moving object in the sequence of images captured by an image sensor and information about a contour area of the moving object;

- determine geometrical coefficients for the moving object using the information about the bounding rectangle and the information about the contour area; and

- determine whether the moving object is a wanted moving object based on at least the geometrical coefficients.

The proposed solution concerns an approach for a robust detection of objects in a sequence of images, e.g. infrared images, in a dark street environment. For instance, the objects may be pedestrians and cyclists. The approach focuses on the detection of the type of moving object within a control area via analysis and evaluation of geometrical parameters of the object. As input data, information about a bounding rectangle and information about a contour area of the moving object are used. This information may be obtained by processing the sequence of images, by receiving results of a dedicated preprocessing stage, or by retrieving available processing results from a memory. The proposed solution is especially advantageous for object detection and classification in a dark ambiance.

In one advantageous embodiment, the sequence of images is preprocessed by a background subtraction algorithm and a movement detection algorithm. This allows generating binary masks containing a moving object, which greatly simplifies further processing of the information. In addition, a wide range of implementations of background subtraction algorithms is readily available.

In one advantageous embodiment, a filtering is applied to the moving object by determining whether the bounding rectangle overlaps with a margin area of the image sensor. When the object enters or leaves a field of camera vision, the object is not fully visible in the image. This can lead to an incorrect determination of its geometrical parameters. To detect crossing of image borders, margins are used.

Only those moving objects are considered, which are fully located in the image, i.e. only objects that do not have intersections with the image border. In this way, an erroneous evaluation is avoided.

In one advantageous embodiment, the geometrical coefficients determined for the moving object include an area coefficient, which is determined from a height and a width of the bounding rectangle and the contour area of the moving object. The moving object is determined not to be a wanted moving object if the area coefficient does not fall within a first range of values. The area coefficient allows a unification of the geometrical parameters, such as a bounding rectangle area, a contour area and an orientation, to a scalar value. In addition, this coefficient has been found to fall into typical value ranges for pedestrians and cyclists. In case the area coefficient is outside those ranges, it can be concluded that the moving object is neither a pedestrian nor a cyclist.

In one advantageous embodiment, for determining the area coefficient a ratio of the height of the bounding rectangle to the width of the bounding rectangle is compared with boundary values. By evaluation of this ratio an orientation coefficient can be determined, which can be used to filter out very elongated horizontal and vertical objects. Such objects have a low probability of belonging to the desired class of objects, in this case pedestrians and cyclists. Preferably, the ratio is compared with minimal and maximal values of the ratio of height to width of the bounding rectangle, which are the most probable and typical for a pedestrian or a cyclist. For example, the value of the orientation coefficient can be negative when the object does not have the typical elongation of the shape of a pedestrian or a cyclist. In such a case, also the area coefficient is negative, which enables an easy evaluation. In one advantageous embodiment, the geometrical coefficients determined for the moving object include an extent coefficient, which is determined from a ratio of the contour area of the moving object to an area of the bounding rectangle. The moving object is determined not to be a wanted moving object if the extent coefficient is smaller than a threshold. The extent can be interpreted as a coefficient indicating how much of the bounding rectangle is filled by the object contour. The parameter allows eliminating insignificant fluctuations of background objects or lighting spots. Such artifacts are frequently present in the images as disproportionate and non- uniform objects.

In one advantageous embodiment, the moving object is split into two or more sub- objects in dependence on the geometrical coefficients. There are cases when the shape of a moving object and the lighting spot of its lamps are presented as one object. This can lead to a false-negative detection, because the shape of the object and, as consequence, the area coefficient, are not typical for the specific type of object. To reduce the probability of non-detected objects, depending on the geometrical coefficients an object may be split into two or more sub-objects, e.g. with equal width of the respective bounding rectangle along a vector of a supposed movement direction. After the splitting procedure, the basic coefficients are

calculated for these sub-objects. Preferably, the moving object is only split into two or more sub-objects when the extent coefficient is smaller than a threshold and the area coefficient falls within a second range of values. This ensures that only an object whose bounding rectangle is insufficiently filled by the object contour area and whose area coefficient is within specific limits are split.

In one advantageous embodiment, a brightness coefficient is determined for the moving object from a ratio of bright areas within the contour area of the moving object to the contour area of the moving object. The moving object is determined not to be a wanted moving object if the brightness coefficient is larger than a threshold. Sudden lighting changes produced by lamps of moving vehicles can significantly influence the detection performance. For example, a lighting spot of a vehicle can take a geometrical shape of an object that leads to a false-positive detection.

Evaluation of the brightness coefficient allows estimating the presence of lighting spots on objects that were produced by background subtraction. In one advantageous embodiment, an object number value is determined, which indicates the number of objects found in an image of the sequence of images. The moving object is determined not to be a wanted moving object if the object number value is larger than a threshold. The movement of one pedestrian or cyclist typically produces a rather small number of objects in the background mask. In contrast, car movement causes many more objects, mostly the light reflections. When the object number value is larger than the empirically determined threshold this is an indication of movement of a vehicle in the vicinity of the control area. Advantageously, an apparatus for detecting a wanted moving object in a sequence of images is used by a lighting unit. To this end, the lighting unit comprises a light source and an image sensor. In addition, the lighting unit comprises a device for switching on and off the light source responsive to detecting a wanted moving object by the apparatus.

In the following, the invention will be described by means of advantageous

embodiments with reference to figures of a number of drawings. Flerein:

Fig. 1 illustrates a flowchart of an exemplary method of detecting a wanted moving object in a sequence of images captured by an image sensor in accordance with an embodiment of the present disclosure;

Fig. 2 illustrates an exemplary apparatus for detecting a wanted moving

object in a sequence of images captured by an image sensor in accordance with an embodiment of the present disclosure;

Fig. 3 illustrates an exemplary apparatus for detecting a wanted moving

object in a sequence of images captured by an image sensor in accordance with a further embodiment of the present disclosure;

Fig. 4 schematically shows a lighting unit comprising an apparatus for

detecting a wanted moving object in a sequence of images captured by an image sensor;

Fig. 5 illustrates a flowchart of an embodiment of a complete detection process including preprocessing and object analysis;

Fig. 6 illustrates margins used for margin crossing filtering;

Fig. 7 shows a filtered background subtraction mask;

Fig. 8 shows an image with originally detected bounding rectangles;

Fig. 9 shows a split background subtraction mask; and

Fig. 10 shows an image with bounding rectangles after object splitting.

The present description illustrates the principles of the present disclosure. It will thus be appreciated that those skilled in the art will be able to devise various

arrangements that, although not explicitly described or shown herein, embody the principles of the disclosure.

All examples and conditional language recited herein are intended for educational purposes to aid the reader in understanding the principles of the disclosure and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions.

Moreover, all statements herein reciting principles, aspects, and embodiments of the disclosure, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure. Thus, for example, it will be appreciated by those skilled in the art that the diagrams presented herein represent conceptual views of illustrative circuitry embodying the principles of the disclosure.

The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term“processor” or“controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, read only memory (ROM) for storing software, random access memory (RAM), and nonvolatile storage.

Other hardware, conventional and/or custom, may also be included. Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.

In the claims hereof, any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function. The disclosure as defined by such claims resides in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.

In Fig. 1 a flowchart of an exemplary method of detecting a wanted moving object in a sequence of images captured by an image sensor is shown. In a first step, information about a bounding rectangle of a moving object in the sequence of images and information about a contour area of the moving object are obtained 10. This can be done, for example, by processing the sequence of images, by receiving results of a dedicated preprocessing stage, or by retrieving available processing results from a memory. Geometrical coefficients are then determined 11 for the moving object using the information about the bounding rectangle and the information about the contour area. Based on at least the geometrical coefficients it is determined 12 whether the moving object is a wanted moving object. A block diagram of a first embodiment of an apparatus 20 for detecting a wanted moving object in a sequence of images captured by an image sensor is illustrated in Fig. 2.

The apparatus 20 has means for obtaining information about a bounding rectangle of a moving object in the sequence of images and information about a contour area of the moving object. For example, the means may comprise an input 21 for receiving results from a dedicated preprocessing system (not shown). The means may likewise comprise a processing unit 22 for processing the sequence of images. Alternatively, available processing results may be retrieved from a local storage unit 26 or an external storage unit (not shown) via the input 21 . The apparatus 20 further has means 23 for determining geometrical coefficients for the moving object using the information about the bounding rectangle and the information about the contour area. For example, the means 23 may comprise a processor or an SoC. In addition, the apparatus has means 24 for determining whether the moving object is a wanted moving object based on at least the geometrical coefficients. For example, the means 24 may comprise a logic unit.

The various units 22, 23, 24 of the apparatus 20 may be controlled by a controller 25. The object detection result is preferably made available via an output 27. It may also be stored on the local storage unit 26. The output 27 may also be combined with the input 21 into a single bidirectional interface. A user interface 28 may be provided for enabling a user to modify settings of the various units 22, 23, 24 or the controller 25. The different units 22, 23, 24 and the controller 25 can be embodied as dedicated hardware units. Of course, they may likewise be fully or partially combined into a single unit or implemented as software running on a processor.

A block diagram of a second embodiment of an apparatus 30 for detecting a wanted moving object in a sequence of images captured by an image sensor is illustrated in Fig. 3.

The apparatus 30 comprises a processing device 31 and a memory device 32 storing instructions that, when executed, cause the apparatus to perform steps according to one of the described methods. For example, the processing device 31 can be a processor adapted to perform the steps according to one of the described methods. In an embodiment according to the present principles, said adaptation comprises that the processor is configured, e.g. programmed, to perform steps according to one of the described methods.

A processor as used herein may include one or more processing units, such as microprocessors, digital signal processors, or a combination thereof.

The local storage unit 26 and the memory device 32 may include volatile and/or non- volatile memory regions and storage devices such as hard disk drives and DVD drives. A part of the memory is a non-transitory program storage device readable by the processing device 31 , tangibly embodying a program of instructions executable by the processing device 31 to perform program steps as described herein according to the present principles.

In an embodiment, a computer program comprises program code, which, when executed by a computing system, causes the computing system to perform the method according to the present principles.

Fig. 4 schematically shows a lighting unit 1 implementing object detection according to the present principles. The lighting unit 1 has a light source 2 and an image sensor 3, e.g. an infrared camera. An apparatus 20, 30 as depicted in Fig. 2 or Fig. 3 is provided for detecting a wanted moving object in a sequence of images captured by the image sensor 3. Responsive to detecting a wanted moving object by the apparatus 20, 30 a switching device 4 switches the light source 2 on and off.

In the following, an embodiment of the present principles shall be described in greater detail.

In this embodiment, a low-resolution static camera with an objective sensitive to near-infrared wavelengths (typically 700-1200 nm) is used as an image sensor. The camera is equipped with an array of infrared LEDs. The interfaces between the camera and the Linux-based SoC, which is used as a processing device, can be implemented via CSI (Camera Serial Interface) or USB. The overall detection process is shown in Fig. 5. The detection process can be divided into two stages, namely an optional preprocessing stage 40 and an object analysis stage 50. The preprocessing stage 40 provides the base for the subsequent object analysis procedure. It includes the following steps:

- Preprocessing and filtering 41 ;

- Finding primary data 42.

In the object analysis stage 50 a sequence of procedures is executed, namely:

- Calculation of basic coefficients 51 ;

- Object splitting 52 (optional);

- Finding brightness coefficient 53;

- Estimation of number of objects 54;

- Overall detection (estimation of coefficients) 55.

In the preprocessing and filtering step 41 , background subtraction methods with a movement detection approach as mentioned further above are used for primary image processing. Images captured by the static camera, i.e. mounted on a fixed object with a relatively constant background scene, are processed by a background subtraction algorithm. The processing result is a binary mask containing a moving object, which is supposed to be processed further. Preferably, an adaptive Gaussian mixture model (MOG2 [1]) is used for background subtraction.

The filtering applies morphological operations to an image produced by the previous background subtraction stage. Erosion is performed to remove white noise caused by insignificant fluctuations and gradual lighting changes. Dilation is used for increasing the region of an object to join broken parts, because erosion tends to shrink objects.

The step 42 of finding primary data takes a filtered binary mask as input data. In case an object is present in the binary mask, the following geometrical data of the object are determined:

- The contour area (represented by a number);

- A bounding rectangle (represented by an array with corner’s coordinates and sides’ length).

This data may be determined using common methods of the Open Source Computer Vision Library as well as Green's theorem.

The objects analysis stage 50 includes a calculation of basic coefficients, resolving of problematic cases, i.e. ambiguous scenes that can cause false detection, and a final evaluation of the obtained parameters with the aim of detecting wanted objects and avoiding the influence of obstructing factors.

During the calculation 51 of basic coefficients, parameters of the moving object are determined using the previously obtained geometrical data. However, first a margin crossing filtering is performed. When the object enters or leaves a field of camera vision, the object’s body is not completely present in the image. This can lead to an incorrect determination of its geometrical parameters. For instance, an appearing car can, at some moment of entering the image, have geometrical parameters

corresponding to a pedestrian or a cyclist. To detect crossing of image borders, margins are used.

Margins are shown as hashed regions in Fig. 6. The status of the margin crossing by an object allows considering only those moving objects whose body is fully located in the image, i.e. only objects that do not have intersections with the image border, to avoid an erroneous evaluation.

Two margin areas are preferably adjoined to the left and right image borders, as crossing of these borders is the most likely. The borders of the margin areas are located at the coordinates x 0 , x 1 and are shifted relative to the image borders by about 3 to 4 pixels.

The status of the margin crossing is True if the object has no intersections with the image borders, otherwise the status is False :

True if f 0c > c 0 \/f 1c < X 0 l

S C ross StdtliS

False otherwise (1 ) where f 0c , f 1c are the coordinates of the left side and the right side of the bounding rectangle of the object, respectively.

To unify the geometrical parameters, such as bounding rectangle area, contour area and orientation, to a scalar value, the area coefficient c rect is used. This geometrical coefficient is calculated as: h 2 + 2 h - w + w 2

C rect = A obj k 4 h - W (2) where A obj is the object contour area, h and w are the height and the width of the bounding rectangle, respectively, and k is an orientation coefficient:

where t min and t max are the minimal and maximal values of the ratio h/w for the bounding rectangle, which are the most probable and typical for a pedestrian or a cyclist. The values of t min and t max are not dependent on image resolution. Suitable values, which have been found empirically, are t min = 0.7 and t max = 2.5.

The coefficient k is used to filter out very elongated horizontal and vertical objects, which have a low probability of belonging to the desired class of objects. In other words, in this implementation the value of the area coefficient is negative when the object does not have the typical elongation of the shape of a pedestrian or a cyclist

The extent e is the ratio of the object contour area to the bounding rectangle area: where A obj is the object contour area and A rect is the bounding rectangle area.

The extent is interpreted as a coefficient indicating how much of the bounding rectangle is filled by the object contour. The parameter allows eliminating insignificant fluctuations of background objects or lighting spots, which are frequently present in the images as disproportionate and non-uniform objects. There are some cases when the shape of a moving object and the lighting spot of its lamps are presented as one object, e.g. for a cyclist. This can lead to a false- negative detection, because the shape of the object and, consequently, the area coefficient are not typical for a cyclist. To reduce the probability of non-detected objects, depending on the area coefficient an object may be split into two or three sub-objects with equal width of the respective bounding rectangle along a vector of a supposed movement direction. After the splitting procedure, the basic coefficients for these sub-objects are calculated. The splitting is preferably performed only for objects that satisfy a splitting condition: where e is the extent value of the object in accordance with formula (4) and e thr is an extent threshold indicating that the bounding rectangle of the object is insufficiently filled by the object contour area. The value of the extent threshold does not depend on the image resolution. A suitable value, which has been found empirically, is e thr = 0.5. c s min and c s max are boundary values of a geometrical coefficient interval, which correspond to values of problematic objects. For an image resolution of 100x75 pixels, these values have been experimentally determined as c s min = -5000 and Cs max——2500. An example of object splitting shall now be described with reference to Figs. 7 to 10.

Fig. 7 shows a filtered background subtraction mask, which is the basis for object detection.

Fig. 8 shows an image with originally detected bounding rectangles. As can be seen, four objects have been detected based on the background subtraction mask. Objects 01 , 02 and 03 can be estimated as noise, i.e. as unwanted objects. Object 04 includes the cyclist and the lighting spot of the cyclist. This object will undergo a splitting operation.

Fig. 9 and Fig. 10 show a split background subtraction mask and a corresponding image with bounding rectangles after object splitting, respectively. As can be seen, three objects have been detected based on the background subtraction mask. Object 01 is the lighting spot of the cyclist. This object will be considered an unwanted object. Object 02 is noise and thus also an unwanted object. Object 03 is the cyclist.

The above approach is based on the assumption that the extent value and the area coefficient of a lighting spot are different from extent values and area coefficients of desired objects. Using object splitting thus allows separating a desired object from a lighting spot and to recalculate basic coefficients for each object for further

estimation.

Sudden lighting changes produced by lamps of moving vehicles can significantly influence the detection performance. A lighting spot of a vehicle can take a

geometrical shape of an object that leads to a false-positive detection. A brightness coefficient c bight allows estimating the presence of lighting spots on objects that were produced by background subtraction.

The coefficient includes two variables:

- A interbright , the area of intersections between the contour area of an object and contour areas of bright regions. A interbright is the sum of all contour areas of bright regions that are located within the contour area of an object.

- A obj , the contour area of the detected object.

The brightness coefficient can be calculated as:

If the resulting brightness coefficient is larger than a threshold, the object is considered a vehicle's lamp or a light-caused object. Bright regions are found using the original images. A pixel is considered bright if its value exceeds an empirically determined threshold.

The difference between human movement and vehicle movement can be expressed by a variable c number indicating a number of objects registered in result of

background subtraction. Typically, for an image resolution of 100x75 pixels the movement of one pedestrian or cyclist produces up to three objects in the

background mask. In contrast, car movement causes up to 70 objects, mostly the light reflections. On this basis the car movement can be detected in advance, typically 5-10 m before the car appears in the camera’s field of vision, since the lighting spot enters the camera’s field of vision earlier. The resulting coefficient can be represented by a Boolean value, which indicates movement of a vehicle nearby the control area. The evaluation of the various parameters is performed using the following conditions:

- Geometrical coefficient status: where c rect is the geometrical coefficient calculated for an object under consideration and r min and r max are boundary values of an interval which is typical for an object to be detected. For a pedestrian moving in a distance of 2-13 meters from the camera and an image resolution of 100x75 pixels suitable values have been empirically determined as r min = 166 and r max = 800. For a cyclist with the same image scene and resolution parameters suitable values have been empirically determined as T ri 850 and rmax 1660.

- Extent status:

True if e > e thr

e status (8)

False otherwise ' where e is the extent value determined for an object under consideration and e thr is a threshold separating wanted and unwanted objects. The value of the extent threshold does not depend on the image resolution. A suitable value for pedestrians or cyclists, which has been found empirically, is e thr = 0.5.

- Brightness coefficient status: f c brigb f < Cf br

(9) otherwise where c bright is the value of the brightness coefficient calculated for an object under consideration and c thr is a threshold separating acceptable coefficient values of bright regions on objects and vehicle's lamp objects. The value of the threshold does not depend on the image resolution. A suitable value, which has been found empirically, is c thr = 0.2.

- Object number status: f c number < c number car

(10) otherwise where c number is a number of registered objects and c number car is a typical number of objects to be registered when car movement occurs. For an image resolution of 100x75 pixels, a suitable value has been empirically determined as c number car = 30.

In a final evaluation, the detection status of an object is True when all its coefficients have the status True : True if s cr0SS _status A c rect _status A e_status

A c bright _status A c amount _status . (11 ) False otherwise The value True is interpreted as a presence of a wanted object, i.e. an object of the target detection group, in the control area.

It is to be understood that the proposed method and apparatus may be implemented in various forms of hardware, software, firmware, special purpose processors, or a combination thereof. Special purpose processors may include application specific integrated circuits (ASICs), reduced instruction set computers (RISCs) and/or field programmable gate arrays (FPGAs). Preferably, the proposed method and apparatus are implemented as a combination of hardware and software. Moreover, the software is preferably implemented as an application program tangibly embodied on a program storage device. The application program may be uploaded to, and executed by, a machine comprising any suitable architecture. Preferably, the machine is implemented on a computer platform having hardware such as one or more central processing units (CPU), a random access memory (RAM), and input/output (I/O) interface(s). The computer platform also includes an operating system and

microinstruction code. The various processes and functions described herein may either be part of the microinstruction code or part of the application program (or a combination thereof), which is executed via the operating system. In addition, various other peripheral devices may be connected to the computer platform such as an additional data storage device and a printing device.

It should be understood that the elements shown in the figures may be implemented in various forms of hardware, software or combinations thereof. Preferably, these elements are implemented in a combination of hardware and software on one or more appropriately programmed general-purpose devices, which may include a processor, memory and input/output interfaces. Herein, the phrase "coupled" is defined to mean directly connected to or indirectly connected with through one or more intermediate components. Such intermediate components may include both hardware and software based components.

It is to be further understood that, because some of the constituent system

components and method steps depicted in the accompanying figures are preferably implemented in software, the actual connections between the system components (or the process steps) may differ depending upon the manner in which the proposed method and apparatus is programmed. Given the teachings herein, one of ordinary skill in the related art will be able to contemplate these and similar implementations or configurations of the proposed method and apparatus. References

[1] T. Trnovszky et al.:“Comparison of Background Subtraction Methods on

Near Infra-Red Spectrum Video Sequences”, presented at the TRANSCOM 2017, 2017.