Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SYSTEMS AND METHODS FOR IDENTIFYING AND/OR TRACKING OBJECTS
Document Type and Number:
WIPO Patent Application WO/2021/056050
Kind Code:
A1
Abstract:
Systems and methods for use in an object tracking system for tracking movement of objects are disclosed. The system configured to: receive object metadata (including object identifiers) corresponding to objects detected by a two or more sensors over a period of time; identify at least one pair of identifiers that correspond to the same physical object, the identification of the at least one pair of identifiers based on at least the objects being identified at similar times and/or at similar locations; and generate an output indicative of the identified at least one pair of identifiers that corresponds to the same physical object, for identifying movement of the physical object represented by said pair of identifiers in the received metadata.

Inventors:
PERERA HIDELLAGE KUSHAR UDDIKA (AU)
WHYATT MICHAEL DAVID (AU)
Application Number:
PCT/AU2020/050983
Publication Date:
April 01, 2021
Filing Date:
September 17, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NETGENIQ PTY LTD (AU)
International Classes:
G06T7/292; G06K9/00; G08B13/196
Foreign References:
US20140347475A12014-11-27
US20140050455A12014-02-20
US20140184803A12014-07-03
US20140037150A12014-02-06
US5973732A1999-10-26
Attorney, Agent or Firm:
FPA PATENT ATTORNEYS PTY LTD (AU)
Download PDF:
Claims:
CLAIMS

1. An object identification system for use in an object tracking system for tracking movement of objects in a physical location over a period of time, the physical location being monitored by two or more sensors configured to detect objects in the physical location, the object identification system comprising a processor and a memory, the memory storing instructions, which when executed by the processor cause the object identification system to: receive object metadata corresponding to objects detected by the two or more sensors over the period of time, wherein the image metadata comprises identifiers of a plurality of identified objects, each identifier associated with data indicating one or more times at which the corresponding object was identified and one or more locations of the identified object at the corresponding one or more times; identify at least one pair of identifiers from the identifiers that corresponds to the same physical object in the physical location, the identification of the at least one pair of identifiers based on at least one of the one or more times at which the corresponding objects were identified and the one or more locations of the identified objects at the corresponding one or more times; and generate an output indicative of the identified at least one pair of identifiers that corresponds to the same physical object, for identifying movement of the physical object represented by said pair of identifiers in the received metadata.

2. The object identification system of claim 1, wherein the two or more sensors are two or more cameras configured to capture images of the physical location and the object metadata include image metadata.

3. The object identification system of claim 2, wherein identifying the at least one pair of identifiers comprises: for each camera of the two or more cameras: splicing the received image metadata into multiple predetermined intervals of time; for each interval of time: retrieving a list of object identifiers and first and last known locations of the objects associated with the list of object identifiers; comparing pairs of object identifiers from the list of identifiers to determine whether the first or last known locations of the pair of object identifiers are within a threshold distance of one another and detected within a threshold period of time; and determining that a pair of object identifiers corresponds to the same object if the first or last known locations of the pair of object identifiers are within the threshold distance and are detected within a threshold period of time.

4. The object identification system of claim 2, wherein identifying the at least one pair of object identifiers comprises: splicing the received image metadata into multiple intervals of time, such that each interval includes image metadata records corresponding to images captured within that interval; for each interval of time: comparing pairs of object identifiers from the list of identifiers to determine whether the location of the pair of object identifiers is within a threshold distance of one another and detected within a threshold period of time of each other; and determining that a pair of object identifiers corresponds to the same object if the location of the pair of object identifier is within the threshold distance and is detected within a threshold period of time of each other; and recording the determined pair of object identifiers that correspond to the same object.

5. The object identification system of claim 4, wherein identifying the at least one pair of object identifiers comprises: splicing the received image metadata into multiple intervals of time, such that each interval includes image metadata records corresponding to images captured within that interval; for each interval of time: comparing pairs of object identifiers from the list of identifiers to determine whether the location of the pair of object identifiers is outside a threshold distance of one another and the pair of object identifiers are detected within a threshold period of time of each other; and determining that a pair of object identifiers do not correspond to the same physical object if the relative location of the pair of object identifier is greater than the threshold distance and the pair of object identifiers is detected within a threshold period of time of each other; recording the determined pairs of object identifiers that do not correspond to the same physical object.

6. The object identification system of claim 5, generating an output indicative of the identified at least one pair of identifiers that corresponds to the same object comprises comparing the determined pairs of object identifiers that correspond to the same object with the determined pairs of object identifiers that do not correspond to the same object and removing any object identifiers from the determined pairs of object identifiers that correspond to the same object if the same object identifier is present in the determined pairs of object identifiers that do not correspond to the same object.

7. The object identification system of claim 1, wherein the object tracking system is scheduled to repeat the method of claim 1 at a predetermined frequency.

8. The object identification system of claim 1, wherein the objects in the physical location are customers in a retail location.

9. The object identification system of claim 1, wherein the location is a Cartesian location of the object with respect to the physical location.

10. The object identification system of any one of claims 2 to 9, wherein the received image metadata includes metadata from a first camera and metadata from a second camera and wherein the identified at least one pair of identifiers from the identifiers that corresponds to the same object includes a pair consisting of an identifier from the first camera and an identifier from the second camera.

11. An object identification system comprising a processor configured to: receive first object metadata comprising a first set of object identifiers and associated time and position information for tracking movement of each of the identified objects, wherein the first set of object identifiers comprises a plurality of distinct object identifiers; determine, based on the time and position information that two or more of the plurality of object identifiers satisfy time and position criteria indicative that they relate to a common physical object; generate second object metadata comprising a second set of object identifiers, the second set of object identifiers indicates that the two or more object identifiers relate to a common physical object.

12. The object identification system of claim 11, wherein the first object metadata is metadata from a plurality of video cameras, including a first video camera and a second video camera, wherein the position information of the first video camera and the second video camera has a common frame of reference.

13. A method for object tracking at a location, the method comprising: establishing a camera with a view at the location, configured to generate first image metadata comprising a first set of object identifiers and associated time and position information for tracking movement of each of the identified objects; receiving, at a computational processing system, the first image metadata; determining, by the computational processing system, based on the time and position information that two or more of the plurality of object identifiers satisfy time and position criteria indicative that they relate to a common physical object; generating second image metadata comprising a third set of object identifiers, the third set of object identifiers indicating that the two or more object identifiers relate to a common physical object.

14. A method of object tracking in a location, the method comprising: establishing a plurality of cameras, each with a view at the location, a first camera configured to generate first image metadata comprising a first set of object identifiers and associated time and position information for tracking movement of each of the identified objects and a second camera configured to generate second image metadata comprising a second set of object identifiers and associated time and position information for tracking movement of each of the identified objects; receiving, a computational processing system, the first image metadata and the second image metadata; determining, by the computational processing system, based on the time and position information that two or more of the plurality of object identifiers satisfy time and position criteria indicative that they relate to a common physical object, the two or more object identifiers comprising at least one object identifier of the first image metadata and at least one object identifier of the second image metadata; generating third image metadata comprising a third set of object identifiers, the third set of object identifiers indicating that the two or more object identifiers relate to a common physical object.

15. A method for object identification in a physical location over a period of time, the physical location being monitored by two or more sensors configured to detect objects in the physical location, the method comprising: receiving object metadata corresponding to objects detected by the two or more sensors over the period of time, wherein the object metadata comprises identifiers of a plurality of identified objects, each identifier associated with data indicating one or more times at which the corresponding object was identified and one or more locations of the identified object at the corresponding one or more times; identifying at least one pair of identifiers from the identifiers that corresponds to the same physical object in the physical location, the identification of the at least one pair of identifiers based on at least one of the one or more times at which the corresponding objects were identified and the one or more locations of the identified objects at the corresponding one or more times; and generating an output indicative of the identified at least one pair of identifiers that corresponds to the same physical object, for identifying movement of the physical object represented by said pair of identifiers in the received metadata.

16. The method of claim 15, wherein the two or more sensors are two or more cameras configured to capture images of the physical location and the object metadata include image metadata.

17. The method of claim 16, wherein identifying the at least one pair of identifiers comprises: for each camera of the two or more cameras: splicing the received image metadata into multiple predetermined intervals of time; for each interval of time: retrieving a list of object identifiers and first and last known locations of the objects associated with the list of object identifiers; comparing pairs of object identifiers from the list of identifiers to determine whether the first or last known locations of the pair of object identifiers are within a threshold distance of one another and detected within a threshold period of time; and determining that a pair of object identifiers corresponds to the same object if the first or last known locations of the pair of object identifiers are within the threshold distance and are detected within a threshold period of time.

18. The method of claim 16, wherein identifying the at least one pair of object identifiers comprises: splicing the received image metadata into multiple intervals of time, such that each interval includes image metadata records corresponding to images captured within that interval; for each interval of time: comparing pairs of object identifiers from the list of identifiers to determine whether the location of the pair of object identifiers is within a threshold distance of one another and detected within a threshold period of time of each other; and determining that a pair of object identifiers corresponds to the same object if the location of the pair of object identifier is within the threshold distance and is detected within a threshold period of time of each other; and recording the determined pair of object identifiers that correspond to the same object.

19. The method of claim 16, wherein identifying the at least one pair of object identifiers comprises: splicing the received image metadata into multiple intervals of time, such that each interval includes image metadata records corresponding to images captured within that interval; for each interval of time: comparing pairs of object identifiers from the list of identifiers to determine whether the location of the pair of object identifiers is outside a threshold distance of one another and the pair of object identifiers are detected within a threshold period of time of each other; and determining that a pair of object identifiers do not correspond to the same physical object if the relative location of the pair of object identifier is greater than the threshold distance and the pair of objects are detected within a threshold period of time of each other; recording the determined pairs of object identifiers that do not correspond to the same physical object.

20. The method of claim 19, generating an output indicative of the identified at least one pair of identifiers that corresponds to the same object comprises comparing the determined pairs of object identifiers that correspond to the same object with the determined pairs of object identifiers that do not correspond to the same object and removing any object identifiers from the determined pairs of object identifiers that correspond to the same object if the same object identifier is present in the determined pairs of object identifiers that do not correspond to the same object.

21. The method of any one of claims 16 to 20, wherein the received image metadata includes metadata from a first camera and metadata from a second camera and wherein the identified at least one pair of identifiers from the identifiers that corresponds to the same object includes a pair consisting of an identifier from the first camera and an identifier from the second camera.

22. A method comprising: receiving first object metadata comprising a first set of object identifiers and associated time and position information for tracking movement of each of the identified objects, wherein the first set of object identifiers comprises a plurality of distinct object identifiers; determining, based on the time and position information that two or more of the plurality of object identifiers satisfy time and position criteria indicative that they relate to a common physical object; generating second object metadata comprising a second set of object identifiers, the second set of object identifiers indicates that the two or more object identifiers relate to a common physical object.

23. The method of claim 22, wherein the first object metadata is metadata from a plurality of video cameras, including a first video camera and a second video camera, wherein the position information of the first video camera and the second video camera has a common frame of reference.

Description:
SYSTEMS AND METHODS FOR IDENTIFYING AND/OR TRACKING OBJECTS

Technical Field

[0001] Aspects of the present disclosure relate to systems and methods for identifying and/or tracking objects.

Background

[0002] Object tracking is an image processing technique for tracking movement of an object (e.g., an animate object such as a person or an animal, or an inanimate object such as a vehicle) in an area. There are multiple applications of object tracking including, for example, surveillance of people, monitoring an area for suspicious behaviour, monitoring traffic, tracking the movement of objects in manufacturing plants, monitoring customer/employee behaviour in physical locations such as retail stores, etc.

[0003] There exists a need for improved object tracking systems or methods.

Summary

[0004] According to a first aspect, there is provided an object identification system for use in an object tracking system for tracking movement of objects in a physical location over a period of time, the physical location being monitored by two or more sensors configured to identify objects in the physical location, the object identification system comprising a processor and a memory, the memory storing instructions, which when executed by the processor cause the object identification system to: receive object metadata corresponding to objects identified by the two or more sensors over the period of time, wherein the object metadata comprises identifiers of the objects identified in the physical location, each identifier associated with data indicating one or more times at which the corresponding object was identified and one or more locations of the identified object at the corresponding one or more times; identify at least one pair of object identifiers from the object identifiers that corresponds to the same physical object in the physical location, the identification of the at least one pair of identifiers based on at least one of the one or more times at which the objects were identified and the one or more locations of the identified objects at the corresponding one or more times; and generate an output indicative of the identified at least one pair of object identifiers that corresponds to the same physical object, for identifying movement of the physical object represented by said pair of object identifiers in the received object metadata. [0005] According to a second aspect, there is provided an object identification system comprising a processor configured to: receive first object metadata comprising a first set of object identifiers and associated time and position information for tracking movement of each of the identified objects, wherein the first set of object identifiers comprises a plurality of distinct object identifiers; determine, based on the time and position information that two or more of the plurality of object identifiers satisfy time and position criteria indicative that they relate to a common physical object; generate second object metadata comprising a second set of object identifiers, the second set of object identifiers indicates that the two or more object identifiers relate to a common physical object.

[0006] According to a third aspect, there is provided a method for object tracking at a location, the method comprising: establishing a camera with a view at the location, configured to generate first image metadata comprising a first set of object identifiers and associated time and position information for tracking movement of each of the identified objects; receiving, at an object tracking system, the first image metadata; determining, by the object tracking system, based on the time and position information that two or more of the plurality of object identifiers satisfy time and position criteria indicative that they relate to a common physical object; generate second image metadata comprising a third set of object identifiers, the third set of object identifiers indicating that the two or more object identifiers relate to a common physical object.

[0007] According to a fourth aspect, there is provided a method of object tracking in a location, the method comprising: establishing a plurality of cameras, each with a view at the location, a first camera configured to generate first image metadata comprising a first set of object identifiers and associated time and position information for tracking movement of each of the identified objects and a second camera configured to generate second image metadata comprising a second set of object identifiers and associated time and position information for tracking movement of each of the identified objects; receiving, a computational processing system, the first image metadata and the second image metadata; determining, by the computational processing system, based on the time and position information that two or more of the plurality of object identifiers satisfy time and position criteria indicative that they relate to a common physical object, the two or more object identifiers comprising at least one object identifier of the first image metadata and at least one object identifier of the second image metadata; generating third image metadata comprising a third set of object identifiers, the third set of object identifiers indicating that the two or more object identifiers relate to a common physical object. [0008] According to a fifth aspect, there is provided a method for object identification in a physical location over a period of time, the physical location being monitored by two or more sensors configured to detect objects in the physical location, the method comprising: receiving object metadata corresponding to objects detected by the two or more sensors over the period of time, wherein the object metadata comprises identifiers of a plurality of identified objects, each identifier associated with data indicating one or more times at which the corresponding object was identified and one or more locations of the identified object at the corresponding one or more times; identifying at least one pair of identifiers from the identifiers that corresponds to the same physical object in the physical location, the identification of the at least one pair of identifiers based on at least one of the one or more times at which the corresponding objects were identified and the one or more locations of the identified objects at the corresponding one or more times; and generating an output indicative of the identified at least one pair of identifiers that corresponds to the same physical object, for identifying movement of the physical object represented by said pair of identifiers in the received metadata.

[0009] According to a sixth aspect, there is provided a method comprising: receiving first object metadata comprising a first set of object identifiers and associated time and position information for tracking movement of each of the identified objects, wherein the first set of object identifiers comprises a plurality of distinct object identifiers; determining, based on the time and position information that two or more of the plurality of object identifiers satisfy time and position criteria indicative that they relate to a common physical object; generating second object metadata comprising a second set of object identifiers, the second set of object identifiers indicates that the two or more object identifiers relate to a common physical object.

[0010] Further aspects of the present invention and further embodiments of the aspects described in the preceding paragraphs will become apparent from the following description, given by way of example and with reference to the accompanying drawings.

Brief description of the drawings

[0011] In the drawings:

[0012] Fig. 1 illustrates an example environment in which an object tracking system according to the present disclosure may be implemented.

[0013] Fig. 2 is a block diagram of an object tracking system according to aspects of the present disclosure. [0014] Fig. 3 is a flowchart illustrating an example method for identifying misidentified objects by a particular camera according to aspects of the present disclosure.

[0015] Fig. 4 is a flowchart illustrating an example method for identifying the same object identified by multiple cameras according to aspects of the present disclosure.

[0016] Fig. 5 is a flowchart illustrating an example method for identifying mismatched objects according to some embodiments of the present disclosure.

[0017] Fig. 6 is a flowchart illustrating an example method for identifying and tracking objects according to some embodiments of the present disclosure.

[0018] While the invention is amenable to various modifications and alternative forms, specific embodiments are shown by way of example in the drawings and are described in detail. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.

Detailed Description

[0019] In the following description numerous specific details are set forth in order to provide a thorough understanding of the claimed invention. It will be apparent, however, that the claimed invention may be practiced without these specific details. In some instances, well- known structures and devices are shown in block diagram form in order to avoid unnecessary obscuring.

General overview and Environment

[0020] Object tracking has typically been performed using image data (e.g., images captured by one or more cameras in an area) and analysing the image data to identify objects and then track these objects across multiple frames. For example, typical object tracking systems receive image data from one or more cameras, recognize objects in the images (e.g., by identifying objects, drawing bounding boxes around the identified objects, and classifying the objects). Thereafter, typical systems determine the centroids of the bounding boxes and track these centroids to track objects.

[0021] These systems typically rely on trained neural networks that can recognize object traits, e.g., size or colour of an object, and in case the objects are people, they may distinguish between objects based on human traits such as facial features, height, ethnicity, clothes, etc. However, these typical systems may face a number of issues. For example, because these systems rely on processing image data they generally require high bandwidth communication systems to communicate image data from one device to another (e.g., from a camera to a backend processing system) and generally require very high processing or computing power. Further still, if the shape of the object suddenly changes (e.g., if a standing person suddenly crouches), the system may misidentify the object as two separate objects. Similarly, if the trait that is being recognized is facial features, the system may fail to recognize a person if the person turns away from the camera. Further still, when people are recognized and tracked based on human traits, there is a privacy concern as personal data about people is recorded and used for tracking. There could also be privacy implications in communicating image data with people’s faces from one system to another and increased risk of the image data being breached.

[0022] Aspects of the present disclosure are directed to systems and methods for tracking objects in an area. The area may be an enclosed physical environment, e.g., a room, an office, a shopping centre, a retail store, a building, an airport, a train station, a hospital, a parking lot, etc., or an open physical environment, e.g., a section of a road, a portion of the sky, or an area in the sea. In particular, the disclosed systems and methods continually or persistently track objects based on object metadata, and in particular based on location of the identified objects, but not based on image data captured by a camera.

[0023] Fig. 1 illustrates an example environment 100 in which the various operations and techniques described herein can be performed. In particular, Fig. 1 illustrates a plan view of an example environment 100 in which an object tracking system according to some aspects of the present disclosure may be implemented. The environment 100 includes an enclosed physical location 102. The enclosed physical location 102 may be, for example, a physical store such as a supermarket, a hardware store, a warehouse, a cafeteria, or any other type of physical location such as an airport, a bank branch, an office, a residential location, etc. In this example, the physical location is a bank branch that offers standard teller and personal banking services. To this end, the physical location includes teller area 104, an information helpdesk area 106, a waiting area 108, and one or more cabins for personal banking services.

[0024] Further, at any given time, one or more people (such as employees and/or customers) may be present in the physical location 102. These people are commonly referred to as objects 112 in the remainder of this disclosure.

[0025] One or more sensors 114 may be installed in the physical location to detect objects within the physical location. Any appropriate sensors may be utilized including, but not limited to thermal sensors, infrared sensors, etc. In one embodiment, the sensors may be a still or video cameras. The remainder of the disclosure is described with respect to cameras being the sensors. However, it will be appreciated that this is merely an example and that any other type of sensor that can identify objects may be utilized without departing from the scope of the present disclosure.

[0026] Returning to Fig. 1, one or more cameras 114 may be installed in the physical location 102 to monitor a desired area within the physical location 102. It will be appreciated that the desired area being monitored by the cameras 114 may vary depending on the type of physical location, the type of objects being monitored, and/or the required application. For example, if the physical location 102 is a bank branch, the type of object being monitored may be humans, and the aim of object tracking may be to identify areas within the bank branch where people typically aggregate, the path taken by customers within the branch, customer dwell time, etc., the desired area may be the entire customer and/or employee areas of the bank branch 102. Areas of the physical location where people may not be present (e.g., dead comers, display areas) and/or private areas (such as powder rooms, toilets, kitchen, etc.) may be considered undesired areas.

[0027] The cameras 114 are connected to an object tracking system 118 via a communication network 116. The object tracking system 118 is described further below.

[0028] The cameras 114 may be stationary wide field-of-view cameras, or Pan Tilt Zoom (PTZ) cameras, and/or thermal cameras. Further still, the cameras may function independent of each other (such that each camera independently monitors a particular fixed region of the desired area) or they may communicate with one another (e.g., via a central controller system) such that at any given time the cameras can adjust their field of vision such that the entire desired area is monitored. Further, the cameras 114 may be configured to automatically switch off if no movement is detected (e.g., by incorporating one or more movement sensors).

[0029] In certain embodiments, one or more of the cameras 114 include in-built image processing system that can analyse image data to identify one or more objects in the images. The cameras 114 can also determine additional information about the identified objects such as the exact time at which an object was detected, the obj ecf s size, the obj ecf s height, whether the obj ect is a thermal object or not, and the location of the object. The location information may be determined in global positioning coordinates or in Cartesian coordinates. For example, once calibrated and GPS coordinates configured, the camera 114 may identify the latitude and longitude coordinates of an object with respect to the earth or it may identify the position of an object in the physical location 102 with X, Y and Z coordinates, where X and Y describe the ground plane and Z the height of the identified object with respect to the ground. In both coordinate systems, the direction into which the camera looks is described by the azimuth angle. In one example, in the Cartesian system, the azimuth angle is defined as zero in the east or along the X-axis and a positive azimuth angle means that the camera is turned counter clockwise when viewed from above, resulting in north at 90° or along the Y-axis, west at 180° and south at 270° or -90°.

[0030] When the cameras 114 are first installed in the physical location 102, they are calibrated to include the height at which the camera is positioned, the azimuth angle of the camera, and its tilt angle, that is the angle between the horizontal plane and the camera. A tilt angle of 0° may mean that the camera is mounted parallel to the ground; a tilt angle of 90° may mean that the camera is mounted top-down in a birds eye view perspective. Further, the multiple cameras 114 installed in physical location 102 may be calibrated to predefined X and Y-axis locations in the physical location 102 with one or more reference points common between two or more of the multiple cameras. Once the cameras are calibrated, they can assign a position value to the identified objects depending on where the identified object is located in the Cartesian plane. If two cameras have overlapping fields of vision and both cameras identify an object in the overlap region, if they are calibrated properly, both the cameras should assign the same physical location Cartesian coordinates to that object.

[0031] Generally speaking, during operation, the cameras 114 monitor the particular regions in their field of vision. Then, for each image captured by a camera, the camera may identify one or more objects. The identified objects are assigned unique object identifiers. By “unique” it is meant that the identifier are sufficiently differentiated to enable the processes of the present disclosure to be useful. If the camera determines that the same object is identified in multiple images, the camera assigns the same object identifier to that object in all the image frames that include the identified object. For each object identified in each image frame, the cameras may also be configured to attach a timestamp for when the image was captured and a location of the object (e.g., global coordinate and/or Cartesian coordinate within the physical location 102).

[0032] Although each camera may identify and track movement of objects within its frame of vision there are some limitations. For instance, when an object is first identified in the vision of a first camera, the first camera may assign the object a unique identifier and may associate this identifier with the object until the object remains in that camera’s vision. However, when the object moves out of the vision of a first camera and into the vision of a second camera, the second camera recognizes the object as a new object and assigns a new (different) object identifier to that object for the duration the object remains within the vision of the second camera. Accordingly, as an object moves between the ranges of multiple cameras, the object is associated with multiple different identifiers and it is not possible to seamlessly and continuously track the movement of the object within the physical location.

[0033] Another issue that may arise is when an object remains stationary for an extended period of time. The cameras 114 are typically programmed to detect certain types of objects - e.g., in the physical location 104 the cameras may be programmed to detect people. Accordingly, one criteria for differentiating between background objects such as chairs, desks, etc., and people may be movement. If any movement is detected by the camera between two image frames, the camera identifies the moving object (e.g., a person) to be the type of object it should monitor and assigns this object an object identifier. However, once the person has been identified, if the person stops moving, the camera may determine that the person is a background object and stops monitoring that person. Thereafter when the person moves again, the camera may detect the movement and identify the person again, but this time it may assign the person a new object identifier and may classify the person as a new object. Accordingly, the cameras may misidentify the same person as two people.

[0034] Another issue may arise where a camera may misclassify a background object as an object it should monitor. For example, if a person moves a piece of furniture or a garbage bin from one location to another, the camera may incorrectly determine that the piece of furniture or garbage bin is a person and may monitor the object until it stops moving (and assign this object an object identifier).

[0035] To seamlessly identify and track objects between multiple cameras and correct the other issues with camera analytics, the object tracking system 118 is employed.

Object tracking system

[0036] The object tracking system 118 described herein is implemented by one or more special-purpose computing systems or devices. A special-purpose computing system may be hard-wired to perform the relevant operations. Alternatively, a special-purpose computing system may include digital electronic devices such as one or more application- specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the relevant operations. Further, alternatively, a special-purpose computing system may include one or more general-purpose hardware processors programmed to perform the relevant operations pursuant to program instructions stored in firmware, memory, other storage, or a combination.

[0037] A special-purpose computing system may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the relevant operations described herein. A special-purpose computing system may be a desktop computer system, a portable computer system, a handheld device, a networking device or any other device that incorporates hard-wired and/or program logic to implement relevant operations.

[0038] By way of example, Fig. 2 provides a block diagram that illustrates one example of the object tracking system 118. Object tracking system 118 includes a bus 202 or other communication mechanism for communicating information, and a hardware processor 204 coupled with bus 202 for processing information. Hardware processor 204 may be, for example, a general-purpose microprocessor, or other processing unit.

[0039] Object tracking system 118 also includes a main memory 206, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 202 for storing information and instructions to be executed by processor 204. Main memory 206 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 204. Such instructions, when stored in non-transitory storage media accessible to processor 204, render the object tracking system 118 into a special-purpose machine that is customized to perform the operations specified in the instructions.

[0040] Object tracking system 118 further includes a read only memory (ROM) 208 or other static storage device coupled to bus 202 for storing static information and instructions for processor 204. A storage device 210, such as a magnetic disk or optical disk, is provided and coupled to bus 202 for storing information and instructions.

[0041] According to one embodiment, the techniques herein are performed by the object tracking system 118 in response to processor 204 executing one or more sequences of one or more instructions contained in main memory 206. Such instructions may be read into main memory 206 from another storage medium, such as a remote database. Execution of the sequences of instructions contained in main memory 206 causes processor 204 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.

[0042] The term “storage media” as used herein refers to any non-transitory media that stores data and/or instructions that cause a machine to operate in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 210. Volatile media includes dynamic memory, such as main memory 206. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.

[0043] Object tracking system 118 also includes a communication interface 218 coupled to bus 202. Communication interface 218 provides a two-way data communication coupling to a communication network, for example communication network 116 of environment 100. For example, communication interface 218 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, etc. As another example, communication interface 218 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface 218 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.

[0044] The object tracking system 118 can receive image metadata from the cameras 114 and may send processed object tracking data through the network(s) 116, network link 220 and communication interface 218.

[0045] On a high level, the object tracking system 118 retrieves object metadata for a particular period of time (e.g., 1 hour, 4 hours, 8 hours, 12 hours, or 24 hours) from multiple cameras with fields of view in the physical location 102 (via the network interface 218), analyses this metadata, and performs a number of intelligent functions on this image metadata to identify distinct objects in the physical location over that period of time and track the movements of these identified distinct objects while they are in the physical location 102.

[0046] In some embodiments, the object tracking system 118 may receive image metadata from the cameras 114 in real time or near real time. For example, the cameras 114 may stream image metadata once the cameras have performed their analytics on the image data. Alternatively, the cameras 114 may store the image metadata in in-camera memory and the object tracking system 118 may be configured to retrieve the image metadata from the cameras 114 on a scheduled basis (e.g., every few seconds, minutes, or hours). In either case, the object tracking system 118 is configured to store the received image metadata in its own storage device 210.

[0047] As described previously, the cameras may record a number of different metadata items for identified objects. For example, the cameras may record the width, height, size of an object. The cameras may also record whether the identified object is thermal or not, what class the identified object belongs to, the speed and direction with which the object is moving, the shape of the object, the geolocation of the object and the Cartesian location of the object. The object tracking system 118 may not be interested in all these metadata fields for a given object. Instead, the object tracking system 118 may only be interested in the object identifier, the time at which the object was detected, the geolocation and/or the Cartesian location of the object. Accordingly, in one embodiment, the object tracking system 118 may filter the image metadata received from the cameras to remove any unwanted fields before storing the image metadata in the storage device 210. Alternatively, the entire image metadata may be stored in the memory without any pre filtering.

[0048] Table A illustrates an example batch of image metadata retrieved from a camera in physical location 102.

Table A - example batch of image metadata from a camera

[0049] For each record, the example image metadata batch depicted in table A includes:

• UTC - which is the Coordinated Universal Time at which the image in which the object associated with that record was identified was captured.

• Object ID, which is a unique identifier allocated by the camera to the identified object. • Cartesian location, which indicates the X, Y and Z axis coordinates of the identified object at that time.

[0050] It will be appreciated that the object tracking system 118 stores similar image metadata for all the cameras in the physical location 102. Further, as new batches of image metadata are received from the cameras 114, the object tracking system 118 appends the new image metadata to the metadata already stored in storage device 210. A camera identifier is associated with the metadata received from each camera to distinguish the metadata received from different cameras.

[0051] At a high level, the functions performed by the object tracking system 118 include identifying the same object across multiple cameras, identifying the same objects in data from one camera, identifying mismatched objects, merging object metadata for objects identified as being the same object across multiple cameras and tracking movement based on the retrieved and stored image metadata.

[0052] Each of these functions will be described in detail in the following sections.

Example processes

[0053] Fig. 3 illustrates an example method 300 performed by the object tracking system 118 to identify the same object from a single camera metadata. As described previously, one issue with camera identified objects is that the same object may be assigned two different object identifiers if the object remains stationary for a period of time and then begins moving again. Method 300 identifies such occurrences and corrects for them. The method 300 is performed for image metadata received from each individual camera separately - i.e., the process 300 is performed iteratively for all the cameras in the physical location 102.

[0054] The method begins at step 302 where the object tracking system 118 extracts image metadata for an unprocessed camera 114. When the method first commences, all the cameras may be unprocessed, so the object tracking system 118 may extract image metadata from any one of the cameras at random or based on a predetermined order. In certain embodiments, the object tracking system 118 may extract image metadata corresponding to a particular period of time (e.g., a 10 minute interval).

[0055] At step 304, the object tracking system 118 retrieves all objects within this interval along with their first and last known locations. It will be appreciated that if an object is identified within the range of a particular camera, the object may be assigned an object identifier and thereafter as long as the object remains within the visual range of that camera and continues to be classified as an obj ect of interest, the camera continues to monitor the obj ect along with the obj ecf s location. Accordingly, the image metadata from the camera may include multiple records for a particular object identifier with corresponding location information. For example, in the table A depicted above, the image metadata includes two records for the object with the object ID ‘500’. At method step 304, the object tracking system 118 determines the first or earliest location identified for each object and the last or latest location identified for that object from the image metadata. What recorded locations of an object correspond to the first and last locations may be determined based on the UTC field in the image metadata. At the end of method step 304, the object tracking system 118 has a list of unique object identifiers and their corresponding first and last known locations and times.

[0056] Next, at step 306, the object tracking system gets an unprocessed pair of object identifiers A and B, where A represents a first object identifier and B represents a second object identifier. It will be appreciated that in the first iteration, none of the pairs of object identifiers in the list would have been processed, so the object tracking system 118 may select any pair of object identifiers. Thereafter, if a particular pair has already been examined, the pair may be ignored and another unprocessed pair of object identifiers may be selected.

[0057] At step 308 a determination is made whether object identifiers A and B correspond to the same object. As described previously, if an object stops moving completely for a period of time, the camera may drop the object identification as the object may blend with the environment. If the object starts moving again, the camera identifies this as movement of a new object (and assigns a new object identifier to the object). In the application of people tracking, a person cannot simply disappear or appear out of thin air in a physical location. Accordingly, if there is a situation where an object was last seen in a location that is same as the location at which a new object is first seen, within an acceptable time period, the object tracking system 118 can infer that it is the same object.

[0058] Accordingly, at this step, the object tracking system 118 compares the first location of object A with the last location of object B. If these locations (i.e., first location of A and last location of B) are within a threshold distance of each other (e.g., within 1 meter) and object A was first seen within a threshold period of time after B was last seen), the object tracking system 118 infers that objects A and B are the same object.

[0059] Alternatively, at this step, the object tracking system 118 compares the last location of object A with the first location of object B. If these locations (i.e., the last location of A and first location of B) are within a threshold distance of each other and Object A was last seen within a threshold period of time before object B was first seen, the object tracking system 118 may infer that objects A and B are the same object.

[0060] If at step 308, it is determined that object identifiers A and B correspond to the same object, the object tracking system 118 records the match between object identifiers A and B, stores the matched pairs of object identifiers at step 310 and the method 300 proceeds to step 312.

[0061] Alternatively, if at step 308, it is determined that object identifiers A and B do not correspond to the same object, the object tracking system 118 directly proceeds to step 312.

[0062] At step 312 a determination is made whether further unprocessed object pairs are present. If the object tracking system 118 has not compared all the distinct object identifiers retrieved at step 304 with every other distinct object identifier in that list, the method returns to step 306 where the object tracking system 118 selects an unprocessed pair of object identifiers and repeats steps 308-312.

[0063] On the other hand, if at step 312, it is determined that all the distinct object identifiers retrieved at step 304 have been already compared with every other distinct object identifier in that list, the method proceeds to step 314 where a determination is made whether further unprocessed cameras exist.

[0064] If the object tracking system 118 has not processed the image metadata for all the cameras in the physical location 102, the method returns to step 304. Otherwise, method 300 ends.

[0065] It will be appreciated that method 300 may be performed on a scheduled basis. For example, each time a new batch of image data is retrieved from the cameras 114, process 300 may be scheduled to execute. Alternatively, method 300 may be scheduled to execute independently of when new image metadata is received from the cameras. For example, it may be scheduled to execute every 10 minutes. In this case, the object tracking module 118 may retrieve image metadata for the latest 10 minute interval each time it executes. Further, the list of matched object identifiers may be updated each time method 300 is executed.

[0066] Table B illustrates an example list of matched object identifiers obtained after one cycle of method 300.

Table B: example dataset of matched object identifiers

[0067] Each item in table B includes a pair of matched object IDs (i.e., OBJ1 and OBJ2) that were matched in one cycle of method 300 and the camera IDs of the cameras from which the corresponding raw metadata was received. In this example, camera IDs are depicted as IP addresses. However it will be appreciated that this is merely an example, and in other examples, any other unique identifiers may be employed to distinguish between cameras. Similarly, any notation may be utilized to uniquely identify objects. In some cases, the cameras in the physical location may utilize the same object identifier notations and therefore two cameras may generate the same object identifiers. To distinguish between these object identifiers, the camera and object identifiers may be utilized in conjunction.

[0068] Fig. 3 depicts one method for individually identifying the same object from image metadata retrieved from multiple cameras. In addition to the raw metadata depicted in table A, the cameras 114 may also be configured to send “deleted object” messages to the object tracking system 118 when the camera stops tracking an object (e.g., due to the object moving out of view or being lost in the background). A deleted object message includes the object identifier and in addition may include a timestamp of when the object was deleted. In an alternative method for identifying the same object from image metadata, the object tracking system 118 utilizes these deleted object messages instead of comparing each detected object with each other detected object in a particular cycle of the method.

[0069] In particular, for each cycle of method 300, the object tracking system identifies any deleted object messages received within that timeframe. The object tracking system 118 then reviews the historical image metadata from that camera to identify the first and last occurrences of that deleted object. This object identifier along with first and last occurrences of the deleted object may be stored in an intermediate results file and this file is updated with each cycle of method 300.

[0070] Further, in each cycle of method 300, the first and/or last occurrences of these deleted objects are compared with the first occurrences of newly identified objects in each cycle. The remainder of the method steps remain the same as those described with respect to Fig. 3

[0071] In yet another example, in each cycle, the object tracking system 118 may simply identify any deleted objects from the deleted object messages received in that cycle. It may then review the historical image metadata from the corresponding camera to identify the first and last occurrences of the deleted objects. The object identifiers along with their first and last occurrences may be stored in an intermediate results file and this file may be updated with each cycle of method 300. Subsequently, (either in each subsequent cycle or at the end of the day), the first and last occurrences of the deleted objects in the intermediate results file may be compared with the first and last occurrences of the other deleted objects in the file to determine possible matches. If matches are found, these may be added to the dataset of matched identifiers (e.g., as depicted in Table B).

[0072] With this arrangement, as the object tracking system only retrieves the first and last known occurrences of deleted objects, the processing power required to perform this method is significantly reduced as the object tracking system does not have to retrieve the first and last known occurrences for each detected object in each cycle of the method.

[0073] To further improve processing efficiency, another alternative method may be employed. In this alternative method, for a first iteration of method 300 (i.e., a first time splice), the first and last known time and location for all objects identified in the time splice is extracted. Further, the object tracking system 118 identifies any “deleted object” messages received within that time splice, retrieves the object identifiers of the objects identified in the deleted object messages and marks the deleted objects as ‘old objects’ . Objects in the current time splice that are not ‘old objects’ are considered ‘active objects’. The identifiers of all active objects in the time splice may be saved to a temporary file that can be updated for each cycle of method 300. [0074] Once the objects in the time splice are categorized into deleted and active objects, for each camera, the first known location of each object identified in the time splice is compared with the last known location of each of the deleted objects to determine if the compared object identifiers correspond to the same physical object. If, for example, an object A in the time splice is compared with a deleted object B such that object A was first seen within a threshold time after object B was last seen and object A’s first known location is within a threshold distance from object B’s last known location, the object tracking system 118 determines that these object identifiers correspond to the same physical location. This step is repeated until all the objects identified in the time splice are compared with deleted objects for each camera. Any identified matches may be recorded in a dataset for matched object identifiers (e.g., as shown in table B). [0075] Next (optionally), each identified deleted object is compared with all other objects identified in the time splice to determine if the compared object identifiers correspond to the same physical object. If, for example, a deleted object A is compared with an object B identified in the time splice such that object A was last seen within a threshold time before object B was first seen and object A’s last known location is within a threshold distance from object B’s first known location, the object tracking system 118 determines that these object identifiers correspond to the same physical location. This step may be repeated until all the deleted objects are compared with the object identified in the time splice for each camera. Any identified matches may be recorded in a dataset for matched object identifiers (e.g., as shown in table B).

[0076] The identified matches for each camera may be consolidated as object identifier pairs in the dataset for matched object identifiers.

[0077] In the next iteration, (i.e., the next time splice), the first and last known time and location of all object identified in that time splice may be extracted. This first and last known time is the first and last known time the object is detected within that time splice. Next, the system may compare the object identifiers of the objects identified in that time splice with the active objects identified in the previous iteration and saved in the temporary file to determine if any objects in the present time splice were also identified in the previous time splice. If any such active objects are identified, the object tracking system may replace the ‘first-seen’ time and location of such active objects with the ‘first seen’ time and location recorded in the temporary file. Thereafter, the other steps of the first iteration are repeated.

[0078] This way, the first seen location and time of active objects may be maintained in a current file for the duration of method 300 and the object tracking system 118 does not have to extract this information from the raw image metadata for each time splice, saving processing power. [0079] Fig. 4 illustrates an example process for matching objects across multiple cameras. As described previously, one issue with camera identified objects is that as an object moves between the ranges of multiple cameras, the object may be associated with multiple different identifiers. Method 400 identifies such occurrences and corrects for these. The method 400 is performed using image metadata received from all the cameras in physical location 102.

[0080] The method begins at step 402 where the object tracking system 118 extracts unprocessed image metadata corresponding to all the cameras in physical location 102. This retrieved image metadata may include for each record, the object identifier, corresponding time, Cartesian location and an identifier of the camera, which recorded that object. The metadata may be ordered according to the time of the records.

[0081] At step 404, the object tracking system 118 splices the extracted metadata into smaller time intervals. For example, the object tracking system 118 splices the extracted metadata into 5 second intervals, where each five second interval includes image metadata (i.e., list of records) from all the cameras 114 that recorded objects within that interval.

[0082] Next (at step 406), an unprocessed splice of image metadata is selected. When the method first commences, all the metadata splices may be unprocessed, so the object tracking system 118 may select image metadata from any splice at random or based on a predetermined order.

[0083] At step 408, the object tracking system 118 selects an unprocessed pair of records A and B, where A represents an instance recorded by a first camera and B represents an instance recorded by a second camera. It will be appreciated that in the first iteration, none of the pairs of records are processed, so the object tracking system 118 may select any pair of records. Thereafter, if a particular pair of records has already been examined, the pair may be ignored and another unprocessed pair of records may be selected.

[0084] At step 410, the object tracking system 118 determines whether the records A and B correspond to the same object. When an object moves to a region that is within the field of view of two (or more) cameras, both the cameras detect the object and create one or more metadata records for that object (albeit with different object identifiers). If the cameras 114 are calibrated properly the location of the objects recorded by both the cameras should theoretically be the same. Accordingly, the two objects identified by the two cameras are essentially the same and can be identified using a common object identifier. Alternatively, if the cameras are not calibrated or have been determined to not be calibrated for use with the object tracking system 118, then object tracking system 118 may perform a step of adjusting the location data, for example by using a look-up table or translation rule, so that the location data across cameras has a common frame of reference.

[0085] To make this determination, at step 410, the object tracking system 118 compares the time and location for record A with the time and location for record B. If the difference in record A’s time and record B’s time is within a threshold value (e.g., a few milliseconds) and the difference in record A’s location and record B’s location is within a threshold distance (e.g., a radius of 25 centimetres or one meter), the object tracking system 118 determines that the records (and associated object identifiers) correspond to the same object. Alternatively, if the difference in record A’s time and record B’s time is not within the threshold value and/or the difference in record A’s location and record B’s location is not within a threshold value, the object tracking system 118 determines that the records (and associated object identifiers) do not correspond to the same object.

[0086] If at step 410, it is determined that the object identifiers of records A and B correspond to the same object, the object tracking system 118 records the match between those object identifiers, stores the matched object identifiers at step 412 and the method 400 proceeds to step 414.

[0087] Alternatively, if at step 410, it is determined that object identifiers of records A and B do not correspond to the same object, the object tracking system 118 directly proceeds to step 414.

[0088] At step 414 a determination is made whether further unprocessed record pairs are present. If the object tracking system 118 has not compared all the distinct record pairs from different cameras, the method returns to step 408 where the object tracking system 118 selects an unprocessed pair of records and repeats steps 410-414.

[0089] On the other hand, if at step 414, it is determined that all the distinct record pairs have already been compared, the method proceeds to step 416 where a determination is made whether further unprocessed metadata splices are present.

[0090] If the object tracking system 118 has not processed the image metadata from all the time splices, the method returns to step 406. Otherwise, method 400 proceeds to step 418 where the object identifiers of all the matched instances are consolidated. This includes aggregating the matched records by matched object identifier pairs with a count of matched records per pair of object identifiers. In certain embodiments, any pairs of object identifiers with a count less than a threshold number (e.g., 3) may be discarded. The remaining pairs of object identifiers may be aggregated by unique object identifier with all object identifiers that match with that given object. [0091] Table C illustrates an example list of matched object identifiers obtained after one cycle of method 400.

Count OBJ1 OBJ2

3 42061 2679

3 42096 17004

4 42113 17164

16 42116 2688

20 42116 17176

48 42116 17195

23 42116 17201

24 42116 17206

4 42144 2688

25 42144 17209

3 2679 42061

15 2688 42116

5 2688 42144

3 17004 42096

4 17164 42113

20 17176 42116

49 17195 42116

26 17201 42116

23 17206 42116

24 17209 42144

Table C: example matched object identifiers

[0092] Each item in table C includes a pair of matched object IDs (i.e., OBJ1 and OBJ2) and a count of the number of times the object tracking system 118 determined that the two object identifiers were matched in one cycle of method 400. The list of matched identifiers may also include the camera IDs of the corresponding cameras.

[0093] It will be appreciated that similar to method 300, method 400 may be performed on a scheduled basis. For example, each time a new batch of image data is retrieved from the cameras 114, process 400 may be scheduled to execute. Alternatively, method 400 may be scheduled to execute independently of when new image metadata is received from the cameras. For example, it may be scheduled to execute every 10 minutes. In this case, the object tracking module 118 may retrieve image metadata for the latest 10 minute interval each time it executes. Further, the list of matched object identifiers may be updated each time method 400 is executed.

[0094] Fig. 5 illustrates an example process for identifying objects that cannot be the same. In some cases, method 400 may incorrectly identify two different objects as the same object. For example, consider the situation where two or more people are walking together and are monitored by two overlapping cameras. Each camera may detect two objects and identify their locations as being very close to each other. When method 400 is performed, because the locations of the two objects is within the threshold distance (e.g., within 25cms or 1 meter of each other), the object tracking system 118 matches the object IDs of the two objects in method 400. Accordingly, method 500 is executed to negate any false matches that can get picked up by the matching process. In particular, method 500 identifies any definite mismatches so that these can be discarded from consideration when determining whether two object identifiers correspond to the same object. For example, in the example above, when the two people were walking together, the matching method can match their object identifiers. However, when, at some other point in time, the two people start to walk apart, method 500 can determine that these objects are definitely not the same object and can record this. The method 500 is performed using image metadata received from all the cameras in physical location 102.

[0095] The method begins at step 502 where the object tracking system 118 extracts unprocessed image metadata corresponding to all the cameras in physical location 102. This retrieved image metadata may include for each record, the object identifier, corresponding time, Cartesian location and an identifier of the camera, which recorded that object. The metadata may be ordered according to the time of the records. As with the process of Figure 4, the location data may be calibrated on receipt by the object tracking system 118 or calibrated by the object tracking system 118.

[0096] At step 504, the object tracking system 118 splices the extracted metadata into smaller time intervals. The time intervals may be the same as those used for method 400. For example, the object tracking system 118 may splice the extracted metadata into 5 second intervals, where each five second interval includes image metadata (i.e., list of records) from all the cameras 114 that recorded objects within that interval.

[0097] Next (at step 506), an unprocessed splice of image metadata is selected.

[0098] At step 508, the object tracking system 118 selects an unprocessed pair of records A and B. It will be appreciated that in the first iteration, none of the pairs of records are processed, so the object tracking system 118 may select any pair of records. Thereafter, if a particular pair of records has already been examined, the pair may be ignored and another unprocessed pair of records may be selected.

[0099] At step 510, the object tracking system 118 determines whether the records A and B correspond to completely different objects. If the location of any two objects in physical location 102 is detected to be a threshold distance apart at the same by one or more cameras, it can be safely determined that the two objects cannot be the same object (e.g., person).

[00100] To make this determination, at step 510, the object tracking system 118 compares the time and location for record A with the time and location for record B. If the difference in record A’s time and record B’s time is within a threshold value (e.g., a few milliseconds) and the difference in record A’s location and record B’s location is more than a threshold value (e.g., a radius of 25 meters), the object tracking system 118 determines that the records (and associated object identifiers) correspond to different objects. Alternatively, if the difference in record A’s time and record B’s time is not within the threshold value and/or the difference in record A’s location and record B’s location is not more than the threshold value, the object tracking system 118 cannot safely determine that the records (and associated object identifiers) do not correspond to the same object.

[00101] If at step 510, it is determined that the object identifiers of records A and B cannot correspond to the same object, the object tracking system 118 records the mismatch between those object identifiers at step 512 and the method 500 proceeds to step 514.

[00102] Alternatively, if at step 510, it is determined that it is not possible to safely conclude that object identifiers of records A and B do not correspond to the same object, the object tracking system 118 directly proceeds to step 514.

[00103] At step 514 a determination is made whether further unprocessed record pairs are present. If the object tracking system 118 has not compared all the distinct record pairs from different cameras, the method returns to step 508 where the object tracking system 118 selects an unprocessed pair of records and repeats steps 510-514.

[00104] On the other hand, if at step 514, it is determined that all the distinct record pairs have already been compared, the method proceeds to step 516 where a determination is made whether further unprocessed metadata splices are present.

[00105] If the object tracking system 118 has not processed the image metadata from all the time splices, the method returns to step 506. Otherwise, method 500 proceeds to step 518 where the object identifiers of all the mismatched instances are consolidated. This includes aggregating the matched records by mismatched object identifier pairs with a count of mismatched records per pair of object identifiers. In certain embodiments, any pairs of object identifiers with a count less than a threshold number (e.g., 3) may be discarded. The remaining pairs of mismatched object identifiers may be aggregated by unique object identifier with all object identifiers that mismatch with that given object.

[00106] Table D illustrates an example list of mismatched object identifiers obtained after one cycle of method 500.

Count OBJ1 OBJ2

7 42061 42111

5 42061 2669

7 42061 17164

6 42096 42111 2 42096 42113

148 42096 42116

5 42096 42144

8 42096 17164 5 42096 17169

45 42096 17176

76 42096 17195

19 42096 17196 6 42096 17200

12 42096 17201

20 42096 17206

4 42096 17209

7 42111 42061

6 42111 42096

5 42111 17004

2 42113 42096

148 42116 42096

7 42116 17004

3 42116 17196

3 42116 17200

5 42144 42096

4 2669 42061

5 17004 42111

7 17004 42116

Table D: example mismatched object identifiers

[00107] Each item in table D includes a pair of mismatched object IDs (i.e., OBJ1 and OBJ2) and a count of the number of times the object tracking system 118 determined that the two object identifiers were mismatched in one cycle of method 500. The list of matched identifiers may also include the camera IDs of the corresponding cameras.

[00108] It will be appreciated that method 500 may also be performed on a scheduled basis. For example, each time a new batch of image data is retrieved from the cameras 114, process 500 may be scheduled to execute. Alternatively, method 500 may be scheduled to execute independently of when new image metadata is received from the cameras. For example, it may be scheduled to execute every 10 minutes. In this case, the object tracking module 118 may retrieve image metadata for the latest 10 minute interval each time it executes. Further, the list of matched object identifiers may be updated each time method 500 is executed.

[00109] Fig. 6 is a flowchart illustrating an example method 600 for identifying and tracking objects in the physical location 102.

[00110] The method begins at step 602 where the outputs of methods 300 and 400 are combined. In certain embodiments, the data in tables B and C may be combined into a single dataset such that the combined dataset includes the list of all object matches. This may also include aggregating exact matching pairs. For example, if object identifier A and object identifier B were considered to correspond to the same object in one time splice of method 400 and the same object identifiers were also considered to correspond to the same object in another time splice of method 400, at this step, the two items in the combined list may be combined such that only one record remains. However, the count for the matched object identifiers is the sum of the count for each individual listed pair. A similar process is carried out for the output of method 500 - i.e., exact mismatching pairs are aggregated.

[00111] At step 604, the matched object identifier pairs in the combined list are re-aggregated based on unique object identifiers. For example, if one matched pair is object identifier A = object identifier B and another match pair is object identifier B = object identifier C, the object tracking system 118 aggregates these object identifiers as follows -

A = [B,C]

A similar process is carried out on the output of method 500 at this stage such that mismatched object identifier pairs are re-aggregated based on unique object identifiers. For example, if two records in the mismatched list show that object identifier A is not matched with object identifier F and object identifier A is not matched with object identifier G, the aggregated mismatched object identifiers for these two records may be as follows -

A ¹ [F, G]

[00112] In one embodiment, these aggregated matched and mismatched object identifiers may be stored as tuples.

[00113] At step 606, the aggregated matched object identifiers after step 604 are compared with the aggregated mismatched object identifiers obtained at the end of step 604. If any mismatched object identifiers are identified in the matched object identifiers list, the object identifier is removed. For example, if the matched object identifiers list indicates that object identifiers A and E are associated with the same object and the mismatched object identifiers list indicates that A and E are mismatched objects, object identifier E may be removed from the aggregated matched identifiers corresponding to object identifier A in the list of matched object identifiers obtained at the end of method step 604. In one embodiment, the object tracking system 118 iteratively compares each pair of mismatched object identifiers with each pair of matched object identifiers at this step.

[00114] Next (at step 608), the primary object identifier of the aggregated matched object identifiers after step 606 is copied into the matched object identifier array to form a tuple. For example, the aggregated matched object identifiers A = [B,C] would be changed into A = [A, B, C]

[00115] At this stage there may be a number of items in the list that have common object identifiers. For example, there may be two items in the list such as below -

A = [ A, B, C]

S = [S, A, N, L]

At this step, the object tracking system 118 is configured to merge these two items as all the object identifiers in these two items must correspond to the same object such that the two items become

A = [ A, B, C, S, N, L]In one embodiment, the object tracking system 118 iteratively determines if any common object identifiers are present in two or more items in the list and merges them until no more common object identifiers exist between any two items in the list. In another example, more efficient techniques such as graph logic may be employed to merge the tuples at this stage. In other words, once step 608 is completed, no two tuples include the same object identifier. This is an output of method 600.

[00116] Table E illustrates an example output of one cycle of method 600.

Unique ID Matched object identifiers

1 42061, 2679

2 42096, 17004

42113, 42116, 42144, 2688,

17164, 17176, 17195,

3 17201, 17206, 17209

4 42111, 42121 5 17196, 17200

Table E: example mismatched object identifiers

[00117] Each item in table E includes a unique ID for the corresponding tuple (which may be assigned based on any naming convention) and the corresponding tuple of matched object identifiers such that each row in table E corresponds to one object/person in the physical location 102.

[00118] Each item in the dataset represents a distinct object in the physical location 102 and the tuple of object identifiers for each item represents all the object identifiers by which that object was recorded by the cameras 114 in the physical location. If the records for each of those object identifiers are retrieved from the original image metadata and arranged in a time order, that object’s movement in the physical location can be determined.

[00119] In some embodiments, the output of method 600 may be forwarded to an output system that is configured to further process the output to display the output on a display of a computing device, e.g., a user computing device. Users can then analyse the object tracking data to determine, e.g., the number of people in the physical location at any given time, the number of people that visited the physical location in a given period, the dwell times of people in the physical location, the paths taken by different people within the physical location and/or the density of people in any given region of the physical location, etc. This information may be utilized to optimize the layout of the physical location, manage manpower, etc.

[00120] The output may be displayed in any suitable fashion. In an example, the output may be displayed in the form of charts or graphs (such as heat maps, density maps, etc.) in other examples, the output may be displayed in the form of tabular or text data.

[00121] It will be appreciated that method 600 is performed after methods 300-500 are performed. In some embodiments, method 600 may be perform immediately after methods 300- 500 are performed. In other embodiments, this method may be performed at a later stage (as required). Further still, two or more of methods 300-500 may be performed sequentially or parallel without departing from the scope of the present disclosure. For example, 300, 400 & 500 may be executed in realtime/near-realtime as independent processes producing the three intermediate result sets in parallel. Method 600 may be executed on a scheduled manner (frequency based on requirement) to process the intermediate results up to that scheduled execution and producing an updated final result set. [00122] Further still, as described above, Figs. 3 to 6 illustrate exemplary methods 300-600 as collections of steps in logical flowcharts. The flowcharts represent sequences of operations that can be implemented in hardware, software, or a combination thereof. When implemented in software, computer readable instructions are stored and executed by the object tracking system 118 to cause the system to perform the described operations. The software instructions may be formed as one or more code modules, each for performing one or more particular tasks. The order in which a given process is described in not necessarily intended to be construed as a limitation. In certain cases, the steps can be performed in a different order to that shown. Further, in certain cases multiple steps can be combined into a single step, a single step can be performed as multiple separate steps, additional steps can be included, and/or described steps can be omitted.

[00123] In the present disclosure, it is assumed that cameras 114 include an analytics processor that is configured to analyse image data to identify objects and generate the object-based image metadata. However, this may not always be the case. In some embodiments, the analytics processor may be a separate, independent system that is connected to the cameras 114. In these embodiments, the analytics processor may receive image data from the cameras and analyse this data to identify objects and generate object-based image metadata. In this case, the object tracking system 118 may not be in direct communication with the cameras 114. Instead, it may be in communication with the analytics processor.

[00124] The operations of the object tracking system are described with respect to the location and time image metadata to identify and track objects. In other embodiments, one or more other image metadata such as object speed, direction, size, and/or height may be utilized in addition to location or instead of location to identify and track objects. For example, objects having the same/similar size that are recorded by two cameras at about the same time may be considered matching objects. Similarly, two objects with significantly different sizes recorded at the same time may be considered to be mismatched objects. Similarly, speed and direction of motion can be utilized to match or mismatch object identifier pairs. The particular combination of image metadata types utilized for identification and tracking may depend on the particular application. For example, speed and direction matching may be appropriate characteristics to identify and track vehicles.

[00125] Further, in the present disclosure systems and methods are described that utilize in camera matches (as described in method 300) and inter-camera matches (as described in method 400) in combination. However, in other implementation, method 300 or method 400 may be implemented without the other method. For example, if an environment is monitored by a single camera or sensor, method 400 may be skipped. Similarly, if a camera system does not detect objects based on movement or lack thereof, the method of identifying and/or tracking objects may be implemented without the need for method 300.

[00126] Further still, in the present disclosure, the metadata associated with detected objects is referred to as “image metadata”. This applies when the sensors in the physical location are cameras. However, when other types of sensors are utilized, the metadata associated with the detected objects may simply be referred to as “object metadata”. Accordingly, in this disclosure, the terms “image metadata” and “object metadata” may be interchangeably used.

[00127] In addition, it will be appreciated that the systems and methods described herein employ object metadata to identify the same physical objects detected by the same sensor or multiple sensors. Enhancements to this method may be contemplated. For example, in addition to object metadata, the systems and methods described herein may identify the same physical objects based on rule-based triggers. The rules may vary depending on the type of environment being monitored. For example, in case the location is a bank branch, which has assigned teller booth locations, rules may be configured to determine that an object that enters a teller booth must be the same object that exits the teller booth. Similarly, if a person enters a private area, the system may be configured to determine that the person that subsequently steps out of the private area is the same person that entered the private area.

[00128] In the foregoing specification, embodiments of the present disclosure are described with reference to numerous specific details that may vary from implementation to implementation. Thus, the sole and exclusive indicator of what is the invention, and is intended by the applicants to be the invention, is the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Hence, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

[00129] It will be understood that the invention disclosed and defined in this specification extends to all alternative combinations of two or more of the individual features mentioned or evident from the text or drawings. All of these different combinations constitute various alternative aspects of the invention.