Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
AUTOMATIC LABELLING OF OBJECTS IN IMAGES
Document Type and Number:
WIPO Patent Application WO/2021/199024
Kind Code:
A1
Abstract:
A method which includes, by a processor and memory circuitry, providing a label for an image of a scene, the label being representative of whether the scene includes a given type of object, the method including: -obtaining, from a first data source, metadata of the image of the scene, the metadata including position and time data of the scene, - obtaining, from a second data source, data indicative of position over time and type of object of each object of a plurality of objects, - determining whether one or more objects have a position matching the position data of the metadata at a time matching the time data of the metadata, and - providing the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type.

Inventors:
PENSO ROEE (IL)
Application Number:
PCT/IL2021/050296
Publication Date:
October 07, 2021
Filing Date:
March 18, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ISRAEL AEROSPACE IND LTD (IL)
International Classes:
G06F16/43; G06V10/764
Domestic Patent References:
WO2019171227A12019-09-12
Foreign References:
US20130129142A12013-05-23
US20090190797A12009-07-30
US20120300089A12012-11-29
Attorney, Agent or Firm:
HAUSMAN, Ehud (IL)
Download PDF:
Claims:
CLAIMS

1. A method including, by a processor and memory circuitry (PMC), providing a label for an image of a scene, the label being representative of whether the scene includes a given type of object, including:

- obtaining, from a first data source, metadata of the image of the scene, the metadata including: o time data representative of a time at which the image has been acquired, and o position data representative of a position of the scene;

- obtaining, from a second data source, data indicative of position over time and type of object of each object of a plurality of objects,

- determining whether one or more objects have a position matching the position data of the metadata at a time matching the time data of the metadata, and

- providing the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type.

2. The method of claim 1 , wherein the image includes a plurality of subsets including one or more pixels, wherein each subset is associated with different time data representative of a time at which the subset has been acquired.

3. The method of any of claims 1 or 2, wherein if an obj ect of the one or more obj ects is of the given type, the method includes selecting a limited area of the image including the object of the given type, based on the position of the object.

4. The method of any of claims 1 to 3, wherein:

- if an object of the one or more objects is of the given type, the label indicates presence of the given type of object in the scene;

- if no object of the one or more objects is of the given type, the label includes a prospect that presence of the given type of object in the scene is below a threshold.

5. The method of any of claims 1 to 4, wherein the second data source includes, for each object of the plurality of objects, a type of the object, wherein the method includes determining whether an object of the plurality of objects meets (a) and (b):

(a) position of the object matches the position data of the metadata at a time matching the time data of the metadata, and

(b) a type of the object is the given type.

6. The method of any of claims 1 to 5, wherein the second data source includes a library storing position over time of a plurality of objects each of the given type, wherein the method includes:

- determining whether an object of the plurality of objects has a position matching the position data of the metadata at a time matching the time data of the metadata.

7. The method of any of claims 1 to 6, wherein

- upon obtaining a given library out of a plurality of different libraries, wherein each object of the plurality of objects of the given library is of the given type, wherein, for each library: o the library stores, for each object of a plurality of objects, data representative of a position of the object over time, o each object of the plurality of objects of the library is of a same type which differs from a type of objects stored in other libraries of the plurality of libraries, the method includes: o determining, based on the given library, whether a given object of the plurality of objects of the given library has a position matching the position data of the metadata at a time matching the time data of the metadata, and o providing the label, wherein the label depends on whether the given object has been determined.

8. The method of any of claims 1 to 7, further including training a software module to detect presence of the given type of object in an image, based on the image of the scene and the label.

9. The method of any of claims 1 to 8, wherein the determining and the providing are performed automatically by the PMC.

10. The method of any of claims 1 to 9, wherein at least one of (a) and (b) is met:

(a) the given type of object is present in the scene and the given type of object is invisible in the image of the scene for a human, and

(b) the image is a satellite image.

11. The method of any of claims 1 to 10, including providing a label for an image of a scene, the label being representative of whether the scene includes a second given type of object which is a static object, including:

- obtaining, from a first data source, metadata of the image, the metadata including position data representative of a position of the scene,

- obtaining, from a second data source, data indicative of position and type of object of each object of a plurality of objects,

- determining whether one or more objects have a position matching the position data of the metadata, and

- providing the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type.

12. The method of any of claims 1 to 11, wherein:

- the image of the scene has been acquired by a sensor of a first type,

- data indicative of position over time and type of object of each object of a plurality of objects is obtained based on images acquired by a sensor of a second type, different from the first type.

13. The method of claim 12, wherein, for a scene including an object of the given type, the object of the given type is better identifiable in an image of the scene acquired by a sensor of the second type than in an image of the scene acquired by a sensor of the first type.

14. A method including, by a processor and memory circuitry (PMC), providing a label for an image of a scene, the label being representative of whether the scene includes a given type of object, including:

- obtaining, from a first data source, metadata of the image, the metadata including position data representative of a position of the scene,

- obtaining, from a second data source, data indicative of position and type of object of each object of a plurality of objects,

- determining whether one or more objects have a position matching the position data of the metadata, and

- providing the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type.

15. The method of claim 14, wherein the given type of object is a static object.

16. The method of any of claims 14 or 15, including:

- upon obtaining a given library out of a plurality of different libraries, wherein each object of the plurality of objects of the given library is of the given type, wherein, for each library: o the library stores, for each object of a plurality of objects, data representative of a position of the object, o each object of the plurality of objects of the library is of a same type which differs from a type of objects stored in other libraries of the plurality of libraries,

- determining, based on the given library, whether a given object of the plurality of objects of the given library has a position matching the position data of the metadata, and

- providing the label, wherein the label depends on whether the given object has been determined.

17. The method of any of claims 14 to 16, wherein: - the image of the scene has been acquired by a sensor of a first type,

- data indicative of position and type of object of each object of a plurality of objects is obtained based on images acquired by a sensor of a second type, different from the first type.

18. The method of claim 17, wherein, for a scene including an obj ect of the given type, the object of the given type is better identifiable in an image of the scene acquired by a sensor of the second type than in an image of the scene acquired by a sensor of the first type.

19. A method including, by a processor and memory circuitry (PMC):

- training a software module to detect presence of a given type of object in an image, the training including feeding to the software module an image of a scene and a label representative of whether the scene includes the given type of object, wherein generation of the label includes: o obtaining, from a first data source, metadata of the image of the scene, the metadata including time data representative of a time at which the image has been acquired, and position data representative of a position of the scene, and o obtaining, from a second data source, data indicative of position over time and type of obj ect of each obj ect of a plurality of obj ects, o determining whether one or more obj ects have a position matching the position data of the metadata at a time matching the time data of the metadata, wherein the label depends on whether an object of the one or more objects is of the given type.

20. A system including a processor and memory circuitry (PMC), configured to provide a label for an image of a scene, the label being representative of whether the scene includes a given type of object, the system being configured to:

- obtain, from a first data source, metadata of the image of the scene, the metadata including: o time data representative of a time at which the image has been acquired, and o position data representative of a position of the scene;

- obtain, from a second data source, data indicative of position over time and type of object of each object of a plurality of objects,

- determine whether one or more objects have a position matching the position data of the metadata at a time matching the time data of the metadata, and

- provide the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type.

21. The system of claim 20, wherein the image includes a plurality of subsets including one or more pixels, wherein each subset is associated with different time data representative of a time at which the subset has been acquired.

22. The system of any of claims 20 or 21, wherein if an object of the one or more objects is of the given type, the system is configured to select a limited area of the image including the object of the given type, based on the position of the object.

23. The system of any of claims 20 to 22, wherein:

- if an object of the one or more objects is of the given type, the label indicates presence of the given type of object in the scene;

- if no object of the one or more objects is of the given type, the label indicates absence of the given type of object in the scene.

24. The system of any of claims 20 to 23, wherein the second data source includes, for each object of the plurality of objects, a type of the object, wherein the system is configured to determine whether an object of the plurality of objects meets (a) and (b):

(a) position of the object matches the position data of the metadata at a time matching the time data of the metadata, and

(b) a type of the object is the given type.

25. The system of any of claims 20 to 24, wherein the second data source includes a library storing position over time of a plurality of objects each of the given type, wherein the system is configured to determine whether an object of the plurality of objects has a position matching the position data of the metadata at a time matching the time data of the metadata.

26. The system of any of claims 20 to 25, wherein, upon obtaining a given library out of a plurality of different libraries, wherein each object of the plurality of objects of the given library is of the given type, wherein, for each library the library stores, for each object of a plurality of objects, data representative of a position of the object over time and each object of the plurality of objects of the library is of a same type which differs from a type of objects stored in other libraries of the plurality of libraries, the system is configured to:

- determine, based on the given library, whether a given object of the plurality of obj ects of the given library has a position matching the position data of the metadata at a time matching the time data of the metadata, and

- provide the label, wherein the label depends on whether the given object has been determined.

27. The system of any of claims 20 to 26, configured to train a software module to detect presence of the given type of object in an image, based on the image of the scene and the label.

28. The system of any of claims 20 to 27, wherein the determining and the providing are performed automatically by the PMC.

29. The system of any of claims 20 to 28, wherein at least one of (a) and (b) is met:

(a) the given type of object is present in the scene and the given type of object is invisible in the image of the scene for a human, and

(b) the image is a satellite image.

30. The system of any of claims 20 to 29, configured to provide a label for an image of a scene, the label being representative of whether the scene includes a second given type of object which is a static object, the system being configured to:

- obtain, from a first data source, metadata of the image, the metadata including position data representative of a position of the scene,

- obtain, from a second data source, data indicative of position and type of object of each object of a plurality of objects,

- determine whether one or more objects have a position matching the position data of the metadata, and

- provide the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type.

31. The system of any of claims 20 to 30, wherein:

- the image of the scene has been acquired by a sensor of a first type,

- data indicative of position over time and type of object of each object of a plurality of objects is obtained based on images acquired by a sensor of a second type, different from the first type.

32. The system of claim 31, wherein, for a scene including an object of the given type, the object of the given type is better identifiable in an image of the scene acquired by a sensor of the second type than in an image of the scene acquired by a sensor of the first type.

33. A system including a processor and memory circuitry (PMC), configured to provide a label for an image of a scene, the label being representative of whether the scene includes a given type of object, the system being configured to:

- obtain, from a first data source, metadata of the image, the metadata including position data representative of a position of the scene,

- obtain, from a second data source, data indicative of position and type of object of each object of a plurality of objects,

- determine whether one or more objects have a position matching the position data of the metadata, and provide the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type.

34. The system of claim 33, wherein the given type of object is a static object.

35. The system of any of claims 33 or 34, wherein, upon obtaining a given library out of a plurality of different libraries, wherein each object of the plurality of objects of the given library is of the given type, wherein, for each library the library stores, for each object of a plurality of objects, data representative of a position of the object, and each object of the plurality of objects of the library is of a same type which differs from a type of objects stored in other libraries of the plurality of libraries, the system is configured to:

- determine, based on the given library, whether a given object of the plurality of objects of the given library has a position matching the position data of the metadata, and

- provide the label, wherein the label depends on whether the given obj ect has been determined.

36. The system of any of claims 33 to 35, wherein:

- the image of the scene has been acquired by a sensor of a first type, and

- data indicative of position and type of object of each object of a plurality of objects is obtained based on images acquired by a sensor of a second type, different from the first type.

37. The system of claim 36, wherein, for a scene including an object of the given type, the object of the given type is better identifiable in an image of the scene acquired by a sensor of the second type than in an image of the scene acquired by a sensor of the first type.

38. A non-transitory storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform operations including providing a label for an image of a scene, the label being representative of whether the scene includes a given type of object, the operations including:

- obtaining, from a first data source, metadata of the image of the scene, the metadata including: o time data representative of a time at which the image has been acquired, and o position data representative of a position of the scene;

- obtaining, from a second data source, data indicative of position over time and type of object of each object of a plurality of objects,

- determining whether one or more objects have a position matching the position data of the metadata at a time matching the time data of the metadata, and

- providing the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type.

39. A non-transitory storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform operations including providing a label for an image of a scene, the label being representative of whether the scene includes a given type of object, the operations further including:

- obtaining, from a first data source, metadata of the image, the metadata including position data representative of a position of the scene,

- obtaining, from a second data source, data indicative of position and type of object of each object of a plurality of objects,

- determining whether one or more objects have a position matching the position data of the metadata, and

- providing the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type.

40. A non-transitory storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform operations including - training a software module to detect presence of a given type of object in an image, the training including feeding to the software module an image of a scene and a label representative of whether the scene includes the given type of object, wherein generation of the label includes: o obtaining, from a first data source, metadata of the image of the scene, the metadata including time data representative of a time at which the image has been acquired, and position data representative of a position of the scene, and o obtaining, from a second data source, data indicative of position over time and type of obj ect of each obj ect of a plurality of obj ects, o determining whether one or more obj ects have a position matching the position data of the metadata at a time matching the time data of the metadata, wherein the label depends on whether an obj ect of the one or more objects is of the given type.

Description:
AUTOMATIC LABELLING OF OBJECTS IN IMAGES

REFERENCE TO RELATED APPLICATIONS

Priority is claimed from Israeli Patent Application No. 273722 entitled "AUTOMATIC LABELLING OF OBJECTS IN IMAGES" and filed March 31, 2020, the disclosure of which application is hereby incorporated by reference.

TECHNICAL FIELD

The presently disclosed subject matter relates to methods and systems for labelling an image, in particular, satellite images.

BACKGROUND

In various technical domains, it is required to label images. This task is challenging: prior art methods are time consuming, costly, and not always applicable.

There is now a need to provide new systems and methods for automatic labelling of images, in particular in satellite images.

GENERAL DESCRIPTION

In accordance with certain aspects of the presently disclosed subject matter, there is provided a method including, by a processor and memory circuitry (PMC), providing a label for an image of a scene, the label being representative of whether the scene includes a given type of object, including: obtaining, from a first data source, metadata of the image of the scene, the metadata including time data representative of a time at which the image has been acquired, and position data representative of a position of the scene, obtaining, from a second data source, data indicative of position over time and type of obj ect of each obj ect of a plurality of obj ects, determining whether one or more objects have a position matching the position data of the metadata at a time matching the time data of the metadata, and providing the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type. In addition to the above features, the method according to this aspect of the presently disclosed subject matter can optionally comprise one or more of features (i) to (xii) below, in any technically possible combination or permutation: i. the image includes a plurality of subsets including one or more pixels, wherein each subset is associated with different time data representative of a time at which the subset has been acquired; ii. if an object of the one or more objects is of the given type, the method includes selecting a limited area of the image including the object of the given type, based on the position of the object; iii. if an object of the one or more objects is of the given type, the label indicates presence of the given type of object in the scene; if no object of the one or more objects is of the given type, the label includes a prospect that presence of the given type of object in the scene is below a threshold; iv. the second data source includes, for each object of the plurality of objects, a type of the object, wherein the method includes determining whether an obj ect of the plurality of obj ects meets (a) and (b) : (a) position of the obj ect matches the position data of the metadata at a time matching the time data of the metadata, and (b) a type of the object is the given type; v. the second data source includes a library storing position over time of a plurality of objects each of the given type, wherein the method includes determining whether an object of the plurality of objects has a position matching the position data of the metadata at a time matching the time data of the metadata; vi. upon obtaining a given library out of a plurality of different libraries, wherein each object of the plurality of objects of the given library is of the given type, wherein, for each library, the library stores, for each object of a plurality of objects, data representative of a position of the object over time, each object of the plurality of objects of the library is of a same type which differs from a type of objects stored in other libraries of the plurality of libraries, the method includes determining, based on the given library, whether a given object of the plurality of objects of the given library has a position matching the position data of the metadata at a time matching the time data of the metadata, and providing the label, wherein the label depends on whether the given object has been determined; vii. the method includes training a software module to detect presence of the given type of object in an image, based on the image of the scene and the label; viii. the determining and the providing are performed automatically by the PMC; ix. at least one of (a) and (b) is met (a) the given type of object is present in the scene and the given type of object is invisible in the image of the scene for a human, and (b) the image is a satellite image; x. the method includes providing a label for an image of a scene, the label being representative of whether the scene includes a second given type of object which is a static object, including: obtaining, from a first data source, metadata of the image, the metadata including position data representative of a position of the scene, obtaining, from a second data source, data indicative of position and type of object of each object of a plurality of objects, determining whether one or more objects have a position matching the position data of the metadata, and providing the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type; xi. the image of the scene has been acquired by a sensor of a first type, and data indicative of position over time and type of object of each object of a plurality of objects is obtained based on images acquired by a sensor of a second type, different from the first type; and xii. for a scene including an object of the given type, the object of the given type is better identifiable in an image of the scene acquired by a sensor of the second type than in an image of the scene acquired by a sensor of the first type.

According to another aspect of the presently disclosed subject matter there is provided a non-transitory storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform operations as described above. In some embodiments, this can include one or more of features (i) to (xii), in any technically possible combination or permutation.

According to another aspect of the presently disclosed subject matter there is provided a method including, by a processor and memory circuitry (PMC), providing a label for an image of a scene, the label being representative of whether the scene includes a given type of object, including obtaining, from a first data source, metadata of the image, the metadata including position data representative of a position of the scene, obtaining, from a second data source, data indicative of position and type of object of each object of a plurality of objects, determining whether one or more objects have a position matching the position data of the metadata, and providing the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type.

In addition to the above features, the method according to this aspect of the presently disclosed subject matter can optionally comprise one or more of features (xiii) to (xvi) below, in any technically possible combination or permutation: xiii. the given type of object is a static object; xiv. upon obtaining a given library out of a plurality of different libraries, wherein each object of the plurality of objects of the given library is of the given type, wherein, for each library, the library stores, for each object of a plurality of objects, data representative of a position of the object, each object of the plurality of objects of the library is of a same type which differs from a type of objects stored in other libraries of the plurality of libraries, determining, based on the given library, whether a given object of the plurality of objects of the given library has a position matching the position data of the metadata, and providing the label, wherein the label depends on whether the given object has been determined; xv. the image of the scene has been acquired by a sensor of a first type, and data indicative of position and type of object of each object of a plurality of objects is obtained based on images acquired by a sensor of a second type, different from the first type; and xvi. for a scene including an object of the given type, the object of the given type is better identifiable in an image of the scene acquired by a sensor of the second type than in an image of the scene acquired by a sensor of the first type.

According to another aspect of the presently disclosed subject matter there is provided a non-transitory storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform operations as described above. In some embodiments, this can include one or more of features (xiii) to (xvi), in any technically possible combination or permutation. According to another aspect of the presently disclosed subject matter there is provided a method including, by a processor and memory circuitry (PMC), training a software module to detect presence of a given type of object in an image, the training including feeding to the software module an image of a scene and a label representative of whether the scene includes the given type of object, wherein generation of the label includes obtaining, from a first data source, metadata of the image of the scene, the metadata including time data representative of a time at which the image has been acquired, and position data representative of a position of the scene, and obtaining, from a second data source, data indicative of position over time and type of object of each object of a plurality of obj ects, determining whether one or more obj ects have a position matching the position data of the metadata at a time matching the time data of the metadata, wherein the label depends on whether an object of the one or more objects is of the given type.

According to another aspect of the presently disclosed subject matter there is provided a non-transitory storage device readable by a machine, tangibly embodying a program of instructions executable by the machine to perform operations as described above.

According to another aspect of the presently disclosed subject matter there is provided a system including a processor and memory circuitry (PMC), configured to provide a label for an image of a scene, the label being representative of whether the scene includes a given type of object, the system being configured to obtain, from a first data source, metadata of the image of the scene, the metadata including time data representative of a time at which the image has been acquired, and position data representative of a position of the scene, obtain, from a second data source, data indicative of position over time and type of object of each object of a plurality of objects, determine whether one or more obj ects have a position matching the position data of the metadata at a time matching the time data of the metadata, and provide the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type.

In addition to the above features, the system according to this aspect of the presently disclosed subject matter can optionally comprise one or more of features (xvii) to (xxix) below, in any technically possible combination or permutation: xvii. the image includes a plurality of subsets including one or more pixels, wherein each subset is associated with different time data representative of a time at which the subset has been acquired; xviii. if an object of the one or more objects is of the given type, the system is configured to select a limited area of the image including the object of the given type, based on the position of the object; xix. if an object of the one or more objects is of the given type, the label indicates presence of the given type of object in the scene; xx. if no object of the one or more objects is of the given type, the label indicates absence of the given type of object in the scene. xxi. the second data source includes, for each object of the plurality of objects, a type of the object, wherein the system is configured to determine whether an object of the plurality of objects meets (a) and (b): (a) position of the object matches the position data of the metadata at a time matching the time data of the metadata, and (b) a type of the object is the given type; xxii. the second data source includes a library storing position over time of a plurality of objects each of the given type, wherein the system is configured to determine whether an object of the plurality of objects has a position matching the position data of the metadata at a time matching the time data of the metadata; xxiii. upon obtaining a given library out of a plurality of different libraries, wherein each object of the plurality of objects of the given library is of the given type, wherein, for each library the library stores, for each object of a plurality of obj ects, data representative of a position of the obj ect over time and each object of the plurality of objects of the library is of a same type which differs from a type of objects stored in other libraries of the plurality of libraries, the system is configured to determine, based on the given library, whether a given object of the plurality of objects of the given library has a position matching the position data of the metadata at a time matching the time data of the metadata, and provide the label, wherein the label depends on whether the given object has been determined; xxiv. the system is configured to train a software module to detect presence of the given type of object in an image, based on the image of the scene and the label; xxv. the determining and the providing are performed automatically by the PMC; xxvi. at least one of (i) and (ii) is met: (a) the given type of object is present in the scene and the given type of object is invisible in the image of the scene for a human, and (b) the image is a satellite image; xxvii. the system is configured to provide a label for an image of a scene, the label being representative of whether the scene includes a second given type of object which is a static object, the system being configured to obtain, from a first data source, metadata of the image, the metadata including position data representative of a position of the scene, obtain, from a second data source, data indicative of position and type of object of each object of a plurality of objects, determine whether one or more objects have a position matching the position data of the metadata, and provide the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type; xxviii. the image of the scene has been acquired by a sensor of a first type, data indicative of position over time and type of object of each object of a plurality of objects is obtained based on images acquired by a sensor of a second type, different from the first type; and xxix. for a scene including an object of the given type, the object of the given type is better identifiable in an image of the scene acquired by a sensor of the second type than in an image of the scene acquired by a sensor of the first type.

According to another aspect of the presently disclosed subject matter there is provided a system including a processor and memory circuitry (PMC), configured to provide a label for an image of a scene, the label being representative of whether the scene includes a given type of object, the system being configured to obtain, from a first data source, metadata of the image, the metadata including position data representative of a position of the scene, obtain, from a second data source, data indicative of position and type of object of each object of a plurality of objects, determine whether one or more objects have a position matching the position data of the metadata, and provide the label for the image, wherein the label depends on whether an object of the one or more objects is of the given type.

In addition to the above features, the system according to this aspect of the presently disclosed subject matter can optionally comprise one or more of features (xxx) to (xxxiii) below, in any technically possible combination or permutation: xxx. the given type of object is a static object; xxxi. upon obtaining a given library out of a plurality of different libraries, wherein each object of the plurality of objects of the given library is of the given type, wherein, for each library the library stores, for each object of a plurality of objects, data representative of a position of the object, and each object of the plurality of objects of the library is of a same type which differs from a type of objects stored in other libraries of the plurality of libraries, the system is configured to determine, based on the given library, whether a given object of the plurality of objects of the given library has a position matching the position data of the metadata, and provide the label, wherein the label depends on whether the given object has been determined; xxxii. the image of the scene has been acquired by a sensor of a first type, and data indicative of position and type of object of each object of a plurality of objects is obtained based on images acquired by a sensor of a second type, different from the first type; and xxxiii. for a scene including an object of the given type, the object of the given type is better identifiable in an image of the scene acquired by a sensor of the second type than in an image of the scene acquired by a sensor of the first type.

According to some embodiments, the proposed solution is able to automatically label images. In particular, according to some embodiments, involvement of human operators is reduced, or even cancelled.

According to some embodiments, the proposed solution is able to generate a large amount of labelled images. As a consequence, quality and efficiency of model training can be increased. Object detection is therefore improved.

According to some embodiments, the proposed solution reduces the time and cost required to label images.

According to some embodiments, the proposed solution provides labelling of satellite images which can be of low resolution.

According to some embodiments, the proposed solution provides labelling of images which could not performed by a human operator.

BRIEF DESCRIPTION OF THE DRAWINGS In order to understand the invention and to see how it can be carried out in practice, embodiments will be described, by way of non-limiting examples, with reference to the accompanying drawings, in which:

Fig. 1 illustrates an embodiment of a system operable to perform one or more methods as described hereinafter;

Fig. 2 depicts an embodiment of a method of automatic labelling of an image, for a given type of object;

Fig. 3 illustrates an image that can be processed according to the method of Fig.

2;

Fig. 3 A illustrates a limited area extracted from the image of Fig. 3 and including the given type of object;

Fig. 3B illustrates an image that can be processed according to the method of Fig. 2, in which different pixel bands are associated with different time and/or position data;

Fig. 4 illustrates an embodiment of the method of Fig. 2;

Fig. 5 illustrates another embodiment of the method of Fig. 2;

Fig. 6 illustrates a method of automatic labelling of an image, for a given type of object which is static over time;

Fig. 7 illustrates an image that can be processed according to the method of Fig.

6;

Fig. 8 illustrates a variant of the method of Fig. 6;

Fig. 9 illustrates a possible embodiment of the method of Fig. 2 and of the method of Fig. 6;

Fig. 10 illustrates an embodiment of a method which combines the method of Fig. 2 and the method of Fig. 6; and

Fig. 11 illustrates an embodiment of a method of training a software module using a labelled image obtained according to the various embodiments previously illustrated.

DETAILED DESCRIPTION

In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be understood by those skilled in the art that the presently disclosed subject matter may be practiced without these specific details. In other instances, well-known methods have not been described in detail so as not to obscure the presently disclosed subject matter.

Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions utilizing terms such as "obtaining", "training", “providing”, “detecting”, “determining”, “selecting” or the like, refer to the action(s) and/or process(es) of a processor and memory circuitry (PMC) that manipulates and/or transforms data into other data, said data represented as physical, such as electronic, quantities and/or said data representing the physical objects.

The term “processor and memory circuitry” covers any computing unit or electronic unit with data processing circuitry that may perform tasks based on instructions stored in a memory, such as a computer, a server, a chip, a processor, a hardware processor, etc. It encompasses a single processor or multiple processors, which may be located in the same geographical zone or may, at least partially, be located in different zones and may be able to communicate together.

The term “memory” as used herein should be expansively construed to cover any volatile or non-volatile computer memory suitable to the presently disclosed subject matter.

Embodiments of the presently disclosed subject matter are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the presently disclosed subject matter as described herein.

The invention contemplates a computer program being readable by a computer for executing one or more methods of the invention. The invention further contemplates a machine-readable memory tangibly embodying a program of instructions executable by the machine for executing one or more methods of the invention.

Attention is drawn to Fig. 1.

A system 100 includes a processor and memory circuitry (PMC - see processor 130 and memory 140).

The system 100 can communicate e.g. with a first data source 110, which can store one or more images to be labelled, and/or metadata of one or more images to be labelled. The first data source 110 can include e.g. at least one memory (e.g. non- transitory memory) to store the relevant data. As schematically illustrated in Fig. 1, according to some embodiments, the first data source 110 stores satellite images and/or metadata of satellite images. This is however not limitative. The system 100 can communicate with a second data source 150, different from the first data source. In particular, the first data source generally stores data which is not present in the second data source (metadata of the first data source are not present in the second data source), and conversely (position over time and type of object for each object are not present in the first data source). As explained hereinafter, according to some embodiments, the second data source 150 stores data associated with objects, and in particular position over time (or in some embodiments only position) and type of the objects. According to some embodiments, the second data source 150 can include at least one memory (e.g. non-transitory memory) which stores the relevant data.

According to some embodiments, the second data source 150 can include (see the second data source 150 illustrated in the right side of Fig. 1) e.g. one or more libraries 150i, 1502,.. ,150N. AS explained hereinafter, according to some embodiments, each library 150i, 1502,.. ,150N can store data (e.g. position over time, or only position in some embodiments) representative of a plurality of objects of a (single) given type of object specific to the library (the type of object is different between the libraries). This is not limitative, and in other embodiments, the second data source can include data (position or position over time and type of objects) for a plurality of different objects which is stored in a common database.

Data stored in the second data source 150 can be obtained from various sources, which can include e.g. network(s) 160 (e.g. Internet, Intranet, etc.), devices and/or sensors 170i, 1702,...,170M (e.g. surveillance cameras, IOT devices, GPS sensors, smartphones, control tower, etc.), from which relevant data is extracted, as explained hereinafter, and can be stored in the second data source 150.

Communication between system 100 and the first and/or second data sources 110, 150 can rely e.g. on wire and/or wireless communication.

According to some embodiments, data stored in the second data source 150 can be obtained based on data provided by a GPS navigation software (such as Waze), from which it is possible to obtain position and time data of various vehicles (car, trucks, motorbikes, ...). Concerning position of planes, real-time or offline flight tracking services (see e.g. w ww . fli ghtradar 24. com) can be used. These examples are not limitative.

According to some embodiments, data stored in the second data source 150 can be obtained based on data provided by cameras (surveillance cameras, traffic cameras,

...). For example, for each given type of object for which position and time data is to be obtained, a deep neural network can be used to detect whether the image provided by the cameras includes the given type of object (this can be applied both to physical objects and to humans).

According to some embodiments, data stored in the second data source 150 can be obtained based on usage of smartphones. Indeed, each smartphone can be located by a telecom operator and this data can be used to obtain position and time data of humans.

According to some embodiments, data stored in the second data source 150 can be obtained based on GPS watches worn by people.

According to some embodiments, data stored in the second data source 150 can be obtained based on data provided by IOT devices. Non limitative examples include air-quality sensors, security cameras, solar panels, weather sensors, Bluetooth beacons that can track locations of people and/or objects, etc.

According to some embodiments, some data stored in the second data source 150 can be obtained based on data provided by a user.

Data of the second data source 150 can be stored using any adapted representation, such as an indexed database (e.g. per object, or per position, or per time data, or per type of object).

According to some embodiments, the second data source 150 can store position data (without time data) and type of object for static objects (for which time data is not required), such as buildings, bridges, rivers, etc.

According to some embodiments, this data can be extracted from computerized maps such as Google maps, Bing maps, which store position and types of objects, or from other sources as explained above (e.g. GPS navigation software, which also provides data of static objects, data provided by users, etc.).

Computerized maps can include information (position and type) relative to different kinds of objects such as buildings, lakes, piers, volcanos, cemeteries, flea markets, parking lots, etc.

According to some embodiments, a library including position data for a given type of object can be built based on data extracted from data mining. For example, if the given type of object is “gas station”, then a request can be performed on an Internet search engine, and data mining can be applied to the results of the request in order to extract position data, which can be stored in the second data source 150. According to some embodiments, system 100 is configured to perform various methods as described hereinafter, which can include in particular automatic labelling of images.

According to some embodiments, system 100 can provide labelled images to another system 180. For example, system 180 (which can include e.g. a processor and memory circuitry) is configured to train one or more models to detect objects in an image (in particular a given type of object), and can receive labelled images from system 100. This is however not limitative.

Attention is now drawn to Fig. 2.

Assume that an image 300 (see Fig. 3) has to be labelled. In particular, assume that the image 300 has to be labelled with a label indicative of whether the image 300 includes a given type of object. The given type of object can be defined e.g. by a user depending on the needs of the user (e.g. for training purpose, or other purposes). Non- limitative examples of types of object include e.g. a car, a plane, a boat, a car of a particular brand, a football field, etc.

According to some embodiments, the image can be a satellite image. This is however not limitative.

The method includes obtaining (operation 200) metadata of the image of the scene, from a first source of data (see reference 110). According to some embodiments, metadata can include time data 305 and position data 310.

Time data can be representative of a time at which the image has been acquired. According to some embodiments, time data is provided by the sensor which has acquired the image. For example, a camera provides time data associated with the image. This is not limitative and according to some embodiments, time data associated with the image can be provided by an operator, or by another device which is not necessarily the sensor which has acquired the image.

The metadata can further include position data. Position data is representative of a position of the scene present in the image. For example, position data can include position of the scene on Earth, which can be characterized e.g. by latitude and longitude. Since the scene generally includes an area, the position of the scene can include e.g. position of a central point of the scene and/or position of one or more extremities of the area covered by the scene and/or position of a plurality of points of the scene. Image satellites provided by satellite operators are generally tagged with a position of each extremity of the image. Satellite operators compute this data using position and orientation of the satellite, position and orientation of the sensor with respect to the satellite, distance from the sensor to the scene (which can be computed using e.g. a digital terrain model) and a model of the sensor (the model describes parameter of the sensor, such as focal length, aperture, etc.).

The method further includes obtaining (205), from a second data source (see e.g. 150 in Fig. 1) different from the first data source, data indicative of position over time and type of object of each object of a plurality of objects. The second data source does not include the metadata of the image. According to some embodiments, a given object can be associated with a plurality of different types. For example, an object is tagged with the following types: “vehicle”, “car” and “Toyota”.

The method further includes determining (210) whether one or more objects of the second data source have a position matching the position data of the metadata at a time matching the time data of the metadata.

This can include comparing position over time of the one or more objects of the second data source with the position and time data of the metadata associated with the image 300.

Assume that a subset S of objects has been identified for which a match has been determined (e.g. according to a matching criterion). In some embodiments, the matching criterion can define a threshold for the comparison for which it can be considered that there is a match in position and time. In some embodiments, time data of the metadata stored in the first data source and time data stored for the objects in the second data source is measured using synchronized clocks (for example, time data has been measured using GPS for both the first data source and the second data source - this is not limitative), to avoid clock drift between the measurements.

The label can be generated (operation 220) based on the subset S of objects.

In particular, if the subset S of objects includes at least one object which is of the given type, then the label can indicate that the image includes the given type of object.

If the subset S of objects does not include any object which is of the given type, then the label can include a prospect (e.g. a probability) that the presence of the given type of object in the image is below a threshold (low probability). In some embodiments, the label can indicate that the image does not include the given type of object.

According to some embodiments, the method can include, if the label indicates that the given type of object is present in the image, selecting (operation 230) a limited area of the image which includes the given type of object. This selection can include e.g. cutting the limited area from the image and outputting this limited area as a new image. Selection of the limited area can be performed e.g. based on the position of the given type of object obtained from the second data source at operation 205. As a consequence, a labelled image including the given type of object is obtained, for which there is a focus on the given type of object. This new image is therefore highly useful for training a deep learning network or other similar machine learning algorithms to detect the given type of object.

According to some embodiments, the method of Fig. 2 can be used for an image which is such that, even if the the given type of object is present in the scene, the given type of object is invisible in the image of the scene for a human. This can be due to the fact that the image has a low resolution which prevents a human from being able to recognize the given type of object. This can be due to the fact that the image is not a classical optical image, but for example a radar image which cannot be interpreted by a human as such.

In a non-limitative example illustrated in Fig. 3, assume that it is desired to label an image 300 of a scene with a label indicating whether a truck is present in the scene.

Assume that the second data source includes position over time of various objects, and type of each object.

Assume that the scene is associated with metadata indicating a position “X” (which can include a range for the latitude and a range for the longitude, for defining the whole area covered by the scene) and an acquisition time “Y”. Comparison of this metadata with the content of the second data source indicates that a plurality of cars 350, a plurality of humans 360 and a truck 370 have position and time data matching the metadata (output of operation 210).

As a consequence, since a type of at least one object identified as being present in the scene includes a truck, a label is generated (operation 220) which indicates that the scene includes a truck.

At operation 230, based on position data of the truck, it is possible to select a limited area 380 of the image which includes the truck, in order to generate a new labelled image 385 (see Fig. 3A) including the truck.

Attention is drawn to Fig. 3B. According to some embodiments, the image can include a plurality of subsets including one or more pixels, wherein each subset is associated with different time data representative of a time at which the subset has been acquired. According to some embodiments, each subset of one or more pixels can be associated with different position data, representative of the position of the portion of the scene present in the subset of one or more pixels. According to some embodiments, time data and position data can be provided e.g. by the system which acquires the image (e.g. a camera of the satellite, based e.g. on GPS data of the satellite, and attitude of the camera over time).

For example, when a satellite acquires an image, in some cases, acquisition can rely on techniques such as push-broom acquisition. As a consequence, there can be a significant time difference in time data of different pixel bands of the image. This is shown in Fig. 3A, in which each pixel band (370i to 3703) is associated with different time data (305i to 3053) representative of a time at which the pixel band has been acquired. In addition, each pixel band (370i to 3703) can be associated with different position data (310i to 3IO3), representative of the position of the portion of the scene present in the pixel band.

According to some embodiments, position data of each pixel band of the image can be determined by using a software such as (but not limited to) SOCET SET (developed by BAE Systems), which receives as an input the image (which is, as mentioned above, tagged with a position of each extremity of the image), a model of the sensor which acquired the image (this model is generally provided by the manufacturer of the sensor) and a digital terrain model (DTM), and outputs position data of each pixel of the image.

As a consequence, at operation 210, position and time data of each subset of pixels (e.g. pixel band) is compared with the position over time of one or more objects, in order to identify whether the given type of object is present in one or more subset of pixels.

Attention is now drawn to Fig. 4, which depicts another particular embodiment of the method of Fig. 2.

Assume that it is intended to label an image of a scene with a label indicating whether the scene includes a given type of object.

The method includes obtaining (operation 400) metadata of the image. Operation 400 is similar to operation 200 and therefore is not described again.

Assume that the second data source includes a library storing position over time of a plurality of objects each of the given type. In other words, the library stores position and time data only of objects which are of the given type (and for which image labelling is required). In this non-limitative embodiment, since the library stores data only for a single given type of object, it is not necessary to store, for each object, its type, and it is sufficient that data associated with the library (e.g. a pointer, or a title) indicates that this library is devoted to this given type of object. Operation 405 illustrates that this library is obtained (this can include the fact that this library is built and/or that this library is already available and can be accessed to obtain data).

In a non-limitative example, the given type of object is a car. Therefore, the library stores position over time only for cars. This kind of library can be built e.g. by accessing data provided by a GPS navigation software such as Waze, which is then stored in a database per object.

The method includes determining (operation 410) whether an object of the plurality of objects for which data is stored in the library, has a position matching the position data of the metadata at a time matching the time data of the metadata.

Based on the output of operation 410, a label can be generated for the image of the scene.

If operation 410 indicates that at least one object of the plurality of objects for which data is stored in the library, has a position matching the position data of the metadata at a time matching the time data of the metadata, then the label can indicate that the image does include the given type of object.

If operation 410 indicates that no object of the plurality of objects for which data is stored in the library, has a position matching the position data of the metadata at a time matching the time data of the metadata, then the label can include a prospect (e.g. a probability) that the presence of the given type of object in the image is below a threshold (low probability).

In a non-limitative example, assume that it is intended to label image 300 (illustrated in Fig. 3) with a label indicating whether the scene includes a car. Assume that a library stores position and time data only for cars. The method can therefore include comparing (at operation 410) metadata 305, 310 of the image with position and time data stored in the library. If there is a match for at least one object, then the label of the image can indicate presence of a car in the scene. If there is no match, then the label can include a prospect (e.g. a probability) that the presence of a car in the image is below a threshold (low probability).

Attention is now drawn to Fig. 5, which describes another embodiment of the method of Fig. 2. According to some embodiments, it can be necessary to label a plurality of images, each with a label indicative of a different type of object. For example, a first subset of the images has to be labelled with a label indicative of the presence of a car, and a second subset of the images has to be labelled with a label indicative of the presence of a plane. According to some embodiments, it can be necessary to label an image with a plurality of labels (e.g. the label indicates whether the scene includes both a car and a truck).

The method can include obtaining (operation 500) metadata of an image of a scene. Operation 500 is similar to operation 200 and therefore is not described again.

Assume that this image has to be labelled with a label indicative of a presence of a first given type of object.

According to some embodiments, the second data source can include a plurality of libraries, wherein each library is devoted to a specific type of object (different from a type of object of the other libraries - see references 150i to 150N). For example, a first library stores position over time of objects of a first given type, and a second library stores position over time of objects of a second given type (different from the first given type of object), etc.

The method can include selecting (operation 505) a library (out of the plurality of libraries) which is associated with the relevant type of object for which a label has to be provided.

The method can further include determining (operation 510) whether an object of the plurality of objects for which data is stored in the selected library, has a position matching the position data of the metadata at a time matching the time data of the metadata.

Based on the output of operation 510, a label can be generated (operation 520) for the image of the scene.

If operation 510 indicates that at least one object of the plurality of objects for which data is stored in the selected library, has a position matching the position data of the metadata at a time matching the time data of the metadata, then the label can indicate that the image does include the given type of object.

If operation 510 indicates that no object of the plurality of objects for which data is stored in the selected library, has a position matching the position data of the metadata at a time matching the time data of the metadata, then the label can indicate that the image does not include the given type of object. As shown in Fig. 5, the method can be repeated for another image, which can be labelled with a label indicative of a different type of object (different from the previous iteration).

Attention is now drawn to Fig. 6, which describes another method of labelling images.

According to some embodiments, it can be necessary to label an image of a scene (see e.g. image 700 in Fig. 7), wherein the label is representative of whether the scene includes a given type of object which is a static object. Non limitative examples of static objects include roads 711, buildings (e.g. apartment buildings 712, towers 713, houses 714), gas stations, train stations, airports, bridges 715, seas, rivers, etc.

The method can include obtaining (operation 600) from a first data source metadata of an image (see e.g. image 700) of a scene to be labelled with a given type of object. In this embodiment, the metadata includes position data 705. Position data 705 is similar to position data 305 described above. In this embodiment, it is not required to obtain acquisition time of the image, since it is intended to label static objects. This metadata can be obtained from a first data source (e.g. 110).

The method further includes obtaining (operation 605) position and type of a plurality of objects. This data can be obtained e.g. from a second data source (see 150 in Fig. 1), which is different from the first data source. According to some embodiments, a given object can be associated with a plurality of different types. For example, a static object is tagged with the following types: “building”, “private house” and “two-floor building”.

The method can include determining (operation 610), based on the second data source, whether a given object of the plurality of objects has a position matching the position data of the metadata and is of the given type.

The method can include providing (operation 620) the label for the image, wherein the label depends on whether the given object has been determined.

According to some embodiments, the method of Fig. 6 can be used for an image which is such that, even if the the given type of object is present in the scene, the given type of object is invisible in the image of the scene for a human. This can be due to the fact that the image has a low resolution which prevents a human from being able to recognize the given type of object. This can be due to the fact that the image is not a classical optical image, but for example a radar image which cannot be interpreted by a human as such. According to some embodiments, the second data source can include, for each given type of object, a library storing position data of objects which are only of the given type (e.g. a first library for buildings, a second library for bridges, a third library for rivers).

Attention is drawn to Fig. 8, which describes a variant of the method of Fig. 6.

The method can include obtaining (operation 800) metadata of an image (see e.g. image 700) of a scene to be labelled with a given type of object. In this embodiment, the metadata includes position data of the scene.

According to some embodiments, a plurality of libraries is available, each library being devoted to a different type of object, and storing position data for objects of the same type. For example, the first library stores data for small buildings (height below a threshold), the second data stores data for high buildings (height above a threshold), and the third library stores data for gas stations.

The method further includes selecting (operation 805) a library storing position data for objects which are of the given type.

The method includes determining (based on the library - operation 810) whether one or more objects of the library (which are all of the given type) have position data matching the position data of the metadata.

The method further includes providing (operation 820) a label based on the determination of operation 810.

If operation 810 indicates that at least one object of the plurality of objects for which data is stored in the library, has a position matching the position data of the metadata, then the label can indicate that the image does include the given type of object.

If operation 410 indicates that no object of the plurality of objects for which data is stored in the library, has a position matching the position data of the metadata, then the label can indicate that the image does not include the given type of object.

Attention is now drawn to Fig. 9, which depicts a particular embodiment of the method of Fig. 2.

Assume that an image, which has been acquired by a sensor of a first type, has to be labelled with a label indicative of whether the image includes a given type of object.

The method includes obtaining (operation 900) metadata of the image of the scene acquired by the first type of sensor. According to some embodiments, metadata can include position data, or position data and time data. The method further includes obtaining (operation 905) type, position data (or position data and time data) of objects in images acquired by one or more other sensors of a second type which is different from the first type.

According to some embodiments, the first type and the second type are different types of acquisition by the sensor.

In a non-limitative example, the first type of sensor is an electro-optic sensor, and the second type of sensor includes e.g. at least one of SAR sensor and multi-spectral sensor.

According to some embodiments, the first type and the second type are different resolutions of the sensors (for example a sensor of the second type has a higher resolution than a sensor of the first type).

Identification of the objects in the images acquired by the sensors of the second type can rely, in some embodiments, on the use of a machine learning algorithm (e.g. DNN) trained to detect specific types of objects in images acquired by a sensor of the second type.

In other embodiments, identification of the objects in the images acquired by the sensors of the second type has been performed by an operator.

The method further includes determining (910) whether one or more objects (present in images acquired by a sensor of the second type) have position data (for static objects) or position and time data (for mobile objects) matching the metadata of the image of the scene acquired by a sensor of the first type.

This can include:

- comparing position of one or more objects present in images acquired by sensor(s) of the second type with position data associated with the image of the scene acquired by a sensor of the first type; and/or

- comparing position and time data of one or more objects present in images acquired by sensor(s) of the second type with position and time data associated with the image of the scene acquired by a sensor of the first type.

Assume that a subset S of objects has been identified for which a match has been determined (e.g. according to a matching criterion). In some embodiments, the matching criterion can define a threshold for the comparison for which it can be considered that there is a match in position and time. The label can be generated (operation 920) based on the subset S of objects.

In particular, if the subset S of objects includes at least one object which is of the given type, then the label can indicate that the image includes the given type of object.

If the subset S of objects does not include any object which is of the given type, then the label can include a prospect (e.g. a probability) that the presence of the given type of object in the image is below a threshold (low probability). In some embodiments, the label can indicate that the image does not include the given type of object.

According to some embodiments, the method can include, if the label indicates that the given type of object is present in the image, selecting (operation 930) a limited area of the image which includes the given type of object, as explained with reference to operation 230 above.

According to some embodiments, for a scene including an object of the given type, the object of the given type is better identifiable in an image of the scene acquired by a sensor of the second type than in an image of the scene acquired by a sensor of the first type. In other words, a sensor of the second type is more adapted to the detection of an object of the given type that a sensor of the first type.

For example, if a first machine learning algorithm is trained to detect the given type of object in an image of the scene acquired by a sensor of the first type and a second machine learning algorithm is trained to detect the given type of object in an image of the scene acquired by a sensor of the first type, the second machine learning algorithm will provide better results.

The method of Fig. 9 allows taking advantage of the fact that depending on the type of the object for which a label is required, performance of sensors of different types can vary. For example, it is easier to detect a plane using a SAR sensor than using a multi-spectral sensor. Therefore, if a plane is to be labelled in an image of a scene acquired by a multi-spectral sensor, and a similar image of the scene (e.g. with same acquisition time for mobile objects) is available from a SAR sensor, the method of Fig. 9 can take advantage from this situation to label the plane based on the fact that it was detected in the image of the SAR sensor. This example is not limitative.

According to some embodiments, the various embodiments described above can be combined. An example is provided with reference to Fig. 10. Assume that an image of a scene has to be labelled with respect to a given type of object.

If the given type of object is not a static object, the method can include obtaining (operation 1000, similar to operation 200) metadata (position and time data) of the image from a first source of data, obtaining position over time and type of a plurality of objects (operation 1005, similar to operation 205), determining whether one or more objects have position and time data matching the metadata (operation 1010, similar to operation 210), and providing (operation 1020, similar to operation 220) the label based on the determination.

If the given type of object is a static object, the method can include obtaining (operation lOOOi, similar to operation 600) metadata (position data) of the image, obtaining position and type of a plurality of (static) objects from a second source of data (operation 1005i, similar to operation 605), determining whether one or more objects have position data matching the metadata (operation lOlOi, similar to operation 610), and providing (operation 1020i, similar to operation 620) the label based on the determination.

The method can be repeated for another image of a scene, until a complete training set is obtained.

Attention is now drawn to Fig. 11.

As explained in the various embodiments above, a label is assigned to an image of a scene, which indicates whether a given type of object is present in the scene. According to some embodiments, the image, together with the label, can be used for training a software module to detect presence of the given type of object in images (see operations 1100 and 1110 in Fig. 11).

A software module generally includes a list of instructions stored in a non- transitory memory, the instructions being such that, when executed by a processor and memory circuitry (such as processor 130 and memory 140) cause the processor and memory circuitry to provide prospect(s) that a given type of object is present in a scene. As explained hereinafter, according to some embodiments, each software module is associated with a given type of object and is specifically trained to detect this given type of object.

The instructions encode operation of a model, such as a machine learning algorithm, a decision tree algorithm, a deep neural network (DNN), or other adapted models. By way of non-limiting example, the layers of DNN can be organized in accordance with Convolutional Neural Network (CNN) architecture, Recurrent Neural Network architecture, Recursive Neural Networks architecture, Generative Adversarial Network (GAN) architecture, or otherwise. Optionally, at least some of the layers can be organized in a plurality of DNN sub-networks. Each layer of the ML network can include multiple basic computational elements (CE), typically referred to in the art as dimensions, neurons, or nodes.

This is not limitative, and in some embodiments the software module is implemented (after training of the software module has been performed, as explained hereinafter) using hardware components, e.g. FPGA, which is configured to execute operation of the model (without requiring storage of the instructions).

The method includes obtaining (operation 1000) an image of the scene and a label (computed using the various embodiments above). The label indicates whether a given type of object is present in the scene.

The method includes training (operation 1010) a software module to detect the given type of object in an image of a scene. Training can include e.g. feeding the image and the label to the software module, computing a loss function (based on the difference between the output of the software module and the label), and updating weights of the model used by the software module (e.g. using backpropagation techniques if the model is a deep neural network).

It is to be noted that the various features described in the various embodiments may be combined according to all possible technical combinations.

It is to be understood that the invention is not limited in its application to the details set forth in the description contained herein or illustrated in the drawings. The invention is capable of other embodiments and of being practiced and carried out in various ways. Hence, it is to be understood that the phraseology and terminology employed herein are for the purpose of description and should not be regarded as limiting. As such, those skilled in the art will appreciate that the conception upon which this disclosure is based may readily be utilized as a basis for designing other structures, methods, and systems for carrying out the several purposes of the presently disclosed subject matter.

Those skilled in the art will readily appreciate that various modifications and changes can be applied to the embodiments of the invention as hereinbefore described without departing from its scope, defined in and by the appended claims.