Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
IMAGE PROCESSING SYSTEM, METHOD AND DEVICE, AND COMPUTER-READABLE MEDIUM
Document Type and Number:
WIPO Patent Application WO/2022/159371
Kind Code:
A1
Abstract:
An image processing system (100), comprises: a video stream processing device (10), configured to receive a video stream, segment the video stream into multiple frames of pictures arranged in chronological order, and distribute the multiple frames of pictures to edge computing devices (201) in a connected edge computing device group (20); the edge computing devices (201) in the edge computing device group (20), configured to subject the received pictures to target identification, and send, to a connected picture collecting device (30), the pictures marked with a region in which an identified target is located; the picture collecting device (30), configured to restore in chronological order as a video stream the received pictures marked with target identification results.

Inventors:
YU YUE (CN)
LOH CHANG WEI (CN)
CHEN WEI YU (CN)
PAN TIAN HUA (CN)
HU SHENG BO (CN)
Application Number:
PCT/US2022/012734
Publication Date:
July 28, 2022
Filing Date:
January 18, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SIEMENS AG (DE)
SIEMENS CORP (US)
International Classes:
G06V20/52
Domestic Patent References:
WO2012141574A12012-10-18
Foreign References:
CN110175549A2019-08-27
Other References:
ZHANG TAN TZHANG@CS WISC EDU ET AL: "The Design and Implementation of a Wireless Video Surveillance System", USER INTERFACE SOFTWARE AND TECHNOLOGY, ACM, 2 PENN PLAZA, SUITE 701 NEW YORK NY 10121-0701 USA, 7 September 2015 (2015-09-07), pages 426 - 438, XP058522848, ISBN: 978-1-4503-4531-6, DOI: 10.1145/2789168.2790123
TERANISHI YUUICHI ET AL: "Dynamic Data Flow Processing in Edge Computing Environments", 2017 IEEE 41ST ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), IEEE, vol. 1, 4 July 2017 (2017-07-04), pages 935 - 944, XP033151148, ISSN: 0730-3157, ISBN: 978-1-5386-2667-2, [retrieved on 20170907], DOI: 10.1109/COMPSAC.2017.113
Attorney, Agent or Firm:
WAXMAN, Andrew M. (US)
Download PDF:
Claims:
What is claimed is:

1. An image processing system (100), comprising: a video stream processing device (10) configured to receive a video stream, segment the video stream into multiple frames of pictures arranged in chronological order, and distribute the multiple frames of pictures to edge computing devices (201) in a connected edge computing device group (20); wherein the edge computing devices (201) in the connected edge computing device group (20) are configured to subject received pictures to target identification, and send, to a connected picture collecting device (30), pictures marked with a region in which an identified target is located; and wherein the connected picture collecting device (30) is configured to restore, in chronological order as a video stream, pictures marked with target identification results.

2. The system (100) as claimed in claim 1, wherein the video stream processing device (10) is configured to: monitor a state of each edge computing device (201) in the connected edge computing device group (20) and a state of each connected edge computing device (202) outside the connected edge computing device group (20); remove a first edge computing device (201) from the connected edge computing device group (20) when the state of the first edge computing device (201) in the connected edge computing device group (20) changes to unavailable; and add a second edge computing device (202) to the connected edge computing device group (20) when the state of the second edge computing device (202) outside the connected edge computing device group (20) changes to a state in which the second edge computing device (202) can be added to the connected edge computing device group (20).

3. The system (100) as claimed in claim 1, wherein the system (100) is used for monitoring a region of interest, and the edge computing devices (201) in the connected edge computing device group (20) are configured to mark the region of interest in the received pictures, and the connected picture collecting device (30) is configured to judge whether there is an overlap between the region in which the identified target is located and the region of interest marked in each frame of picture received, in response to there being an overlap, judge the degree to which the proportion of the area of the region of interest taken up by a region of overlap changes within a time period, and output an alert in response to the degree exceeding a change degree threshold.

4. The system (100) as claimed in claim 3, wherein the connected picture collecting device (30) is configured to compare information of pre-stored targets and the target identified in the received pictures, judge whether the target identified conforms to information of a pre-stored target, and if the target identified conforms to information of the prestored target, then mark, in the alert, target information of the pre-stored target.

5. The system (100) as claimed in claim 4, wherein the connected picture collecting device (30) is configured to, if the target identified does not conform to information of any prestored target: extract a feature of a key part of the target identified in each frame of picture in the video stream in the time period, the key part being used to distinguish between different targets; cluster the frames of pictures according to the extracted feature; acquire a feature located at the cluster center point; obtain a key part of a target in the picture that is most similar to the feature located at the cluster center point; and mark the key part in the alert.

6. An image processing method (400), comprising: receiving (S401) a video stream; segmenting (S402) the video stream into multiple frames of pictures arranged in chronological order; distributing (S403) the multiple frames of pictures to edge computing devices in a connected edge computing device group, to enable the edge computing devices in the connected edge computing device group to subject the pictures to target identification, wherein the pictures marked with a region in which an identified target is located are restored in chronological order as a video stream; monitoring (S404) a state of each edge computing device in the connected edge computing device group and a state of each connected edge computing device outside the connected edge computing device group; removing (S405) a first edge computing device from the connected edge computing device group when the state of the first edge computing device in the connected edge computing device group changes to unavailable; and adding (S406) a second edge computing device to the edge computing device group when the state of the second edge computing device outside the connected edge computing device group changes to a state in which the second edge computing device can be added to the connected edge computing device group.

7. An image processing method (500), for monitoring a region of interest, the image processing method comprising: receiving (S501) multiple frames of pictures arranged in chronological order, wherein each frame of picture is marked with a region in which a target identified by target identification is located and the region of interest; judging (S502) whether there is an overlap between the region in which the target identified is located and the region of interest marked in each frame of picture received; in response to there being an overlap, judging (S503) the degree to which the proportion of the area of the region of interest taken up by a region of overlap changes within a time period; and outputting (S504) an alert in response to the degree exceeding a change degree threshold.

8. The method (500) as claimed in claim 7, further comprising: comparing (S505) information of pre-stored targets and the target identified in the received pictures, judging whether the target identified conforms to information of a pre-stored target; and marking (S506), in the alert, target information of the prestored target if the target identified conforms to information of the pre-stored target.

9. The method (500) as claimed in claim 8, wherein if the target identified does not conform to information of any prestored target, then the method further comprises: extracting (S507) a feature of a key part of the target identified in each frame of picture in a video stream in the time period, the key part being used to distinguish between different targets; clustering (S508) the frames of pictures according to the extracted feature; acquiring (S509) a feature located at the cluster center point; obtaining (S510) a key part of a target in the picture that is most similar to the feature located at the cluster center point; and marking (S511) the key part in the alert.

10. A video stream processing device (10), comprising: a receiving module (111), configured to receive a video stream; a segmenting module (112), configured to segment the video stream into multiple frames of pictures arranged in chronological order; a distributing module (113), configured to distribute the multiple frames of pictures to edge computing devices in a connected edge computing device group, to enable the edge computing devices in the connected edge computing device group to subject the received pictures to target identification, wherein the pictures marked with a region in which an identified target is located are restored in chronological order as a video stream; and a monitoring module (114), configured to monitor a state of each edge computing device in the connected edge computing device group and a state of each connected edge computing device outside the connected edge computing device group, remove a first edge computing device from the connected edge computing device group when the state of the first edge computing device in the connected edge computing device group changes to unavailable, add a second edge computing device to the connected edge computing device group when the state of the second edge computing device outside the connected edge computing device group changes to a state in which the second edge computing device can be added to the connected edge computing device group .

11. A picture collecting device (30), for monitoring a region of interest, the picture collecting device (30) comprising: a receiving module (211), configured to receive multiple frames of pictures arranged in chronological order, wherein each frame of picture is marked with a region in which a target identified by target identification is located and the region of interest; and an alerting module (212), configured to judge whether there is an overlap between the region in which the target identified is located and the region of interest marked in each frame of picture received, in response to there being an overlap, judge the degree to which the proportion of the area of the region of interest taken up by a region of overlap changes within a time period, and output an alert in response to the degree exceeding a change degree threshold.

12. The device (30) as claimed in claim 11, wherein the alerting module (212) is configured to: compare information of pre-stored targets and the target identified in the pictures received by the receiving module (211), judge whether the target identified conforms to information of a pre-stored target; and mark, in the alert, target information of the pre-stored target in response to the target identified conforming to information of the pre-stored target.

13. The device (30) as claimed in claim 12, wherein, if the target identified does not conform to information of any prestored target, the alerting module (212) is configured to: extract a feature of a key part of the target identified in each frame of picture in a video stream in the time period, the key part being used to distinguish between different targets; cluster the frames of pictures according to the extracted feature; acquire a feature located at the cluster center point; obtain a key part of a target in the picture that is most similar to the feature located at the cluster center point; and mark the key part in the alert.

14. An image processing apparatus (10, 30), comprising: at least one memory, configured to store computer-readable code; and at least one processor, configured to call the computer- readable code, and perform the method as claimed in any one of claims 6 - 9.

15. A computer-readable medium, wherein computer-readable instructions are stored on the computer-readable medium, and the computer-readable instructions, when executed by a processor, cause the processor to perform the method as claimed in any one of claims 6 - 9.

Description:
IMAGE PROCESSING SYSTEM, METHOD AND DEVICE, AND COMPUTER- READABLE MEDIUM

Cross-reference to Related Application (s)

[0001] This application claims priority to Chinese Patent Application No. 202110076088.5, filed January 20, 2021, the entire contents of which are incorporated herein by reference.

Field

[0002] At least some example embodiments relate to the field of computer vision, for example to image processing systems, methods, devices, and/or non-transitory computer-readable mediums.

Background

[0003] In the field of computer vision, target identification processing speed and accuracy are two important indices. In some application scenarios, the processing speed when target identification is performed is relatively important. However, if a target identification task is completed at an edge side, the processing speed requirements are often unattainable due to the low computing power of most edge computing devices at present. For example, many edge devices are only able to achieve a processing speed of 3 - 4 frames per second when performing video monitoring, but application scenarios such as video monitoring generally require a processing speed of at least 24 frames per second. In such application scenarios, it may become relatively important to increase the processing speed of edge computing devices. [0004] At present, a common method is to use a server or workstation with a high-performance image processor (graphics processing unit, GPU), or use a specially customized edge computing device, but both types of device have relatively high costs. Another method is to transmit a video stream to the cloud for further analysis, but this may result in a long network delay and as well as a safety risk.

Summary

[0005] At least some example embodiments provide image processing methods, apparatuses and non-transitory computer- readable mediums, wherein a video stream to be processed is segmented into multiple frames of pictures arranged in chronological order, which are separately distributed to multiple edge computing devices, which do not have high computing power, for image processing such as target identification, in order to achieve an increase in overall computing power and increase the processing speed of target identification. The states of the multiple edge computing devices are identifiable, and it is possible to flexibly add new edge computing devices to, and remove unavailable edge computing devices from, a group formed by the edge computing devices (referred to as an "edge computing device group" hereinbelow), in order to ensure that a video stream will not be unable to be processed normally due to a fault in an individual edge computing device.

[0006] For a video monitoring application scenario, an optional example embodiment can monitor whether a person enters or abnormally leaves a region of interest, and make a judgment by comparing trends of change in the relationship between a region in which a target is located and the region of interest, so can effectively avoid misjudgments, increasing the accuracy of judgment. In the process of presenting a monitoring result, an identifier or head portrait of an identified person can be presented in alert information, and a clustering method can be used to find the part best able to display a target feature for presentation, thus increasing the degree of recognition.

[0007] According to at least one example embodiment, an image processing system is provided, comprising:

- a video stream processing device, configured to receive a video stream, segment the video stream into multiple frames of pictures arranged in chronological order, and distribute the multiple frames of pictures to edge computing devices in a connected edge computing device group;

- the edge computing devices in the edge computing device group, configured to subject the received pictures to target identification, and send, to a connected picture collecting device, the pictures marked with a region in which an identified target is located;

- the picture collecting device, configured to restore in chronological order as a video stream the received pictures marked with target identification results.

[0008] According to at least one example embodiment, an image processing method is provided that can be performed by the video stream processing device in an image processing system, such as that discussed herein, the method comprising: receiving a video stream; segmenting the video stream into multiple frames of pictures arranged in chronological order; distributing the multiple frames of pictures to edge computing devices in a connected edge computing device group, to enable the edge computing devices in the edge computing device group to subject the received pictures to target identification, wherein the pictures marked with a region in which an identified target is located are restored in chronological order as a video stream; monitoring a state of each edge computing device in the edge computing device group and a state of each connected edge computing device outside the edge computing device group; when the state of a first edge computing device in the edge computing device group changes to "unavailable", removing the first edge computing device from the edge computing device group; when the state of a connected second edge computing device outside the edge computing device group changes to "can be added to the edge computing device group", adding the second edge computing device to the edge computing device group.

[0009] In at least one other example embodiment, a video processing method for monitoring a region of interest is provided, wherein the method can be performed by the picture collecting device in a image processing system, such as that discussed herein, the method comprising: receiving multiple frames of pictures arranged in chronological order, wherein each frame of picture is marked with a region in which a target identified by target identification is located and the region of interest; judging whether there is an overlap between the region in which the identified target is located and the region of interest marked in each frame of picture received; if there is an overlap, then judging the degree to which the proportion of the area of the region of interest taken up by a region of overlap changes within a preset time period; if the degree of change exceeds a preset change degree threshold, then outputting an alert.

[0010] In at least one other example embodiment, a video stream processing device is provided, comprising:

- a receiving module, configured to receive a video stream; - a segmenting module, configured to segment the video stream into multiple frames of pictures arranged in chronological order;

- a distributing module, configured to distribute the multiple frames of pictures to edge computing devices in a connected edge computing device group, to enable the edge computing devices in the edge computing device group to subject the received pictures to target identification, wherein the pictures marked with a region in which an identified target is located are restored in chronological order as a video stream;

- a monitoring module, configured to: monitor a state of each edge computing device in the edge computing device group and a state of each connected edge computing device outside the edge computing device group; when the state of a first edge computing device in the edge computing device group changes to "unavailable", remove the first edge computing device from the edge computing device group; when the state of a connected second edge computing device outside the edge computing device group changes to "can be added to the edge computing device group", add the second edge computing device to the edge computing device group.

[0011] In at least one other example embodiment, a picture collecting device for monitoring a region of interest is provided, comprising:

- a receiving module, configured to receive multiple frames of pictures arranged in chronological order, wherein each frame of picture is marked with a region in which a target identified by target identification is located and the region of interest;

- an alerting module, configured to: judge whether there is an overlap between the region in which the identified target is located and the region of interest marked in each frame of picture received; if there is an overlap, then judge the degree to which the proportion of the area of the region of interest taken up by a region of overlap changes within a preset time period; if the degree of change exceeds a preset change degree threshold, then output an alert.

[0012] In at least one other example embodiment, an image processing apparatus is provided, the image processing apparatus comprising: at least one memory, configured to store computer- readable code; at least one processor, configured to call the computer-readable code, and perform the steps provided in one or more of the methods discussed herein.

[0013] In at least one other example embodiment, a non- transitory computer-readable medium is provided, wherein computer-readable instructions are stored on the computer- readable medium, and the computer-readable instructions, when executed by a processor, cause the processor to perform the steps provided in one or more of the methods discussed herein. [0014] In one or more example embodiments, a video stream to be processed is segmented into multiple frames of pictures arranged in chronological order, which are separately distributed to multiple edge computing devices, which do not have high computing power, for image processing such as target identification, in order to achieve an increase in overall computing power and increase the processing speed of target identification .

[0015] In any one or more of the example embodiments, optionally, the video stream processing device may further monitor a state of each edge computing device in the edge computing device group and a state of each connected edge computing device outside the edge computing device group; when the state of a first edge computing device in the edge computing device group changes to "unavailable", remove the first edge computing device from the edge computing device group; when the state of a connected second edge computing device outside the edge computing device group changes to "can be added to the edge computing device group", add the second edge computing device to the edge computing device group. The states of the multiple edge computing devices are identifiable, and it is possible to flexibly add new edge computing devices to, and remove unavailable edge computing devices from, the group formed by the edge computing devices, in order to ensure that a video stream will not be unable to be processed normally due to a fault in an individual edge computing device.

[0016] In any one or more of the example embodiments, optionally, the example embodiment (s) may be used for monitoring a region of interest, wherein the edge computing devices in the edge computing device group mark the region of interest in the received pictures; the picture collecting device judges whether there is an overlap between the region in which the identified target is located and the region of interest marked in each frame of picture received; if there is an overlap, then judges the degree to which the proportion of the area of the region of interest taken up by a region of overlap changes within a preset (or defined) time period; if the degree of change exceeds a preset change degree threshold, then outputs an alert. For a video monitoring application scenario, it is possible to monitor whether a person enters or abnormally leaves a region of interest, and make a judgment by comparing trends of change in the relationship between a region in which a target is located and the region of interest. It is thus possible to more effectively avoid misjudgments, increasing the accuracy of judgment.

[0017] In any one or more of the example embodiments, optionally, the picture collecting device compares information of pre-stored targets and the target identified in the received pictures, and judges whether the identified target conforms to information of one pre-stored target; if the identified target conforms to information of one pre-stored target, then marks, in the outputted alert, target information of the pre-stored identified target. If the identified target does not conform to information of any pre-stored target, then it is possible to extract a feature of a key part of the target identified in each frame of picture in the video stream in the preset time period, the key part being used to distinguish between different targets; cluster the frames of pictures according to the extracted feature; acquire a feature located at the cluster center point; obtain the key part of the target in the picture that is most similar to the feature located at the cluster center point; mark the key part so obtained in the outputted alert. For example: in the process of presenting a monitoring result, an identifier or head portrait of an identified person can be presented in alert information, and a clustering method can be used to find the part best able to display a target feature for presentation, thus increasing the degree of recognition.

Brief Description of the Drawings

[0018] Fig. 1 is a structural schematic diagram of an image processing system provided in an example embodiment.

[0019] Fig. 2 is a structural schematic diagram of a video stream processing device provided in an example embodiment. [0020] Fig. 3 is a structural schematic diagram of a picture collecting device provided in an example embodiment. [0021] Fig. 4 is a flow chart of an image processing method at the video stream processing device side as provided in an example embodiment.

[0022] Fig. 5 is a flow chart of an image processing method at the picture collecting device side as provided in an example embodiment.

[0023] Fig. 6 is a schematic diagram of judging whether a target has entered or abnormally left a region of interest in an example embodiment. [0024] Fig. 7 is a schematic diagram of a possible application scenario in an example embodiment.

Detailed Description

[0025] The subject matter described herein is now discussed with reference to example embodiments. It should be understood that these embodiments are discussed purely in order to enable those skilled in the art to better understand and thus implement the subject matter described herein, without limiting the protection scope, applicability or examples expounded in the claims. The functions and arrangement of the elements discussed can be changed without departing from the protection scope of the content of the example embodiments. Various processes or components can be omitted from, replaced in or added to each example as required. For example, the method described may be performed in a different order from that described, and all of the steps may be added, omitted or combined. In addition, features described in relation to some examples may also be combined in other examples.

[0026] The drawings are to be regarded as being schematic representations and elements illustrated in the drawings are not necessarily shown to scale. Rather, the various elements are represented such that their function and general purpose become apparent to a person skilled in the art. Any connection or coupling between functional blocks, devices, components, or other physical or functional units shown in the drawings or described herein may also be implemented by an indirect connection or coupling. A coupling between components may also be established over a wireless connection. Functional blocks may be implemented in hardware, firmware, software, or a combination thereof.

[0027] Specific structural and functional details disclosed herein are merely representative for purposes of describing example embodiments. Example embodiments, however, may be embodied in various different forms, and should not be construed as being limited to only the illustrated embodiments. Rather, the illustrated embodiments are provided as examples so that this disclosure will be thorough and complete, and will fully convey the concepts of this disclosure to those skilled in the art. Accordingly, known processes, elements, and techniques, may not be described with respect to some example embodiments. Unless otherwise noted, like reference characters denote like elements throughout the attached drawings and written description, and thus descriptions will not be repeated. Example embodiments, however, may be embodied in many alternate forms and should not be construed as limited to only the example embodiments set forth herein.

[0028] It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, components, regions, layers, and/or sections, these elements, components, regions, layers, and/or sections, should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments. As used herein, the term "and/or," includes any and all combinations of one or more of the associated listed items. The phrase "at least one of" has the same meaning as "and/or". [0029] Spatially relative terms, such as "beneath," "below," "lower," "under," "above," "upper," and the like, may be used herein for ease of description to describe one element or feature's relationship to another element (s) or feature (s) as illustrated in the figures. It will be understood that the spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if the device in the figures is turned over, elements described as "below, " "beneath, " or "under, " other elements or features would then be oriented "above" the other elements or features. Thus, the example terms "below" and "under" may encompass both an orientation of above and below. The device may be otherwise oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly. In addition, when an element is referred to as being "between" two elements, the element may be the only element between the two elements, or one or more other intervening elements may be present.

[0030] Spatial and functional relationships between elements (for example, between modules) are described using various terms, including "connected," "engaged," "interfaced," and "coupled." Unless explicitly described as being "direct," when a relationship between first and second elements is described in the above disclosure, that relationship encompasses a direct relationship where no other intervening elements are present between the first and second elements, and also an indirect relationship where one or more intervening elements are present (either spatially or functionally) between the first and second elements. In contrast, when an element is referred to as being "directly" connected, engaged, interfaced, or coupled to another element, there are no intervening elements present. Other words used to describe the relationship between elements should be interpreted in a like fashion (e.g., "between," versus "directly between," "adjacent," versus "directly adjacent," etc.).

[0031] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms "a," "an," and "the," are intended to include the plural forms as well, unless the context clearly indicates otherwise. As used herein, the terms "and/or" and "at least one of" include any and all combinations of one or more of the associated listed items. It will be further understood that the terms "comprises," "comprising," "includes," and/or "including," when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items. Expressions such as "at least one of, " when preceding a list of elements, modify the entire list of elements and do not modify the individual elements of the list. Also, the term "example" is intended to refer to an example or illustration.

[0032] Additionally, as used herein, the term "comprises" and variants thereof denote open terms, meaning "including but not limited to". The term "based on" means "at least partly based on". The terms "one embodiment" or "one example embodiment" and "an embodiment" or "an example embodiment" mean "at least one embodiment" or "at least one example embodiment." The term "another embodiment" means "at least one other embodiment". The terms "first", "second", etc. may denote different or identical objects. Other definitions may be included below, either explicit or implicit. Unless clearly indicated in the context, the definition of a term is the same throughout the specification .

[0033] When an element is referred to as being "on, " "connected to," "coupled to," or "adjacent to," another element, the element may be directly on, connected to, coupled to, or adjacent to, the other element, or one or more other intervening elements may be present. In contrast, when an element is referred to as being "directly on," "directly connected to," "directly coupled to," or "immediately adjacent to," another element there are no intervening elements present.

[0034] Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which example embodiments belong. It will be further understood that terms, e.g., those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

[0035] It should be borne in mind that terms should be associated with the appropriate physical quantities and may merely be convenient labels applied to these quantities. Unless specifically stated otherwise, or as is apparent from the discussion, terms such as "processing" or "computing" or "calculating" or "determining" of "displaying" or the like, refer to the action and processes of a computer system, or similar electronic computing device/hardware, that manipulates and transforms data represented as physical, electronic quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.

[0036] Although described with reference to specific examples and drawings, modifications, additions and substitutions of example embodiments may be variously made according to the description by those of ordinary skill in the art. For example, the described techniques may be performed in an order different with that of the methods described, and/or components such as the described system, architecture, devices, circuit, and the like, may be connected or combined to be different from the above-described methods, or results may be appropriately achieved by other components or equivalents.

[0037] In accordance with at least some example embodiments, processors, memories and/or example algorithms, encoded as computer program code, may serve as means for providing or causing performance of operations discussed herein.

[0038] Provided in example embodiments are image processing systems, methods, devices and/or a non-transitory, tangible or other computer readable mediums, for implementing fast image processing at an edge side.

[0039] Fig. 1 is a structural schematic diagram of an image processing system 100 provided in an example embodiment. As shown in fig. 1, the image processing system 100 may comprise:

- a video stream processing device 10, for receiving a video stream 41 acquired by a camera, segmenting the video stream 41 into multiple frames of pictures arranged in chronological order, and distributing the multiple frames of pictures to edge computing devices 201 in a connected edge computing device group 20; - the edge computing device group 20, comprising the edge computing devices 201, which receive the pictures from the video stream processing device 10, subject the pictures to target identification, and send, to a picture collecting device 30, the pictures marked with a region in which an identified target is located (e.g. a partial region enclosed by a bounding box);

- the picture collecting device 30, which restores in chronological order as a video stream 42 the pictures marked with the target identification results and received from the edge computing devices 201.

[0040] A target identification task is distributed to multiple edge computing devices to be handled separately, instead of being handled on only one device; this parallel processing method considerably speeds up the processing speed of the entire video stream. The performance of a high-performance edge computing device with a GPU can be achieved using multiple low- cost edge computing devices, but the manufacturing cost is lower.

[0041] The video stream processing device 10 and the picture collecting device 30 may also be an edge computing device 201 in the edge computing device group 20, performing the function of target identification and also performing picture distribution and image processing.

[0042] Optionally, in addition to the edge computing devices 201 in the edge computing device group 20, the image processing system 100 may further comprise other edge computing devices 202. The video stream processing device 10 can monitor a state of each edge computing device 201 in the edge computing device group 20 and a state of each connected edge computing device 202 outside the edge computing device group 20. Specifically, when the state of a first edge computing device 201 in the edge computing device group 20 changes to "unavailable", the first edge computing device 201 is removed from the edge computing device group 20; and when the state of a connected second edge computing device 202 outside the edge computing device group 20 changes to "can be added to the edge computing device group 20", the second edge computing device 202 is added to the edge computing device group 20.

[0043] This optional form of implementation achieves hot backup. In a conventional method, it is necessary to configure a batch size and a portion of hyper-parameters; moreover, limited by computing power and device internal memory restrictions, the batch size will generally have a small upper limit. This optional form of implementation can alter the way in which the edge computing device group is configured and can significantly increase the upper limit of the batch size, thus achieving greater flexibility, and also guaranteeing reliable and timely processing of each picture in the video stream. The video stream processing device 10 can detect all the available edge computing devices and send pictures to them, and if it is desired to add a new edge computing device to the edge computing device group 20, all that need be done is to connect the video stream processing device 10 to the edge computing device; for example, the new edge computing device can be added to a local area network in which the video stream processing device 10 is located. Furthermore, it is also possible to flexibly alter the configuration of the edge computing device group 20, to conveniently increase or decrease the number of edge computing devices without affecting the image processing task currently being performed. If the state of an edge computing device 201 changes to "unavailable", for example if the device experiences a fault or a power cut, then the video stream processing device 10 will no longer send pictures to that edge computing device for processing. As another example, a new edge computing device 202 is added to the local area network in which the video stream processing device 10 is located; the video stream processing device 10 can then detect that the state of this new edge computing device 202 has changed to "available", and adds it to the edge computing device group 20 and sends pictures thereto for processing.

[0044] In addition, the video stream processing device 10 can also monitor the state of processing resources of each edge computing device 201 in the edge computing device group 20, distribute more pictures to edge computing devices 201 with ample remaining processing resources for processing, and distribute a small number of pictures to edge computing devices 201 with insufficient remaining processing resources or temporarily not send pictures thereto. This achieves processing load balance among the edge computing devices 201.

[0045] One possibility is that an edge computing device 201 does not complete a target identification task in time, so is unable to send a processed picture to the picture collecting device 30 in time; in one optional form of implementation, the picture collecting device 30 acquires the original picture before processing and other frames of pictures and restores these as the video stream 42.

[0046] The image processing system 100 may be used in various scenarios to complete image processing tasks, and can effectively increase the processing speed. One possible application scenario is as shown in Fig. 7; the image processing system 100 is used to monitor a region of interest (ROI, region- of-interest) 62. In Fig. 7, 62 is an entrance. A region captured by a camera is 60, and each edge computing device 201 in the edge computing device group 20 marks the ROI 62 in each frame of picture received. In scenarios in which the camera position and capture angle are fixed, it is necessary to record the positional relationship between the ROI 62 and the field of view 60 of the camera when the camera is installed, and notify each edge computing device 201 of the positional relationship; the edge computing devices 201 can then mark the ROI 62 in the pictures according to the positional relationship. When a pedestrian or a vehicle moves into the entrance, the pedestrian or vehicle will appear in the field of view of the camera. The edge computing devices 201 identify the pedestrian and vehicle by target identification, and use bounding boxes to mark a region 61 in which the pedestrian or vehicle is located in the pictures.

[0047] Thus, both the region 61 in which the identified target is located and the ROI 62 are marked in the pictures collected by the picture collecting device 30. Optionally, in order to judge whether a target has entered or abnormally left the monitored ROI 62, for each frame of picture, the picture collecting device 30 can judge whether there is an overlap between the ROI 62 and the region 61 in which the identified target is located; if there is an overlap, then the degree to which the proportion of the area of the ROI taken up by a region of overlap changes within a preset (or defined) time period (e.g. 10 s) is judged. If the degree of change exceeds a preset (or defined) change degree threshold, then an alert is outputted. The degree of change within a preset time period is used for judgment in order to avoid misjudgments caused by errors in the results of target identification in individual frames. [0048] The above-described judgment process of the picture collecting device 30 is explained below with reference to Fig. 6. As can be seen from Fig. 6, three targets have been identified in the current frame of picture; the regions in which these are located are all marked 61, while the ROI is 62. In the figure, the shaded parts marked with oblique lines are regions of overlap between the ROI 62 and the regions 61 in which the identified targets are located. If the proportion of the area of the ROI 62 taken up by the region of overlap exceeds a preset (or defined) threshold within a preset (or defined) time period, then it is concluded that a target has entered the ROI 62 (if the proportion taken up increases by an amount exceeding the preset threshold), or that a target has abnormally left the ROI 62 (if the proportion taken up decreases by an amount exceeding the preset threshold). Of course, the two thresholds used to judge that a target has entered and that a target has abnormally left may be different. Misjudgments can also be avoided by setting the preset threshold.

[0049] The picture collecting device 30 can judge whether a target has entered or abnormally left the ROI 62 based on the received pictures; furthermore, the picture collecting device 30 can mark target-related information in the outputted alert in the following optional fashion.

[0050] The picture collecting device 30 can compare information of pre-stored targets and the targets identified in the received pictures, and judge whether an identified target conforms to information of one pre-stored target; if the identified target conforms to information of one pre-stored target, then target information of the pre-stored identified target is marked in the outputted alert, e.g. an identifier ID of the target. If the target is a person, then information such as the person's name and job title can be marked; if the target is a vehicle, then information such as the vehicle's license plate number can be marked. If the identified targets do not conform to information of any pre-stored (or defined) target, the picture collecting device 30 can extract features of key parts of the targets identified in each frame of picture in the video stream in a preset time period, the key parts being used to distinguish between different targets. The frames of pictures are clustered according to the extracted features, a feature located at the cluster center point is acquired, the key part of the target in the picture that is most similar to the feature located at the cluster center point is obtained, and the key part so obtained is marked in the outputted alert. For example: if the target is a person, then the person's face can be marked; if the target is a vehicle, then the vehicle's licence plate region can be marked, and so on. The most identifiable key parts can be found through this clustering method.

[0051] The solution shown in Fig. 1 can be deployed quickly and easily, to achieve the same processing speed and accuracy as a high-performance GPU, and the cost will be greatly reduced. In addition, using the optional implementation method of hot backup mentioned above, edge computing devices can be added easily without the need for additional configuration, to expand the edge computing device group.

[0052] The solution shown in Fig. 1 can be used for monitoring of regions of interest, for example: registration and identification of visitors in a building, monitoring of region entrances, and so on. The edge computing devices are generally small in volume, and relatively easy to deploy at front desks, main doors, etc. With the solution provided in one or more example embodiments, the computing power provided by the edge computing device group is sufficient to perform corresponding image processing, being capable of rapid target identification, e.g. face recognition, etc.

[0053] The video stream processing device 10 and picture collecting device 30 provided in one or more example embodiments are presented below; both of these devices can be regarded as image processing apparatuses.

[0054] The video stream processing device 10 provided in one or more example embodiments may be implemented as a network of computer processors, in order to perform the image processing method 400 in one or more example embodiments. The video stream processing device 10 may also be a single computer as shown in Fig. 2, comprising at least one memory 101, which comprises a computer readable medium, e.g. random access memory (RAM). The apparatus 11 further comprises at least one processor 102 coupled to the at least one memory 101. Computer executable instructions are stored in the at least one memory 101, and, when executed by the at least one processor 102, can cause the at least one processor 102 to perform the steps described herein. The at least one processor 102 may comprise a microprocessor, an application specific integrated circuit

(ASIC), a digital signal processor (DSP), a central processing unit (CPU), a graphics processing unit (GPU), a state machine, etc. Examples of computer readable media include, but are not limited to, floppy disks, CD-ROM, magnetic disks, memory chips, ROM, RAM, ASIC, configured processors, all-optical media, all magnetic tapes or other magnetic media, or any other media from which instructions can be read by a computer processor. In addition, various other forms of computer readable media may send or carry instructions to a computer, including routers, dedicated or public networks, or other wired and wireless transmission devices or channels. The instructions may include code of any computer programming language, including C, C++, C language, Visual Basic, Java and JavaScript.

[0055] When executed by the at least one processor 102, the at least one memory 101 shown in Fig. 2 may contain an image processing program 11, which causes the at least one processor 102 to perform the image processing method 400 described in one or more example embodiments. The image processing program 11 may comprise:

- a receiving module 111, configured to receive a video stream;

- a segmenting module 112, configured to segment the video stream into multiple frames of pictures arranged in chronological order;

- a distributing module 113, configured to distribute the multiple frames of pictures to edge computing devices in a connected edge computing device group, to enable the edge computing devices in the edge computing device group to subject the received pictures to target identification, wherein the pictures marked with a region in which an identified target is located are restored in chronological order as a video stream;

- a monitoring module 114, configured to: monitor a state of each edge computing device in the edge computing device group and a state of each connected edge computing device outside the edge computing device group; when a first edge computing device in the edge computing device group experiences a fault, remove the first edge computing device from the edge computing device group; and when the state of a connected second edge computing device outside the edge computing device group changes to "can be added to the edge computing device group", add the second edge computing device to the edge computing device group. [0056] Optionally, the video stream processing device 10 may further comprise a communication module 103, connected to the at least one processor 102 and the at least one memory 101 via a bus, and used for communication between the video stream processing device 10 and an external device.

[0057] It should be mentioned that example embodiments may comprise an apparatus having an architecture different from that shown in Fig. 2. The architecture described above is merely an example, and serves to explain the method 400 provided in example embodiments.

In addition, the modules described above may also be regarded as functional modules realized by hardware, for performing the various functions which are involved when the video stream processing device 10 performs the image processing method. For example, control logic of the processes involved in the method is burned into field-programmable gate array (FPGA) chips or complex programmable logic devices (CPLD) etc. in advance, with these chips or devices executing the functions of the modules described above. The particular manner of implementation may be determined according to engineering practice.

[0058] The picture collecting device 30 provided in one or more example embodiments may be implemented as a network of computer processors, in order to perform the image processing method 500 in one or more example embodiments. The picture collecting device 30 may also be a single computer as shown in Fig. 3, comprising at least one memory 201, which comprises a computer readable medium, e.g. random access memory (RAM). The apparatus 12 further comprises at least one processor 202 coupled to the at least one memory 201. Computer executable instructions are stored in the at least one memory 201, and, when executed by the at least one processor 202, can cause the at least one processor 202 to perform the steps described herein. The at least one processor 202 may comprise a microprocessor, an application specific integrated circuit (ASIC), a digital signal processor (DSP), a central processing unit (CPU), a graphics processing unit (GPU), a state machine, etc. Examples of computer readable media include, but are not limited to, floppy disks, CD-ROM, magnetic disks, memory chips, ROM, RAM, ASIC, configured processors, all-optical media, all magnetic tapes or other magnetic media, or any other media from which instructions can be read by a computer processor. In addition, various other forms of computer readable media may send or carry instructions to a computer, including routers, dedicated or public networks, or other wired and wireless transmission devices or channels. The instructions may include code of any computer programming language, including C, C++, C language, Visual Basic, Java and JavaScript.

[0059] When executed by the at least one processor 202, the at least one memory 201 shown in Fig. 3 may contain an image processing program 21, which causes the at least one processor 202 to perform the image processing method 500 described in one or more example embodiments. The image processing program 21 may comprise:

- a receiving module 211, configured to receive multiple frames of pictures arranged in chronological order, wherein each frame of picture is marked with a region in which a target identified by target identification is located and a region of interest;

- an alerting module 212, configured to: judge whether there is an overlap between the region in which the identified target is located and the region of interest marked in each frame of picture received; if there is an overlap, then judge the degree to which the proportion of the area of the region of interest taken up by a region of overlap changes within a preset time period; and if the degree of change exceeds a preset change degree threshold, then output an alert.

[0060] Optionally, the alerting module 212 is further configured to: compare information of pre-stored targets and the target identified in the pictures received by the receiving module 211, and judge whether the identified target conforms to information of one pre-stored target; and if the identified target conforms to information of one pre-stored target, then mark, in the outputted alert, target information of the prestored identified target.

[0061] Optionally, the alerting module 212 may also be configured to: if the identified target does not conform to information of any pre-stored target, then extract a feature of a key part of the target identified in each frame of picture in the video stream in a preset time period, the key part being used to distinguish between different targets; cluster the frames of pictures according to the extracted feature; acquire a feature located at the cluster center point; obtain the key part of the target in the picture that is most similar to the feature located at the cluster center point; and mark the key part so obtained in the outputted alert.

[0062] Optionally, the picture collecting device 30 may further comprise a communication module 203, connected to the at least one processor 202 and the at least one memory 201 via a bus, and used for communication between the picture collecting device 30 and an external device.

[0063] It should be mentioned that one or more example embodiments may comprise an apparatus having an architecture different from that shown in Fig. 3. The architecture described above is merely an example, and serves to explain the method 500 provided in one or more example embodiments.

[0064] In addition, the modules described above may also be regarded as functional modules realized by hardware, for performing the various functions which are involved when the picture collecting device 30 performs the image processing method. For example, control logic of the processes involved in the method is burned into field-programmable gate array (FPGA) chips or complex programmable logic devices (CPLD) etc. in advance, with these chips or devices executing the functions of the modules described above. The particular manner of implementation may be determined according to engineering practice. The image processing method 400 provided in one or more example embodiments is explained below with reference to Fig. 4. As shown in Fig. 4, the method may comprise the following steps:

- S401: receiving a video stream:

- S402: segmenting the video stream into multiple frames of pictures arranged in chronological order;

- S403: distributing the multiple frames of pictures to edge computing devices in a connected edge computing device group, to enable the edge computing devices in the edge computing device group to subject the received pictures to target identification, wherein the pictures marked with a region in which an identified target is located are restored in chronological order as a video stream;

- S404: monitoring a state of each edge computing device in the edge computing device group and a state of each connected edge computing device outside the edge computing device group;

- S405: when a first edge computing device in the edge computing device group experiences a fault, removing the first edge computing device from the edge computing device group;

- S406: when the state of a connected second edge computing device outside the edge computing device group changes to "can be added to the edge computing device group", adding the second edge computing device to the edge computing device group.

[0065] The image processing method 500 provided in one or more example embodiments is explained below with reference to Fig. 5. As Fig. 5 shows, the method may comprise the following steps:

- S501: receiving multiple frames of pictures arranged in chronological order, wherein each frame of picture is marked with a region in which a target identified by target identification is located and a region of interest;

- S502: judging whether there is an overlap between the region in which the identified target is located and the region of interest marked in each frame of picture received; if there is an overlap, then performing step S503;

- S503: judging the degree to which the proportion of the area of the region of interest taken up by a region of overlap changes within a preset time period; and if the degree of change exceeds a preset change degree threshold, then performing step S504;

- S504: outputting an alert.

Optionally, the method 500 may further comprise the following steps:

- S505: comparing information of pre-stored targets and the target identified in the received pictures, and judging whether the identified target conforms to information of one pre-stored target; if the identified target conforms to information of one pre-stored target, then performing step S506; if the identified target does not conform to information of any pre-stored target, then performing step S507 - S511;

- S506: marking, in the outputted alert, target information of the pre-stored identified target.

- S507: extracting a feature of a key part of the target identified in each frame of picture in the video stream in a preset time period, the key part being used to distinguish between different targets;

- S508: clustering the frames of pictures according to the extracted feature;

- S509: acquiring a feature located at the cluster center point;

- S510: obtaining the key part of the target in the picture that is most similar to the feature located at the cluster center point;

- S511: marking the key part so obtained in the outputted alert. [0066] In addition, a non-transitory, tangible or other computer-readable medium is also provided in one or more example embodiments; a computer-readable instruction is stored on the computer-readable medium, and the computer-readable instruction, when executed by a processor, causes the processor to perform the image processing method described above. Examples of computer-readable media include floppy disks, hard disks, magneto-optical disks, optical disks (such as CD-ROM, CD-R, CD- RW, DVD-ROM, DVD-RAM, DVD-RW, DVD+RW), magnetic tapes, nonvolatile memory cards and ROM. Optionally, a computer-readable instruction may be downloaded from a server computer or a cloud via a communication network. It must be explained that not all of the steps and modules in the flows and system structure diagrams above are necessary; certain steps or modules may be omitted according to actual requirements. The order in which steps are executed is not fixed, but may be adjusted as required. The system structures described in example embodiments above may be physical structures, and may also be logical structures, i.e. some modules might be realized by the same physical entity, or some modules might be realized by multiple physical entities, or realized jointly by certain components in multiple independent devices.