Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SAFETY MECHANISMS FOR ARTIFICIAL INTELLIGENCE UNITS USED IN SAFETY CRITICAL APPLICATIONS
Document Type and Number:
WIPO Patent Application WO/2022/123197
Kind Code:
A1
Abstract:
A method for safety monitoring of an artificial intelligence, AI, processing unit, the method comprises the steps of: providing a safety unit for sending a first signal to the AI processing unit; determining, by the AI processing unit, a second signal in response to the first signal; transmitting, by the AI processing unit to the safety unit the second signal and a heartbeat signal of the AI processing unit, the heartbeat signal representing a time sequence of the first and second signals; determining, by the safety unit, whether the second signal is correct and whether the time sequence is correct, wherein if either the second signal or the time sequence is incorrect, the method further comprises the step of sending a signal to shut down the AI processing unit either by resetting the AI processing unit or shutting off a power supply of the AI processing unit.

Inventors:
WU HAO (GB)
LU KE (GB)
FREER DANIEL (GB)
MAGI MART (GB)
AN HAO (GB)
Application Number:
PCT/GB2021/052831
Publication Date:
June 16, 2022
Filing Date:
November 02, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ANTOBOT LTD (GB)
International Classes:
G01R31/317; G01R31/3193; G05B19/042; G05B19/05
Foreign References:
EP2037340A12009-03-18
DE102008034150A12010-01-28
US20040172580A12004-09-02
US20150178617A12015-06-25
Attorney, Agent or Firm:
IP21 LTD (GB)
Download PDF:
Claims:
9

CLAIMS

1. A method for safety monitoring of an artificial intelligence, Al, processing unit, the method comprising the steps of: providing a safety unit for sending a first signal to the Al processing unit; determining, by the Al processing unit, a second signal in response to the first signal; transmitting, by the Al processing unit to the safety unit the second signal and a heartbeat signal of the Al processing unit, the heartbeat signal representing a time sequence of the first and second signals; determining, by the safety unit, whether the second signal is correct and whether the time sequence is correct, wherein if either the second signal or the time sequence is incorrect, the method further comprises the step of sending a signal to shut down the Al processing unit either by resetting the Al processing unit or shutting off a power supply of the Al processing unit.

2. A method according to claim 1, wherein the first signal represents a question to the Al processing unit, wherein the question is one or more from the group of: a task identifier, Taskjd corresponding to a task of the Al processing unit; an operation corresponding to the question; an operand of an operation corresponding to the question; the number of times, live_count, a second signal is transmitted in the response to a first signal, wherein the method steps are repeated; and a data check code.

3. A method according to any preceding claim, wherein determining, by the Al processing unit, a second signal comprises: providing a software module in the Al processing unit for storing the first signal in a question storage area of the Al processing unit and reading the second signal, the second signal being stored in an answer storage area of the Al processing unit, and for sending the second signal to the safety unit.

4. A method according to any preceding claim, wherein the second signal represents an answer to a question comprised in the first signal, wherein the answer is one or more from the group of: a task identifier corresponding to a task of the Al processing unit answering the question; a calculation result; the number of times, live_count, a second signal is transmitted in the response to a first signal, wherein the method steps are repeated; a time stamp of the time when the second signal is generated; a data check code.

5. A method according to any preceding claim, wherein the step of determining, by the safety unit, whether the second signal is correct and whether the time sequence is correct comprises the steps of: providing a software module of the Al processing unit to control the heartbeat signal, the software module being adapted to send a falling edge signal to the safety unit, the falling edge indicating that the Al processing unit has prepared the second signal in response to the first signal and can receive the next first signal, wherein, after the falling edge is generated for a predetermined period of time, the Al processing unit generates a rising edge of the heartbeat signal; wherein, after detecting the falling edge received from by the software module, the safety unit sends the next first signal to the Al processing unit and reads the previous second signal; and determining, by the safety unit, a time period between the falling edge and the rising edge of the heartbeat signal to verify that a clock of the Al processing unit is correct.

6. The method according to any of claims 2 to 5, wherein determining whether the second signal is correct and whether the time sequence is correct comprises determining one or more from the group of: whether the Taskjd and live_count correspond to each other; whether the calculated answer of the second signal in response to the first signal is correct, whether the data check code is correct; whether the timing of the heartbeat signal is correct; 11 whether the time interval between the rising edge and the next falling edge is within a certain predefined time period.

7. A system for safety monitoring of an artificial intelligence, Al, processing unit, the system comprising: a safety unit for sending a first signal to the Al processing unit; the Al processing unit being adapted to determine a second signal in response to the first signal and transmit to the safety unit the second signal and a heartbeat signal of the Al processing unit, the heartbeat signal representing a time sequence of the first and second signals; the safety unit being adapted to determine whether the second signal is correct and whether the time sequence is correct, the system further comprising a power supply for powering the Al processing unit, wherein the safety unit is further adapted to send a signal to shut down the Al processing unit either by resetting the Al processing unit or shutting off the power supply.

8. A system according to claim 7, wherein the Al processing unit further comprises a software module in the Al processing unit for storing the first signal in a question storage area of the Al processing unit and reading the second signal, the second signal being stored in an answer storage area of the Al processing unit, and for sending the second signal to the safety unit.

Description:

SAFETY MECHANISMS FOR ARTIFICIAL INTELLIGENCE UNITS USED IN SAFETY CRITICAL APPLICATIONS

Technical field

Aspects of the present invention generally relate to safety monitoring of artificial intelligence (Al) processing units. In particular, aspects of the present invention relate to methods and systems using a safety unit to perform safety monitoring on Al processing units with low safety levels.

Background

Functional safety is of great significance to electronic, electrical, and programmable electronic safety control systems. Functional safety focuses on avoiding unacceptable risks due to system functional failures. Internationally, the IEC 61508, IEC 61511 and other series of standards issued by the International Electrotechnical Commission (IEC) have gradually become basic functional safety standards now widely recognized by various countries and industries globally. Other industries are gradually forming national industry standards with reference to basic standards.

When designing the functional safety of a system, an important step in the early stages is to conduct hazard analysis and risk assessment of the system, identify the hazards of the system and evaluate the risk level of the hazards. IEC 61508 divides the safety integrity level (Safety Integration Level, SIL) into 4 levels, the 4th level representing the highest integrity. In order to meet the requirements, a number of safety mechanisms are typically integrated inside a chip, including a safety mechanism of the internal module of the chip and a safety mechanism at the system level. When a fault occurs and is detected by the corresponding safety mechanism, these safety mechanisms need to report the occurrence of the fault in time, so that the system can respond to the fault according to the type and degree of the fault, so as to avoid hidden faults or a function failure directly caused by the fault.

Over recent years, artificial intelligence (Al) technology has gradually become a focus of industry, being widely used in robotics, economic and political decision-making systems, and control systems, amongst others. The core hardware in Al technology may include a wide variety of chips. However, the current functional safety standards lack consideration of Al technology, and the design of functional safety standards does not consider the applicability of Al and other technologies.

Aspects of the present invention are aimed at addressing the above safety problems, amongst others.

Summary

The present invention provides methods and systems to monitor an Al processing unit such as an Al computing chip and discover computing abnormalities in time, thereby ensuring system safety.

In a broad independent aspect, there is provided a method according to claim 1.

Rather than simply communicating values between the various parts of the system, the Al processing unit is sent questions to work out answers for example via numerical or logical operations. In addition, the heartbeat signal representing the physical timeclock of the Al processing unit is checked for example to determine the time period between a falling and rising edge of the physical signal which may be predefined at a fixed value (e.g. 10 ms). Accordingly, if the answer (represented by the second signal) is correct, the Al processing unit is considered to be working normally. Otherwise, it is regarded as an abnormal system. The safety unit determines whether the timing of sending and answering the question (posed by the first signal) is correct. This timing is realized by monitoring the heartbeat signal. Behaviours that violate the timing are also regarded as system abnormalities. If the system is abnormal, the Al processing unit may be reset or the power supply of the Al processing unit may be cut off.

In a dependent aspect, the first signal represents a question to the Al processing unit, wherein the question is one or more from the group of: a task identifier, Taskjd corresponding to a task of the Al processing unit; an operation corresponding to the question; an operand of an operation corresponding to the question; the number of times, live_count, a second signal is transmitted in the response to a first signal, wherein the method steps are repeated; and a data check code, such as Cyclic Redundancy Check (CRC).

Advantageously, multiple tasks in the Al processing unit may be monitored simultaneously.

In a dependent aspect, determining, by the Al processing unit, a second signal comprises: providing a software module in the Al processing unit for storing the first signal in a question storage area of the Al processing unit and reading the second signal, the second signal being stored in an answer storage area of the Al processing unit, and for sending the second signal to the safety unit. For example, a checkpoint may be set for generating answers in the TASK that needs to be monitored in the Al processing unit. It will read the questions from the preset question area and write the answers into a pre-set answer area of the Al processing unit.

In a dependent aspect, the second signal represents an answer to a question comprised in the first signal, wherein the answer is one or more from the group of: a task identifier corresponding to a task of the Al processing unit answering the question; a calculation result; the number of times, live_count, a second signal is transmitted in the response to a first signal, wherein the method steps are repeated; a time stamp of the time when the second signal is generated; a data check code, such as Cyclic Redundancy Check (CRC).

In a dependent aspect, the step of determining, by the safety unit, whether the second signal is correct and whether the time sequence is correct comprises the steps of: providing a software module of the Al processing unit to control the heartbeat signal, the software module being adapted to send a falling edge signal to the safety unit, the falling edge indicating that the Al processing unit has prepared the second signal in response to the first signal and can receive a next first signal, wherein, after the falling edge is generated for a predetermined period of time, the Al processing unit generates a rising edge of the heartbeat signal; wherein, after detecting the falling edge received from by the software module, the safety unit sends the next first signal to the Al processing unit and reads the previous second signal; and determining, by the safety unit, a time period between the falling edge and the rising edge of the heartbeat signal to verify that a clock of the Al processing unit is correct.

In a dependent aspect, determining whether the second signal is correct and whether the time sequence is correct comprises determining one or more from the group of: whether the Taskjd and live_count correspond to each other; whether the calculated answer of the second signal in response to the first signal is correct, whether the data check code is correct; whether the timing of the heartbeat signal is correct; whether the time interval between the rising edge and the next falling edge is within a certain predefined time period.

In a second, broad independent aspect, there is provided a system according to claim 7.

To overcome the deficiencies of the prior art Al systems in relation to functional safety in applications, aspects of the present invention use a safety monitoring unit, referred to as a ‘safety unit' or ‘safety chip'. This may be a safety monitoring unit composed of a higher safety level single-chip microcomputer, to perform a safety monitor on an artificial intelligence computing unit, such as a ‘system on a chip', SOC or a field-programmable gate array, FPGA for example.

The monitoring function may include the following:

1. Communication between the safety monitoring unit and the Al processing unit (also referred to as an Al unit, Al calculation unit, or Al computing unit).

2. A ‘heartbeat’ signal used to indicate the operating status of the Al processing unit. It will be understood that the heartbeat or heartbeat signal represents a periodic signal indicative of normal operation of the processing unit, which may also be used to synchronise it with other parts of the system (i.e. the Al processing unit clock). The heartbeat signal may also be referred to as a "pulse signal", or "logic level signal".

3. A signal used to safely shut down the Al processing unit, such as a reset signal or a power control signal.

Advantageously, the safety monitoring unit sends the question to the Al processing unit, the Al processing unit feeds back the calculation result and the heartbeat signal corresponding to the question to the safety chip, and the safety unit verifies the answer. If the answer is correct, the Al processing unit is working normally. Otherwise, it is considered that the Al processing unit is not working normally, this being regarded as a psystem abnormality. Furthermore, the sending and answering of questions need to follow a certain time sequence. Advantageously, this sequence is achieved by monitoring the heartbeat signal. Behaviours that violate the sequence are also regarded as system abnormalities. If the system is abnormal, the Al processing unit is reset or the power supply of the Al processing unit is cut off.

In this manner, aspects of the invention can be used to effectively monitor whether the Al processing unit is operating normally and take reasonable fault response measures in good time. This increases safety and robustness of the overall system.

Brief Description of the Drawings

Exemplary embodiments of the invention will now be described in reference to the accompanying drawings, in which: Fig.1 shows a data flowchart of a method for safety monitoring.

Fig. 2 shows a detailed data flowchart for step S2 in Fig.1.

Fig. 3 is a schematic diagram of a safety monitoring system.

Fig. 4 schematically illustrates an example of a ‘heartbeat’ signal.

Detailed Description

The following examples are used to illustrate the implementation of the present invention.

Turning first to Fig. 3, an example safety monitoring system 1 is illustrated schematically. The system 1 for safety monitoring of an Al processing unit includes: a safety chip (module) 11, a power supply (module) 12, and an Al processing unit (module) 13, i.e. a processor 13.

Fig. 1 shows a schematic diagram of a method used to monitor the safety of the Al processing unit 13. The method includes the following steps:

Step S1

The safety chip 11 sends a question to the Al processing unit 13. In this example, questions sent to the Al processing unit 13 include the following:

(1) Taskjd corresponding to the monitored TASK. The Al processing unit may carry out a number of tasks simultaneously, identifiable by the Taskjd. These tasks and Taskjds are pre-known by both Al processing unit and the safety chip. The tasks are running and monitored by the Al processing unit.

(2) The operation corresponding to the question;

(3) The operand of the operation corresponding to the question;

(4) live_count;

(5) Data check code, such as Cyclic Redundancy Check (CRC).

In this example, the Taskjd may represent the serial number corresponding to each task in the Al processing unit 13, live_count represents the number of questions and answers (Q&A) conducted, and 1 is added for each question and answer (Q&A). Step S2

The Al processing unit 13 reads and calculates the questions received from the safety chip 11, and then feeds back the results and the heartbeat signal to the safety chip 11. Fig. 2 shows a detailed flow chart of an example step S2. As shown in Fig. 2, at step S2, the Al processing unit 13 reads and calculates the question from the safety chip 11, and then feeds the result back to the safety chip 11. More specifically, step S2 comprises the following steps:

S21: A dedicated software module (DSM) in the Al processing unit 13 stores the question in the pre-set question area.

522. The checkpoint used to generate the answer in the TASK that needs to be monitored in the Al processing unit 13 reads the question from the pre-set question area, the monitored TASK calculates the answer and write the answer into the pre-set answer area, and the Al processing unit 13 feedbacks the answers to the safety chip 11. In this example, the answer include:

(1) Taskjd corresponding to the TASK answering the question;

(2) Calculation answer;

(3) live_count;

(4) The time stamp of the time when the answer was generated;

(5) Data check code, such as CRC.

523. The DSM in the Al processing unit 13 reads the answer from the pre-set answer area and sends it to the safety chip 11.

524. The DSM in the Al processing unit 13 sends a falling edge of the heartbeat signal, indicating the answer is ready. After detecting the falling edge generated by the Al processing unit, the safety chip immediately reads the answer from the previous question and sends the next question to the Al processing unit 13.

525. The DSM in the Al processing unit 13 sends a rising edge of the heartbeat signal after a predefined period of time (e.g. 10ms) following the falling edge to verify the time sequence.

S3. The safety chip 11 checks the information in the question and answer (Q&A). Specific inspection contents in this example include:

(1) Whether taskjd and live_count correspond correctly;

(2) Whether the answer of the operation is correct;

(3) Whether the CRC check is correct.

54. The safety chip 11 verifies the heartbeat signal and measures the time between the rising edges and the falling edges to verify whether the clock of the Al processing unit 13 is correct. With reference to Fig.4, if the interval between the falling edge and the rising edge does not equal to the predefined value (e.g. 10ms in Fig. 4), the system 1 is abnormal, otherwise, it is normal. In addition, if the time interval between the rising edge and the next falling edge is within certain predefined values (Y<t<X, here X and Y are predefined), the question and answer does not time out, and the system 1 is assessed to be working normally. If the time interval between the rising edge and the next falling edge exceeds the predefined values, the system 1 is assessed to be abnormal.

55. If it is determined that the system 1 is abnormal after S3 and S4, reset the Al processing unit 13 or cut off the power supply 12 of the Al processing unit 13.

56. If it is determined that the Al processing unit 13 is working normally after S3 and S4, keep monitoring and continue to the next question and answer (Q&A).




 
Previous Patent: DRIVER CIRCUITRY

Next Patent: HAIRCARE APPLIANCE