Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR AUTOMATICALLY EXPLORING STATES AND TRANSITIONS OF A HUMAN MACHINE INTERFACE (HMI) DEVICE
Document Type and Number:
WIPO Patent Application WO/2023/110478
Kind Code:
A1
Abstract:
In order to improve software exploration of HMI devices (14), the invention suggests a method comprising an AI agent (28, 30) determining an HMI action to be performed by the HMI device (14). The HMI action is selected from an allowed action set of HMI actions that are allowed to be performed by the AI agent (28, 30) based on a set of predetermined test conditions. The testing environment (12) captures image data (38) from the HMI device (14), the image data (38) being indicative of a current screen associated with the HMI state and comparing the current screen with a set of all known screens. If the current screen is not part of the set of all known screens, the AI agent (28, 30) performs a semantic analysis on the current screen, determines a semantic description of the current screen, combines the current screen and the semantic description into an annotated screen, and adds the annotated screen to the set of all known screens. Finally, the AI agent (28, 30) uses the annotated screen corresponding to the current screen to update the allowed action set based on the annotated screen.

Inventors:
TEO YON SHIN (SG)
CAO YUSHI (SG)
TOH YUXUAN (SG)
ADIGA VINAY VISHNUMURTHY (SG)
WAI AUNG PHYO (SG)
CHONG GA XIANG (SG)
AYE HTUN SU NANDAR (SG)
LIN SHANG-WEI (SG)
LIU YANG (SG)
Application Number:
PCT/EP2022/084343
Publication Date:
June 22, 2023
Filing Date:
December 05, 2022
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
CONTINENTAL AUTOMOTIVE TECH GMBH (DE)
UNIV NANYANG TECH (SG)
International Classes:
G06F11/36; G06N3/006
Foreign References:
US20210073110A12021-03-11
US20190384699A12019-12-19
DE10202115031A
Other References:
CAO YUSHI YUSHI002@E NTU EDU SG ET AL: "Automatic HMI structure exploration via curiosity-based reinforcement learning", PROCEEDINGS OF THE 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ACMPUB27, NEW YORK, NY, USA, 15 November 2021 (2021-11-15), pages 1151 - 1155, XP058907572, ISBN: 978-1-4503-9442-0, DOI: 10.1109/ASE51524.2021.9678703
MURTHY VENKATESH N VENK@CS UMASS EDU ET AL: "Automatic Image Annotation using Deep Learning Representations", PROCEEDINGS OF THE 10TH ACM INTERNATIONAL ON CONFERENCE ON EMERGING NETWORKING EXPERIMENTS AND TECHNOLOGIES, CONEXT '14, ACM PRESS, NEW YORK, NEW YORK, USA, 22 June 2015 (2015-06-22), pages 603 - 606, XP058504980, ISBN: 978-1-4503-3279-8, DOI: 10.1145/2671188.2749391
TAKANEN, A.DEMOTT, J.DMILLER, C.KETTUNEN, A.: "Fuzzing for software security testing and quality assurance", ARTECH HOUSE, 2018
SUTTON, M.GREENE, A.AMINI, P.: "Fuzzing: brute force vulnerability discovery", PEARSON EDUCATION, 2007
ADAMO, D.KHAN, M.K.KOPPULA, S.BRYCE, R.: "Reinforcement learning for android gui testing. In Proceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design", SELECTION, AND EVALUATION, November 2018 (2018-11-01), pages 2 - 8
MNIH, V.KAVUKCUOGLU, K.SILVER, D.RUSU, A.A.VENESS, JBELLEMARE, M.G.GRAVES, A.RIEDMILLER, M.FIDJELAND, A.K.OSTROVSKI, G.: "Human-level control through deep reinforcement learning", NATURE, vol. 518, no. 7540, 2015, pages 529 - 533, XP037437579, DOI: 10.1038/nature14236
ZHENG, Y.XIE, X.SU, T.MA, L.HAO, J.MENG, Z.LIU, Y.SHEN, R.CHEN, YFAN, C.: "In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE", November 2019, IEEE., article "Wuji: Automatic online combat game testing using evolutionary deep reinforcement learning", pages: 772 - 784
VUONG, T.A.T.TAKADA, S: "A reinforcement learning based approach to automated testing of android applications", IN PROCEEDINGS OF THE 9TH ACM SIGSOFT INTERNATIONAL WORKSHOP ON AUTOMATING TEST CASE DESIGN, SELECTION, AND EVALUATION, November 2018 (2018-11-01), pages 31 - 37, XP055688509, DOI: 10.1145/3278186.3278191
STILL, S.PRECUP, D.: "An information-theoretic approach to curiositydriven reinforcement learning", THEORY IN BIOSCIENCES, vol. 131, no. 3, 2012, pages 139 - 148, XP035096381, DOI: 10.1007/s12064-011-0142-z
TANG, H.HOUTHOOFT, R.FOOTE, D.STOOKE, ACHEN, XDUAN, Y.SCHULMAN, J.DE TURCK, F.ABBEEL, P: "# exploration: A study of count-based exploration for deep reinforcement learning", IN 31ST CONFERENCE ON NEURAL INFORMATION PROCESSING SYSTEMS (NIPS, vol. 30, 2017, pages 1 - 18
PATHAK, D.AGRAWAL, P.EFROS, A.A.DARRELL, T.: "Curiositydriven exploration by self-supervised prediction", IN INTERNATIONAL CONFERENCE ON MACHINE LEARNING, July 2017 (2017-07-01), pages 2778 - 2787
SUN, X.LI, T.XU, J.: "2020 IEEE 20th International Conference on Software Quality, Reliability and Security Companion (QRS-C", December 2020, IEEE., article "UI Components Recognition System Based On Image Understanding", pages: 65 - 71
YNION, J.C., USING AI IN AUTOMATED UI LOCALIZATION TESTING OF A MOBILE APP, 2020
SINGH, M.K.FERNANDES, W.MRASHID, M.S.: "Smart Cities and Applications", 2021, SPRINGER, article "Robust UI Automation Using Deep Learning and Optical Character Recognition (OCR). In Proceedings of International Conference on Recent Trends in Machine Learning, IoT", pages: 33 - 44
Attorney, Agent or Firm:
KASTEL PATENTANWÄLTE PARTG MBB (DE)
Download PDF:
Claims:
CLAIMS

1 . A computer implemented method for automatically exploring HMI states and HMI transitions between said HMI states of an HMI device (14) that is arranged in a testing environment (12), wherein each HMI transition connects a first HMI state with a second HMI state via an HMI action, the method comprising: a) an Al agent (28, 30) using a reinforcement learning method for determining an HMI action to be performed by the HMI device (14), wherein the HMI action is selected from an allowed action set of HMI actions that are allowed to be performed by the Al agent (28, 30) based on a set of predetermined test conditions; b) the Al agent (28, 30) communicating the HMI action determined in step a) to the testing environment (12), wherein the testing environment (12): c) generates an HMI control signal that causes the HMI device (14) to change from the first HMI state to the second HMI state; and d) captures image data (38) from the HMI device (14), the image data (38) being indicative of a current screen associated with the second HMI state and comparing the current screen with a set of all known screens, wherein, if the current screen is not part of the set of all known screens, step e) is performed, otherwise step f) is performed; e) the Al agent (28, 30) performing semantic analysis on the current screen that was captured in step d), determining a semantic description of the current screen, combining the current screen and the semantic description into an annotated screen, and adding the annotated screen to the set of all known screens; f) the Al agent (28, 30) using the annotated screen previously obtained in step e) or selecting the annotated screen corresponding to the current screen from the set of all known screens and updating the allowed action set based on the annotated screen.

2. The method according to claim ^ ch a racte ri zed by a step g) of repeating the method from step a) until a predetermined abort condition is met and/or stopping the execution of the method, if the abort condition is met.

3. The method according to claim 2, ch a racte ri zed i n th at , the abort condition is selected from a group consisting of a number of HMI transitions; allocated time; and failing to detect new HMI states within a predetermined time limit.

4. The method according to any of the preceding claims, characterized by a step of the Al agent (28, 30) outputting the set of all known screens having the annotated screens and/or a step of storing and outputting the encountered the HMI transitions.

5. The method according to any of the preceding claims, characterized in that, in step a) the set of allowed actions and/or the set of test conditions are determined using natural language processing.

6. The method according to any of the preceding claims, characterized in that, in step e) the image data (38) of the current screen is broken down into text image data (44), that only include text-like elements, and graphics image data (46), that only include graphics-like elements, the text image data (44) and the graphics image data (46) are semantically analyzed separately, and the semantic description is combined from the seperately analyzed text and graphics image data (44, 46).

7. The method according to claim 6, characterized in that, in the semantic analysis of the text image data (44) and the graphics image data (46), a set of candidate words (48) is generated for each of these data, wherein the candidate words (48) are indicative of the current screen.

8. The method according to claim 7, characterized in that, the sets of candidate words (48) are narrowed to a set of representative words (50), that is smaller than the set of candidate words (48), by vectorizing the sets of candidate words (48) using word embedding.

9. The method according to any of the preceding claims, characterized in that, in step a) the reinforcement learning method is a Q-learning method, wherein a pair of HMI state and HMI action has associated with it a Q-value, that is defined to memorize and capture temporal relations among HMI states and HMI actions from which the HMI action is generated.

10. A system (10) comprising a testing environment (12) with an HMI device (14) to be tested and a control unit (22) that is operatively coupled to the HMI device (14) and an agent environment (24) that is operatively coupled to the testing environment (12), wherein the system (10) is configured to perform a method according to any of the preceding claims, so as to explore the HMI states and/or HMI transitions of the HMI device.

11 . A computer program, a machine readable storage medium, or a data signal that comprises instructions that, upon execution on a data processing device (27) and/or control unit (22), cause the device to perform one, some, or all of the steps of a method according to any of the preceding claims 1 to 19.

Description:
DESCRIPTION

Method for automatically exploring states and transitions of a human machine interface (HMI) device

TECHNICAL FIELD

The invention relates to a computer implemented method for automatically exploring HMI states and HMI transition of an HMI device.

BACKGROUND

Prior art has proposed the use of computer vision and optical character recognition (OCR) techniques to detect user interface (Ul) elements on the human machine interaction (HMI) software screens. Usually human intervention is required to interact with the software under test (SUT), either in the form of manual input during testing or through automation scripts using picture templates and screen description databases which are also prepared by human testers. Although this approach can detect faults on the HMI screens at a high accuracy, it is not deemed scalable as the human testers have to manually create the templates, descriptions, etc. for each screen to be tested.

An artificial neural network-based approach has the potential to generalize better, but additional overhead in the form of model training, maintenance and continuous deployment may be needed to ensure that the model can keep up with a shift in design principles of HMI devices/software over time. Moreover, even simple icon sets may also look different across projects even though they carry similar meanings.

When it comes to automating the exploration of the underlying structure of the HMI graphics user interface (GUI) software at the system level, the industry has been relying heavily on manual control by testers. Heuristic approaches such as random fuzzing were proposed to improve testing. Although a random fuzzing exploration approach can be deployed out of the box without any prior knowledge of the SUT or model training, it can suffer from a lack of efficiency and not necessarily guarantee total coverage.

Without an intelligence to understand elements of III screens in order to apply the correct set of action inputs, the fuzzing-based testing is usually incapable of extracting as well as verifying structural and information displayed on HMI device like a human software testing engineer, while adhering to the constraints imposed by the testing conditions.

Reference is made to the following documents:

[1] Takanen, A., Demott, J.D., Miller, C. and Kettunen, A., 2018. Fuzzing for software security testing and quality assurance. Artech House.

[2] Sutton, M., Greene, A. and Amini, P., 2007. Fuzzing: brute force vulnerability discovery. Pearson Education.

[3] Adamo, D., Khan, M.K., Koppula, S. and Bryce, R., 2018, November. Reinforcement learning for android gui testing. In Proceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation (pp. 2-8).

[4] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G. and Petersen, S., 2015. Human-level control through deep reinforcement learning, nature, 518(7540), pp.529- 533.

[5] Zheng, Y., Xie, X., Su, T., Ma, L., Hao, J., Meng, Z., Liu, Y., Shen, R., Chen, Y. and Fan, C., 2019, November. Wuji: Automatic online combat game testing using evolutionary deep reinforcement learning. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE) (pp. 772-784). IEEE.

[6] Vuong, T.A.T. and Takada, S., 2018, November. A reinforcement learning based approach to automated testing of android applications. In Proceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation (pp. 31-37).

[7] Still, S. and Precup, D., 2012. An information-theoretic approach to curiosity- driven reinforcement learning. Theory in Biosciences, 131 (3), pp.139-148.

[8] Tang, H., Houthooft, R., Foote, D., Stooke, A., Chen, X., Duan, Y., Schulman, J., De Turck, F. and Abbeel, P., 2017. # exploration: A study of count-based exploration for deep reinforcement learning. In 31st Conference on Neural Information Processing Systems (NIPS) (Vol. 30, pp. 1-18).

[9] Pathak, D., Agrawal, P., Efros, A. A. and Darrell, T., 2017, July. Curiositydriven exploration by self-supervised prediction. In International Conference on Machine Learning (pp. 2778-2787). PMLR.

[10] Sun, X., Li, T. and Xu, J., 2020, December. Ul Components Recognition System Based On Image Understanding. In 2020 IEEE 20th International Conference on Software Quality, Reliability and Security Companion (QRS-C) (pp. 65-71). IEEE.

[11] Ynion, J.C., 2020. Using Al in Automated Ul Localization Testing of a Mobile App.

[12] Singh, M.K., Fernandes, W.M. and Rashid, M.S., 2021. Robust Ul Automation Using Deep Learning and Optical Character Recognition (OCR). In Proceedings of International Conference on Recent Trends in Machine Learning, loT, Smart Cities and Applications (pp. 33-44). Springer, Singapore.

[13] CN 106201898 A

[14] US 2019012254 A1

[15] WO 2020086773 A1

[16] US 5542043 A

[17] US 2007022407 A1

[18] US 2019179732 A1

[19] US 2003126517 A1 [20] US 20180157386 A1

[21] US 2003229825 A1

[22] US 20150378876 A1

[23] US 20150339213 A1

Furthermore, reference is made to unpublished German patent application 10 2021 115 031.0, the disclosure of which is hereby incorporated by reference, in particular for purposes of enabling disclosure.

SUMMARY OF THE INVENTION

It is the object of the invention to improve testing methods for HMI devices, such as a vehicle dashboard, preferably in regards to efficiency, coverage, and time requirement.

The invention provides a computer implemented method for automatically exploring HMI states and HMI transitions between said HMI states of an HMI device that is arranged in a testing environment, wherein each HMI transition connects a first HMI state with a second HMI state via an HMI action, the method comprising: a) an Al agent determining an HMI action to be performed by the HMI device with a reinforcement learning method, wherein the HMI action is selected from an allowed action set of HMI actions that are allowed to be performed by the Al agent based on a set of predetermined test conditions; b) the Al agent communicating the HMI action determined in step a) to the testing environment, wherein the testing environment: c) generates an HMI control signal that causes the HMI device to change from the first HMI state to the second HMI state; and d) captures image data from the HMI device, the image data being indicative of a current screen associated with the second HMI state and comparing the current screen with a set of all known screens, wherein, if the current screen is not part of the set of all known screens, step e) is performed, otherwise step f) is performed; e) the Al agent performing semantic analysis on the current screen that was captured in step d), determining a semantic description of the current screen, combining the current screen and the semantic description into an annotated screen, and adding the annotated screen to the set of all known screens; f) the Al agent using the annotated screen previously obtained in step e) or selecting the annotated screen corresponding to the current screen from the set of all known screens and updating the allowed action set based on the annotated screen.

Preferably, step a) includes detecting image data displayed by the HMI device, the image data being indicative of the HMI state. Preferably step a) includes hashing the image data in order to obtain a hash representation of the HMI state. Preferably, if a previously unencountered hash representation is encountered within a predetermined time interval or a predetermined number of HMI actions the Al agent generates an HMI action that is determined by a curiosity-based reinforcement learning method that includes at least one curiosity measure that is defined for each pair of HMI state and HMI action. Preferably, step a) includes compiling a sequence of HMI actions that is determined by a DFA. Preferably, the Al agent sends the HMI action of or a sequence of HMI actions to the testing environment.

Preferably, a transition function of the DFA is updated to include a previously unencountered HMI transition from a first HMI state to a second HMI state, if the second HMI state was previously unencountered. Preferably, the reinforcement learning method is a Q-learning method, wherein each pair of HMI state and HMI action has associated with it a Q-value, that is defined to memorize and capture temporal relations among HMI states and HMI actions from which the HMI action to be sent is generated. Preferably, upon performing a particular HMI action, a corresponding curiosity measure is decreased.

Preferably, the Q-value is updated according to the following equation

Q new (s, d) = Q current (s, d) + a [ ? ■ curiosity(s, d) + y ■ max Q(s', a) — Q current (s, a)] wherein Q new is the updated Q-value, Q current is the current Q-value, a is the learning rate, [3 is the curiosity coefficient, y is the discount factor, curiosity(s, a) is the curiosity measure associated with HMI state, s ,and HMI action, a, and s’ denotes a newly reached HMI state. Preferably, the HMI action is generated using an s-greedy method, wherein the E- greedy method chooses, with a predetermined probability of 1 -e, the HMI action that has the maximum Q-value, or chooses, with a predetermined probability of E, the HMI action that has the maximum curiosity measure.

Preferably, the DFA determines the HMI transition that has the highest curiosity measure, wherein the DFA further identifies the shortest sequence of HMI actions that result in the HMI transition with the highest curiosity measure and outputs said sequence of HMI actions for sending.

Preferably the method comprises a step of storing encountered HMI transitions and encountered HMI states for further processing.

Preferably, the method comprises a step g) of repeating the method from step a) until a predetermined abort condition is met and/or stopping the execution of the method, if the abort condition is met.

Preferably, the abort condition is selected from a group consisting of a number of HMI transitions; allocated time; and failing to detect new HMI states within a predetermined time limit.

Preferably, the method comprises a step of the Al agent outputting the set of all known screens having the annotated screens and/or a step of storing and outputting the encountered the HMI transitions.

Preferably, in step a) the set of allowed actions and/or the set of test conditions are determined using natural language processing.

Preferably, in step e) the image data of the current screen is broken down into text image data, that only include text-like elements, and graphics image data, that only include graphics-like elements, the text image data and the graphics image data are semantically analyzed separately, and the semantic description is combined from the seperately analyzed text and graphics image data. Preferably, in the semantic analysis of the text image data and the graphics image data, a set of candidate words is generated for each of these data, wherein the candidate words are indicative of the current screen.

Preferably, the sets of candidate words are narrowed to a set of representative words by vectorizing the sets of candidate words using word embedding.

The invention provides a system comprising a testing environment with an HMI device to be tested and a control unit that is operatively coupled to the HMI device and an agent environment that is operatively coupled to the testing environment, wherein the system is configured to perform a previously described method, so as to explore the HMI states and/or HMI transitions of the HMI device.

The invention provides a computer program, a machine readable storage medium, or a data signal that comprises instructions that, upon execution on a data processing device and/or control unit, cause the device to perform one, some, or all of the steps of a previously described method.

A fully automated HMI GUI software testing solution should have any of the following: i) Understanding of Ul elements on the screens; ii) Decision making on allowed action inputs; iii) Mimicry of human input for full exploration of the software to detect potential defects and bugs

For automotive applications, detection of HMI device/software design flaws is paramount as it may impact user experience and may also have safety implications. Overall, the the idea is directed at an Al agent that can conduct HMI GUI software exploration in an intelligent, autonomous manner without any human intervention, augmented by the cognitive ability to adjust its behavior during exploration on different parts of the software, which may serve different functional purposes such as user settings, entertainment, monitoring of device under testing (DUT) and so on. Previously, the unpublished German patent application 10 2021 115 031.0 discloses a curiosity-based approach, which treats the software as a black box and does not incorporate any prior knowledge or guidance from the domain experts. Other than using DFA (deterministic finite automata) to record the state-action transitions, the agent only marks the screen as visited/not yet visited by comparing the images but does not extract any information from the screen.

In actual scenarios software testing is typically done under a set of constraints, which may limit or expand the allowed actions of the agents. For example, the function of certain inputs may be different depending on which screen the software is displaying.

Certain test cases may also require the agent to have the cognitive ability or at least visual perception to each screen, for example to check if all the icons are being displayed correctly or not to trigger certain function (e.g. factory reset) that might disrupt the testing workflow.

Previously, only the exploration step was considered. Here, this idea is further developed and improved by describing how such agent can effectively interact with and extract information from the testing environment, which typically includes hardware and software components for HMI device testing at the system level. With the ability to control both the hardware and software components in the test system, as well as adjusting its behavior according to the information displayed on the HMI device, the agent is able to automate software path exploration and visual element identification part of the testing workflow in a self-contained manner.

As a result, compared to traditional heuristic software exploration approaches (such as fuzzing) efficiency, coverage, (human) labor, time requirement, variance in performance (“lottery effect”) due to randomness and non-systematic nature can be improved.

While the detection of III elements on HMI devices with a template-based approach is the current industry standard and has a high design flaw detection rate, this approach is not scalable as it usually requires a significant amount of manual overhead from the testers to prepare the template. Furthermore, the output cannot be readily used to support automation in software exploration as the template-based approach does not provide any semantic information regarding the screens that were explored.

Here, with Al it is possible to effectively extract information from the HMI DUT by fully integrating it into the testing environment. It is therefore possible to make use of the information to augment the Al’s decision making and generate additional value such as Al-based screen annotation and detection of design flaws on the III screens of the HMI device.

The Al agent as disclosed here has the cognitive ability to “see” the HMI device states to be tested and can adapt accordingly to different testing conditions. To do so, certain behaviors or predefined rules can be hardwired into the agent for them to handle different scenarios.

The usually significant overhead to map the states discovered by Al-based software exploration to actual screens in the software if the Al can be avoided since the novel Al agent understands the semantic meanings of the screens. As a result, human testers are no longer necessary for the testing process itself.

It should be noted that, while the invention is described with reference to a dash board as an HMI device, the invention is not limited thereto. It can be readily applied to any types of HMI device with well-defined transitions among different GUI screens.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention are described in more detail with reference to the accompanying schematic drawings.

Fig. 1 depicts an embodiment of a system for automatically exploring states and transitions of a human machine interface (HMI) device;

Fig. 2 and

Fig. 3 depict an embodiment of a semantic analysis; and

Fig. 4 depicts table with results comparing different methods. DETAILED DESCRIPTION OF EMBODIMENT

Referring to Fig. 1 , a system 10 for automatically exploring HMI states and HMI transitions is depicted. The system 10 comprises a testing environment 12.

The testing environment 12 includes at least one HMI device 14 that is under test. The HMI device 14 may be a vehicle dashboard for an automobile, for example.

The testing environment 12 includes a framegrabber 16 that is operatively coupled to the HMI device 14 for grabbing image data of the screens that are being output by the HMI device 14.

The testing environment 12 further includes a test box 18 that is operatively coupled to the HMI device 14 and emulates a situation as if the HMI device 14 was installed in the vehicle.

The testing environment 12 includes a CAN signal simulator 20 that is configured to simulate a CAN bus. The CAN signal simulator 20 ist operatively coupled to the test box 18, which allows sending the CAN signals to the HMI device 14 as control signals.

The testing environment 12 includes a control unit 22 that controls the testing environment 12. In particular the control unit 22 directly controls the framegrabber 16 and the CAN signal simulator 20.

The system 10 includes an agent environment 24. The agent environment 24 may comprise an API 26 that allows for communication with other devices (not shown).

The agent environment 24 has a data processing device 27, such as a general purpose computer, that is preferably operatively coupled to the API 26.

The agent environment 24 further includes an Al agent having an Al pathfinder module 28 and an Al visual element understanding module 30. The system 10 also has an output environment 32. The output environment 32 may comprise a data storage device for storing output data determined by the Al agent. The output environment 32 preferably includes a III screen transition matrix 34. The III screen transition matrix 34 includes all detected transitions from a first HMI state to a second HMI state and the corresponding HMI action that causes the HMI device 14 to transition from one to the other.

In the preparation stage, a set of allowed actions and test conditions are provided to the Al agent, specifically to the Al pathfinder module 28 preferably via the API 26. The test conditions may be prepared by the testers in natural language form and an NLP module can be employed to interpret and to trigger the modification of the allowed action.

A set of icon files from the design document may also be provided to the Al visual element understanding module 30 to help the Al agent identify the visual elements on III screens. The icons can be prelabelled by human testers or via object recognition methods using pretrained computer vision models.

The method includes a transversal decision-making step in which the decision making of the Al agent is powered by reinforcement learning (RL) methods. RL methods aim at learning optimal policies for certain desirable tasks by maximizing rewards signals from the environment. However, clear reward signals extrinsic to the agents can be extremely sparse. In these scenarios, curiosity provides a mechanism to motivate agents to explore novel states which are not tied to immediate rewards, thereby allowing the agents to acquire new skills or knowledge that might reward them in the future. From the software exploration perspective, we formulate curiosity as an intrinsic motivation to explore areas less known to the agent. This formalism adaptively encourages the agent to explore less executed action which lead to better understanding of the possible states and paths in the software.

As not all states are equally connected, to further explore hard-to-reach states, we propose a deterministic finite automaton (DFA) guided exploration strategy that provides high-level guidance for the RL agent to efficiently explore the HMI dashboard. In particular, the DFA records all the states and path taken during the exploration. When the RL agent is trapped (i.e. , cannot discover new states within a given time budget or after a fixed number of operations), a path will be chosen from DFA based on curiosity to further continue its exploration. A DFA can be described as a 5-tuple, (S, A, 5, so, F), where S is a finite set of states, A is a finite set of actions (action space), 5 is a transition function that maps a current HMI state s and an HMI action a to a new HMI state a, so is the initial state, and F is a finite set of states that cannot transit to other states. In particular, A refers to the action space, which can be modified depending on which screen the agent is currently at (the information is provided by the visual element understanding step in the previous cycle).

During the exploration, once a new transition (s, a) is explored, the DFA will be updated: 6 -=6 u{(s,a)}. When the Al agent is trapped at certain states, the DFA can help define the state with the highest curiosity value and guide the RL to the state directly by identifying the shortest path via all the transitions.

For a more comprehensive description reference is made to unpublished German patent application 102021 115 031 .0, which describes this in more detail.

The Al agent, specifically the Al pathfinder module 28 determines an HMI action from the set of allowable actions A.

In the interaction and information retrieval step, the HMI action determined by the Al agent is communicated via a API wrapper to the control unit 22, e.g. an execution server, of the testing environment 12, which comprises a testing automation platform that controls the signaling toolboxes (such as test box 18), HMI device 14 and other testing hardware. The HMI action is first sent to the data processing device 27. The data processing device 27 sends a command request to a TCP/IP server 36 as an intermediary, which in turn establishes a connection with the control unit 22.

Subsequently, the control unit 22 interprets the HMI action selected by Al agent and simulates the corresponding control signals (such as CAN signals) with the CAN singal simulator 20 in order to generate a response on the dashboard cluster, i.e. one ore more HMI devices 14. The HMI action or rather the control signal based on the HMI action causes the HMI device 14 to perform an HMI transition from a first HMI state to a second HMI state. Each of the HMI states has associated with it a III screen that includes text elements and/or visual elements.

Furthermore, upon receiving the HMI action, the control unit 22 initializes the framegrabber 16 which retrieves image data 38 of the current screen that is associated with the second HMI state after the HMI transition has ocurred. The image data 38 may instead be retrieved in the form of a snapshot via the use of a mounted camera as an alternative to the framegrabber 16. The image data 38 is then sent to the Al agent for visual element understanding.

Upon receiving the image data 38, the Al agent first performs a check on the current screen obtained from the HMI device 14 against all known screens. If the current screen is already known as an annotated screen, the Ul screen transition matrix 34 is updated.

If the current screen is not yet identified, the image data 38 are sent to the Al visual element understanding module 30. The Al visual element understanding module 30 analyzes the current screen with the aim to understand its semantic meanings, to make sure that its subsequent decision-making obeys the boundary imposed by test conditions.

As depicted in more detail in Fig. 2 and Fig. 3, visual elements can be text elements 40 or graphic elements 42. Scene text detection techniques (such as “EAST” - An Efficient and Accurate Scene Text Detector) that are known per se, are used to identify regions of the screen with texts, masking the area and to produce two images. One image purely includes text image data 44 the other purely includes graphic image data 46.

The textual information on the text data image 44 is extracted by OCR techniques (such as PyTesseract) while the graphical information on the graphic image data 46 is extracted by computer vision techniques, for example pattern recognition using the previously provided icon set. Finally, the semantic description coming from the two separate pipelines (i.e. textual and graphical) are pooled together to produce a set of candidate words 48 to describe the current screen. By leveraging a word embedding method, the Al visual element understanding module 30 can vectorize the candidate words 48 to narrow down the candidate words 48 to a few representative words 50, using methods such as finding the centroids and finding the word with embedding closest to the vector averaged from the candidate words 48. This step can also be customized according to requirements, e.g. if a specific image labeling strategy is desired by the testers. The set of all known screens is updated with an annotated screen based on the current screen that was just analyzed. The annotated screen includes the associated current screen and a semantic description of the elements contained therein. The semantic description is formed by the one or more of the representative words 50.

It should be noted that the image data 38 may be preprocessed in a known manner to obtain preprocessed image data 39 that are particularly suitable for the scene text detection techniques.

Using the information regarding the current screen that the Al agent is at, the Al agent updates its allowed action set A for the next exploration cycle accordingly. For example, if the test condition does not warrant changing of any user preferences (such as metric system, font size, font color, etc.), the Al agent will limit itself from triggering these settings should it find itself to be on the user preference screens.

Hence, during exploration the Al agent adaptively selects a subset of HMI actions from the allowed action space, according to the information the Al agent receives from the HMI device. With this the Al agent only exhibits permissible behaviors at all times as it traverses/explores through the various HMI states of the GUI, for example “avoid triggering system settings” or “do not change any user preferences”.

The Al agent terminates the exploration phase once certain conditions are met, such as the total number of steps, the allocated time budget, or no more new states (screens) from be found from all the known edge states. Subsequently, the Al agent produces two forms of output, namely the collection or set of annotated screens 52 with included semantic meanings, and the III screen transition matrix 30 showing all the possible transition paths between the annotated screens 52. These outputs can be readily used for debugging by matching to the requirement document or generating test cases.

Advantageously the inventive method is capable of exploring, interacting, and collecting information from the DUT end to end with minimal prior assumptions, i.e. agnostic on the DUT. In other words, this framework can be readily extended to other HMI devices in other functional domains with different design principles.

Using reinforcement learning approach, the agent accumulates its knowledge of the DUT on the fly as it explores - no training or re-training are required. This eliminates the need for training data and reduces maintenance of the model, allowing a “plug and play” feature and thus easier usage and deployment for testers which may not have the relevant Al background.

The ideas presented herein combine identifying the visual elements on the screens and using a heuristic approach to explore software paths, into a self-contained Al agent with the cognitive ability to adapt to different test conditions in an intelligent manner.

This approach enables collection of required information for software testing such as connecting paths, texts, visual designs etc. via Al independently (i.e. without the assistance of human tester during the process, can be left running on the machine overnight), which is otherwise carried out manually in the current industry standard.

Connecting paths between screens and Ul designs are required input for test case generation, which may go through frequent updates during iterations of software development cycle. By automating the update of these information using Al, human testers can be relieved from these laborious efforts, thus improves their productivity and the software quality. With the measures described herein, one can achieve the highest state and path coverage as compared to other fuzzing and Al approaches.

As shown in Fig. 4, the Al agent can achieve 95% of state and path coverage by running continuously for 12 hours without human intervention, as compared to 100% of coverage which requires 50 hours of manual labor by human testers. In other words, the invention enables to cut 50 man-hours of labor for the testing of each release.

Although not explicitly described, the invention has the potential to be extended to other fields that involve path navigation between two nodes in a graph with well defined discrete state transition, such as auto-routing in design of circuit boards, wiring of sensory devices such as traffic signaling networks, interconnection of power grids, and graphical object search in geospatial information systems (GIS).

REFERENCE SIGNS

10 system

12 testing environment

14 HMI device

16 framegrabber

18 test box

20 CAN signal simulator

22 control unit

24 agent environment

26 API

27 data processing device

28 Al pathfinder module

30 Al visual element understanding module

32 output environment

34 III screen transition matrix

36 TCP/IP server

38 image data

39 preprocessed image data

40 text elements

42 graphic elements

44 text image data

46 graphic image data

48 candidate word

50 representative word

52 annotated screens