Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD FOR MONITORING AN ANALYTICAL SYSTEM FOR STREAM DATA
Document Type and Number:
WIPO Patent Application WO/2019/004859
Kind Code:
A1
Abstract:
The invention relates to a method for monitoring intended operation of an analytical system (10) for stream data, wherein the analytical system (10) comprises software components for processing stream data that are received by the analytical system (10). A test component (20) is integrated in the analytical system (10). The method performs the following steps at the runtime of the analytical system: transmission of at least one test datum (TD) having a prescribed, respectively explicit content to the analytical system (10) for processing by the analytical system (10), wherein the test datum (TD) corresponds to the stream data processed by the analytical system (10) in terms of type and/or in terms of format; reception of the at least one test datum (TD) and reading of the content of the at least one test datum (TD); evaluation of the content of the at least one received test datum (TD) by means of a comparison with the prescribed content during transmission of the at least one test datum (TD).

Inventors:
PYAYT ALEXANDER LEONIDOVICH (RU)
VINOGRADOV SERGEY VALERIEVICH (RU)
Application Number:
PCT/RU2017/000476
Publication Date:
January 03, 2019
Filing Date:
June 30, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SIEMENS AG (DE)
International Classes:
G06F11/36
Domestic Patent References:
WO2000007312A12000-02-10
Foreign References:
US20050251298A12005-11-10
Other References:
None
Download PDF:
Claims:
PATENT CLAIMS

1. A method for monitoring intended operation of an analytical system (10) for stream data, wherein the analytical system (10) comprises software components for processing stream data that are received by the analytical system (10) , wherein the method involves a test component (20) being integrated into the analytical system (10) , which test component performs the following steps at the runtime of the analytical system: transmission of at least one test datum (TD) having a prescribed, respectively explicit content to the analytical system (10) for processing by the analytical system (10) , wherein the test datum (TD) corresponds to the stream data processed by the analytical system (10) in terms of type and/or in terms of format; reception of the at least one test datum (TD1) and reading of the content of the at least one test datum (TD' ) ; evaluation of the content of the at least one received test datum (TD1) by means of a comparison with the prescribed content during transmission of the at least one test datum (TD) .

2. The method as claimed in claim 1, in which the at least one test datum (TD) is transmitted at periodic or irregular intervals of time.

3. The method as claimed in claim 1 or 2, in which a log of the content of the at least one received test datum (TD') and/or of the prescribed content of the at least one test datum (TD) for transmission is stored for a later evaluation.

4. The method as claimed in one of the preceding claims, in which the evaluation of the content comprises whether a change in the prescribed, respectively explicit content that results from the processing by the analytical system (10) matches a result, ascertained by the test component (20) , from the prescribed, respectively explicit content.

5. The method as claimed in one of the preceding claims, in which a hash value for the prescribed, explicit content of the at least one test datum (TD) for transmission is ascertained by means of a prescribed hash function, and the ascertained hash value is stored together with an identifier of the test datum (TD) for transmission.

6. The method as claimed in claim 5, in which a further hash value is ascertained for the read content of the at least one received test datum (TD1) and is compared with the stored hash value whose associated identifier matches the identifier of the received test datum (TD') .

7. The method as claimed in claim 6, in which the presence of a loss of data is inferred if the hash value and the further hash value do not match.

8. The method as claimed in one of the preceding claims, in which a first timestamp or a first sequence number is coded into the prescribed, explicit content of the at least one test datum (TD) or is stored together with an identifier of the at least one test datum (TD) before the at least one test datum (TD) is transmitted.

9. The method as claimed in claim 8, in which reception of the at least one test datum (TD') prompts the read content of the at least one test datum (TD1) to be augmented by a second timestamp or a second sequence number and stored or prompts the second timestamp or the second sequence number to be stored together with the identifier of the at least one received test datum (TD') .

10. The method as claimed in claim 9, in which the order of the first timestamps or first sequence numbers of a plurality of test data (TD) sent in succession is compared with the order of the second timestamps or second sequence numbers of the test data received in chronological succession .

11. The method as claimed in claim 9, in which an absence of data consistency is inferred if the orders do not match.

12. A computer program product that can be loaded directly into the internal memory of a signal processing unit and comprises software code sections that are used to carry out the steps according to one of the preceding claims when the product runs on the signal processing unit.

Description:
METHOD FOR MONITORING AN ANALYTICAL SYSTEM FOR STREAM DATA

The invention relates to a method for monitoring an analytical system for stream data, wherein the analytical system comprises software components for processing the stream data that are received by the analytical system.

Uninterrupted monitoring of a technical system, e.g. the vibration monitoring of a machine, during operation of said machine produces a continuous data stream that represents a time-dependent signal. The continuous data stream, which comprises what are known as stream data, is processed by means of what is known as an analytical system. The processing of the continuous data stream allows monitoring of the functions of the technical system in real time. Furthermore, it is expedient for the intended operation of the analytical system used for evaluating the stream data to be ensured too.

Such an analytical system consists of a multiplicity of different software components that are each developed and tested on their own. These are then combined with one another in the analytical system in a manner provided for in advance. Problems frequently arise in this case, such as e.g. poor connections, erroneous processing in real time or a loss of data. To be able to ensure reliable operation of the analytical system, it is therefore also necessary to monitor operation thereof.

It is known practice to initially test such an analytical system on startup by supplying test data having a prescribed content to the analytical system instead of the stream data to be monitored and analyzing the processing result provided by the analytical system. On account of the multiplicity of different software components, some of which are proprietary components and some of which are standardized (what are known as "off-the-shelf") components, however, there is also the possibility of an analytical system that is already operating resulting in erroneous processing results .

It is an object of the invention to specify a method that allows intended operation of an analytical system to be monitored during ongoing operation too.

This object is achieved by a method according to the features of patent claim 1. Advantageous configurations arise from the dependent patent claims.

A method for monitoring intended operation of an analytical system for stream data is proposed. The analytical system comprises software components for processing stream data that are received by the analytical system. The method involves a test component being integrated into the analytical system. The test component can be integrated as a one-off. The test component can remain in the analytical system for the life thereof . The test component performs the following steps at the runtime of the analytical system: transmission of at least one test datum having a prescribed, respectively explicit content to the analytical system for processing by the analytical system. In this case, the test datum corresponds to the stream data processed by the analytical system in terms of type and/or in terms of format. The at least one test datum is integrated into the stream data, i.e. introduced into the data stream of the stream data, that are supplied to the analytical system during intended operation of the analytical system. In a next step, the at least one test datum is received by the test component and the content of the at least one test datum is read. Subsequently, the content of the at least one received test datum is evaluated by means of a comparison with the prescribed content during transmission of the at least one test datum.

The steps of the method according to the invention that are indicated above can be performed in real time during the runtime of the analytical system. This allows the analytical system to be monitored during realtime operation too. Such monitoring particularly allows errors that are attributable to cross-component processing operations on the stream data to be detected.

The method is suitable regardless of whether the analytical system is compiled from proprietary software components or standardized components.

The approach of the method according to the invention is based on having at least one test datum transmitted by the test component and processed by the analytical system, particularly at periodic or irregular intervals of time. The result of the processing of the analytical system is evaluated by the test component. On the basis of the evaluation result, it is then possible to infer the presence of a problem for the analytical system. This approach is comparable with what are known as "healthy checks", as a result of which the monitoring can be provided online, continually and in a manner integrated into the analytical system. In particular, the method is distinguished in that despite the provision of the test component the operating sequence of the analytical system is not impaired during the monitoring .

The method particularly allows the monitoring of losses of data and of data consistency. According to one expedient configuration, a log of the content of the at least one received test datum and/or of the prescribed content of the at least one test datum for transmission is stored for a later evaluation. This configuration variant allows temporally independent evaluation of the test data independently of reception thereof by the test component .

According to a further expedient configuration, the evaluation of the content comprises whether a change in the prescribed, respectively explicit content that results from the processing by the analytical system matches a result, ascertained by the test component, from the prescribed, respectively explicit content. The result ascertained by the test component can be ascertained thereby prior to the transmission in a manner determined in advance, for example. This particularly easily allows the behavior of the analytical system to be monitored, since it is merely necessary to perform a comparison of the content contained in the received test datum with the result ascertained by the test component from the prescribed, respectively explicit content .

According to a further expedient configuration, a hash value for the prescribed, explicit content of the at least one test datum for transmission is ascertained by means of a prescribed hash function, and the ascertained hash value is stored together with an identifier of the test datum for transmission. The use of a hash function and the ascertainment of a hash value on the basis thereof allows ascertainment of whether a piece of information from the transmitted test datum has been lost during or by the processing by the analytical system.

In particular, a further hash value is ascertained for the read content of the at least one received test datum and is compared with the stored hash value whose associated identifier matches the identifier of the received test datum. If the hash value and the further hash value do not match, then the presence of a loss of data is inferred. In this case, the loss of data consists not in the complete loss of the test datum but rather in a loss of at least one portion of the content with which the test datum has been transmitted .

According to a further expedient configuration, a first timestamp or a first sequence number is coded into the prescribed, explicit content of the at least one test datum or is stored together with an identifier of the at least one test datum before the at least one test datum is transmitted. This makes it possible to establish whether a plurality of test data transmitted by the test component are also received in the correct order. The order is correct when the received test data are received in the order in which they were previously transmitted. If this is not the case, then this can indicate erroneous operation of the analytical system.

In particular, there may be provision for reception of the at least one test datum to prompt the read content of the at least one test datum now to be augmented by a second timestamp or a second sequence and stored or to prompt the second timestamp or the second sequence number to be stored together with the identifier of the at least one test datum. This allows the received test data to be evaluated for the correct order in a simple manner.

To this end, there may particularly be provision for the order of the first timestamps or first sequence numbers of a plurality of test data sent in succession to be compared with the order of the second timestamps or second sequence numbers of the test data received in chronological succession. This allows a consistency check on the analytical system with regard to the processing of the test data and generally the stream data in the correct order. By way of example, when the analytical system is operating correctly, the period of time that results for a test datum from the difference between the second and first timestamps must always be constant for all test data.

Hence, an absence of data consistency is inferred if the orders do not match.

The invention further proposes a computer program product that can be loaded directly into the internal memory of a signal processing unit and comprises software code sections that are used to carry out the steps of the method described herein when the product runs on the signal processing unit. The computer program product may be realized in the shape of a DVD, a CD-ROM, a USB memory stick and the like. The computer program product may also be present as a stored program, however, that can be loaded wirelessly or via a wired connection.

The invention for monitoring intended operation of an analytical system for stream data will now be described with reference to the enclosed drawings . The drawings and the embodiments described below are used to describe an exemplary embodiment but are not restricted thereto. In the figures below, like elements are provided with like reference symbols. In the drawings:

Fig. 1 shows a schematic depiction of an analytical system that is augmented in a manner according to the invention by a test component for monitoring the intended operation of the analytical system;

Fig. 2 shows a flowchart for identifying a loss of data; and 6

Fig. 3 shows a flowchart for identifying a data inconsistency .

An analytical system 10 shown in fig. 1 comprises a data access layer 11, a transport layer 12, one or more stream data service components 13 and one or more processing engines 14. Said components of the analytical system 10 are software components for processing stream data that are provided continuously e.g. by a monitored technical system.

The stream data supplied to the analytical system 10 are processed in a manner prescribed by the configuration of the components of the analytical system 10. The processing in this case is dependent, by way of example, on the design of the stream data service component (s) 13 and/or the processing engine 14 or generally the components 11-14. The respective properties of the individual components 11-14, regardless of whether proprietary or standardized (i.e. "off-the-shelf") components are involved, are known on the basis of preceding individual tests on the components 11-14. After the components 11-14 are combined in the analytical system 10, transfer of the stream data from one component 11-14 to the next, for example, can cause an unforeseen behavior to occur, such as e.g. a (partial) loss of data or a data inconsistency. The latter is caused by temporally delayed processing of individual data packets of the stream data, for example, as a result of which the chronological order of the stream data for processing may be undesirably modified.

Components known to a person skilled in the art in such an analytical system 10 are Apache Kafka as a message queue component or Apache Spark or Apache Store as processing engines 14, for example. These components are used, as is known to a person skilled in the art, for load equalization and for distribution in order to ensure realtime data processing . The exact design of the analytical system 10 described in this respect is of secondary significance to the method according to the invention described below, since said method can be used in an analytical system of any form.

In a manner according to the invention, the analytical system 10 further comprises a test component 20. The test component 20, like the components of the analytical system 10, is a software component that is integrated in the analytical system 10 as a separate component. In this case, there may be provision for the test component 20 to be integrated into the analytical system 10 during the actual design of the analytical system 10.

The task of the test component 20 is to produce test data TD at periodic or irregular intervals of time and to transmit them to the components 11-14 at the runtime of the analytical system 10, so that said test data are processed as conventional stream data. The test data processed by the components 11-14 of the analytical system 10, which can be monitored for operation as intended using the test component 20, are then in turn received by the test component 20 as processed test data TD 1 and the content thereof is read. The processed test data TD ' correspond to the received test data TD'. Subsequently, the content of the received test data TD 1 is evaluated by means of a comparison with the prescribed, respectively explicit content of the test data TD transmitted to the analytical system 10. In this case, the comparison is each time effected between a transmitted test datum TD and a received test datum TD 1 that have a matching identifier (e.g. in the header) .

The test data TD transmitted by the test component 20 correspond in terms of type and/or in terms of format to the stream data produced by the system to be monitored. The test data TD can, to this end, be channeled into the stream data g 0476 of the technical system, i.e. integrated into the data stream .

Each test datum from the test data TD transmitted by the test component 20 comprises a prescribed, respectively explicit content. This prescribed, respectively explicit content is produced or provided by the test component 20. The monitoring is generally based on the evaluation of the content of the test data TD transmitted by the test component 20 and of the content of the (modified) test data TD ' processed and received by the components of the analytical system 10. In general, the evaluation of the content comprises whether a change in the prescribed, respectively explicit content that results from the processing by the components of the analytical system 10 matches a result from the prescribed, respectively explicit content that is ascertained by the test component 20. In this regard, it is possible to store a log of the content of the at least one received test datum TD ' and/or of the prescribed content of the transmitted test data TD for a later evaluation. Fundamentally, this is not necessary, since the evaluation can take place directly after reception of the previously transmitted and then received test data TD 1 .

Figs. 2 and 3 describe flowcharts for two aspects of the monitoring performed by the method.

The cycle shown in fig. 2 is used to check whether a loss of data occurs as a result of the components 11-14 of the analytical system 10. A loss of data is intended to be understood to mean the loss of a portion of the information sent in the test datum. When such a loss of data occurs in a test datum, it can be assumed that a loss of data takes place when the stream data are processed too. In this case, the cycle described in fig. 2 is repeated at periodic intervals. The description is provided with reference to a single test datum TD.

In a step 200, a test datum TD having a prescribed, respectively explicit content is provided by the test component 20. The test datum TD is, on the one hand, transmitted to the transport layer 11 of the analytical system 10 in order to be integrated into the stream data and processed by the further components 12 to 14 of the analytical system 10 in step 210. In a step 220, a processing logic unit of the test component 20 ascertains and stores a hash value for the content of the transmitted test datum TD by means of a prescribed hash function. The hash value can be stored particularly together with an identifier of the transmitted test datum TD in order to be able to associate the read content with the transmitted test datum TD on reception of the test datum TD ' from the components 11-14 of the analytical system 10.

Following reception of the test datum TD ' processed by the components 11-14 of the analytical system 10, the content of the received test datum TD ' is read in step 230. A further hash value is ascertained for the content of the received test datum TD ' , this being effected by the processing logic unit of the test component 20. Subsequently, there is a comparison between the stored hash value of the transmitted test datum TD and the now ascertained further hash value of the received test datum TD ' .

If the two hash values do not match, the presence of a loss of data is inferred.

The cycle described in fig. 3 is used to check data consistency. Data consistency comprises particularly the check to determine whether the order of a plurality of transmitted ^ ^ 17 000476 test data matches the order of the test data TD' processed by the components 11-14 and received by the test component 20.

In step 300, the test data are prepared by the coding of a first timestamp or of a first sequence number in the prescribed, explicit content of the test data for transmission, this information being able to be stored together with an identifier of the test data for transmission before the test data TD are transmitted.

In step 310, the test data TD are transmitted to the components 11-14 of the analytical system 10, and processed by said components, sequentially, at regular or irregular intervals of time.

In step 320, test data TD 1 that were processed by the components 11-14 of the analytical system 10 are received. On reception of a respective received test datum TD 1 , the read content is augmented by a second timestamp or a second sequence number and stored. Alternatively, the second timestamp or the second sequence number can be stored together with the identifier of the test datum just received.

In step 330, the order of the first timestamps or first sequence numbers of the plurality of successively sent test data TD is now compared with the order of the second timestamps or second sequence numbers of the test data received in chronological succession.

If the orders do not match, then an absence of data consistency can be inferred.

The proposed method creates a high level of confidence in the correct operation of the analytical system 10 at runtime. This is achieved by the regular transmission of prescribed test data that are integrated into the stream data. On the basis of the identifiers, the test data processed by the components of the analytical system can be identified by the test component and picked out from the stream data for further analysis.

Hash values and timestamps can therefore be used to establish losses of data and data inconsistencies during operation of the analytical system. This monitoring can take place at the runtime of the analytical system.

List of reference symbols

10 Analytical system

11 Data access layer

12 Transport layer

13 Stream data service component

14 Processing engine

20 Test component

TD Transmitted test datum

TD' Received test datum

200 Method step

210 Method step

220 Method step

230 Method step

300 Method step

310 Method step

320 Method step

330 Method step