Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SYSTEM AND METHOD FOR PLANT WIDE ASSET MANAGEMENT
Document Type and Number:
WIPO Patent Application WO/2013/041440
Kind Code:
A1
Abstract:
A system for plant wide asset management of a large scale production plant (20) comprises a processing unit (PU) for generating output data (0_1, 0_2) from input data (I_1, I_2), a data source unit (DS, A/D) for providing the input data (I_1, l_2) based on at least one signal relating to a production process of the production plant (20), a data visualization unit (M) and/or a command signal generation unit (D/A) and at least one data interface (I/O) for transmitting the input data (I_1, l_2) from the data source (DS, A/D) to the processing unit (PU) and for transmitting the output data (O_1, O_2) from the processing unit (PU) to the data visualization unit (M) and/or to the command signal generation unit (D/A). The processing unit (PU) is adapted to determine a plant performance indicator PPI from the input data (I_1, l_2), to detect an undesired variation in the PPI (4), to isolate a root cause section (9, 19) of the production plant (20) based on characteristics of the PPI (5), to perform root cause analysis (6) on the isolated root cause section (9, 19) for isolating at least one root cause asset (10, 18) of the production plant (20), to diagnose the at least one root cause asset (10, 18) for determining a cause for the undesired variation in the PPI (7) and to determine as output data (O_1, O_2) at least one corrective action (11), where the at least one corrective action is to be performed on the production plant (20) for eliminating the cause.

Inventors:
SAND GUIDO (DE)
CHIOUA MONCEF (DE)
XU CHAOJUN (DE)
SCHMIDT WERNER A (DE)
SCHLAKE JAN CHRISTOPH (DE)
HORCH ALEXANDER (DE)
Application Number:
PCT/EP2012/067958
Publication Date:
March 28, 2013
Filing Date:
September 13, 2012
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ABB TECHNOLOGY AG (CH)
SAND GUIDO (DE)
CHIOUA MONCEF (DE)
XU CHAOJUN (DE)
SCHMIDT WERNER A (DE)
SCHLAKE JAN CHRISTOPH (DE)
HORCH ALEXANDER (DE)
International Classes:
G05B23/02
Domestic Patent References:
WO1998014300A11998-04-09
Foreign References:
US20060073013A12006-04-06
US20090012653A12009-01-08
US20100223500A12010-09-02
Other References:
THORNHILL, N.F.; HUANG, B.; ZHANG, H.: "Detection of multiple oscillations in control loops", JOURNAL OF PROCESS CONTROL, vol. 13, 2003, pages 91 - 100, XP002554541, DOI: doi:10.1016/S0959-1524(02)00007-0
THORNHILL, N.F.; SHAH, S.L.; HUANG, B.; VISHNUBHOTLA, A.: "Spectral principal component analysis of dynamic process data", CONTROL ENGINEERING PRACTICE, vol. 10, 2002, pages 833 - 846, XP055045322, DOI: doi:10.1016/S0967-0661(02)00035-7
THORNHILL, N.F.: "Finding the source of nonlinearity in a process with plant-wide oscillation", IEEE TRANSACTIONS ON CONTROL SYSTEM TECHNOLOGY, vol. 13, 2005, pages 434 - 443, XP011130701, DOI: doi:10.1109/TCST.2004.839570
BAUER, M.: "Data driven methods for process analysis", PHD THESIS UNIVERSITY OF LONDON, 2005
Attorney, Agent or Firm:
KOCK, Ina (GF-IPWallstadter Str. 59, Ladenburg, DE)
Download PDF:
Claims:
Claims

System for plant wide asset management of a large scale production plant (20) comprising

• a processing unit (PU) for generating output data (0_1 , 0_2) from input data (U > l_2),

• a data source unit (DS, A/D) for providing the input data (l_1 , l_2) based on at least one signal relating to a production process of the production plant (20),

• a data visualization unit (M) and/or a command signal generation unit (D/A),

• at least one data interface (I/O) for transmitting the input data (l_1 , l_2) from the data source (DS, A/D) to the processing unit (PU) and for transmitting the output data (0_1 , 0_2) from the processing unit (PU) to the data visualization unit (M) and/or to the command signal generation unit (D/A),

• where the processing unit (PU) is adapted to determine a plant performance indicator PPI from the input data (M , l_2),

characterized in that

• the processing unit (PU) is further adapted to

i. detect an undesired variation in the PPI (4),

ii. isolate a root cause section (9, 19) of the production plant (20) based on characteristics of the PPI (5),

iii. perform root cause analysis (6) on the isolated root cause section (9, 19) for isolating at least one root cause asset (10, 18) of the production plant (20),

iv. diagnose the at least one root cause asset (10, 18) for determining a cause for the undesired variation in the PPI (7),

v. determine as output data (0_1 , 0_2) at least one corrective action (1 1 ), where the at least one corrective action (1 1 ) is to be performed on the production plant (20) for eliminating the cause.

System according to claim 1 , where the at least one corrective action (1 1 ) is one of changing an operating parameter of the at least one root cause asset (18), adjusting control parameters of a control loop containing the at least one root cause asset (18), switching the production process over to use an alternative asset instead of the at least one root cause asset (18) or raising a flag indicating that the at least one root cause asset (18) requires maintenance.

3. System according to claim 1 or 2, where the processing unit (PU) is adapted to isolate the root cause section (9, 19) by analyzing the time dependent behavior of at least two measured signals of the production process belonging to at least two different possible root cause sections (9, 19).

4. System according to claim 3, where the processing unit (PU) is adapted to isolate the root cause section (9, 19) based on the degree of similarity of the time dependent behavior of the at least two signals to the time dependent behavior of the PPI.

5. System according to any of the previous claims, where the processing unit (PU) is adapted to diagnose the at least one root cause asset (10, 18) for additionally determining the extent to which the at least one root cause asset (10, 18) causes the un- desired variation in the PPI.

6. System according to claim 5, where the processing unit (PU) is adapted to label the at least one root cause asset (10, 18) as a critical asset and to store it in a data storage unit together with a quantitative relationship between a variation in the critical asset and a resulting variation in the PPI.

7. System according to claim 6, where the processing unit (PU) is adapted to predict future variations in the PPI from input data relating to the current behavior of at least one critical asset and/or from a predicted trend in the behavior of at least one critical asset.

8. System according to any of claims 5 to 7, where the processing unit (PU) is adapted to isolate the at least one root cause asset (10, 18) in the form of a list of root cause assets ranked according to the extent to which they cause the undesired variation in the PPI.

9. System according to any of the previous claims, where the processing unit (PU) is adapted to determine the at least one corrective action (1 1 ) in the form of a list of corrective actions ranked according to an expected success rate in reducing the undesired variation in the PPI.

10. System according to any of the previous claims, where the processing unit (PU) is adapted to analyze the PPI after the at least one corrective action (1 1 ) has been out- put with respect to a reduction of the undesired variation, to determine a level of success of the corresponding corrective action and to store the level of success in a data storage unit. Method for plant wide asset management of a large scale production plant (20) comprising the steps:

• generating output data (0_1 , 0_2) from input data (l_1 , l_2),

• providing the input data based on at least one signal relating to a production process of the production plant (20),

• determining a plant performance indicator PPI from the input data (l_1 , l_2), characterized by the steps:

i. detecting an undesired variation in the PPI (4),

ii. isolating a root cause section (9) of the production plant (20) based on characteristics of the PPI (5),

iii. performing root cause analysis on the isolated root cause section (10) for isolating at least one root cause asset of the production plant (6), iv. diagnosing the at least one root cause asset (10) for determining a cause for the undesired variation of the PPI (7),

v. determining as output data (0_1 , 0_2) at least one corrective action (1 1 ), where the at least one corrective action (1 1 ) is to be performed on the production plant (20) for eliminating the cause.

Description:
System and method for plant wide asset management

Description

The invention relates to a system and a method for plant wide asset management of a large scale production plant, in particular in the discrete manufacturing and processing industry, where the system comprises a processing unit for generating output data from input data, a data source unit for providing the input data based on at least one signal relating to a production process of the production plant, a data visualization unit and/or a command signal generation unit, and at least one data interface for transmitting the input data from the data source to the processing unit and for transmitting the output data from the processing unit to the data visualization unit and/or to the command signal generation unit. In the system, the processing unit is adapted to determine a plant performance indicator PPI from the input data. Accordingly, the method comprises the steps of generating output data from input data, providing the input data based on at least one signal relating to a production process and determining a plant performance indicator PPI from the input data.

Plant asset management activities may be applied to a large variety of different application areas which contain among others the process industry, discrete manufacturing, pulp and paper, metals, chemical industry, electronics industry, energy industry, food and beverage, pharmaceuticals, cement etc. The production plants of all these industries have in common that they can be treated as so called large scale systems or large scale production plants.

Their common specific characteristics are that

• they are too complex to be modeled physically,

· they contain hundreds or thousands of control loops, both at high level and low level, and

• they provide thousands of measured signals.

The production plants under consideration are often automated to a high degree, which requires and finally leads to a large amount of measurement and control devices. Common automation systems collect and process data coming from different types of sensors and generate control instructions as well as signals or information as input for actuators or human decision makers.

Plant assets may be all kinds of plant components such as apparatuses, vessels, machines, pipes, pumps, valves as well as sensors and process control and communication devices. Plant assets are investments and means with a finite lifetime, which are used for the production of intermediate or final products. During their whole lifecycle plant assets are subject to performance degradation and/or complete breakdown. That means their "health", i.e. their performance level, whereas quantity as well as quality aspects are taken into account, and/or their availability, is not static but changes over time, and accordingly seems to follow a more or less dynamic process.

Diagnosis systems are used to measure and monitor the health of plant assets; they detect faults and identify their root causes using data-based and/or model-based methods. Differently as defects of a plant asset, performance degradation can only be detected online while the plant is in operation. Apart from statistical evaluations concerning historical data, a detection of a performance degradation during an inspection is usually not possible.

Several aspects are to be involved in a holistic "Plant Asset Management" approach:

• Maintenance,

• Control loop monitoring,

· Fault detection and diagnosis,

• Operation of the system.

Diagnosing a given or specific plant asset health and deriving and implementing appropriate therapy actions require that the diagnosis systems and the process and control systems are integrated with each other and are cooperating. Applications and methodologies for linking diagnosis functionality with process control and optimization functionality are only known and available at the lower layers of the plant hierarchy, such as the layers for field components or field devices. The research in the area of FDI/FTC (fault detection and isolation/fault tolerant control) is limited to small and medium sized systems and concentrates rather on the health than on the performance of the system or system components. The main development effort in PAM (plant asset management) concentrates on the diagnostics of individual components rather than on taking a system- wide view. An efficient plant asset management system has to provide a large variety of different aspects and functionalities, some of which are explained below:

• Data of different kind is gathered from control loops, process measurements, manufacturing execution system, where the data can be maintenance related, operation re- lated, product related or business related. The data is stored in a central database or in various databases, and it is visualized in a structured and non-structured way.

• Thresholds for abnormal events are to be configured, and alarms are to be created accordingly.

• Additional information may be attached to a visualized signal, e.g. comments by the operator.

• Flags may be used, for example a 4 digit system, e.g. NAMUR Recommendation NE 107 / VDI 2650.

The maintenance may be executed according to different philosophies, such as Run-to- failure, or preventive or condition-based or pro-active maintenance. Maintenance tasks need to be scheduled.

• Depending on the system status, production levels need to be adjusted to prevent break downs, for example via a change of set points for control loops.

• Performance Monitoring is performed by executing the following steps:

o PPIs (plant performance indicators) are defined or generated and visualized continuously, where a PPI is an indication for an overall equipment effectiveness (OEE) or an overall quality measure of the performance of the plant production. PPIs can for example relate to the quality of the final product, such as the moisture of paper in a paper mill, or to the time needed for the overall pro- duction process, the energy used for the overall production process or the waste which was produced,

o Finger prints (documentation of the current status) are created,

o Control loops are monitored and their performance is measured.

• A Fault Detection and Diagnosis can be performed component-based (failure modes), model-based (for small, well modeled systems) or data-based (for medium sized system).

• Visualization can be performed in a hierarchical way for systems, sub-systems and components, or according to the location in the plant, for example by showing on a map where to find what and by visualizing corresponding signals, data etc.. Further, arrows may symbolize that this system follows another system (for example in a flowchart); and an impact may be visualized to indicate: if this event happens a certain another event will happen with that probability.

The health of plant components may be predicted, with the aim to reduce costs and efforts by replacing or repairing a particular component before it effects the whole process and in particular the product quality and/or quantity. During prediction, trends are generated model-based or data-based, using extrapolation. A traffic light system may be used to indicate with green that the system/component works fine, with yellow that the system/component might not be fine and with red that the system/component is not performing well.

Suggested therapy Actions can be based on statistical data, i.e. by using failure probabilities, or based on historical data, creating a list of actions to be taken in case of a given or specific failure, or using an expert system, where the system is fed with available data to decide what to do. Possible actions are to change a control strategy or to trigger maintenance. Currently, efforts are oriented towards managing the large amount of data derived from a large scale system and to develop methods of how to use them and to make them usable for plant asset management.

Available methods for plant asset management are following a bottom-up approach. Failure detection and diagnosis methods are implemented - if possible - on a component level, the lowest level possible. These methods work mainly component-based. So, for each asset a tailored method is applied, resulting in a vast number of different methods for a large scale system. Due to the large number of assets in such a large scale system, a lot of, mostly unnecessary, alarms are generated by these methods, whenever the corresponding low level components exceed a predefined threshold. In most cases, the end user, e.g. the plant operator, has to decide manually if such a failure is important or of relevance or not. Sometimes an automated expert system can also assist this decision process. Hence, in the worst case, all low level component failures may be treated by the operator with the same importance and in an equal manner, so that really important alarms may possibly never be brought to the attention of the operator due to the flood of alarms. The state-of-the-art FDI/FTC and PAM methods are particularly lacking of

• aggregation methodologies for large scale systems which condense and combine the distributed field data to plant asset health information on higher levels, e.g. on the layer of unit processes or the entire plant, • models and metrics which enable an assessment of the actual plant asset health and its future evolution (prognosis), whereas in the case where a model is used it is generated bottom-up instead of top-down, and

• methodologies to systematically identify, select and implement beneficial therapy ac- tions on various layers of the automation hierarchy.

Components very often behave much differently in a real application than in the lab or in a test environment/scenario, where diagnostic functions are evaluated and tested.

According to commonly known systems or methods a system-wide view on component monitoring in situ is still missing in reality since it needs to involve the physics around the component in question. This aspect is highly non-trivial.

Accordingly the object of the invention is to provide means for an improved plant asset management for a large scale production plant.

This object is solved by a system and a method for plant wide asset management of a large scale system production plant according to the independent claims. Further embodiments and developments of the invention are disclosed in the dependent claims and the following description.

In the system according to the invention the processing unit is adapted to detect an undesired variation in the PPI, to isolate a root cause section of the production plant based on characteristics of the PPI, to perform root cause analysis on the isolated root cause section for isolating at least one root cause asset of the production plant, to diagnose the at least one root cause asset for determining a cause for the undesired variation of the PPI, to determine as output data at least one corrective action, where the at least one corrective action is to be performed on the production plant for eliminating the cause.

The corresponding method performed by the system according to the invention comprises the steps detecting an undesired variation in the PPI, isolating a root cause section of the production plant based on characteristics of the PPI, performing root cause analysis on the isolated root cause section for isolating at least one root cause asset of the production plant, diagnosing the at least one root cause asset for determining a cause for the undesired variation of the PPI, determining as output data at least one corrective action, where the at least one corrective action is to be performed on the production plant for eliminating the cause. The central aspect according to the invention is the top-down instead of bottom-up approach when dealing with large scale systems. As is described above, large scale systems consist of a large number of assets, like on component level for example valves, pumps and motors, and on sub-system level for example control loops and sections of the system. The behavior of these assets and the behavior of the large scale system itself are described by a huge amount of continuous data coming from the distributed control systems (DCS), quality control system (QCS) and maintenance planning system. In most cases the large scale system has more than 1000 signals. According to the invention, instead of monitoring each and every single asset in the large scale production plant which requires considerable processing as well as human resources, one or several plant performance indicators, PPIs, are monitored and, in case that an undesired behavior is detected, corrective actions are determined automatically. This simplifies the task of a human operator considerably. During determination of the corrective action, a root cause analysis is performed on a reduced number of assets by identifying root cause sections. By this means, the processing effort is considerably reduced resulting in a fast and direct response to a PPI failure.

The system and method according to the invention are generally applicable on a plant level, for example a whole pulp and paper plant, or on a system/sub-system level, for example a drying section of a pulp and paper plant, which means that the processing unit determines the PPI either for the overall plant or for the sub-system, respectively. The term root cause section is used for a sub-unit of the part of the overall production process for which the PPI is determined. In other words, if the PPI is defined on plant level, than a root cause section may be any part of the production plant, such as a production line, sub-system or control loop of the production plant, for example a dryer section or a wire section. I If the PPI is defined for a sub-system of the plant, a corresponding root cause section may be any sub-part of this sub-system, such as a pre-defined group of assets.

The term asset is not limited with respect to the invention to single plant components such as apparatuses, vessels, machines, pipes, pumps, valves, sensors and process control and communication devices, but also to small logical groups of these components, such as control loops. The predefinition of the PPI or PPIs may be performed by the plant owner and the respective operators of the plant. After that almost all actions and measures taken automatically by the system and method for plant wide asset management have the goal to achieve a certain level of the PPIs. As a result, the size of the problem of managing the assets of the production plant is considerably reduced. An undesired variation in the PPI can also be regarded as a PPI failure, which can for example manifest itself by an abnormal degradation or an oscillation of the PPI. An undesired variation may for example be recognized as soon as the amplitude or the frequency of an oscillation of the PPI exceeds a corresponding pre-defined threshold. There can be different thresholds for one PPI, such as an alarm threshold, with which a PPI failure is clearly detected, and a warning threshold, which may indicate that there will most probably be a future PPI-problem.

The data source unit for providing the input data based on which the processing unit determines the PPI may be either a signal interface to a sensor in the production plant or a data storage unit, where measurements and other data are stored for later creation and calculation of the PPI.

The production plant may comprise at least one database or may have access to at least one external database, wherein all relevant and necessary data or information are accessibly stored and can be retrieved, in particular data and information which relates to operation measurements and/or distributed control systems (DCS) and/or quality control system (QCS) and/or maintenance data and/or business data, including time series of identified PPI ' s.

In the following, various embodiments of the invention are described with respect to the system. The steps performed by the system form corresponding embodiments of the above described method. The at least one corrective action is one of changing an operating parameter of the at least one root cause asset, adjusting control parameters of a control loop containing the at least one root cause asset, switching the production process over to use an alternative asset instead of the at least one root cause asset, which could for example achieved by changing the type of product to be produced, or raising a flag indicating that the at least one root cause asset requires maintenance.

The kind of corrective action may be chosen by the processing unit depending among others at least on one of the following items:

failure type of the root cause asset,

magnitude of the undesired variation in the PPI and/or of the asset failure, · status of the asset failure, whether it is a predicted one or one that has in fact occurred,

status of the root cause section, e.g. production unit running, stopped, under maintenance or a maintenance action being scheduled. According to an embodiment, the processing unit is adapted to isolate the root cause section by analyzing the time dependent behavior of at least two measured signals of the production process belonging to at least two different possible root cause sections.

In particular, the processing unit may be adapted to isolate the root cause section based on the degree of similarity of the time dependent behavior of the at least two signals to the time dependent behavior of the PPI. For example, known correlation analysis methods may be applied to determine the degree of dependency between the PPI signal and the at least two signals.

In a further embodiment, the processing unit is adapted to diagnose the at least one root cause asset for additionally determining the extent to which the at least one root cause asset causes the undesired variation in the PPI. So, the processing unit is not only able to detect which asset of the production plan causes at least part of the undesired behavior in the PPI but also to which extent it does that.

Even further, the processing unit may be adapted to label the at least one root cause asset as a critical asset and to store it in a data storage unit together with a quantitative relationship between a variation in the critical asset and a resulting variation in the PPI.

From this stored knowledge, the processing unit may advantageously predict future variations in the PPI from input data relating to the current behavior of at least one of the critical assets and/or from a predicted trend in the behavior of at least one critical asset. Accordingly, future PPI failures could be prevented by early invention, for example by initiating maintenance of the critical asset before its deterioration in functionality affects the overall quality of the production process.

The processing unit may further be adapted to isolate the at least one root cause asset in the form of a list of root cause assets ranked according to the extent to which they cause the undesired variation in the PPI. In that way, the most critical assets are dealt with first resulting in as fast an improvement in the PPI as possible.

The processing unit may even further be adapted to determine the at least one corrective action in the form of a list of corrective actions ranked according to an expected success rate in reducing the undesired variation in the PPI. This provides additional help in the decision process of an operator. The expected success rate is based mainly on historical and past experiences with a given type of corrective action, but may further depend at least on one of the following items:

type of PPI where the undesired variation occured, estimated influence of the specific corrective action on the PPI,

availability of resources for the specific corrective action,

and it may be expressed in terms of a so called PPI-objective function.

In order to learn from previous actions, the processing unit may be adapted to analyze the PPI after the at least one corrective action has been output with respect to a reduction of the undesired variation in the PPI and thereby to determine a level of success of the corresponding corrective action and to store the level of success in a data storage unit. During future handling of PPI failures, the stored level of success may then for example be used to correct the value which is determined for the extent to which the corresponding root cause asset causes the undesired variation in the PPI as well as to adjust the ranking in the resulting lists of root cause assets and/or corrective actions accordingly.

The invention and its further embodiments will become apparent from the examples described below in connection with the appended drawings which illustrate:

Fig. 1 an overview of a system for plant wide asset management of a large scale

production plant,

Fig. 2 an illustration of the workflow of a method for plant wide asset management of a large scale production plant,

Fig. 3 method steps corresponding to Fig. 2,

Fig. 4 a time dependent PPI signal with its thresholds,

Fig. 5 an example of a PPI monitoring screen,

Fig. 6 an example for how to isolate a root cause section,

Fig. 7 an example of a screen representation of root cause asset determination,

Fig. 8 elements of a large scale production plant.

As is shown in Fig. 1 , the system for plant wide asset management of a large scale production plant comprises a processing unit (PU) for generating output data, here shown as a first output data signal 0_1 and a second output data signal 0_2, from input data, shown as a first input data signal M and a second input data signal l_2. A data source unit for providing the input data can be a signal interface unit, such as an analogue-to-digital converter A D, which receives a signal measured directly in the production plant by a sensor and which transforms it into the first input data signal M . In the alternative, the data source unit may also be a data storage unit DS where input data derived from signals relating to a production process of the production plant and/or from measured signals are stored, at least temporarily, and from where the second input data signal l_2 can then be provided to the processing unit PU at need. The first input data signal M may for example be a direct digital representation of a sensor signal, such as in a pulp and paper plant the measured moisture of the final paper or paper board which is output from the overall production process. The moisture signal may represent a PPI in itself, since it is an essential indicator of the quality of the end product. The second input data signal l_2 may for example be an indication of the speed of production derived from several signals such as start time and end time of a certain production process. Other examples for PPIs measured or determined in a paper mill are the basis weight, which - as the moisture - is also an indicator for the quality of the end product, or the overall equipment effectiveness (OEE).

From these input signals, the processing unit detects a possible failure in the production process by observing at least one of the PPIs of the production plant and proposes at least one corrective action which in form of the first and/or second output data signals 0_1 , 0_2 are then provided to a data visualization unit M and to a command signal generation unit D/A, respectively. The digital-to-analogue converter in this example transforms the second output data signal 0_2 into an electrical signal which may be sent directly to an actuating unit of the production plant, such as a motor or pump. In the alternative, the command signal generation unit may be implemented as a processing unit belonging for example to the distributed control system DCS of the production plant, where the processing unit then applies the corrective action proposed via the second output data signal 0_2 to the DCS in an appropriate manner. The first and second output data signals 0_1 and 0_2 may either represent different corrective actions or they may stand for two different informational realizations of one and the same corrective action, one in the form or a visualization signal and the other in the form of a data communication signal. A data interface I/O is provided in the system for transmitting the first and second input data M , l_2 from the data sources A/D and DS to the processing unit PU and for transmitting the first and second output data 0_1 , 0_2 from the processing unit PU to the data visualization unit and/or to the command signal generation unit. Instead of a combined data input and output interface, also two separate units could be provided, one for transmitting input data and the other for transmitting output data only.

How the processing unit PU may determine the corrective actions will now be explained with respect to the other figures. Fig. 2 shows an illustration of the workflow of a method for plant wide asset management of a large scale production plant performed by the system of Fig. 1 , where the pyramid in the middle represents the production plant 20 with its different levels of abstraction of monitoring the operation of the production plant. Level 1 is the plant level, level 2 the section level, where sub-systems or sub-units of the overall production process are regarded, and level 3 is the asset level, relating to individual field components, in particular singular actuating elements, of the production process. A different graphical representation of the structure of a production plant and in particular a large scale production plant, is shown in Fig. 8, which is explained here briefly in comparison to the representation of Fig. 2. In Fig. 8, the small squares represent assets 18 of the plant, the larger rectangles represent sections 19 of the plant and the outermost rectangle surrounding all assets 18 and sections 19 is the plant 20 itself. In production plant 20 at least one end product is produced from at least one input product or material. A section 19 is a sub-system of the plant which produces one or more intermediate products or which produces the end product from at least one intermediate product. The sections 19 are interconnected. For at least some of the assets 18, so called asset performance indicators 22 (API) may be defined. The plant 20 may be regarded as large scale production plant when the assets 18 are numerous, e.g. more than thousand, when they deliver numerous measurements and probably also APIs, and when there is at least one plant performance indicator 21 (PPI) used to indicate the overall performance of the plant. Traditional asset management focuses on the asset level 3 (Fig. 2), i.e. the assets and their API are monitored individually in order to detect and diagnose a fault in the plant. This is also called bottom-up approach.

Opposed to that, Fig. 2 illustrates the top-down approach of the invention. The method starts on the plant level 1 with the detection of an undesired variation in a PPI in a step referenced by 4. In the next step, at reference 5, a root cause section is isolated based on

characteristics of the PPI, which may then be validated either by interaction with a human validator, in a step referenced by 8 or by automated validation, in a step referenced by 12. The result is a validated root cause section, indicated by reference 9.

The method has now moved from the uppermost plant level 1 to the section level 2. Here, a root cause analysis is performed on the validated root cause section 9 in a step referenced by 6 in order to isolate at least one possible root cause asset. The at least one root cause asset may then again be validated automatically or by human interaction, resulting in at least one validated root cause asset 10.

Thereby, the method has moved one level further down towards the asset level 3. The at least one validated root cause asset 10 is then diagnosed in order to determine a cause for the undesired variation in the PPI and at least one corrective action is determined to eliminate the cause, both performed during a step referenced by 7. The at least one corrective action may be validated as well, in the steps referenced by 8 or 12, respectively, and the output of the method is at least one validated corrective action 1 1 which is then applied to the production plant in order to improve the PPI again. The essential method steps performed mainly by the PU of Fig. 1 are in addition summarized in Fig. 3. There, in a first step Stepl , a PPI is determined based on measured signals from the production process. In a second step Step 2, an undesired variation in the PPI is detected, and in a third step Step 3, a root cause section is isolated based on characteristics of the PPI. In a fourth step, Step 4, known root cause analysis algorithms are applied on the isolated root cause section in order to identify a root cause asset. During a fifth step Step 5, the identified root cause asset is diagnosed, so that in a sixth step Step 6 a corrective action may be proposed. And in a seventh step Step 7 the correction action is applied to the production plant, in particular to those parts of the production process to which the PPI belongs.

In Fig. 4, a time dependent PPI signal 24 is shown together with its thresholds 23. The PPI is the moisture of the end product of a paper mill, which oscillates slightly around a set point but which keeps inside the boundaries of thresholds 23. As a result, a monitoring signal 25, which would indicate an undesired variation of the PPI, stays unchanged on its non-critical level.

With the help of Figs. 5 to 7, it shall now be explained how the isolation of the root cause section and at least one root cause assets could be performed. In Fig. 5, an example of a PPI monitoring screen is shown where again the moisture of the end product of a paper mill is observed. As can be seen, the PPI now oscillates more heavily. The processing unit PU of the system of Fig. 1 analyses the PPI signal 24 and determines characteristics of the behavior of the PPI such as the oscillation period in seconds, the oscillation amplitude in % of operating range, various oscillation indices and a value for the oscillation severity. These characteristics are shown on the PPI monitoring screen, which is for example visualized on data visualization unit M. As can be seen from Fig. 5, the oscillation index SP-PV as well as the value for the oscillation severity both indicate that the PPI signal 24 exhibits an undesired variation in the PPI moisture, since they both have left the ranges which would indicate good behavior. The index SP-PV stands for a "high variation around set point".

In order to now identify a root cause section within the paper mill, the processing unit PU may perform a so called contribution plot analysis. Therefore, the time dependent behavior of at least two measured signals of the production process belonging to at least two different possible root cause sections are analyzed. The aim is to reduce the number of measured signals which have to be analyzed in detail. In Fig. 6, each of these signals is represented by one column or bar in the upper part of the figure, where each signal belongs to exactly one section and multiple signals may belong to one and the same section. As an example, a total of 1000 signals may be analyzed, representing 170 different sections in the paper mill. For each of these signals, the processing unit PU determines the degree of similarity of the time dependent behavior of the signal with the time dependent behavior of the PPI. This could for example be achieved via correlation analysis. Those signals which have a similar frequency and/or amplitude pattern as the PPI 24 show the highest degree of similarity. If many signals of a section show high similarity with the PPI 24, the probability of the root cause being in this particular section is high. In the example of Fig. 6, the degree of similarity is depicted for each of the 1000 signals in form of the height of the corresponding bar. Those signals where the degree of similarity exceeds a predetermined threshold 26 are selected, which are here the signals number 1 , 3, 4, 100 and so on. And for these signals, the corresponding sections are either directly labeled as identified root cause sections, or they are labeled as possible root cause sections and are transmitted for further validation. In the example of the paper mill, one root cause section is identified, which is the drying section of the mill. By being identified as root cause section, the drying section is identified to contain at least one root cause asset, i.e. at least one malfunctioning plant component, such as an actuating or sensing element, a processing device or a control loop, which caused at least in part the undesired variation in the PPI moisture signal 24.

Directly after or after the further validation of the root cause sections, the root cause assets are isolated from each of the isolated root cause sections and are further analyzed in order to identify a possible cause of the undesired variation in the PPI. This is achieved by applying known disturbance analysis methods, also called plant-wide disturbance analysis (PDA), where the applied methods are for example:

• oscillation detection as described in Thornhill, N.F., Huang, B., and Zhang, H., 2003, "Detection of multiple oscillations in control loops", Journal of Process Control, 13, 91 - 100,

• spectral principal components analysis as known from Thornhill, N.F., Shah, S.L., Huang, B., and Vishnubhotla, A., 2002, "Spectral principal component analysis of dynamic process data", Control Engineering Practice, 10, 833-846,

• nonlinearity diagnosis as suggested in Thornhill, N.F., 2005, "Finding the source of nonlinearity in a process with plant-wide oscillation", IEEE Transactions on Control System Technology, 13, 434-443, and

• causality analysis as explained in Bauer, M., 2005, "Data driven methods for process analysis", PhD thesis University of London.

In order to identify the root cause asset, the PDA analysis is performed on all signals belonging to the isolated root cause section or sections, and the signals are ranked according to how closely located they are to the root of the PPI failure, in term of

geographical closeness. Each of the signals can be associated with one or more assets. As a result, the asset or assets associated to the signal with the highest ranking is determined to be at least one identified root cause asset. More candidates for root cause assets could be defined by the signal or signals with the next highest ranking.

In the example described here, the result of the PDA analysis of the signals belonging to the drying section as the identified root cause section yields that at least one of the root cause assets is a control loop of the drying section called "steam pressure control 1 ". In another scenario, not a group of components, such as a control loop, but a single component, such as a pump or valve, may have been recognized as root cause asset.

The next step is the diagnosis of the identified root cause asset. Exemplary results of the diagnosis of control loop "steam pressure control 1 " are shown in the "Asset oscillation monitoring" window of Fig. 7. The control loop is analyzed and diagnosed based on three signals, which are the signals for the set point, the controller output and the process value of the control loop. The latter signal is shown here as the steam pressure signal 27. The results of the signal diagnosis are APIs, such as oscillation index SP-PV, the oscillation period and the oscillation amplitude. Based on these APIs, the control loop "steam pressure control 1 " is further diagnosed, where exemplary results can be seen in the "Asset diagnosis" window of Fig. 7. From the APIs of steam pressure signal 27 it is recognized that a loop oscillatory occurs which is due to a significant nonlinearity in the form of valve stiction. Hence, it is stated that a valve stiction occurs in a particular valve of control loop "steam pressure control 1 ". Accordingly, the valve and its stiction is identified to be a probable root cause for the undesired oscillation in PPI moisture signal 24.

As corrective actions, processing unit PU may then suggest to repair or even replace the valve and/or to adjust and retune the controller of the control loop "steam pressure control 1 " so that the stiction no longer takes effect. The possible corrective actions may be given as a ranked list, the ranking being decided upon based on historic experience, known

effectiveness, availability, maintenance plans and other characteristics of the production plant. Then, either an operator may choose which corrective action to apply or an automated validation algorithm may result in a decision and automatic application of the respective corrective action. The results of the above described disturbance analysis are qualitative relationships between the PPI moisture and the identified root cause assets of control loop steam pressure control 1 . The identified root cause assets may be labeled as critical or risky assets and their qualitative relationship with the PPI moisture can be used for early-detection and prediction of future variations in the moisture. Generation of a future trend of the behavior of these critical assets will indicate a PPI-failure before it actually happens. For example, if the loop performance index, which is an API for the asset control loop steam pressure control 1 , exceeds a predefined threshold and thereby indicates a future degradation of the control loop behavior, the stored relationship between steam pressure control 1 and PPI moisture may help to indicate that this will cause a variation in the PPI moisture in the future. Based on the identified relationship between a PPI and a risky or critical asset, the impact which the critical asset has on the PPI can be quantified in case that an additional quantitative formula or relationship is identified and stored. The impact indicates the importance of a certain asset to the chosen PPI. An example is to express the quantitative relationship in form of a weighting factor, i.e. a variation in the steam pressure signal 27 could contribute with a weighting factor of 40 to a variation in the PPI moisture signal 24. Future trend generation for steam pressure signal 27 and other signals belonging to critical assets can then be used in an equation or formula to predict a future variation in the PPI moisture signal 24. In general, the quantitative relationships can be used, among others, to predict the PPI trend in the future, in particular based on a prediction of trends for each signal, i.e. each critical asset, influencing the PPI.

Furthermore, in some cases it might also be useful to define thresholds for each critical asset or signal, respectively.

In addition to predict the future behavior of the PPI and the related assets in a further embodiment, the trend generation may also be used if a signal is not available due to e.g. measurement transmission errors, different and time varying sampling times if the signal is measured via a sensor network etc..

The identification of an asset as critical asset may also play a role during future PPI analysis in the sense that for an isolated root cause section, at first the critical assets are analyzed further before the other assets of the same root cause section are looked at. For example, if a significant variation in PPI moisture signal 24 is observed via exceeding a warning threshold, a fault diagnosis analysis of all control loops may have shown a correlating degradation of the control loop performance indicators in the control loops steam pressure control 1 and steam pressure control 100. In a next step, only the control loop steam pressure control 1 is selected as a starting point for root cause analysis, because control loop steam pressure control 100 is not a critical asset for this PPI.

It may occur that, in the example, the application of the corrective action to retune the controller of control loop steam pressure control 1 shows no effect on the variation of PPI moisture signal 24. In that case processing unit PU may be adapted to change a so called objective function related to PPI moisture signal 24 accordingly in order to reflect the lack of impact of this particular corrective action. On the other hand, in case that the repairing or replacing of the valve, which was the second suggested corrective action, was successful, this could be detected from control loop steam pressure control 1 operating in its normal range of operation and the variation of the PPI moisture having vanished. Then, processing unit PU may change the related objective function to reflect this positive impact of the corrective action. Accordingly, at the next occurrence of this particular PPI failure, the ranking of the list of corrective actions will be based upon the updated PPI-objective function.

This functionality may be implemented in a learning module. The learning module evaluates the effectiveness of a corrective action. If the corrective action had a beneficial effect on the PPI, the learning module increases the weighting of the corrective action in the PPI-objective function; if the corrective action did not have an effect on the PPI, the learning module decreases the corresponding weighting.

In a further embodiment the list of possible corrective actions and the corresponding PPI- objective function may be stored as a list for every PPI-fault scenario. Further, it could be advantageous to generate and store for every known asset failure, a so called asset therapy action table. The asset therapy action table would contain all possible known corrective actions for a certain asset failure. The PPI-objective functions are then stored together with the corrective actions in the asset therapy action table, i.e. a link is made between every asset failure and a resulting PPI variation. Further, it may occur that an operator observes that a positive change in the overall behavior of a PPI is due to a replacement of a corresponding critical asset. In that case, the

processing unit PU is adapted to erase the labeling as a critical asset and/or to neutralize all identified quantified relationships between this asset and the PPI.

Further possible embodiments of the invention are described in the following. A visualization system may be provided which accesses the data storage unit DS comprising a database and which visualizes historical and actual data of the PPIs in a PPI screen or frame. The visualization may be done in form of a graph showing the behavior of the PPI over time. In addition, future trends of the PPI may be visualized, generated based on future trends of corresponding critical assets. Thresholds can be visualized together with the historical and actual PPI in the PPI screen. If one PPI exceeds one of its thresholds, an alarm or notification may be given on the display of the operator automatically. The design of the PPI screen is predefined, reusable and applicable for different systems. The relationships between critical assets and corresponding PPI may be stored in the database together with the PPI failure scenarios, in particular in a relationship-table. Additionally, these relationships can be visualized in a relationship screen. The PPI- thresholds can be transformed into asset-thresholds for the critical assets. These asset- thresholds will be stored in a failure-table and visualized together with the time series of the critical assets in a relationship screen.

Moreover, results of a PPI prediction may be visualized via at least one trend diagram showing the amplitude over time and/or a traffic light report, where the red light stands for "PPI failure detected", yellow for "PPI failure predicted" and green for "Systems operate within the thresholds". As a result, the asset-prediction screen and the PPI-prediction screen contain each a trend diagram showing the historical, the actual and the predicted value over time, together with the corresponding thresholds.

If a historical value and/or an actual value is not available, for example due to transmission errors, measurement errors etc., the trend generation is also used to generate estimations for these missing values. The stored quantified relationships between PPI-failures and critical assets, which are for example stored in a failure table or an impact network, are used for trend generation of the PPI based on the future trends of the critical assets.