Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A METHOD, APPARATUS AND COMPUTER PROGRAM FOR DIAGNOSING A MODE OF OPERATION OF A MACHINE
Document Type and Number:
WIPO Patent Application WO/2011/073613
Kind Code:
A1
Abstract:
A method, apparatus and computer program for diagnosing a mode of operation of a machine by collecting operating measurements (50) and extracting features (52) from those measurements to characterise operation of the machine in each of a number of known operating modes (54). Collecting further measurements and extracting further features and then comparing those features with the previously extracted features to identify a confidence level for each known operating mode (58). By using the confidence levels it possible to identify the operating mode of the machine.

Inventors:
FAHIMI FARSHAD (GB)
BROWN DAVID JOHN (GB)
Application Number:
PCT/GB2010/002263
Publication Date:
June 23, 2011
Filing Date:
December 13, 2010
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
UNIV PORTSMOUTH (GB)
FAHIMI FARSHAD (GB)
BROWN DAVID JOHN (GB)
International Classes:
G05B23/02
Other References:
BROWN D ET AL: "Particle filter based anomaly detection for aircraft actuator systems", AEROSPACE CONFERENCE, 2009 IEEE, IEEE, PISCATAWAY, NJ, USA, 7 March 2009 (2009-03-07), pages 1 - 13, XP031450244, ISBN: 978-1-4244-2621-8
BYUNGCHUL PARK ET AL: "Fault detection in IP-based process control networks using data mining", INTEGRATED NETWORK MANAGEMENT, 2009. IM '09. IFIP/IEEE INTERNATIONAL SYMPOSIUM ON, IEEE, PISCATAWAY, NJ, USA, 1 June 2009 (2009-06-01), pages 211 - 217, XP031499100, ISBN: 978-1-4244-3486-2
FARSHAD FAHIMI ET AL: "Feature set evaluation and fusion for motor fault diagnosis", INDUSTRIAL ELECTRONICS&APPLICATIONS (ISIEA), 2010 IEEE SYMPOSIUM ON, IEEE, PISCATAWAY, NJ, USA, 3 October 2010 (2010-10-03), pages 634 - 639, XP031843107, ISBN: 978-1-4244-7645-9
Attorney, Agent or Firm:
WALLIN, Nicholas et al. (Goldings House2 Hays Lane, London SE1 2HW, GB)
Download PDF:
Claims:
CLAIMS

A method of configuring a method of diagnosing a mode of operation of a machine, said method comprising:

a. collecting measurements relating to the operation of a machine in each of a plurality of different known operating modes;

b. extracting features from the collected measurements; and,

c. arranging extracted features in sets, each set of features characterising the operation of the machine in a particular known mode using a particular feature.

The method according to claim 1 , wherein features are extracted from the collected measurements using a sliding window technique.

The method according to claims 1 or 2, wherein features are extracted from the collected measurements in at least one of the following domains: time, frequency, wavelet.

The method according to any preceding claim, wherein sets of features are formed by clustering extracted features together according to operating mode and feature type.

A method of diagnosing a mode of operation of a machine, using features extracted from measurements relating to the operation of the machine in each of a plurality of different known operating modes, said extracted features being arranged in sets, each set of features characterising the operation of the machine in a particular known mode using a particular feature, said method comprising:

a. collecting measurements relating to the operation of the machine in an operating mode;

b. for each set of features, determining a probability that the machine is operating in the known mode corresponding to that set of features using the collected measurements; and,

c. combining each probability relating to the same known mode to generate a confidence level that the machine is operating in that known mode.

6. The method of claim 5, further comprising scaling each probability according to a corresponding first weighting prior to combining each probability relating to the same known mode to generate the confidence level that the machine is operating in that known mode.

7. The method of claim 6, further comprising:

d. identifying similar confidence levels;

e. for each identified confidence level, scaling each corresponding probability according to a corresponding second weighting; and,

f. combining each scaled probability relating to the same confidence level to generate a second confidence level.

8. The method of claim 7, wherein the following expression is used to identify similar confidence levels:

CCG = {CJ \ (Cmm - oc) < CJ < CmM } where CCG is a group of similar confidence levels, C is a confidence level in a h mode as calculated in step c, Cmax is a maximum confidence level calculated in step c, and uc is a variance of the confidence levels calculated in step c.

9. The method of any one of claims 5 to 8, wherein each weighting is defined according to the accuracy with which the set of features, corresponding to the probability which corresponds to the weighting, identifies different modes.

10. The method of claim 9, wherein each weighting comprises a score which relates to the set of features corresponding to the probability which corresponds with the weighting.

11. The method of claim 10, wherein the score relating to each first weighting is calculated using as a positive data-set the set of features corresponding to the probability which corresponds with the first weighting, and using as a negative data-set any other sets of the same features.

12. The method of claim 10 when dependent on claim 7, wherein the score relating to each second weighting comprises a collection of scores, wherein each score is calculated using as a positive data-set the set of features corresponding to the probability which corresponds with the second weighting, and using as a negative data-set a different one of any other sets of the same features.

13. The method of any one of claims 10 to 12, wherein the score comprises an F-Score.

14. The method of any one of claims 5 to 13, wherein in step b. the probabilities are determined by calculating a mean and variance for each set of features, and for each set, identifying a probability that the collected measurements are part of the set using the set's mean and variance.

15. The method of any one of claims 5 to 14, wherein in step b. the same features are extracted from the collected measurements as are extracted prior to step a., and the features extracted from the collected measurements are used to generate the probabilities in step b.

16. The method according to any preceding claim, wherein the measurements and/or collected measurements comprise voltage and/or current measurements.

17. The method according to any preceding claim, wherein the features comprise at least one of: geometric mean, harmonic mean, arithmetic mean, median, mode, trimmed mean, variance, mean absolute deviation, range, standard deviation.

18. The method according to any preceding claim, wherein the machine is a three-phase motor and the plurality of different known operating modes includes at least one of the following modes: normal, overload, disconnection of phase 1 , disconnection of phase 2, disconnection of phase 3.

19. The method according to any one of claims 5 to 18 when configured according to the method of any one of claims 1 to 4.

20. The method of any of claims 5 to 19, wherein step b. is performed only for sets of features which characterise the operation of the machine in a subset of the plurality of different known operating modes. 21. The method of claim 20, wherein the subset comprises only fault modes.

22. An apparatus for configuring an apparatus for diagnosing a mode of operation of a machine, said apparatus comprising:

at least one processor; and,

at least one memory including computer program code

the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following:

collect measurements relating to the operation of a machine in each of a plurality of different known operating modes;

extract features from the collected measurements; and,

arrange extracted features in sets, each set of features characterising the operation of the machine in a particular known mode using a particular feature.

23. An apparatus for diagnosing a mode of operation of a machine, using features extracted from measurements relating to the operation of the machine in each of a plurality of different known operating modes, said extracted features being arranged in sets, each set of features characterising the operation of the machine in a particular known mode using a particular feature, said apparatus comprising:

at least one processor; and,

at least one memory including computer program code

the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following:

collect measurements relating to the operation of the machine in an operating mode; for each set of features, determine a probability that the machine is operating in the known mode corresponding to that set of features using the collected measurements; and combine each probability relating to the same known mode to generate a confidence level that the machine is operating in that known mode.

24. The apparatus of claim 23, the at least one processor and the at least one memory comprising a first and a second processing unit configured to communicate with each other;

the first processing unit being configured to collect the measurements relating to the operation of the machine in an operating mode, and transmit data relating to the collected measurements to the second processing unit; and

the second processing unit being configured to, for each set of features, determine a probability that the machine is operating in the known mode corresponding to that set of features using data received from the first processing unit, and combine each probability relating to the same known mode to generate a confidence level that the machine is operating in that known mode.

25. The apparatus of claim 24, wherein the second processing unit is configured to determine a probability only for sets of features which characterise the operation of the machine in a subset of the plurality of different known operating modes, and the first processing unit is configured to transmit only data corresponding to the subset.

26. The apparatus of claim 25, wherein the subset comprises only fault modes.

27. The apparatus of claim 25 or 26, wherein the first processing unit is configured to transmit data to the second processing unit in dependence on instructions received from the second processing unit.

28. The apparatus of any one of claims 25 to 27, wherein the first processing unit is configured to use the collected measurements to identify if the machine is operating in one of the subset of the plurality of different known operating modes, and only transmit data to the second processing unit if it is identified that the machine is operating in one of the subset of the plurality of different known operating modes.

29. A computer program for configuring a computer program for diagnosing a mode of operation of a machine, said computer program comprising:

code for collecting measurements relating to the operation of a machine in each of a plurality of different known operating modes;

code for extracting features from the collected measurements; and, code for arranging extracted features in sets, each set of features characterising the operation of the machine in a particular known mode using a particular feature.

30. A computer program for diagnosing a mode of operation of a machine, using features extracted from measurements relating to the operation of the machine in each of a plurality of different known operating modes, said extracted features being arranged in sets, each set of features characterising the operation of the machine in a particular known mode using a particular feature, said computer program comprising:

code for collecting measurements relating to the operation of the machine in an operating mode;

code for, for each set of features, determining a probability that the machine is operating in the known mode corresponding to that set of features using the collected measurements; and

code for combining each probability relating to the same known mode to generate a confidence level that the machine is operating in that known mode.

31. A method of configuring a method of diagnosing a mode of operation of a machine substantially as hereinbefore described with reference to the accompanying figures.

32. A method of diagnosing a mode of operation of a machine substantially as hereinbefore described with reference to the accompanying figures.

33. An apparatus for configuring an apparatus for diagnosing a mode of operation of a machine substantially as hereinbefore described with reference to the accompanying figures.

34. An apparatus for diagnosing a mode of operation of a machine substantially as hereinbefore described with reference to the accompanying figures.

35. A computer program for configuring a computer program for diagnosing a mode of operation of a machine substantially as hereinbefore described with reference to the accompanying figures.

36. A computer program for diagnosing a mode of operation of a machine substantially as hereinbefore described with reference to the accompanying figures.

Description:
A METHOD, APPARATUS AND COMPUTER PROGRAM

FOR DIAGNOSING A MODE OF OPERATION OF A MACHINE TECHNICAL FIELD OF THE INVENTION

The present invention relates to a method of diagnosing a mode of operation of a machine and more particularly, to a method for diagnosing a fault in operating machinery. The present invention also relates to a method of configuring said method of diagnosing.

BACKGROUND TO THE INVENTION

Fault diagnosis of malfunctioning machinery, such as, for example, a DC motor, involves collecting measurements, such as, for example, current or voltage measurements, from indicators installed on the machinery while it operates. The collected measurements are analysed to identify the likely cause of the malfunction. Once the cause has been identified the machinery can be repaired.

Expert systems are one way in which to perform fault diagnosis on faulty machinery. Measurements are collected from the machinery while it operates using several indicators installed on the machinery. The indicators measure a wide range of different operational characteristics, such as, sound, vibration, voltage, current, etc. Before the expert system is operated, measurements are stored relating to the operational characteristics of the machine while operating in various known failure states. During fault diagnosis, live measurements are collected from the faulty machinery while it operates and are compared to the stored measurements. The likely fault is diagnosed by identifying the failure state which corresponds best with the live measurements.

There are several disadvantages of expert systems. Firstly, several indicators must be installed on the machinery and this can: increase the cost of diagnosis, damage the machinery, endanger the installer of the indicators, and reduce the productivity of the machinery during installation of the indicators. Secondly, there is a trade-off between the efficiency and accuracy of the expert system. The more known failure states that are stored and the more operational characteristics that are stored for each failure state the more accurately the system can diagnose a fault. However, increasing the number of failure states and characteristics stored increases the processing necessary to correlate the live measurements with the stored measurements. This in turn increases the time taken to perform a diagnosis. Real-time performance is very important because the faster a fault is diagnosed the less effect it has on machinery productivity and the less chance there is it will develop into a larger, more serious and more costly fault. Thirdly, expert systems perform analysis based on historical fault data stored before the system is operated. Therefore, the system is not able to detect new or unknown faults.

It is also known to use neural networks to identify faults. In such systems, the live measurements are input into a predefined neural network. The network then outputs possible faults in dependence on the input live measurements. An advantage of this method is that different measurements can be given different weightings in dependence on how good an indicator they are of the machines operational mode. This in turn improves the accuracy of the fault diagnoses generated.

There are several disadvantages of neural network systems. Firstly, they are closed systems and therefore, operators or engineers are not aware of how the diagnoses are calculated. Stated differently, neural network systems leave no audit trails to justify their predictions. Secondly, it is almost impossible to adjust and modify the neural network for different applications, such as, for example, different machinery, and this inhibits any future improvements. SUMMARY OF EMBODIMENTS OF THE INVENTION

A first aspect of the invention provides a method of configuring a method of diagnosing a mode of operation of a machine, said method comprising: collecting measurements relating to the operation of a machine in each of a plurality of different known operating modes; extracting features from the collected measurements; and, arranging extracted features in sets, each set of features characterising the operation of the machine in a particular known mode using a particular feature. Preferably, features are extracted from the collected measurements using a sliding window technique. Preferably, features are extracted from the collected measurements in at least one of the following domains: time, frequency, wavelet. Preferably, sets of features are formed by clustering extracted features together according to operating mode and feature type.

A second aspect of the invention provides a method of diagnosing a mode of operation of a machine, using features extracted from measurements relating to the operation of the machine in each of a plurality of different known operating modes, said extracted features being arranged in sets, each set of features characterising the operation of the machine in a particular known mode using a particular feature, said method comprising: collecting measurements relating to the operation of the machine in an operating mode; for each set of features, determining a probability that the machine is operating in the known mode corresponding to that set of features using the collected measurements; combining each probability relating to the same known mode to generate a confidence level that the machine is operating in that known mode.

Preferably, scaling each probability according to a corresponding first weighting prior to combining each probability relating to the same known mode to generate the confidence level that the machine is operating in that known mode.

Preferably, the method further comprises: identifying similar confidence levels; for each identified confidence level, scaling each corresponding probability according to a corresponding second weighting; and, combining each scaled probability relating to the same confidence level to generate a second confidence level.

Preferably, the following expression is used to identify similar confidence levels:

Cco = {C J \ (C ma>i - u c ) < C j < C m where Ccc is a group of similar confidence levels, C is a confidence level in a h mode as calculated in the second aspect, C max is a maximum confidence level calculated in the second aspect, and o c is a variance of the confidence levels calculated in the second aspect. Preferably, each weighting is defined according to the accuracy with which the set of features, corresponding to the probability which corresponds to the weighting, identifies different modes. Preferably, each weighting comprises a score which relates to the set of features corresponding to the probability which corresponds with the weighting.

Preferably, the score relating to each first weighting is calculated using as a positive data-set the set of features corresponding to the probability which corresponds with the first weighting, and using as a negative data-set any other sets of the same features. Preferably, the score relating to each second weighting comprises a collection of scores, wherein each score is calculated using as a positive data-set the set of features corresponding to the probability which corresponds with the second weighting, and using as a negative data- set a different one of any other sets of the same features. Preferably, the score comprises an F-Score.

Preferably, the probabilities are determined by calculating a mean and variance for each set of features, and for each set, identifying a probability that the collected measurements are part of the set using the set's mean and variance.

Preferably, the same features are extracted from the collected measurements as are extracted prior to collecting measurements, and the features extracted from the collected measurements are used to generate the probabilities. Preferably, the measurements and/or collected measurements comprise voltage and/or current measurements. Preferably, the features comprise at least one of: geometric mean, harmonic mean, arithmetic mean, median, mode, trimmed mean, variance, mean absolute deviation, range, standard deviation. Preferably, the machine is a three-phase motor and the plurality of different known operating modes includes at least one of the following modes: normal, overload, disconnection of phase 1 , disconnection of phase 2, disconnection of phase 3.

Preferably, determining a probability, for each set of features, that the machine is operating in the known mode corresponding to that set of features using the collected measurements is performed only for sets of features which characterise the operation of the machine in a subset of the plurality of different known operating modes. Preferably, the subset comprises only fault modes.

A third aspect of the invention provides an apparatus for configuring an apparatus for diagnosing a mode of operation of a machine, said apparatus comprising: at least one processor; and, at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following: collect measurements relating to the operation of a machine in each of a plurality of different known operating modes; extract features from the collected measurements; and, arrange extracted features in sets, each set of features characterising the operation of the machine in a particular known mode using a particular feature.

A fourth aspect of the invention provides an apparatus for diagnosing a mode of operation of a machine, using features extracted from measurements relating to the operation of the machine in each of a plurality of different known operating modes, said extracted features being arranged in sets, each set of features characterising the operation of the machine in a particular known mode using a particular feature, said apparatus comprising: at least one processor; and, at least one memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to perform at least the following: collect measurements relating to the operation of the machine in an operating mode; for each set of features, determine a probability that the machine is operating in the known mode corresponding to that set of features using the collected measurements; and combine each probability relating to the same known mode to generate a confidence level that the machine is operating in that known mode.

Preferably, the at least one processor and the at least one memory comprise a first and a second processing unit configured to communicate with each other; the first processing unit being configured to collect the measurements relating to the operation of the machine in an operating mode, and transmit data relating to the collected measurements to the second processing unit; and the second processing unit being configured to, for each set of features, determine a probability that the machine is operating in the known mode corresponding to that set of features using data received from the first processing unit, and combine each probability relating to the same known mode to generate a confidence level that the machine is operating in that known mode.

Preferably, the second processing unit is configured to determine a probability only for sets of features which characterise the operation of the machine in a subset of the plurality of different known operating modes, and the first processing unit is configured to transmit only data corresponding to the subset. Preferably, the subset comprises only fault modes.

Preferably, the first processing unit is configured to transmit data to the second processing unit in dependence on instructions received from the second processing unit.

Preferably, the first processing unit is configured to use the collected measurements to identify if the machine is operating in one of the subset of the plurality of different known operating modes, and only transmit data to the second processing unit if it is identified that the machine is operating in one of the subset of the plurality of different known operating modes.

A fifth aspect of the invention provides a computer program for configuring a computer program for diagnosing a mode of operation of a machine, said computer program comprising: code for collecting measurements relating to the operation of a machine in each of a plurality of different known operating modes; code for extracting features from the collected measurements; and, code for arranging extracted features in sets, each set of features characterising the operation of the machine in a particular known mode using a particular feature.

A sixth aspect of the invention provides a computer program for diagnosing a mode of operation of a machine, using features extracted from measurements relating to the operation of the machine in each of a plurality of different known operating modes, said extracted features being arranged in sets, each set of features characterising the operation of the machine in a particular known mode using a particular feature, said computer program comprising: code for collecting measurements relating to the operation of the machine in an operating mode; code for, for each set of features, determining a probability that the machine is operating in the known mode corresponding to that set of features using the collected measurements; and code for combining each probability relating to the same known mode to generate a confidence level that the machine is operating in that known mode.

The further features mentioned above which relate to the first aspect apply equally to the third and fifth aspects. The further features mentioned above which relate to the second aspect apply equally to the fourth and sixth aspects.

BRIEF DESCRIPTION OF THE DRAWINGS Embodiments of the present invention will now be described by way of example only and by reference to the accompanying drawings, in which:

Figure 1 illustrates an example environment in which a first embodiment of the invention operates;

Figure 2 provides a flow diagram of the operation of the first embodiment of Figure 1 ;

Figures 3, 4a and 4b illustrate the feature selection block of Figure 2; Figure 5 illustrates an apparatus according to a first alternative embodiment of the invention; and,

Figure 6 illustrates an apparatus according to a second alternative embodiment of the invention.

DESCRIPTION OF THE EMBODIMENTS

FIG 1 illustrates an environment in which an embodiment of the invention is intended to operate. A three-phase DC motor 2 is installed with a current and voltage sensor or indicator 4. The indicator 4 is capable of measuring the current through the motor 2 and the voltage across the motor 2 while it operates. Also provided is a data collection board 6 and a computer 8, the board 6 is in communication with the indicator 4 via connector 10 and the board 6 is in communication with the computer 8 via the connector 12. When the motor 2 operates, the indicator 4 takes voltage and current measurements from the motor 2 and transmits them to the board 6. The board 6 comprises a TCP/IP port which is used to stream the collected current and voltage measurements to the computer 8 in real-time. The computer 8 processes the streamed measurements to diagnose an operating mode of the motor 2, as will be described in detail below.

The method of diagnosis performed by the present embodiment will be described with reference to FIG 2. The operation at block 50 (data collection) is as described above with reference to FIG 1, i.e. a stream of measurements is collected from the motor 2 and transmitted to the computer 8. In block 52 (feature extraction) the computer 8 processes the streamed measurements to calculate a number of features. The calculated or extracted features fall into three main categories (or domains): amplitude features, frequency features and wavelet features. Feature extraction comprises calculating for each category one or more of the following features: geometric mean, harmonic mean, arithmetic mean, median, mode, trimmed mean, variance, mean absolute deviation, range, and standard deviation.

A sliding window technique is used to extract features from the stream of measurements. More specifically, a discrete slice (or window) of the most recent data in the time, frequency or wavelet domain is used to calculate a value for each of the features. This process is then repeated for the two remaining domains until a value has been calculated for all features in each of the three domains. A new window is then sliced off from the steam of measurements and the process is repeated. The size of the window is variable and can be chosen in dependence on expert analysis performed during a configuration phase. Windowing is used because it is only possible to judge the behaviour or mode of operation of the motor with confidence based on a number of data points rather than a single data point.

At block 54 (Gaussian clustering) the features extracted from the data steam in block 52 are clustered, i.e. arranged in sets. It is important in this example to note that there are two clustering phases. The first clustering phase is performed during configuration of the diagnosis system before diagnosis is performed. The second clustering phase is performed after configuration of the diagnosis system while diagnosis is being performed.

In the first clustering phase, values relating to a particular feature are measured while the motor is intentionally operated in a particular mode and then grouped into a subset called a cluster. Clustering the features in this way enables them to be characterised by statistical characteristics of the cluster, such as, for example, the cluster mean or variance. During the configuration phase, the motor 2 is intentionally operated in a normal mode and four fault modes: motor overload, disconnection of phase 1, disconnection of phase 2 and disconnection of phase 3. For each of the five modes of operation data is collected and features are extracted as described above. The values calculated for each individual feature during each mode of operation are clustered together. The mean and variance of each cluster are then calculated. The generation of means and variances in this way represents the first clustering phase. Actual diagnosis can be performed after this phase has been completed. In the second clustering phase, a live measurement stream is collected from the motor 2 and features are extracted as described above. Based on the first clustering phase, a mean and variance have been calculated for each feature in each of the five modes. In the second clustering phase, for each feature extracted from the live stream a probability is calculated of that feature being part of each one of the five modes. This calculation involves using each of the five mean and variance pairs with the live feature value to identify the probability that the live feature is part of each cluster. The result of this operation is a set of five probabilities for each feature, wherein each probability indicates the likelihood of the motor 2 operating in a different one of the five modes based on one feature (e.g. median, mode, geometric mean, etc.) in one domain (i.e. time, frequency or wavelet).

The above operation is performed for each feature in each domain. Therefore, the overall outcome of the second clustering phase is a series of sets of probabilities. As there are five possible modes, ten different features and three different domains the result of this operation yields a total of one hundred and fifty probability values.

The preferred output of the diagnosis method of the present embodiment is a single probability for each of the five modes. Therefore, the probabilities calculated during Gaussian clustering (block 54) need to be combined to produce a probability or confidence level that the motor 2 is operating in each of the five modes.

At block 56 (feature selection) the probabilities calculated during Gaussian clustering (block 54) are combined intelligently. More specifically, each feature may have a different level of importance depending on the motor's mode of operation. For example, one feature may provide a clear indicator for the motor 2 operating in normal mode but that same feature may not provide a clear indicator for an overload fault. However, a different feature which does not provide a clear indicator that the motor is operating in normal mode may provide a particularly clear indicator for an overload fault. Accordingly, when calculating an overall probability for a particular mode the probability of an individual feature should be given a level of importance in dependence on how good an indicator it is of that particular mode. Considering the above example, the probability of the first feature should be considered more important when calculating the overall probability that the motor is operating in normal mode, whereas the probability of the second feature should be considered more important when calculating the overall probability that the motor is operating with an overload fault.

Intelligent feature selection is performed using the F-Score algorithm. The F-Score algorithm measures the discrimination between two sets of data. One data set is called the 'positive' set and the other data set is called the 'negative' set. F-Score measures how separate and distinct the two sets of data are. The F-Score algorithm is used to generate weightings for each of the one hundred and fifty probabilities mentioned above. Each probability can be combined with its corresponding weightings to attenuate the effect of the individual probability on the overall probability for a mode. The amount of attenuation being dependent on how good an indicator a probability is for that mode. The weightings are calculated using the clustered data generated during the first cluster phase. Once the weightings have been calculated they are used with the probabilities generated during the second clustering phase. The F-Score algorithm is defined as follows.

The positive set of values relating to a feature is defined as x k (+) , k = 1,2,..., n + , where n equals the number of values in the positive set. The negative set of values relating to the feature is defined as , k = \,2,..., n ~ , where n equals the number of values in the negative set. The F-Score for the feature i is defined as:

where 3„ x {* tj _) are the averages of the feature of the whole (combined positive and negative), positive, and negative data sets, respectively; xff is the k^ value of feature / ' from the positive set, and is the & lh value of feature from the negative set. The F-Score provides a good feature selection criteria for the following reasons.

FIG 3 shows two exemplary clusters 100 and 102. The values in both clusters 100 and 102 relate to feature however, the values in cluster 100 relate to a different mode of operation of the motor than the values of cluster 102, for example, the cluster 100 relates to normal mode whereas the cluster 102 relates to overload mode. Clusters 100 and 102 represent the positive and negative data sets, respectively. Using the F-Score equation it is possible to calculate an F-Score for the feature i with the values of clusters 100 and 102. A function of the F-score variable is that its value is higher the larger the Euclidean distance between the data sets (i.e. clusters). The Euclidean distance is defined as the distance between the mean value of each cluster normalised by the disparity of each cluster. The Euclidean distance is indicated on FIG 3 by line 104. Using the mean and variance values of each of clusters 100 and 102 (which were calculated during the first clustering phase), it is possible to identify the probability that a new measurement value of the feature illustrated in FIG 3 falls into either one of cluster 100 or 102. For example, a new measurement 106 lies inside cluster 100 but outside cluster 102. Therefore, the probability that the motor is in the mode of operation modelled by cluster 100 is higher than the probability that the motor is in the mode modelled by cluster 102.

In addition to the above, the probability of correctly detecting a particular mode of operation depends on the Euclidean distance between the clusters 100 and 102. The shorter the Euclidean distance the less certain one can be that a new measurement lying inside cluster 100 but outside cluster 102 indicates the mode of operation represented by cluster 100. Conversely, the longer the Euclidean distance the more confident one can be that a point lying inside cluster 100 but outside cluster 102 indicates the mode of operation represented by cluster 100. The F-Score provides a indication of the Euclidean distance between the clusters. Therefore, using the F-Score algorithm it is possible to identify features that define clusters for particular modes of operation which are spaced apart from each other. By giving those clusters a higher weighting when calculating an overall probability for a given mode it is possible to obtain a more accurate probability. According to the present embodiment, features which have a higher F-Score are given a higher weighting whereas features which have a lower F-Score are given a lower weighting. The present embodiment operates by looking at the negative and positive data sets of each F- Score calculation from two different perspectives. Firstly, as shown in FIG 4a, a single cluster 110 is compared to a collection 120 of all other clusters 112 to 1 18. Secondly, as shown in FIG 4b, the cluster 1 10 is compared to each other cluster 1 12 to 1 18 in turn. Based on this processing, two sets of scoring matrices are formed which are used to provide weightings. Each scoring matrix is made up of F-Score values. The following describes in more detail how the scoring matrices are formed.

FIG 4a shows five clusters 1 10 to 1 18 which all relate to the same feature, such as, for example, the arithmetic mean calculated in the time domain. Each of the clusters 110 to 1 18 model the value of the feature in the five different operational modes of the motor 2, i.e. normal, overload, disconnection of phase 1 , disconnection of phase 2, and disconnection of phase 2. The first scoring matrices are calculated by comparing each cluster against a collection of all the other clusters. In this case, the F-Score values are calculated using the single cluster as the positive data set and all data in all other clusters as the negative data set. Therefore, for the five cluster situation as shown in FIG 4a, five first scoring matrices are formed, wherein each matrix has a different one of the five clusters as the positive data set. The dimension of each matrix is 1 x n where n is the number of feature spaces for the data.

The second scoring matrices are calculated by comparing each cluster against each other cluster in turn. Therefore, for the five cluster situation as shown in FIG 4b, five second scoring matrices are formed, wherein each matrix has a different one of the five clusters as the positive data set. The dimension of each matrix is (n - l)x n , where n is the number of feature spaces for the data. Once the first and second scoring matrices have been formed all F-Score values contained within each matrix are normalised to one. This is so that each F-Score can be used as a weighting factor for probabilities. Therefore, the end of the feature selection stage is to generate first and second set of scoring matrices containing weightings for combining with probabilities calculated based on live measurements. As the feature selection process operates on cluster data established during the configuration phase (i.e. the first cluster phase), some embodiments of the invention perform feature selection (block 56) during the configuration phase before diagnosis is performed and then store the matrices generated so that they can be used during diagnosis.

At block 58 (data fusion) the scoring matrices calculated during feature selection (block 56) are used as weightings to combine with probability values calculated using live measurements received from the motor 2. In particular, the features having a higher F-Score define more spaced out clusters for each operational mode and therefore, those features are given a relatively high weighting because they identify different modes more accurately. Conversely, features having a lower F-Score define more closely packed clusters and so are given a relatively low weighting because they do not identify different modes particularly accurately. It is noted that the value of F-Score is normalised between zero and one, and can take any value in between in dependence on cluster spacing. The overall output of the data fusion stage is a confidence level for each of the five modes, wherein each indicates the probability that the motor is operating in a different one of the five modes.

Data fusion is split into two stages. In the first stage, a confidence level for each mode of operation is calculated using the first scoring matrices. In the second stage, those confidence levels calculated in the first stage which are close to each other are compared using the second scoring matrices to identify which one should prevail over the others.

In the first stage, the confidence levels are calculated as follows. A live steam of measurements from the motor is collected and features are extracted as described above. For each feature value extracted, a probability is calculated of that value being part of each cluster relating to that feature which was calculated during the configuration phase. As mentioned above this operation is defined as the second clustering phase. The result of this operation is a set of five probabilities for each feature, wherein the total number of probabilities is equal to one hundred and fifty.

These probabilities are then grouped according to the mode of operation that they relate to rather than the feature that they relate to. Each probability relating to a particular mode of operation is then combined with the first scoring matrix. More specifically, each probability relates to a particular feature and a particular mode of operation. The step of combining multiplies that probability with the F-Score from the first scoring matrix which relates to that same mode and feature. During the feature selection stage the F-Scores were normalised to one and therefore, combining the probability with its corresponding F-Score in this way applies a weighting to that probability. Stated differently, the probability is increasingly attenuated the closer the cluster spacing is.

The above operation is performed for each probability relating to the particular mode. The combinations of probabilities and F-Scores are then combined into a single overall probability or confidence value for a particular mode. This operation is then repeated for each other mode to obtain five confidence values, one for each of the five modes of operation. This operation can be described by the following equation.

If the probability of a new live value relating to the th feature and in the h mode is P and i = \,2,...,n wherein n is the number of features, the probabilities for a single mode j are combined using the following equation:

C = {p i ,P ...,Pi)(fs{,fii,...,fci) where C' is the overall confidence level in the h mode, fsj is the F-Score from the one-to- all comparison matrices (i.e. the first scoring matrices) relating to the i th feature in h mode. In the present embodiment there are five modes and therefore, five confidence values are created: C 1 , C 2 , C 3 , C 4 , C 5 . The value of n will equate to the number of features extracted during the configuration while the motor operates in a given mode. This will include features calculated in each of the three domains, i.e. time, frequency and wavelet.

The results of the first stage is a set of confidence levels, wherein each confidence level defines a probability that the motor is operating in a different one of the five modes of operation. For example, the result of the first stage may be as follows:

Confidence Motor mode of operation Stage

value 1

d Normal 18%

C Overload 96% & Disconnection of Phase 1 19%

c 4 Disconnection of Phase 2 17%

c 3 Disconnection of Phase 3 4%

For the outlying values, such as 96%, it is possible to be reasonably confident of accuracy as the probability is sufficiently distinct from any other. In the present example, C 2 and C 5 are sufficiently distinguished to be reasonably accurate. However, for values which are close together (i.e. C 1 , C 3 and C 4 ) it is more difficult to be confident which is the most probable. To be able to discriminate more accurately between confidence levels which are close together the second scoring matrix is used as follows.

Firstly, the maximum confidence level is identified and the variance of all the confidence levels is calculated. The following expression in then used to identify the set of values which are too close to be distinguishable from each other.

C CG = {C J \ (C max - u c ) < C > < C M where CCG is the confidence level final comparison group, C is the confidence level in the f 1 mode, C max is the maximum confidence level in any mode, and v c is the variance of the confidence levels. In the present example, the confidence level final comparison group (CCG) comprises C 1 , C 3 and C 4 . For each confidence level in the final comparison group (i.e. C l , C 3 and C 4 ) a second confidence value is calculated using the second scoring matrices. The same expression as used above with respect to the first scoring matrices is used but this time with the second scoring matrices. In particular, each second value is created using the following expression: where C 01 is the second value for the confidence level in the _ * mode, P/ is the probability of the new live value relating to the I TH feature and in the / h mode (as before). fs m J , is the F-

Score from one-to-one comparison matrices (i.e. the second scoring matrices) relating to the z 'th feature in h mode. Once second values have been calculated for the confidence levels in the final comparison group they are fed into a winner takes all network to identify which second value is the largest. It is noted that the largest value calculated in the second stage may be different from the largest value calculated in the first stage. For example, in the first stage the largest of the values in the final comparison group was C 3 with 19%. However, in the second stage second values are calculated using the second scoring matrices and therefore, these matrices may place a higher weighting on a particular feature than the first scoring matrices. In other words, C 1 or C 4 may prevail once the second confidence values are calculated. Considering the case where the new values of C 1 , C 3 and C 4 are 12%, 29% and 51%, respectively, C 4 is distinguished enough from C 1 and C 3 to be reasonably confident that it is the most likely occurrence. The result of the second stage of data fusion is therefore as follows:

Taking into consideration the results of the data fusion stage, it is possible to identify with reasonable confidence that the motor is experiencing an overload fault. It is very unlikely that the motor is experiencing a disconnection of phase 3 fault. However, there is some possibility that the motor is experiencing either normal operation, a disconnection of phase 1 fault or a disconnection of phase 2 fault. Within these three possible modes of operation, it is most likely that the motor is experiencing a disconnection of phase 2 fault.

Armed with this information an engineer can devise a repair schedule based on the likelihood of a particular fault. For example, the engineer can first inspect the components of the motor which require attention in the event of an overload fault. If and when components are inspected which require repair, the engineer can replace or repair said components. The engineer can systematically work through the components of the motor in an order defined by the above results. In other words, after inspecting the aspects related to an overload fault the engineer can then look at aspects related to a disconnection of phase 2 fault, then those aspects related to a disconnection of phase 1 fault, and so on. It is an advantage of this embodiment that repair of a motor which is experiencing a fault can be performed more efficiently because a schedule of repair can be set around the likelihood that the motor is experiencing a particular fault, as described above.

It is an advantage of data collection that diagnosis can be performed using any available measurements collected. In particular, only a small number of sensors or indicators need be installed on the machinery because many features can be extracted from just one sensor. An advantage of this is to limit: increases in the cost of diagnosis, damage to the machinery, danger to the installer of sensors, and reduction in the productivity of the machinery during installation of the sensors.

The measurements streamed from the motor during data collection are prolific. It is an advantage of feature extraction that this large quantity of data is reduced to a number of features. Firstly, as the quantity of data is reduced the computing power necessary to process the data is reduced so that the diagnosis as described above can operate on hardware (e.g. computer 8) having lower processing power and in real-time. Secondly, reducing the stream of measurements to a finite number of features removes redundancy and results in noise reduction.

It is an advantage of feature extraction that the window size used to split up the stream of measurements collected from the motor can be varied. In particular, the window size can be varied in dependence on the resolution of the measurement stream in any particular domain, such as time, frequency and wavelet. Further, the window size can be varied in dependence on any other variable, such as wavelet depth. It is an advantage of Gaussian clustering that variations in feature values calculated in any single mode can be modelled using the Gaussian distribution. In particular, by calculating the mean and variance of each cluster defined during the first clustering phase it is possible to identify the probability of a new measurement falling in that cluster. The accuracy achieved using this technique is greater than having a single stored feature value and relying on that one value to identify modes of operation. It is also possible to use other statistical characteristics of a cluster to determine a probability that a new measurement falls within the cluster. For example, the range, mode or median of a cluster could be used instead of or in addition to the cluster mean and variance. It is to be understood that in some example embodiments clustering may not be Gaussian clustering. In particular, parametric or non- parametric clustering techniques may be used.

It is an advantage of feature selection that multiple probabilities formed using each cluster are combined into a single probability or confidence level for each mode. Further, it is an advantage that the probabilities are combined in an intelligent way rather than, for example, simply taking an average. In particular, it is an advantage that the probabilities are combined according to their respective F-Score. Accordingly, probabilities relating to features which define spaced apart clusters are given a higher weighting than probabilities relating to features which define close together clusters. This approach corresponds to how an expert engineer might approach fault diagnosis. More specifically, the expert would use their experience to identify certain features which provide a particularly clear indication of a particular mode of operation and therefore, they would prioritise use of those features when diagnosing that particular mode. It is a further advantage of feature selection that two sets of scoring matrices are formed. Therefore, when the first is used to generate confidence levels which are difficult to distinguish between the second set may be used to analyse those confidence levels further to identify which is the most probable. This is made possible by considering the positive and negative data sets of the F-Score algorithm in two different perspectives. The two sets of scoring matrices help identification of particular modes of operation with higher accuracy when compared to just using one set of scoring matrices. This is important in maintaining accuracy of fault diagnosis using only a limited number of measurement types, i.e. just current and voltage measurements. Accordingly, the diagnosis method is stable and consistently shows the correct mode of operation without alternating between different modes of operation due to incorrect diagnoses.

It is an advantage of data fusion that an individual confidence level is calculated for each mode using a plurality of features relating to the operation of the motor in that mode. This is because it is only possible to judge the behaviour of machinery with confidence based on a number of features or data points rather than a single feature or data point. However, it is most useful to an engineer to output a single confidence level rather than an array of values.

It is an advantage of the embodiment described above that unknown faults can be detected. It is possible to identify a mode of operation which is different from any of those for which clusters are defined during the configuration phase (i.e. during the first clustering phase). In particular, because the normal mode of operation is defined by feature clusters it is possible to identify new measurements which are unlikely to be part of those clusters but which are also unlikely to be part of any other mode's clusters. In such circumstances it is possible to say with reasonable certainty that the motor is undergoing a new fault of some kind. Given that an engineer will be able to review the confidence levels generated, it will be possible for them to design a repair schedule around the most likely known fault. This will likely assist the engineer in moving in on the cause of the new fault more efficiently than if no confidence levels were provided. Obviously, once the new fault has been characterised it is possible to incorporate the new clusters with those which were formed during the configuration phase.

It is an advantage of the embodiment described above that a running diagnosis of the motor's performance can be provided during any operational mode. While the motor is operating in a normal mode, the confidence levels will indicate that this is the case. However, when the motor begins to develop a fault, the confidence levels will highlight this. By tracking the changes in confidence levels it is possible for an engineer to anticipate a fault before it occurs. Also, it is possible for an engineer to catch a fault in an infant stage before it develops into a much more costly and time consuming problem. Further, as the engineer can predict a fault before it occurs, it is possible for a repair schedule to be designed around the times when the machinery is not in use, such as, for example, during night time or in-between cycles of use. Accordingly, the diagnosis method described above can be an online method because it can be operational and deliver confidence levels whenever the machinery is operational.

It is an advantage of the embodiment described above that input from an engineer is not required in-between collecting measurements from the machinery and generating confidence levels. Accordingly, the method is self-contained and can operate without supervision. It is also an advantage that an audit trail is available to justify any confidence level generated. Therefore, an engineer can confirm confidence levels which appear unusual or relate to faults which require a great deal of time and money to repair. In particular, the precise probabilities which lead to each confidence level in combination with their respective F-Scores can be identified.

It is an advantage of the embodiment described above that the method is portable across different machinery and different types of machinery. For example, the diagnosis does not rely on particular characteristics of a motor nor does it rely on expert knowledge of a motor. Instead, provided a stream of measurements can be obtained from a machine and it can be operated in a number of different modes during a configuration phase, the method described above can accurately diagnose what mode the machine is operating in.

Alternative embodiments of the present invention will now be described with reference to FIG 5 and FIG 6, each of which illustrate an alternative apparatus for implementing the above-described method of the first embodiment. The apparatus of FIG 5 is suitable for use in an electric vehicle, such as, for example, an electric motor car. Specifically, the apparatus may be used with embedded in-wheel sensors and processors of the motor car to diagnose a mode of operation of the various components of the motor car, such as, for example, the brakes. The apparatus comprises an embedded processing unit (EPU) 200 in communication with a central processing unit (CPU) 202. In the present example embodiment, the EPU comprises a first processing unit and the CPU comprises a second processing unit. In the present example embodiment, the EPU 200 communicates with the CPU 202 via a controller-area network (CAN) bus 204. It is to be understood that some other example embodiments may use a communications link which is different to a CAN bus. The EPU 200 includes a sensing and digitising unit 206, a memory 208 and a feature extraction unit 210. The sensing and digitising unit 206 is in communication with the memory 208 and one or more embedded in-wheel sensors and/or processors (not shown). The memory 208 is also in communication with the feature extraction unit 210. The CPU 202 includes a predictive intelligence unit 212, a feature selection unit 214, a query scheduler 216 and a feature query unit 218. Each component of the CPU 202 is in communication which each other unit of the CPU.

In the present example embodiment, the CPU 202 is a more powerful processing unit than the EPU 200 and, therefore, the CPU takes control of general processing tasks. In some other example embodiments, however, the EPU could be a more powerful processing unit. The EPU 200 provides an intermediary unit which transfers data between the one or more embedded in-wheel sensors and/or processors, and temporally records data from the sensors and/or processors. Specifically, the sensing and digitising unit 206 receives a stream of measurements from the one or more embedded in-wheel sensors and/or processors, and digitises those measurements. The digitized values are then transferred to the memory 208 for storage. Accordingly, the sensing and digitising unit 206 performs the data collecting step of the above described method of the first embodiment.

The feature extraction unit 210 compresses the stored, digitised stream of measurements into its statistical metrics that are representative of signal's characteristics or patterns. Accordingly, the feature extraction unit 210 performs the feature extraction step of the method of the first embodiment. Therefore, a role of the feature extraction unit 210 is to extract statistical features from the stored, digitized measurements in order to characterise the operation of the motorcar elements monitored by the one or more embedded in-wheel sensors and/or processors.

The feature extraction unit 210 communicates with the feature query unit 218 and the query scheduler 216 of the CPU 202, via the CAN bus 204. The query scheduler 216 knows the urgency (or priority) of fault conditions of the motorcar elements monitored by the one or more sensors and/or processors. In other words, the query scheduler is programmed with the relative severity of each fault condition. Also, the query scheduler 216 knows which extracted features are relevant for identifying each fault condition. Specifically, the query scheduler 216 queries the feature selection unit 214 in order to identify which features best identify particular fault conditions. It is to be understood that the feature selection unit 214 performs the feature selection step of the method of the first embodiment. Additionally, the predictive intelligence unit 212 performs the Gaussian clustering step of the method of the first embodiment. Accordingly, the query scheduler 216 is capable of instructing the feature query unit 218 to request features from the feature extractor 210 in dependence on which faults are most severe and which features most effectively identify those faults. This allows the CPU 202 to query the EPU 200 for features based on the urgency of fault conditions and with a frequency that is required for analysis. According to this operation, the quality of the data transferred between the EPU 200 and the CPU 202 can be optimised when the amount of data transferred between the EPU 200 and the CPU 202 is reduced, for example, because the communication means between the EPU and CPU can only handle a relatively low data rate. The predictive intelligence unit 212 is then arranged to perform the data fusion step of the method of the first embodiment. Accordingly, the unit 212 generates confidence levels based on the measurements taken by the one or more embedded in-wheel sensors and or processors.

It is noted that the query scheduler 216 can be adaptive to new machinery and environments. Specifically, the programming of the query scheduler 216 can be updated so that the relative severities of fault conditions, or the features associated with each fault condition, are modified. This modification can be performed manually, i.e. by a human user of the CPU 202 physically modifying the programming, or automatically by using the F-Score method described with reference to the first example embodiment.

It is also noted that the EPU memory may store data relating to faults which are not deemed to be severe faults. According to the above operation, this data is not necessarily transmitted to the CPU. Therefore, storing this data means that it may be analysed at a different time in order to identify minor or non-severe faults.

It is an advantage of the present embodiment that the CPU 202 can query the EPU 200 for features in dependence on which faults are most severe, and which features most effectively identify those faults. Stated differently, the CPU can instruct the EPU to sent it data corresponding to particular modes of operation, such as, serious failure states. This advantage may be illustrated by considering an exemplary scenario in which a high frequency sampled signal is obtained by the EPU 200 and needs to be transferred to the CPU 202. However, the signal's sampling rate may need to be reduced before transmission to the CPU. For example, this may be required because of limitations in the communications link between the EPU 200 and CPU 202, or because of limitations in the CPU 202 itself. The inventors of the present embodiment have surprisingly found that high frequency changes in the signal provide information which is important for the prediction of fault conditions or patterns. The present embodiment is advantageous because it optimises the use of bandwidth and, therefore, when bandwidth is restricted, the use of that restricted bandwidth is optimised. Specifically, the restricted bandwidth is used to transfer data relating to the most severe fault conditions. In particular, the restricted bandwidth is used to transmit data relating to features which are effective at identifying the most severe fault conditions. Therefore, high frequency change data relating to serious faults can be transmitted in preference to transmitting data relating to less serious faults, or not serious faults.

A further advantage of the operation of the present embodiment is achieved because less data is transferred from the EPU 200 and the CPU 202, i.e. less features are transferred because only those relevant to a subset of the recognisable operational modes are transmitted. For example, only those relevant to severe fault conditions are transmitted, rather than transmitting those relevant to all fault conditions. This operation is advantageous because the EPU 200 consumes less energy as there is less data transfer. Such an advantage is particularly significant in cases in which there is a need to minimise power consumption, such as, for example, in cases when the EPU is part of a mobile unit, such as, an electric motorcar.

The apparatus of FIG 6 is the same to the apparatus of FIG 5, however, the query scheduler 216 is part of the EPU 200 rather than part of the CPU 202. According to the arrangement of FIG 6, more fault diagnosis may be performed by the EPU 200 than in the arrangement of FIG 5. Specifically, in the arrangement of FIG 6 the query scheduler 216 is capable of monitoring the stream of measurements received by the sensing and digitizing unit 206, to identify particular fault conditions. For example, the query scheduler can identify fault conditions based on conditions programmed into the query scheduler by a human user.

In operation, the query scheduler 216 activates a transfer of information from the EPU 200 to the CPU 202 in response to a change in the sensory values which signifies a possible fault condition. In such circumstances, the operation of the arrangement of FIG 6 then mimics the arrangement of FIG 5. However, during times when the query scheduler 216 has not identified a possible fault condition, the information collected by the sensing and digitizing unit 206 is recorded in the memory 208 and is not transferred to the CPU 212 at that time. Instead, the stored data is transferred to the CPU 212 to be analysed at a later stage, for example, when communication is less expensive or easier. This transfer of data may be initiated by the EPU or the CPU.

The operation of the present embodiment is advantageous in cases when communication between the EPU 200 and the CPU 202 is restricted, for example, due to an interruption or because it is too costly. Communication may be too expensive if, for example, it is over a cellular communication network. An exemplary scenario relates to monitoring highly valuable goods during transportation. Due to the fact that only limited information can be sent in transit, the query scheduler 216 is placed inside the EPU 200 within the transit container. Accordingly, only information that is urgently required by the CPU 202 (for example, a server in a monitoring room) is sent, in order to minimise the transfer of data between the EPU and CPU. According to this operation, the EPU 200 only transmits data to the CPU 202 when a serious fault condition is about to occur. Otherwise, the EPU 200 stores the measurement data, so that less severe faults can be diagnosed once communication is less restricted, for example, once transportation is complete. It is to be understood that in the above embodiments, the intelligent feature extraction, feature selection, planned feature query and diagnostics based on fault priority result in a smaller power consumption relative to traditional systems. This advantage becomes even more beneficial in cases where mobile battery powered units are used for embedded analysis or even central processing, for example, in transit situations.

It is to be understood that the advantages associated with the method of the first example embodiment may also be achieved by the apparatus of the embodiments of FIG 5 and FIG 6.

Various additions and modifications may become apparent to the skilled person when reading the above description of embodiments of the invention, any and all of which are intended to fall within the scope of the appended claims. For example, in the above embodiments a three- phase DC motor provides the machinery on which diagnosis is performed. However, another type of electro-mechanical machine could be used instead of, or in addition to, the motor, such as a power generator, an assembly-line robot, a printing press, or any other type of machinery. Additionally or alternatively, rather than using a voltage and current sensor or indicator, other types of measurement device may be used, such as, vibration sensors, resistance sensors, thermal sensors, speed sensors, power sensors, or any other type of sensor for measuring a physical or electrical characteristic of machinery. Additionally or alternatively, other features could be extracted from measurements streamed from the machinery in addition to, or instead of, those features mentioned above. Additionally or alternatively, probability distributions other than the Gaussian probability distribution could be used to calculate the probability that a new measurement falls within an existing cluster.