Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND PRODUCT FOR CREATING RULES
Document Type and Number:
WIPO Patent Application WO/2006/084945
Kind Code:
A1
Abstract:
A method for creating rules (44, 45) to build up a rule hierarchy (50) for dividing a feature space, wherein that at most two features (Fx, Fy) are chosen, the chosen features (Fx, Fy) are used to span a coordinate system, a view of the coordinate system is presented on a display device, at least one rule (44, 45) is generated into the view, the rule (44, 45) is connected in cascade (56) to another rule and the connected rules are included in the rule hierarchy (50).

Inventors:
REUNANEN JUHA (FI)
SAARELA ANTTI (FI)
Application Number:
PCT/FI2006/000041
Publication Date:
August 17, 2006
Filing Date:
February 10, 2006
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ABB OY (FI)
REUNANEN JUHA (FI)
SAARELA ANTTI (FI)
International Classes:
G06K9/62; G06T7/00; G06F
Foreign References:
US20040218806A12004-11-04
US20040007878A12004-01-15
Other References:
R. DUDA ET AL: "Pattern Classification", 2001, JOHN WILEY AND SONS, USA, XP002374469
SOMOL P ET AL: "Feature selection toolbox", PATTERN RECOGNITION, ELSEVIER, KIDLINGTON, GB, vol. 35, no. 12, December 2002 (2002-12-01), pages 2749 - 2759, XP004379645, ISSN: 0031-3203
Attorney, Agent or Firm:
Heikkinen-keinanen, Merja (Legal Affairs/Patents P.O. Box 210, Helsinki, FI)
Download PDF:
Claims:
What is claimed is:
1. A method for creating rules (44, 45) to build up a rule hierarchy (50) for dividing a feature space, characterized in that at most two features (Fx, Fy) are chosen, the chosen features (Fx, Fy) are used to span a coordinate system, a view of the coordinate system is presented on a display device, at least one rule (44, 45) is generated into the view, the rule (44, 45) is connected in cascade (56) to another rule and the connected rules are included in the rule hierarchy (50).
2. A method according to Claim 1, characterized in that at most two features (Fx, Fy) are chosen from a multitude of features, which multitude of features includes the previously chosen features and at most two newly chosen features.
3. A method according to Claim 1, characterized in that more than one rule (44, 45) is generated into a single view.
4. A method according to Claim 1, characterized in that a rule (44, 45) is generated into the view by drawing on a display device by a pointing device.
5. A method according to Claim 1, characterized in that a rule hierarchy (50) is utilized as classification criteria for classifying defects on a sheet like material.
6. A method according to Claim 1, characterized in that a rule hierarchy (50) is utilized for dividing training data for at least one trainable classifier (19).
7. A method according to Claim 1, characterized in that a rule hierarchy (50) is utilized for a trainable classifier (19) having at least one classification result for rejecting at least one classification result.
8. A method according to Claim 1, characterized in that multidimensional data (40) is introduced to the view before generating at least one rule (44, 45) into the view.
9. A computer program product stored on computer operable media for creating rules to build up a rule hierarchy (50) for dividing a feature space, characterized in that the computer program product comprises: means for choosing at most two features (Fx, Fy), means for spanning a coordinate system using the chosen features (Fx, Fy), a display means for presenting a view of the coordinate system, means for generating at least one rule (44, 45) into the view, means for connecting the rule (44, 45) in cascade (56) to another rule and means for including the connected rules in the rule hierarchy (50).
Description:
METHOD AND PRODUCT FOR CREATING RULES

TECHNICAL FIELD The present invention relates to a method for creating rules to build up a rule hierarchy for dividing a feature space, and to a computer program product stored on a computer operable media for creating rules to build up a rule hierarchy for dividing a feature space. Especially the invention is applicable in the area of classifying defects on a sheet like material, as paper, metal, glass fiber, non-woven, plastic etc.

BACKGROUND ART

The design of data mining applications has received much attention in recent years. Examples of such applications include similarity determination and classification. In the context of data mining, it is assumed that we are dealing with a data set containing N objects in a dimensionality of d. Thus, in this data space, each object X can be represented by the d coordinates (x(l), . . . x(d)). These d coordinates are also referred to as the features in the data. The data space is also referred to as the feature space, which may reveal interesting characteristics of the data.

One problem in data mining is the prediction of particular class labels from the features. In this problem, there is a set of features, and a special variable called the class variable. The class variable typically draws its value out of a discrete set of classes C(I), . . . C(k). A test sample is defined to be a data example for which only the feature values are known, but the class variable is unknown. A training sample is defined to be a data example for which both the feature variables and the class label are known. Training data is defined to be a set of training samples. Training data is used in order to construct a model, which relates the features in the training data to the class variable. This model can then be used in order to predict the class behaviour of individual test samples, also referred to as sample labelling or classification.

However, as sophisticated and, in some cases, complex as these classification techniques may be, these conventional automated techniques lack benefits that may be derived from human interaction during their design and application stages. Any rule-based classifier must be configured with selected rule variables and limit values. Manual classification of example data, the training data, can be used to validate the rule hierarchy of a rule-based classifier. In a typical multidimensional classification problem, the number of available features can exceed twenty or even hundred features. Where the features and their scale are challenging and the rule hierarchies grow complex, the act of creating sensible and desired rule hierarchies is tedious without a good user interface. Therefore, techniques are needed that effectively employ human interaction in order to design and/or perform data mining applications such as classification.

Conventionally, the rules for a multidimensional classification task have been defined as a series of conditions. The condition parameters have been defined directly as machine-readable code or by using some type of a parser to generate the needed logic. In present applications, the rule hierarchy has been visualized for example as a tree structure showing the nodes and edges of the hierarchy. The visualization of only the hierarchy, however, does not utilize the data at hand. Also shortcomings in the visualization of the created rules makes it hard to check the validity of parameters. As an example, the user may, unintentionally, create two rules where one excludes the other, thus creating a situation where the excluded rule branch is never reached. In ordinary solutions the verification of the rules is tedious and error prone and typically several passes are needed before reaching the intended solution. This is especially true for multidimensional data with more than a few features. The plotting of the data helps to understand the statistics of the data in respect of the features selected for plotting the data. In general, such visualization techniques have been used to present a certain projection of the data to the user. Also, there are methods, where one selected projection of the data is visualized and the user has the possibility to define and visualize any number of rules to that projection. Such method is described in US patent application publication 2004/0078378.

For many types of classifiers a huge amount of samples may be available when creating the classification rules. Using the available data helps to create classification rules applicable and suitable to that particular data. The data may be labelled or unlabelled. Conventionally, the visualization of training or test data is not incorporated in the creation process of the classification rule hierarchies.

Powerful human perceptual abilities can be utilized for detecting interesting features in low dimensions (from ID to 3D). This is crucial for understanding the nature of a multidimensional data set. Graphic displays such as histograms or scatter-plots are effective ways of visualizing a data set for human interpretation. While there have been approaches utilizing such visual displays for visualizing the data, they lack the ability to directly draw rules onto the visualization surface and the ability to create rule hierarchies from these rules. As the rule hierarchies get more complex, the rule creation process gets error prone. Any unnecessary step between defining correct rules and implementing them raises the risk for making errors.

DISCLOSURE OF INVENTION

The purpose of this invention is to solve the problems described above and to create a solution for creating rules to build up a rule hierarchy for dividing a feature space.

This is achieved with a method in which at most two features are chosen, the chosen features are used to span a coordinate system, a view of the coordinate system is presented on a display device, at least one rule is generated into the view, the rule is connected in cascade to another rule and the connected rules are included in the rule hierarchy. The computer program product according to the invention comprises: means for choosing at most two features, means for spanning a coordinate system using the chosen features, a display means for presenting a view of the coordinate system, means for generating at least one rule into the view, means for connecting the rule in cascade to another rule and means for including the connected rules in the rule hierarchy.

The new method creates multidimensional rule hierarchies for a feature space efficiently, even when the number of rules in parallel and in series is high. For all rules created in one level a successive rule branch can be generated. This can be made by choosing one or two features from a multitude of features, which multitude of features includes the previously chosen features and one or two newly chosen features. The new feature or the new feature pair is then used to create a next rule level.

The solution provides visual manners for defining individual rules and their limit values. A low-dimensional space is spanned by one or two features which are selected from the feature space. Then the low dimensional space is displayed on a display device. One or more rules can be generated directly to the view for instance by drawing with a pointing device. Limit values for a rule are defined by the position and the geometry of the drawn object. The created rule is connected in cascade to another rule and the connected rules are included in the rule hierarchy. The rules form a rule hierarchy where any number of subsequent rules can be applied after any rule. The solution imposes no restrictions to the amount of parallel rules in any node of the rule hierarchy. The solution sets no restrictions to the amount of serial rules in any branch of the rule hierarchy. The solution allows also the linking of rules in a serial manner. The solution implies no restrictions to the length of the feature vectors to be categorized by the rule hierarchy. The solution also visualizes the created rule hierarchy providing an easy access to any node of the structure.

The invention is particularly advantageous when multidimensional data is introduced to the feature space. One or two features can be chosen to span a coordinate system on a visualization surface and when introducing a multidimensional data to the view it leads to a visual presentation of the data, where data points can be drawn as a so called histogram or scatter-plot, respectively. By drawing boundaries to the visualized data and using them as rules the user can see which data comply with a rule and which data depart from a rule immediately, thus making the work less error prone. After creating several rules to the rule hierarchy it

is also possible to indicate the data points which comply with the whole rule hierarchy, i.e. track what part of the original data is left in the current phase. This feature of the invention is of special value when analyzing the operation of the rule hierarchy.

The rules and definitions in the built rule hierarchy may change many times during the life cycle of the rule hierarchy. It is one of the advantages of this solution that the created rules can easily be modified and verified against available test and training data.

Implementing the rules automatically as they are defined minimizes risk for making errors as unnecessary steps between defining correct rules and implementing them is avoided. The invented method allowing easy visualization and navigation of the rule hierarchy further reduces the errors.

In the invention the ability to design a multidimensional rule hierarchy is combined with the ability to create trainable classifiers with a restricted set of training data. In this context, the rule hierarchy can be used to divide the training data at hand to smaller subsets, where each subset can be used to train a particular trainable classifier. Also at the same time, the rule hierarchy can be used to restrict the outcome of any trainable classifier linked to the rule hierarchy. It can reject one or more classification results of a trainable classifier. Any type of external classifier can be linked to any number of the leaf nodes of the hierarchy.

Furthermore the rule hierarchy can also be used to classify test samples directly without trainable classifiers. This combination of human driven rule generation using a priori knowledge and utilization of trainable classification techniques presents a powerful method for any type of classification task. When the knowledge of underlying features is incomplete a trainable classifier can be used, whereas explicit rules are applicable when the user knows the interpretation and scale of the features used in the rales as well as the relation of the features to the classes.

The method for creating rules to build up a rule hierarchy for dividing a feature space is advantageous in the area of classifying defects on a sheet like material (paper, metal, glass fiber, non-woven, plastic, etc.), where the created rule hierarchy can be utilized as classification criteria for classifying defects.

In an advantageous embodiment the method for creating rules to build up a rule hierarchy for dividing a feature space is performed using a computer. The programs to be used are stored in the memory of the computer or on computer readable media, for example a CDROM, which can be loaded on a computing device. These computer readable media have instructions for causing the computer to execute a method.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a schematic overview of a system in context of which the invention can be used;

HG. 2 presents a digitized image of paper web with a defective area;

HG. 3 presents a flow chart describing the main steps of the invention;

HG. 4 presents rule definition to a scatter-plotted feature data;

HG. 5 presents a visualized rule hierarchy.

DETAILED DESCRIPTION OF THE INVENTION

HG.1 illustrates an industrial application of a visual inspection system 10, in connection of which the method for creating rules to build up a rule hierarchy for dividing feature space can be used. This is an example where the visual inspection system represents any visual system, which takes and collects electronic images of various materials or objects for classifying different characteristics in them. Visual inspection system 10 may be applied for various continuous and discontinuous manufacturing lines. Fig. 1 illustrates a web inspection system 1 in which the visual inspection system 10 is inspecting a moving and continuous web 11 manufactured on a process line such as a paper machine.

The moving web 11 is viewed by one or several cameras 13 from one side of the web 11. The cameras 13 are mounted on a suitable mechanical support such as a camera beam 12. The web 11 is illuminated from underneath by a light source 14. The light source may also be located above the web 11. Transmitting light, as illustrated in FIG.l is favorably used for translucent materials. Reflecting light is suitable especially for other types of materials. With the reflecting light the illumination angle may either be specular or diffuse in respect to the camera- viewing angle.

The cameras 13 may be any types of electronic cameras, which can be directly or indirectly coupled to image processing unit 15. Functions of the image-processing unit

15 may also be integrated with the camera 13, in which case the camera 13 is a more complicated and self-contained image-processing unit. Image data output of an analog camera, for example an analog CCD line scan or matrix camera, has to first be converted to digital format. Digital camera output is typically more ready for digital processing in the image processing unit 15. The image-processing unit 15 receives from the cameras

13 a digital representation of the view imaged by the cameras 13. The representation is in the form of a series of digital numbers. Image processing unit 15 interprets this data as an electronic image, which is elsewhere referred to as an image, on the basis of the information it has about the properties of the camera 13. For example, the image processing unit 15 combines the successive series of data sent by a camera of line scan type to form a matrix that represents an image of the web 11.

The image-processing unit 15 is a separate, typically programmable, hardware unit. It can be partially or totally integrated with the camera, as depicted in Fig. 1. It can be also a personal computer or any other type of universal computer. One computer may take care of image data processing of one or several cameras. The outcome of this processing stage is a set of electronic images representing selected parts of the web, the images being manipulated electronically to meet requirements of the application at hand.

The images are forwarded to the next processing step, which is image analysis. This step can be done in a separate computer, which may be a part of an operator station 16 of the visual inspection system 10 and it is typically common to all the cameras 13. Image

analysis comprises, for example, segmentation for finding the interesting areas, such as defects, in the image. After segmentation, features describing properties of the regions found by segmentation can be extracted. The features are numeric values that will be used in recognizing the areas, i.e. in classifying them.

Operator station 16 contains the user interface of the visual inspection system 10. It is used for entering various tuning parameters and selecting desired displays and reports, which for example show the status of the system and the quality of the inspected products. Naturally the visual inspection system 10 requires separate means for supplying power to the system and devices for interfacing with the external systems such as the process itself. These means, which are well known to those of ordinary skill in the art, can be located in an electronic cabinet 17. In addition to operator station 16, external devices 18 can be used for alerting the operator.

The image data is stored in an image database. The image collection of the database consists of different types of digitized paper web defects. The defects are detected and their images are digitized from a running paper web. An example of the paper web defects is shown in Fig. 2. The defective area 2 is the object of interest in the image and is marked with a dash line. Digital line-scan cameras acquire the defect images with transmission lighting and the images are stored to an image database together with a set of calculated features associated with certain areas of the image. A plurality of such defect images with a varying number of defects and associated features in each image form a defect image collection. The associated features can be used to classify the defects appropriately.

For classifying the defects a classifier 19 is used. There are two main categories in the automatic classification methods: trainable classifiers and rule-based classifiers. Any trainable classifier can be trained with examples of the objects to be classified. To train a trainable classifier a number of examples must be manually labeled. The labeled examples form a so-called training material. A trainable classifier adapts to the features and given class labels of the training material and can then be used to classify unlabeled samples based on their features. On the contrary, a rule-based classifier doesn't require

any training material. Instead, the classification rules are defined based on a priori knowledge of the features and the data to be classified. The rules can be simple and easily understandable or very complex and difficult to interpret. The rules are easier to interpret, if the features used for the rules are understandable. Also, the low complexity of the rule hierarchy helps in figuring out what the rules are for. In many challenging classification tasks this is not the case.

HG. 3 presents a flow chart describing the main steps of the invention. A method for creating rules to build up a rule hierarchy for dividing a feature space comprises of several steps numbered from 31 to 38. The step 31 is to load the stored feature data. In the step 32 the appropriate feature or two features are selected. These features are used for rule visualization in the following step 33. Feature data is presented as a histogram or a scatter-plot in the step 34. The step 35 consists of defining rules to the view. The rules can be drawn with a pointing device or be defined numerically by inputting the correct boundary values. A rule is a union of one or more regions. The linked regions are visually indicated in order to clarify the created logical set, step 36. The step 37 is the definition of the actions caused by each rule. The rules can be linked to subsequent rule levels. In this case, step 32, selection of one or two features to form a new level, is repeated for that level.

The rule hierarchy is complete after all the rule levels and actions are defined. The whole rule hierarchy can then be exported for later use, step 38.

To further clarify the use of the invention, the method for creating rules to build up a rule hierarchy for dividing a feature space is applied for classifying objects presented by the feature vectors to a fixed number of class categories. The objects in the embodiment of the invention are the images of defects of a sheet like material. The rules are used for reducing the number of appropriate classes based on the samples' feature values and for dividing training material for a number of supervised image classifiers. The task is to classify an image collection by applying rules based on the feature values of the samples.

Where unique class label cannot be defined based on the defined rules the classification

is carried out by training a trainable classifier for choosing between the appropriate classes. The image collection can be represented by labeled or unlabeled data.

Defining the first pair of features to be used in the division of the data starts the rule creation process. FIG. 4 presents rule definition to a scatter-plotted feature data. A scatter-plot is drawn with these selected two features Fx and Fy. The first rules 44, 45 can be drawn with a pointing device. The rules can also be defined numerically by inputting the correct boundary values. The boundary values achieved by drawing can also be checked and altered numerically. Li both ways the defined rule is visualized on top of the scatter-plot data 40. Each rule 44, 45 consists of one or more conditions. A condition is a comparison of the feature value to a given boundary value. The boundary value can constrain the feature to be less or more than the given boundary value. A region 41, 42, 43 consists of one or more conditions and can be visualized in one or two dimensions. The regions 41, 42, 43 are defined by drawing directly to the visualization plane using a pointing device or by inputting the boundary values numerically for each boundary. Each boundary value can also be set to negative or positive infinity, i.e. the feature value is not constrained in the corresponding direction. A rule is a union of one or more regions. In FIG. 4 a rule 44 is a union of two regions 41, 42. Further, the individual regions can be linked to other regions or left as single region rules. AU interlinked regions lead to the same branch in the rule hierarchy, thus creating exactly one logical rule. The linked regions are visually indicated in order to clarify the created logical set.

All rules can be modified afterwards by changing only the corresponding condition boundary values. The actions can also be modified without affecting the rule hierarchy.

For each rule created a number of actions can be defined. In this case the list of appropriate classes is reduced based on the applied rules. For all of the rules created in this level a successive rule branch can be created. This is achieved by selecting the rule and then defining the next two features to be used for creating the next level. Then another rule of the first level can be selected and that branch of the rule hierarchy can be expanded with successive rules. The successive rules are defined as in the first stage and the list of appropriate classes is further reduced. This process will continue until all the

desired rules have been defined. If there is only one class left at the leaf node of the rule hierarchy it can readily be used for classifying samples. On the other hand, if several classes still exist, a trainable classifier 19 (FIG. 1) can be linked to the leaf node. This linked classifier 19 can be used for choosing the correct class label. The built rule hierarchy can be applied when new samples are classified. It then helps the trainable classifiers by removing the undesired class labels using the rules.

In FIG. 5 the rule hierarchy 50 is visualized in a form of a branching tree, where each node 52, 53 can be branched to any number of child nodes 54. The node 53 is connected in cascade 56 to a parent node 55. This tree visualization is used for navigating the rule hierarchy, and any rule level, i.e. node of the tree, is reached easily by selecting the desired node by using a pointing device. The outlined area 51 presents the hierarchy level in FIG. 4. The two nodes 52, 53 correspond to previously defined two rules 44, 45 in HG. 4.

Another embodiment of the invention is to use the method for creating specialized trainable classifiers. When feeding the training material for the trainable classifiers, the rule hierarchy can be applied for each training material sample and the appropriate classifier is selected based on the rules. In this way each classifier only sees a portion of the training material samples, thus specializing in the classification task of certain types of defects.

The invented method can also be used to divide the training material for a number of supervised image classifiers. This image collection can be represented by labeled or unlabeled data.

In one embodiment the invention is applied to help the user to check and modify a rule hierarchy already made by some instance. The rules to be verified may have been made by the user himself/herself or by another person or it may have been generated automatically. A tree branching algorithm may produce an automatic rule set based on training data. An algorithm known as C4.5 can be used in this task, for example [the algorithm is described in J. Ross Quinlan. C4.5: Programs for Machine Learning. Morgan

F

12

Kaufmann, 1993]. In such a case, it is often particularly important to verify the reasonability of the rules manually.

In an advantageous embodiment the method is performed using a computer.

Various changes can be made to the invention without departing from the spirit thereof or scope of the following claims.