Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
QUALITY INSPECTION AND DATA CLASSIFICATION
Document Type and Number:
WIPO Patent Application WO/2019/171125
Kind Code:
A1
Abstract:
The present disclosure provides a method and a device for inspecting a quality of an object, data classification, a program and a storage medium. The method comprising : acquiring image data of multiple objects to be inspected (S101); for each piece of image data, inputting the image data into a classifier and acquiring an output of the classifier (S103), the output being likelihood that the corresponding object to be inspected is qualified and unqualified; generating evaluation data for each piece of image data on the basis of the likelihood (S105), the evaluation data representing an absolute value of a difference between the likelihood that the object to be inspected is qualified and unqualified; identifying target image data on the basis of the evaluation data (S107); and outputting the target image data (S111) or information indicating the target image data. According to the present disclosure, the data for object detection is efficiently selected.

Inventors:
YANAGAWA YUKIKO (JP)
Application Number:
PCT/IB2018/051414
Publication Date:
September 12, 2019
Filing Date:
March 06, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
OMRON TATEISI ELECTRONICS CO (JP)
International Classes:
G06K9/62
Foreign References:
JP2011214903A2011-10-27
Other References:
WEIGL EVA ET AL: "On improving performance of surface inspection systems by online active learning and flexible classifier updates", MACHINE VISION AND APPLICATIONS, SPRINGER VERLAG, DE, vol. 27, no. 1, 20 November 2015 (2015-11-20), pages 103 - 127, XP035857173, ISSN: 0932-8092, [retrieved on 20151120], DOI: 10.1007/S00138-015-0731-9
SHARMA MANALI ET AL: "Evidence-based uncertainty sampling for active learning", JOURNAL OF DATA MINING AND KNOWLEDGE DISCOVERY, NORWELL, MA, US, vol. 31, no. 1, 13 April 2016 (2016-04-13), pages 164 - 202, XP036133044, ISSN: 1384-5810, [retrieved on 20160413], DOI: 10.1007/S10618-016-0460-3
SUN JUN ET AL: "Further development of adaptable automated visual inspection-part II: implementation and evaluation", THE INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, SPRINGER, LONDON, vol. 81, no. 5, 16 May 2015 (2015-05-16), pages 1077 - 1096, XP035573630, ISSN: 0268-3768, [retrieved on 20150516], DOI: 10.1007/S00170-015-7214-Z
CHEN FENG ET AL: "Deep Active Learning for Civil Infrastructure Defect Detection and Classification", COMPUTING IN CIVIL ENGINEERING 2017, 22 June 2017 (2017-06-22), Reston, VA, pages 298 - 306, XP055440088, ISBN: 978-0-7844-8082-3, DOI: 10.1061/9780784480823.036
BURR SETTLES: "Active Learning Literature Survey", 26 January 2010 (2010-01-26), XP055219798, Retrieved from the Internet [retrieved on 20151009]
Attorney, Agent or Firm:
INABA, Yoshiyuki et al. (JP)
Download PDF:
Claims:
Claims

1. A method for inspecting a quality of an object, comprising:

acquiring image data of multiple objects to be inspected;

for each piece of image data, inputting the image data into a classifier and acquiring an output of the classifier, the output being likelihood that the corresponding object to be inspected is qualified and unqualified;

generating evaluation data for each piece of image data on the basis of the likelihood, the evaluation data representing an absolute value of a difference between the likelihood that the object to be inspected is qualified and unqualified;

identifying target image data on the basis of the evaluation data, the target image data being the image data having the evaluation data that satisfies a predetermined condition; and

outputting the target image data or information indicating the target image data.

2. The method of claim 1 , wherein identifying the target image data on the basis of the evaluation data includes:

allocating priorities to the multiple pieces of image data on the basis of the absolute values, wherein the image data corresponding to small absolute values has priorities higher than the image data corresponding to large absolute values.

3. A device for inspecting a quality of an object, comprising:

an input unit configured to acquire image data of multiple objects to be inspected;

a classifier configured to, for each piece of image data, receive the image data and output likelihood that the corresponding object to be inspected is qualified and unqualified;

a score evaluation unit configured to generate evaluation data for each piece of image data on the basis the likelihood, the evaluation data representing an absolute value of a difference between the likelihood that the object to be inspected is qualified and unqualified;

an identification unit configured to identify target image data on the basis of the evaluation data, the target image data being the image data having the evaluation data that satisfies a predetermined condition; and an output unit configured to output the target image data or information indicating the target image data.

4. The device of claim 3, wherein the identification unit is further configured to allocate priorities to the multiple pieces of image data on the basis of the absolute values, wherein the image data corresponding to small absolute values has priorities higher than the image data corresponding to large absolute values.

5. A method for data classification, comprising:

acquiring multiple pieces of input data to be classified;

for each piece of input data to be classified, inputting the input data into a classifier and acquiring an output of the classifier, the output being likelihood for each of classification classes;

generating evaluation data based on the likelihood, the evaluation data representing a range of likelihood for the classification classes;

identifying target input data based on the evaluation data, the target input data being the input data having the evaluation data that satisfies a predetermined condition; and

outputting the target input data or information indicating the target input data.

6. The method of claim 5, wherein a number of the classification classes is two, the predetermined condition being that each likelihood is higher than zero and lower than a predetermined value, and the range of likelihood for the classification classes is in a predetermined interval.

7. The method of claim 5, wherein the predetermined condition being that each likelihood is equal to or lower than zero.

8. The method of claim 5, wherein the range of likelihood for the classification classes is the absolute value for the difference between the highest likelihood and the lowest likelihood, and the predetermined condition being that

the absolute value is lower than a predetermined value.

9. The method of claim 8, wherein identifying the target input data based on the evaluation data further comprises:

allocating priorities to the target input data on the basis of the absolute values, wherein the target input data corresponding to small absolute values has priorities higher than the target input data corresponding to large absolute values.

10. The method of claim 9, wherein the information indicating the target image data includes an identification for identifying the input data and the propriety allocated to the image data.

11. The method of claim 6 or 7, wherein identifying the target image data based on the evaluation data further comprises:

allocating priorities to the target input data on the basis of whether the predetermined condition is satisfied, wherein the target input data satisfying the predetermined condition has priorities higher than the other target input data.

12. The method of claim 11 , wherein the information indicating the target image data includes an identification for identifying the input data and the propriety allocated to the image data.

13. The method of claim 9, wherein outputting the target input data further comprises:

outputting each piece of target input data or identification information of each piece of target input data according to a sequence from high to low priorities.

14. The method of claim 13, further comprising:

determining the labels for the target input data according to the sequence.

15. A device for data classification, comprising:

an input unit configured to acquire multiple pieces of input data to be classified; a classifier configured, for each piece of input data to be classified, to receive input data and output likelihood for each of classification classes;

an evaluation unit configured to generate evaluation data based on the likelihood, the evaluation data representing a range of likelihood for the classification classes; an identification unit configured to identify target input data based on the evaluation data, the target input data being the input data having the evaluation data that satisfies a predetermined condition; and

an output unit configured to output the target input data or information indicating the target input data.

16. The device of claim 15, wherein the identification unit is configured to allocate priorities to the target input data on the basis of whether the predetermined condition is satisfied, wherein the target input data satisfying the predetermined condition has priorities higher than the other target input data.

17. The method of claim 16, wherein the information indicating the target image data includes an identification for identifying the input data and the propriety allocated to the image data.

18. The device of claim 15, further comprising:

an evaluation condition storage unit that stores an evaluation condition configured to generate the evaluation data.

19. The device of claim 15, further comprises a determination unit which is configured to determine the corresponding labels for the target input data.

20. A program for data classification, the program comprising instructions which, when the program is executed in a computer, cause the computer to perform the method of any one of claims 1 , 2, and 5-14.

21. A storage medium that stores a program for data classification, the program including instructions which, when the program is executed on a computer, cause the computer to perform the method of any one of claims 1 , 2, and 5-14.

Description:
QUALITY INSPECTION AND DATA CLASSIFICATION

Technical Field

The present disclosure relates to a method and a device for inspecting a quality of an object, data classification, a program and a storage medium..

Background

It is proposed that machine learning technology is utilized in inspecting the quality of products. For example, an inspection device described in JP2011 -214903 includes a classifier for determining whether the inspection target object is defectlless, which is generated by using supervised learning method. When such inspection device is operated in, for example, a production line after the initial leaning has completed, the classifier may be required to be updated in order to improve the accuracy or adopt the environment in which the inspection device is used. In this case, the user needs to perform additional learning on the classifier and prepare a training data set including a plurality of pairs of an input (i.e., an image of the inspection target object) and a label (or in other words, the desired output corresponding to the input) indicating, for example, whether the inspection target object is defectless.

Assigning labels with a large amount of input images takes times and costs of the user especially when the labels are generated manually, and thus it is desirable to extract the specific (and a manageable amount of) input images capable of effectively improving the performance of the inspection device and assign labels with the extracted input images to generate the training data for the additional learning. This issue may occur in a device including a classifier that employs supervised leaning other than the above described inspection device. However, it is difficult for the user to identify inputs capable of improving the performance of the classifier.

Summary

The technical problem to be solved

The present disclosure is intended to solve at least part or all of the foregoing problems. Means of solving the technical problem

In embodiments of the present disclosure, after data to be classified is acquired, likelihood that the data to be classified belongs to each classification class in multiple classification classes are determined, and the data to be classified which serves as candidates of training data to be used in additional learning.

The embodiments of the present disclosure provide a method and a device for inspecting a quality of an object, data classification, a program and a storage medium, so as to at least solve the problems that, when there is much learning data, an operation of assigning labels to object data to be classified is cumbersome and time-consuming and it is impossible to know about an amount of learning data capable of effectively improving performance of an classification device.

According to an aspect of the embodiments of the present disclosure, a method for inspecting a quality of an object comprising: acquiring image data of multiple objects to be inspected; for each piece of image data, inputting the image data into a classifier and acquiring an output of the classifier, the output being likelihood that the corresponding object to be inspected is qualified and unqualified; generating evaluation data for each piece of image data on the basis of the likelihood, the evaluation data representing an absolute value of a difference between the likelihood that the object to be inspected is qualified and unqualified; identifying target image data on the basis of the evaluation data, the target image data being the image data having the evaluation data that satisfies a predetermined condition; and outputting the target image data or information indicating the target image data.

Therefore, when there is much image data of the objects to be classified, a required part of the image data configured to specify whether the objects are qualified or unqualified may be selected, and detection efficiency is improved.

In an exemplary implementation mode of the method inspecting a quality of an object, identifying the target image data on the basis of the evaluation data includes: allocating priorities to the multiple pieces of image data on the basis of the absolute values, wherein the image data corresponding to small absolute values has priorities higher than the image data corresponding to large absolute values.

In such a manner, the required part of the data is preferably provided for data classification.

According to another aspect of the embodiments of the present disclosure, a device for inspecting a quality of an object comprising: an input unit configured to acquire image data of multiple objects to be inspected; a classifier configured to, for each piece of image data, receive the image data and output likelihood that the corresponding object to be inspected is qualified and unqualified; a score evaluation unit configured to generate evaluation data for each piece of image data on the basis of the likelihood, the evaluation data representing an absolute value of a difference between the likelihood that the object to be inspected is qualified and unqualified; an identification unit configured to identify target image data on the basis of the evaluation data, the target image data being the image data having the evaluation data that satisfies a predetermined condition; and an output unit configured to output the target image data or information indicating the target image data.

Therefore, when there is much image data of the objects to be classified, a required part of the image data configured to specify whether the objects are qualified or unqualified may be selected, and detection efficiency is improved.

In an exemplary implementation mode of the device, the identification unit is further configured to allocate priorities to the multiple pieces of image data on the basis of the absolute values, wherein the image data corresponding to small absolute values has priorities higher than the image data corresponding to large absolute values.

In such a manner, the required part of the data is preferably provided for data classification.

According to another aspect of the embodiments of the present disclosure, a method for data classification is provided, which includes acquiring multiple pieces of input data to be classified; for each piece of input data to be classified, inputting the input data into a classifier and acquiring an output of the classifier, the output being likelihood for each of classification classes; generating evaluation data based on the likelihood, the evaluation data representing a range of likelihood for the classification classes; identifying target input data based on the evaluation data, the target input data being the input data having the evaluation data that satisfies a predetermined condition; and outputting the target input data or information indicating the target input data.

In such a manner, the data to be classified which is required to be assigned with the labels may be extracted, so that classification efficiency is improved.

In a schematic implementation mode of the method for data classification, a number of the classification classes is two, the predetermined condition being that each likelihood is higher than zero and lower than a predetermined value, and the range of likelihood for the classification classes is in a predetermined interval.

In such a manner, the evaluation data configured to select the data to be classified which is to be output and assigned with the labels is determined, and for the data to be classified of the two classification classes, the proper is selected for output and assignment with the labels.

In a schematic implementation mode of the method, the predetermined condition being that each likelihood is equal to or lower than zero.

In such a manner, the data difficult for a classifier to classify is determined.

In a schematic implementation mode of the method for data classification, the range of likelihood for the classification classes is the absolute value for the difference between the highest likelihood and the lowest likelihood, and the predetermined condition being that the absolute value is lower than a predetermined value.

In such a manner, the evaluation data configured to select the data to be classified which is to be output and assigned with the labels is determined, and for the data to be classified of the more than two classification classes, the proper is selected for output and assignment with the labels.

In a schematic implementation mode of the method for data classification, wherein identifying the target input data based on the evaluation data further comprises: allocating priorities to the target input data on the basis of the absolute values, wherein the target input data corresponding to small absolute values has priorities higher than the target input data corresponding to large absolute values..

In such a manner, the data to be classified which is preferably selected is provided for assignment with the labels, and the data classification efficiency is improved.

In a schematic implementation mode of the method for data classification, wherein the information indicating the target image data includes an identification for identifying the input data and the propriety allocated to the image data.

In a schematic implementation mode of the method for data classification, wherein identifying the target image data based on the evaluation data further comprises: allocating priorities to the target input data on the basis of whether the predetermined condition is satisfied, wherein the target input data satisfying the predetermined condition has priorities higher than the other target input data. In such a manner, the data to be classified which satisfies the conditions and may greatly improve the data classification efficiency is preferably provided.

In a schematic implementation mode of the method for data classification, wherein outputting the target input data further comprises: outputting each piece of target input data or identification information of each piece of target input data according to a sequence from high to low priorities.

In such a manner, the data to be classified which is most preferably selected for classification is provided at first.

In a schematic implementation mode of the method for data classification, further comprises determining the labels for the target input data according to the sequence.

In such a manner, the labels are determined for the data to be preferably classified at first.

According to another aspect of the embodiments of the present disclosure, a data for data classification is further provided, which includes: an input unit configured to acquire multiple pieces of input data to be classified; a classifier being configured, for each piece of input data to be classified, to receive input data and output likelihood for each of classification classes; an evaluation unit configured to generate evaluation data based on the likelihood, the evaluation data representing a range of likelihood for the classification classes; an identification unit configured to identify target input data based on the evaluation data, the target input data being the input data having the evaluation data that satisfies a predetermined condition; and an output unit configured to output the target input data or information indicating the target input data.

In such a manner, the device capable of extracting the data to be classified which is required to be assigned with the labels is provided, so that classification efficiency is improved.

In a schematic implementation mode of the device for data classification, wherein the identification unit is configured to allocate priorities to the target input data on the basis of whether the predetermined condition is satisfied, wherein the target input data satisfying the predetermined condition has priorities higher than the other target input data.

In a schematic implementation mode of the device for data classification, wherein the information indicating the target image data includes an identification for identifying the input data and the propriety allocated to the image data. In a schematic implementation mode of the device for data classification, the device for data classification further includes: an evaluation condition storage unit that stores an evaluation condition configured to generate the evaluation data.

In such a manner, information configured to select the data to be classified is stored for use during output.

In a schematic implementation mode of the device for data classification, further comprises a determination unit which is configured to determine the corresponding labels for the target input data.

In such a manner, an classification result of the data to be classified is determined.

According to another aspect of the embodiments of the present disclosure, a program for data classification is further provided, the program comprising instructions which, when the program is executed in a computer, cause the computer to perform any foregoing method.

According to another aspect of the embodiments of the present disclosure, a storage medium that stores a program for data classification is further provided, the program including instructions which, when the program is executed on a computer, cause the computer to perform any foregoing method.

The system, the program and the storage medium may achieve the same effect as the foregoing method.

Technical effect

In the embodiments of the present disclosure, after data to be classified is acquired, the likelihood that the data to be classified belongs to each classification class in the multiple classification classes are determined, and the data to be classified which serves as candidates of training data to be used in additional learning. With the candidates of the training data, classification performance may be effectively improved by additional leaning, and time for preparing labels is reduced.

Brief Description of the Drawings

The drawings described herein are used to provide a further understanding of the present disclosure and constitute a part of the present application. The schematic embodiments of the present disclosure and the descriptions thereof are used to explain the present disclosure, and do not constitute improper limitations to the present disclosure. In the drawings: Fig. 1 is a flowchart of a method for inspecting a quality of an object, according to an embodiment of the present disclosure;

Fig. 2 is a schematic diagram of output data in a method for inspecting a quality of an object, according to an embodiment of the present disclosure;

Fig. 3 is a flowchart of a method for data classification according to an embodiment of the present disclosure;

Fig. 4 is a mode diagram of a hardware structure of an classification system 100 according to an implementation mode of the present disclosure; and

Fig. 5 is a block diagram of a device for data classification.

Detailed Description of the Embodiments

In sequence to make those skilled in the art better understand the solutions of the present disclosure, the technical solutions in the embodiments of the present disclosure are clearly and completely described below in combination with the accompanying drawings in the embodiments of the present disclosure. Apparently, the described embodiments are merely a part of the embodiments of the present disclosure, rather than all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments in the present disclosure, without creative efforts, shall fall within the protection scope of the present disclosure.

It is important to note that terms“first”,“second” and the like in the specification, claims and drawings of the present disclosure are adopted not to describe a specific sequence or order but to distinguish similar objects. It should be understood that data used like this may be exchanged under a proper condition for implementation of the embodiments of the present disclosure described herein in a sequence besides those shown or described herein. In addition, terms “include” and “have” and any transformation thereof are intended to cover nonexclusive inclusions. For example, a process, method, system, product or equipment including a series of steps or modules or units is not limited to those steps or modules or units which are clearly listed, but may include other steps or modules or units which are not clearly listed or intrinsic to the process, the method, the product or the equipment.

In the present disclosure, for a learning network which has finished the initial learning, when additional learning is performed on the spot, all data acquired on the spot is input, and labels are started to be assigned to data to be classified from output data of which results are close to a border. In one embodiment, a label represents the desired value (for example, a class) from a classifier or an identifier corresponding to an input value (for example, an image). The data of which the results are close to the border refers to data difficult for the learning network to judge, so that the labels are assigned to the results close to the border for learning, and a more stable learning network may be established.

According to an embodiment of the present disclosure, a method for inspecting a quality of an object, such as a product is provided. Fig. 1 is a flowchart of a method for inspecting a quality of an object, according to an embodiment of the present disclosure. As shown in Fig. 1 , the method inspecting a quality of an object includes the following steps. In S101 , image data of multiple objects to be classified is acquired. In S103, for each piece of image data, inputting the image data into a classifier and acquiring an output of the classifier, the output being likelihood that the corresponding object to be classified is qualified (i.e., defectless) and unqualified (i.e., defective). In S105, evaluation data is generated for each piece of image data on the basis of each score based on the likelihood, the evaluation data representing an absolute value of a difference between the likelihood that the object to be classified is qualified and unqualified. In S107, target image data is identified on the basis of the evaluation data, the target image data being the image data having the evaluation data that satisfies a predetermined condition; and outputting the target image data. In S109, priorities are allocated to the multiple pieces of image data on the basis of the absolute values. In S111 , the target image data is outputted.

At first, when additional learning is performed on the spot, data of the multiple objects to be classified for classification is acquired from the spot, and the data may be images of the objects to be classified captured by a camera on the spot. The images are acquired for each object to be classified, and these images are configured to determine whether the corresponding objects are qualified or unqualified. For the acquired image data of the objects to be classified, according to each piece of image data, the likelihood that the corresponding object is qualified and the likelihood that the object is unqualified are judged. For an image easy to classify, a difference between numerical values of the likelihood that the corresponding object is qualified and the likelihood that the object is unqualified is relatively large, so that, it may be rapidly classified by an classification device. Flowever, for a large amount of data captured on the spot for additional learning, there may exist much image data difficult to classify. For the data difficult to classify, the likelihood that the object is qualified or the likelihood that the object is unqualified is judged on the basis of it, a difference between their numerical values may be relatively small, and it is difficult for the classification device to judge whether the object is qualified or unqualified, so that the method of the embodiment of the present disclosure determines the data difficult to classify by the classification device at first. After the likelihood that the objects to be classified are qualified and unqualified are determined, the scores corresponding to the likelihood are generated to represent magnitudes of the likelihood. For example, high scores may be generated for high likelihood, and on the contrary, low scores are generated for low likelihood. It should be understood that the likelihood may be positive values or negative values.

For example, likelihood is generated for each class, indicating how much or the possibility that the input data belongs to the class The value for likelihood can be in the range from — - + if the value for likelihood is positive, the possibility that the input data belongs to the class is higher with a higher absolute value for the likelihood. The scores corresponding to the likelihood are also negative values or positive values. In the embodiment of the present disclosure, the evaluation data is generated for each piece of image data on the basis of the scores, and is configured to determine the specific images based on which whether the objects to be classified are qualified or unqualified is judged.

For example, when classifying if an object is qualified, the classes are“qualified object” and“unqualified object”. For each object, likelihood or score for likelihood is generated for each class. Example is provided as in the following table.

Table 1

As to object 1 , the score for the object being qualified is +90, and the score for the object being unqualified is -90. It is determined that object 1 is classified as “qualified”. As to object 2, it is determined that it is classified as“unqualified”.

As mentioned above, for different images, degrees of difficulties in qualification and disqualification judgement are different, so that the image data may be selected for qualification and disqualification judgment according to the scores on the basis of a requirement, and moreover, the image data is output on the basis of the evaluation data to detect whether the objects to be classified are qualified or unqualified to determine an classification result.

Fig. 2 is a schematic diagram of output data in a method for inspecting a quality of an object, according to an embodiment of the present disclosure. As shown in Fig. 2, multiple pieces of image data for classification are acquired, and moreover, an image Classifier (ID) is assigned to each piece of image data to distinguish each piece of image data. For example, multiple pieces of image data are acquired, and for the image data A012, scores are generated for likelihood that a corresponding object is a qualified product and an unqualified product, for example, the score corresponding to the qualified product is -0.5, and the score corresponding to the unqualified product is +0.5. For the image data A105, the score corresponding to the qualified product is +15.0, and the score corresponding to the unqualified product is -20.0. It is indicated that, for the image data A012, a judgment result of the classification device is close to 0, and it is difficult for the classification device to confirm whether the object corresponding to the image data is a qualified product or an unqualified product; while for the image data A105, the classification device obtains scores with relatively large absolute values for the image data; and it is indicated that, compared with the image data A012, the classification device has relatively high accuracy when classifying the image data A105. That is, the specific image data easy to classify and the specific image difficult to classify may be judged according to the scores of each piece of image data. For the image data difficult to classify, an classification result may be determined for it at first, that is, whether the object to be classified corresponding to the image data is a qualified product or an unqualified product is determined for additional learning of the classification device.

Therefore, when there is much image data of the objects to be classified, a required part of the image data configured to specify whether the objects are qualified or unqualified may be selected, and detection efficiency is improved.

In an exemplary implementation mode of the method inspecting a quality of an object, for each piece of image data, an absolute value of a difference between the scores of the likelihood that the object to be classified is qualified and unqualified is calculated, the absolute value being used as evaluation data. As shown in Fig. 2, a score difference corresponding to each piece of image data may be calculated, that is, the absolute value of the difference between the scores of the likelihood that the object to be classified is qualified and unqualified is calculated. For example, for the image data A012, the score difference is:

| (-0.5) - (+0.5) |=1.0.

For the image data A105, the score difference is:

| (+15.0) - (-20.0) |=35.0.

The score difference is calculated for each piece of image data in the obtained multiple pieces of image data, and according to a calculation result, it can be seen that, for the image data with small score differences, it is difficult for the classification device to confirm whether the corresponding objects to be classified are qualified products or unqualified products and, for the image data with great score differences, it is easy for the classification device to determine classification results. The absolute values of the differences between the scores of the likelihood that the objects to be classified are qualified and qualified, for example, the score differences, may be used as the evaluation data, and the degrees of difficulties in classification may be confirmed according to the evaluation data. For the image data of which the classification result is difficult to determine, the classification result may be determined at first.

Therefore, a standard for selecting a part of the image data is set to determine the required image data for detection.

As shown in Fig. 1 , in an exemplary implementation mode of the method inspecting a quality of an object, the target image data or the image data to be assigned with the labels is output on the basis of the evaluation data. In S109, priorities are allocated to the multiple pieces of image data on the basis of the absolute values, wherein the image data corresponding to small absolute values has priorities higher than the image data corresponding to large absolute values. As shown in Fig. 2, the score difference corresponding to the image data A012 is 1.0, the score difference corresponding to the image data B056 is 5.0, and the score difference corresponding to the image data A105 is 35.0, so that the priority of the image data A012 is highest, the priority of the image data B056 is lower than the image data A012, the priority of the image data A105 is lower than the image data A012 and the image data B056, and meanwhile, the priorities of the image data with greater score differences are sequentially reduced. When being output, the image data may be output on the basis of the priorities. In such a manner, the target image data recommendable for being used in additional learning is preferably provided for the user.

According to an aspect of the embodiments of the present disclosure, a method for data classification is provided. As shown in Fig. 3, the method for data classification includes the following steps. In S301 , multiple pieces of data to be classified are acquired. In S303, each piece of data to be classified is input into a classifier as input data, and scores of likelihood of correspondence to each of classification class in multiple classification classes are output by the classifier. In S305, evaluation data is generated based on the likelihood, the evaluation data may represent a range of likelihood for the classification classes. Particularly, evaluation data may be generated on the basis of each score, the evaluation data being configured to determine multiple pieces of data to be classified to be assigned with labels and each label representing that the corresponding data to be classified corresponds to a classification class in the multiple classification classes. In S307, the data to be classified to be assigned with the labels or information indicating the target input data is output on the basis of the evaluation data.

During additional learning, whether an object is a qualified product or an unqualified product is judged, and classification of another type may also be performed. For example, in the method according to the embodiments of the present disclosure, it may be classified that the data to be classified corresponds to a classification class in the multiple classification classes. In the method, the multiple pieces of data to be classified may be acquired, and the data to be classified may correspond to the classification classes in the multiple classification classes. For each classification class, the likelihood that the data to be classified corresponds to each classification class are determined. The same as determination of the scores of the likelihood mentioned above, in the method, the scores are determined for the likelihood that the data to be classified corresponds to each classification class. If the likelihood that a certain piece of data to be classified corresponds to a classification class, a score of the likelihood is high, and if the likelihood that the data to be classified corresponds to the classification class is low, the score is low. That is, the scores of the likelihood represent magnitudes of the likelihood that the data to be classified corresponds to the classification classes. After the scores of correspondence between the data to be classified and the multiple classification classes are determined, the evaluation data may be generated according to the scores, that is, the specific data to be classified to be assigned with the labels is determined according to the scores. Since the data to be classified may practically correspond to any classification class in the multiple classification classes, while for each classification class, the data to be classified corresponds to a score, the score representing the likelihood that the data to be classified corresponds to a classification class, that is, each piece of data to be classified may have a likelihood for each classification class, assigning the labels to the data to be classified is to determine the classification class the data to be classified practically corresponds to. On the basis of the evaluation data, the data to be classified to be assigned with the labels is determined. For data of which the classification classes are difficult to determine, the labels may be assigned at first, or the data to be classified to be assigned with the labels may be selected according to the evaluation data.

In such a manner, the data to be classified which is required to be assigned with the labels may be extracted, so that classification efficiency is improved.

In one embodiment, the evaluation data represents a range of likelihood for the classification classes. That is, for likelihood S1 and S2, the evaluation data can represent a range from S1 to S2. The target input data can be identified on the basis that whether the evaluation data satisfies a predetermined condition, for example, as shown below.

In a schematic implementation mode of the method for data classification, a number of the classification classes is two, and the operation that the evaluation data is generated on the basis of each score includes that: for each piece of data to be classified, an absolute value of a difference between the two scores is calculated, the absolute value being used as evaluation data. For example, for each piece of data to be classified, it is necessary to judge whether it corresponds to the first classification class or second classification class in the two classification classes. A first likelihood that the data to be classified corresponds to the first classification class is determined, a second likelihood that the data to be classified corresponds to the second classification class, a first score corresponds to the first likelihood, and a second score corresponds to the second likelihood. An absolute value of a difference between the first score and the second score is calculated, and the absolute value is used to determine the target input data as evaluation data. If the absolute value is small, it is indicated that the first score is close to the second score and it is difficult to determine the one the data to be classified practically corresponds to in the first classification class and the second classification class. If the absolute value is large, compared with a small absolute value, it is easy to determine the one the data to be classified practically corresponds to in the first classification class and the second classification class. Therefore, the evaluation data may determine degrees of difficulties in classification of the specific classification class the data to be classified corresponds to.

In such a manner, the evaluation data configured to select the target input data which is to be output as candidate training data to be assigned with the labels is determined, and for the data to be classified of the two classification classes, the proper target input data is selected.

In a schematic implementation mode of the method for data classification, the operation that the target input data is output on the basis of the evaluation data includes that: priorities are allocated to the data to be classified on the basis of the absolute values, wherein the data to be classified corresponding to small absolute values has priorities higher than the data to be classified corresponding to large absolute values. For example, in the embodiment, an absolute value of a difference between a first score of correspondence between first data to be classified and the first classification class and a second score of correspondence between the data to be classified and the second classification class is lower than an absolute value of a difference between a first score of correspondence between second data to be classified and the first classification class and a second score of correspondence between the data to be classified and the second classification class, and then a priority higher than the second data to be classified is allocated to the first data to be classified. If the absolute value of the difference between the first score of correspondence between the data to be classified and the first classification class and the second score of correspondence between the data to be classified and the second classification class is smaller, it is indicated that the likelihood that the data to be classified corresponds to the first classification class and the second classification class are lower, an classification result is close to border data, and it is difficult to implement accurate classification. Therefore, for such data to be classified, a label is assigned at first for additional learning.

In such a manner, the target input data is preferably selected and provided to the user as candidate training data for efficiently improving classification performance of a classifier.

In a schematic implementation mode of the method for data classification, the number of the classification classes is more than two the predetermined condition being that each likelihood is higher than zero and lower than a predetermined value, and the range of likelihood for the classification classes is in a predetermined interval. If each likelihood is higher than zero and lower than a predetermined value, it means that there is low likelihood that the object is belonging to a class. If the range is narrow, it means that the values for likelihood for the classes are close, and that it is difficult for a classifier to correctly classify the object.

In another example, each likelihood is equal to or lower than zero. That is, it is classified that the object doesn’t belongs to any class, which is difficult for a classifier to correctly classify the object, as well.

In such a manner, the evaluation data configured to select the target input data which is to be output as candidate training data to be assigned with the labels is determined, and for the data to be classified of the more than two classification classes, the proper target input data is selected.

In a schematic implementation mode of the method for data classification, the operation that the target input data is output on the basis of the evaluation data includes that: priorities are allocated to the data to be classified on the basis of the evaluation data, wherein the data to be classified satisfying the conditions has priorities higher than the other data to be classified. In the foregoing embodiment, for the data to be classified of the more than two classification classes, the proper target input data is selected for output, wherein the output target input data. The data to be classified satisfying the conditions in the foregoing embodiment are assigned with high priorities, compared with the data to be classified not satisfying the conditions.

In such a manner, the data to be classified which satisfies the conditions and may greatly improve the data classification performance is preferably provided.

In a schematic implementation mode of the method for data classification, the operation that the target input data is output on the basis of the evaluation data further includes that: each piece of data to be classified or identification information of each piece of data to be classified is output according to a sequence from high to low priorities. The data to be classified with high priorities is output at first, or the identification information of the data to be classified with high priorities is output, and then the data to be classified with low priorities is output. For a large amount of data to be classified, it is preferable that a part of the data to be classified may be selectively subjected to label assignment and additional learning, for example, the data to be classified difficult to classify or the data to be classified satisfying the conditions is preferably adopted.

In such a manner, the data to be classified which is most preferably selected for the target input data is provided at first.

In a schematic implementation mode of the method for data classification, the method for data classification further includes that: the labels are determined for the target input data according to a sequence. The labels are assigned for high priorities at first, and then the data to be classified with low priorities is processed. Therefore, the target input data which is assigned with the labels at first may be selected, and when the target input data is subjected to additional learning after being assigned with the labels at first, a learning process may be more efficient.

In such a manner, the labels are determined for the data which may improve classification performance of a classifier at first.

In a schematic implementation mode of the method for data classification, the operation that the target input data is output on the basis of the evaluation data includes that: the data to be classified of which the absolute values are lower than a predetermined value is output. In addition, the method according to the implementation mode of the present disclosure may also adopt another manner to determine the data to be classified to be adopted. For example, after multiple scores of correspondence between the data to be classified and the multiple classification classes are determined, the absolute values of the differences between the multiple scores are calculated. If the absolute values are smaller, it is indicated that the classification results of the data to be classified are closer to the border and it is more difficult to determine the classification classes the data to be classified corresponds to. Therefore, a threshold value may be set for the absolute values, and if the absolute values are lower than the threshold value, it is indicated that classification difficulties are relatively great, and the corresponding data to be classified should be selected for processing.

In such a manner, all the data to be classified which meets an classification requirement is provided.

In a schematic implementation mode of the method for data classification, the method for data classification further includes that: the labels are determined for the output data to be classified. In the foregoing embodiment, the selected data to be classified is output, and the labels are determined for the data to be used as training data for additional learning.

In such a manner, a label of the data to be classified is determined.

Then, a hardware structure of an classification system 100 for data classification according to an implementation mode of the present disclosure is described.

Fig. 4 is a mode diagram of a hardware structure of an classification system 100 according to an implementation mode of the present disclosure. As shown in Fig. 4, for example, the classification system 100 may be implemented by a general-purpose computer of a universal computer architecture. The classification system 100 may include a processor 110, a main memory 112, a memory 114, an input interface 116, a display interface 118 and a communication interface 120. These parts may, for example, communicate with one another through an internal bus 122.

The processor 110 extends a program stored in the memory 114 on the main memory 112 for execution, thereby realizing functions and processing described hereinafter. The main memory 112 may be structured to be a nonvolatile memory, and plays a role as a working memory required by program execution of the processor 110.

The input interface 116 may be connected with an input unit such as a mouse and a keyboard, and receives an instruction input by operating the input portion by an operator.

The display interface 118 may be connected with a display, and may output various processing results generated by program execution of the processor 110 to the display.

The communication interface 120 is configured to communicate with a Programmable Logic Controller (PLC), a database device and the like through a network 200.

The memory 114 may store a program capable of determining a computer as the classification system 100 to realize functions, for example, a moving body classification program and an Operating System (OS).

The program stored in the memory 114 for data classification may be installed in the classification system 100 through an optical recording medium such as a Digital Versatile Disc (DVD) or a semiconductor recording medium such as a Universal Serial Bus (USB) memory. Or, the program for data classification may also be downloaded from a server device and the like on the network.

The program for data classification according to the implementation mode may also be provided in a manner of combination with another program. Under such a condition, the program for data classification does not include a module included in the other program of such a combination, but cooperates with the other program for processing. Therefore, the program for data classification according to the implementation mode may also be in a form of combination with the other program.

According to another embodiment of the present disclosure, a device for data classification is provided.

Fig. 5 is a block diagram of a device for data classification. As shown in Fig. 5, the device 500 for data classification includes: an input unit 501 , configured to acquire multiple pieces of data to be classified; an classifier 503 being constructed by training-data-based supervised learning, and configured to train data including data to be classified for training and labels, each label representing that the corresponding data to be classified for training corresponds to a classification class in multiple classification classes and, moreover, the score evaluation unit 505, configured to, for each piece of data to be classified, generate evaluation data based on the likelihood, the evaluation data representing a range of likelihood for the classification classes; an identification unit 507 configured to identify target input data based on the evaluation data, the target input data being the input data having the evaluation data that satisfies a predetermined condition; and an output unit 509 configured to output the data to be classified to be assigned with the labels, or information indicating the target input data, on the basis of the evaluation data.

The device 500 for data classification according to the embodiment of the present disclosure executes the foregoing method for data classification. The input unit 501 receives data, the data is, for example, data to be classified, and the data to be classified may be acquired on the spot by capturing with a camera. After receiving the data, the input unit 501 transmits the data to the classifier 503, which is, for example, a classification device, and is constructed by training data in advance, and the training data includes the data to be classified and the corresponding labels.

The classifier 503 may output the likelihood of correspondence between the data to be classified and each classification class in the multiple classification classes and obtain the scores corresponding to the likelihood. The classifier 503 transmits the scores to the score evaluation unit 505.

The score evaluation unit 505 performs processing in the foregoing implementation mode on the basis of the received scores, and for example, may calculate absolute values of differences between multiple scores, or data representing a range of likelihood for the classification classes as evaluation data. The evaluation data is received by the identification unit 507, and the identification unit 507 identifies data to be classified based on the evaluation data, and assign high priorities to the data to be classified with low absolute values, or may determine all the data to be classified of which the absolute values are lower than a predetermined value, or may judge whether each score corresponding to the multiple pieces of data to be classified satisfies a predetermined condition or not and assign high priorities to the data to be classified which satisfies the condition. The processing result (for example, arrangement of the priorities of the multiple pieces of data to be classified, or generation of a directory based on the IDs of the data to be classified and the corresponding priorities, the directory further including the score of correspondence between each piece of data to be classified and each classification class) is sent to the output unit 509.

The output unit 509 outputs the directory, or outputs the data to be classified determined by the identification unit 507 for assignment with the labels, or outputs the information indicating the target image data includes an identification for identifying the input data and the propriety allocated to the image data. In one example, the device may further includes a determination unit which is configured to determine the corresponding labels for the data.

In such a manner, the device capable of extracting the data to be classified which is required to be assigned with the labels is provided, so that classification efficiency is improved.

As shown in Fig. 5, in a schematic implementation mode of the device for data classification, the device 500 for data classification further includes: an evaluation condition storage unit 511 , storing an evaluation condition configured to generate the evaluation data. After obtaining the processing result on the basis of the scores, for example, after obtaining the generated directory, the score evaluation unit 507 may send the processing result to the evaluation condition storage unit 511 for calling according to a requirement in a subsequent processing process. When the processing result is required to be output, data stored before may be acquired from the evaluation condition storage unit 511 , and it is sent to the output unit 509 for output.

In such a manner, information configured to select the data to be classified is stored for use during output.

In a schematic implementation mode of the device for data classification, the classifier 503 may determine the corresponding labels according to the acquired data to be classified. The classifier 503 is constructed according to the training data, and may determine classification results of correspondence to the classification classes for the data to be classified. After the data to be classified output by the output unit 509 is assigned with the labels, the data to be classified and the labels may be input into the classifier 503 again to train the classifier 503 to improve an classification capability of the classifier 503. For example, the classifier 503 may have an classification mode and a learning mode. In the classification mode, the classifier 503 determines the likelihood of correspondence to the classification classes for the received data to be classified, and determines the classification results. In the learning mode, the classifier 503 is trained according to the received data to be classified and the corresponding labels.

In such a manner, an classification result of the data to be classified is determined.

According to another aspect of the embodiments of the present disclosure, a program for data classification is further provided, which, when being executed, executes any foregoing method.

According to another aspect of the embodiments of the present disclosure, a storage medium is further provided, on which a program is stored, the program, when being executed, executing any foregoing method.

The program for data classification and storage medium according to the embodiments of the present disclosure refer to the contents mentioned above, and their specific implementation mode will not be elaborated herein. In the embodiments of the present disclosure, different emphases are laid to descriptions about each embodiment, and parts which are not elaborated in a certain embodiment may refer to the related descriptions in the other embodiments.

In the several embodiments provided in the present disclosure, it should be understood that the disclosed technical content may be implemented in other manners. The device embodiment described above is merely schematic. For example, the unit or module division may be a logic function division, and other division manners may be adopted during practical implementation. For example, a plurality of units or modules or components may be combined or may be integrated into another system, or some features may be ignored or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, units or modules, and may be in electrical or other forms.

The units or modules described as separate components may or may not be physically separated. The components displayed as units or modules may or may not be physical units or modules, that is, may be located in one place or may be distributed on multiple units or modules. Some or all of the units or modules may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, each of the functional units or modules in the embodiments of the present disclosure may be integrated in one processing unit or module, or each of the units or modules may exist physically and independently, or two or more units or modules may be integrated in one unit or module. The above-mentioned integrated unit or module may be implemented in form of hardware, and may also be implemented in form of software functional unit or module.

If being implemented in the form of a software functional unit and sold or used as an independent product, the integrated unit may be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the present disclosure essentially, or the part contributing to the prior art, or all or part of the technical solutions may be implemented in the form of a software product, and the computer software product is stored in a storage medium, including several instructions for causing a piece of computer equipment (such as a personal computer, a server or network equipment) to execute all or part of the steps of the method according to the embodiments of the present disclosure. The foregoing storage medium includes: various media capable of storing program codes such as a USB disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.

The foregoing is only the preferred embodiments of the present disclosure, and it should be noted that those of ordinary skilled in the art may make some improvements and modifications without departing from the principle of the disclosure. These improvements and modifications should be regarded to be within the scope of protection of the present disclosure.

Reference signs in the accompanying drawings : Classification system

: Processor

: Main memory

: Memory

: Input interface

: Display interface

: Communication interface

: Bus

: Input unit

: Classifier

: Score evaluation unit

: Identification unit

: Output unit

: Evaluation condition storage unit