Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
IMPROVED MODEL BASED ON COLLECTIVE FEEDBACK-BASED DECISION ON MODEL PREDICTION QUALITY
Document Type and Number:
WIPO Patent Application WO/2023/209185
Kind Code:
A1
Abstract:
A method for providing model result quality improvement data for improving the quality of the results of a model, comprising: providing a result of the model to the plurality of evaluator; receiving feedback on the result of the model from the plurality of evaluator; determining a collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of evaluator; wherein, when the collective feedback indicates a positive consensus on the received feedback of the result of the model from the plurality of evaluator, classifying the collective feedback as to be satisfactory; wherein, when the collective feedback indicates a negative consensus on the received feedback of the result of the model from the plurality of evaluator, classifying the collective feedback as to be non-satisfactory, wherein, when the collective feedback on the result of the model from the plurality of evaluator indicates a non-consensus on the received feedback of the result of the model from the plurality of evaluator, providing progressively further parameter instigating the plurality of evaluator to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator, providing model result quality improvement data for improving the quality of the results of a model based on the collective feedback being classified as to be satisfactory.

Inventors:
SHARMA DIVYASHEEL (IN)
GOPALAKRISHNAN GAYATHRI (SE)
KLOEPPER BENJAMIN (DE)
ASTROM JOAKIM (SE)
SCHMIDT BENEDIKT (DE)
MAN YEMAO (SE)
ZIOBRO DAWID (SE)
KOTRIWALA ARZAM MUZAFFAR (DE)
DIX MARCEL (DE)
Application Number:
PCT/EP2023/061313
Publication Date:
November 02, 2023
Filing Date:
April 28, 2023
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ABB SCHWEIZ AG (CH)
International Classes:
G06N3/008; G06Q10/06; G06N20/00; G06Q10/00
Foreign References:
US20190130904A12019-05-02
US20150089399A12015-03-26
Attorney, Agent or Firm:
MAIWALD GMBH (DE)
Download PDF:
Claims:
Claims

1 . A method for providing model result quality improvement data for improving the quality of the results of a model, comprising: providing a result of the model to a plurality of evaluator; receiving feedback on the result of the model from the plurality of evaluator; determining a collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of evaluator; wherein, when the collective feedback indicates a positive consensus on the received feedback of the result of the model from the plurality of evaluator, classifying the collective feedback as to be satisfactory; wherein, when the collective feedback indicates a negative consensus on the received feedback of the result of the model from the plurality of evaluator, classifying the collective feedback as to be non-satisfactory, wherein, when the collective feedback on the result of the model from the plurality of evaluator indicates a non-consensus on the received feedback of the result of the model from the plurality of evaluator, providing progressively further parameter instigating the plurality of evaluator to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator, providing model result quality improvement data for improving the quality of the results of a model based on the collective feedback being classified as to be satisfactory.

2. The method according to claim 1 , wherein the feedback on the result of the model from the plurality of evaluator comprises a positive feedback indicating an agreement to the result of the model and a negative feedback indicating a disagreement to the result of the model.

3. The method according to claim 2, when the collective feedback on the result of the model from the plurality of evaluator indicates a non-consensus on the received feedback of the result of the model from the plurality of evaluator, providing progressively a further parameter comprises the following steps: providing at least one first model explanation to the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the first model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the first model explanation based on the received feedback on the first model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the first model explanation indicates a consensus on the received feedback on the first model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the first model explanation gains the confidence of the plurality of evaluator in the result of the model such that the plurality of evaluator update their given feedback on the result of the model leading to the non- consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator.

4. The method according to claim 3, when the collective feedback on the first model explanation indicates a non-consensus on the received feedback on the first model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model, providing progressively a further parameter comprises the following steps: providing at least one second model explanation to the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the second model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the second model explanation based on the received feedback on the second model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the second model explanation indicates a consensus on the received feedback on the second model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the second model explanation gains the confidence of the plurality of evaluator in the result of the model such that the plurality of evaluator update their given feedback on the result of the model leading to the non- consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator.

5. The method according to claim 4, further comprising: when the collective feedback on the second model explanation still indicates a non- consensus on the received feedback on the second model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model, repeating the following steps until the consensus on the result of the model is provided: providing at least one other model explanation to the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of evaluator in the result of the model such that the plurality of evaluator update their given feedback on the result of the model leading to the non- consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator.

6. The method according to claims 3 to 5, further comprising: when the plurality of evaluator are unable to reach a consensus on the first model explanation, the second model explanation and/or the other model explanations, checking for a lack on the first model explanation, the second model explanation and/or the other model explanations, wherein, when a lack is identified, classifying the result of the model as to be non- satisfactory such that the model is either rejected due to poor results or referred back to the domain authority for a countercheck of the model, wherein, when a lack is not identified, repeating the following progressively steps: providing at least one other model explanation to the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of evaluator in the result of the model such that the plurality of evaluator update their given feedback on the result of the model leading to the non- consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator.

7. The method according to one of the claims 3 to 6, wherein the first model explanation, the second model explanation, and/or the other model explanation comprises at least one variable of the model.

8. The method according to claim 7, wherein the first model explanation, the second model explanation, and/or the other model explanation further comprises a relevance-indicating-element for each one of the at least one variable of the model and/or a request-element for requesting feedback, in particular feedback on the result of the model, feedback on the first model explanation, feedback on the second model explanation and/or feedback on the other model explanation, from another evaluator of the plurality of evaluator.

9. The method according to claims 3 to 7, further comprising: when the plurality of evaluator update their feedback such that consensus on the result of the model is provided, storing the metadata for the first model explanation, second model explanation and/or other model explanation.

10. The method according to any one of the previous claims, wherein the feedback on the result of the model, the feedback on the first model explanation, the feedback on the second model explanation, and/or the feedback on the other model explanation comprises a tuple, in particular a 3-tuple, including the degree of agreement, the degree of disagreement, and the uncertainty in assigning an agreement or a disagreement.

11 . The method according to any one of previous claims, wherein the collective feedback on the result of the model is determined by average the feedback on the result of the model from the plurality of evaluator.

12. The method according to any one of previous claims, wherein the collective feedback on the first model explanation, the second model explanation, and/or the other model explanation is determined by average the feedback or by a mathematical operator.

13. The method according to any one of the previous claims, wherein the plurality of evaluator is at least two evaluator.

14. The method according to any one of the previous claims, wherein the model is a machine learning model.

15. A system (10) for providing model result quality improvement data for improving the quality of the results of a model, the system comprising: an evaluator interface (11 ) for providing a result of the model to a plurality of evaluator, a at least one first model explanation, a at least one second model explanation, and/or a at least one other model explanation to the plurality of evaluator, and for receiving a feedback on the result of the model, a feedback on the first model explanation from the plurality of evaluator, a feedback on the second model explanation from the plurality of evaluator, and/or a feedback on the other model explanation from the plurality of evaluator; a processor (12) for executing the method according to claims 1 to 14, and a providing unit (13) for providing model result quality improvement data for improving the quality of the results of a model.

Description:
IMPROVED MODEL BASED ON COLLECTIVE FEEDBACK-BASED DECISION ON MODEL PREDICTION QUALITY

TECHNICAL FIELD

The present disclosure relates to a method for providing model result quality improvement data for improving the quality of the results of a model and to a system for providing model result quality improvement data for improving the quality of the results of a model.

TECHNICAL BACKGROUND

The general background of this disclosure is the aid of finding a decision on model results quality based on collective feedback and on providing model result quality improvement data for providing an improved model. In order to improve the trust to a result of a model, model evaluators must agree with the result of the model. Hence, model scientists should incorporate the opinions of the model evaluators to train robust models for receiving improved model results.

Typically, none or only a small amount of totally different opinions of the model evaluators are/can be used by the model scientists for improving the model, because solely a minor amount of opinions of the model evaluators are precise enough and/or in the needed detailed manner. Further, the opinions of the model evaluators are mostly absolutely different, such that it is difficult for the model scientist to recognize how to improve the model and to recognize which opinion is more decisive, in particular to find a consensus of all opinions. Therefore, model scientists always have the problem that the few received opinions are mostly not useable, understandable or comparable such that an optimization of a model by the model scientist can be only done based on few opinions received being complexively chosen by the model scientist. This significantly weakens and prevents a comprehensive and sustainable improvement of the models.

Hence, there is a need to enable an easy, less complex and time saving way to receive and to collect a plurality opinions of the model evaluators and process all of these opinions such that a model can be improved in a significant manner. SUMMARY OF THE INVENTION

In one aspect of the invention a method for providing model result quality improvement data for improving the quality of the results of a model is presented, comprising: providing a result of the model to a plurality of evaluator; receiving feedback on the result of the model from the plurality of evaluator; determining a collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of evaluator; wherein, when the collective feedback indicates a positive consensus on the received feedback of the result of the model from the plurality of evaluator, classifying the model as to be satisfactory; wherein, when the collective feedback indicates a negative consensus on the received feedback of the result of the model from the plurality of evaluator, classifying the model as to be non-satisfactory, wherein, when the collective feedback on the result of the model from the plurality of evaluator indicates a non- consensus on the received feedback of the result of the model from the plurality of evaluator, providing progressively further parameter instigating the plurality of evaluator to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator, providing model result quality improvement data for improving the quality of the results of an model based on the collective feedback indicating a positive consensus on the received feedback of the result of the model.

The term model as used herein is to be understood broadly and represents any system/algorithm calculates, determined and/or modulates data. The model may be a machine learning model. For instance, the machine learning model may be a model being trained to predict yields and provide model explanations e.g. in a decision tree or a rule, but is not limited thereto. The model may be use historical parameters such as temperature, pressure, flowrate etc. and feed composition, catalysts and their amounts and process yields per batch as an input, but is not limited thereto. The model may be a model in training phase or a model being deployed on production.

The term evaluator as used herein is to be understood broadly and represents any person who uses, programs, and/or provides the model or any domain agent. In particular, the user may be a domain expert, a plant operator, a field engineer but is not limited thereto. In particular, the domain agent may be any artificial intelligence agent having collective domain knowledge, like ChatGPT. For instance the domain agent may be a SME’s digital twin, a trained Bayesian Network based on SME knowledge and data, ChatGPT or another generative model, but is not limited thereto. The domain agent reviews the model based on the model explanations, and say, disagrees with what the model has learnt e.g., a rule that oil temperature should be greater than a certain value. In an alternate implementation, multiple Domain Agents across similar processes or plants may also exchange model explanations and collect recommendations from each Domain Agent. Considering their own knowledge of the process, the Domain Agents then propose another (set of) feature(s) and a rule(s) that may be more important. The domain agents may even provide a partial order of their relative importance of various features, derive weighted importance of each feature and provide it as an input to the model.

The term feedback as used herein is to be understood broadly and represents any opinion, belief or view of the evaluator with respect to the model, but is not limited thereto. Further, the term feedback may be the transmission of evaluative or corrective information about a model, action, event, or process to the original or controlling source. The feedback may me positive or negative, wherein a positive feedback indicates an agreement and is described by “1” and a negative feedback indicates a disagreement and is described by “0”.

The term result of the model as used herein is to be understood broadly and represents any results being provided by a model. Exemplary, the result of the model is a prediction, but is not limited thereto.

The term receiving as used herein is to be understood broadly and represents any action for providing, collecting, and/or recording the feedback from the evaluator.

The term providing as used herein is to be understood broadly and represents any action for showing, depicting, presenting results from a model to the plurality of the evaluator. The providing can be provided by an evaluator interface, a monitor, a display, a touchscreen but is not limited thereto. The term model result quality as used herein is to be understood broadly and represents any information, data indicating the validity and the correctness of the model results with respect to a real occurrence or a predefined state.

The term collective feedback as used herein is to be understood broadly and represents any summary/aggregation of the feedback on the result of the model of the plurality of evaluator. The collective feedback may be provided as a single output, or as a plurality of outputs.

The term determining as used herein is to be understood broadly and represents any calculation/determination of the collective feedback. The determination may include allocating, averaging, statistical analyzing but is not limited thereto. Alternatively, the determination may be a calculation with a tuple, in particular a tuple <a,d,u> (a is the degree of agreement with the explanation, d is the degree of disagreement with the explanation, and u is the uncertainty in assigning an agreement or a disagreement; satisfying a+d+u = 1 and a,d,u ∈ [0,1 ]). When using a tuple, the determining uses a mathematical operator being described as the the following formula for determining the collective feedback:

The term consensus as used herein is to be understood broadly and represents the output of the collective feedback determination. Therefore, the consensus may be the judgment arrived by the plurality of evaluators or the collective opinions of the plurality of evaluators. The consensus may be positive or negative. Positive consensus indicates that the plurality of evaluator agree with the result of the model. Negative consensus indicates that the plurality of evaluator disagree with the result of the model. The term non- consensus indicates the opposite of the term consensus.

The term satisfactory as used herein is to be understood broadly and represents that the result of the model are good such that the collective feedback on the quality of the results of the model is satisfactory. The term non-satisfactory represents the opposite of the term satisfactory, i.e. the result of the model are bad such that the collective feedback on the quality of the results of the model are non-satisfactory.

The term parameter as used herein is to be understood broadly and represents any further explanation instigating the plurality of evaluator to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator.

The term model result quality improvement data as used herein it to be understood broadly and represents any data indicating/being indicative for an improving the quality of the results of an model based on the collective feedback indicating a positive consensus on the received feedback of the result of the model. For instance, the model result quality improvement data include data indicating quality improvement adjustment parameters for adjusting the quality of the model, and indicating quality improvement recommendation data recommending which model parameters, variables or routines have to be adjusted in order to improve the quality of the results of the model. The model result quality improvement data can be integrated in the model whose results were evaluated. By including these model result quality improvement data into the old model, the quality of the results of the model can be significantly increased, because disadvantageous quality differences can be compensated.

The term providing model result quality improvement data as used herein is to be understood broadly and represents any action for generating, identifying, determining, or presenting data indicating/being indicative for an improving the quality of the results of an model based on the collective feedback indicating a positive consensus on the received feedback of the result of the model. For instance the model result quality improvement data are provided by a further domain agent (having collective domain knowledge like a SME’s digital twin - could be a trained Bayesian Network based on SME knowledge and data or a generative model but not limited thereto, this domain agent reviews the model based on the model explanations and the collective feedback, and indicates/says how to improve the quality of the model. In this context, the domain agent having knowledge about the model and the collective feedback then propose another (set of) feature(s) and a rule(s) that may be important for the quality of the results of the model and has to be adapted/adjusted for increasing the quality of the model. The domain agents may also provide a partial order of their relative importance of various features, derive weighted importance of each feature and provide it as an input to the model, but is not limited thereto. Based on this domain agent use, a new model can be trained based on the domain agent’s recommendations i.e., features are selected based on the recommendation or biased based on the recommended importance of the features. After training, the new model explanations are provided for predictions. Therefore, the quality of the results of the model can be increased.

The determination of the collective feedback of the result of the model based on the received feedback on the result of the model from the plurality of user leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model user and process all of these opinions for providing providing model result quality improvement data for improving the quality of the results of a model. Further, the providing of progressively further parameters leads to an easy, less complex and time saving way to find/provide a consensus on the received feedback of the result of the model from the plurality of evaluator such that a model scientist can easily process all of these opinions for significant improving the models. Further, by providing model result quality improvement data the quality of the model, in particular the quality of the results of the model, can be significantly increased.

In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model , the feedback on the result of the model from the plurality of evaluator comprises a positive feedback indicating an agreement to the result of the model and a negative feedback indicating a disagreement to the result of the model.

In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model, when the collective feedback on the result of the model from the plurality of evaluator indicates a non-consensus on the received feedback of the result of the model from the plurality of evaluator, providing progressively a further parameter comprises the following steps: providing at least one first model explanation to the plurality of evaluator who gave a feedback leading to the non- consensus on the received feedback of the result of the model; receiving feedback on the first model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the first model explanation based on the received feedback on the first model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the first model explanation indicates a consensus on the received feedback on the first model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the first model explanation gains the confidence of the plurality of evaluator in the result of the model such that the plurality of evaluator update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator.

The progressively providing of at least one first model explanation, i.e. a further parameter, the receiving of feedback on the first model explanation and the determination of the collective feedback on the first model explanation provides a further progressive step which leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model evaluator and process, in particular find a consensus, all of these opinions such that a model scientist can improve the models in a significant manner.

In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model, when the collective feedback on the first model explanation indicates a non-consensus on the received feedback on the first model explanation from the plurality of evaluator who gave a feedback leading to the non- consensus on the received feedback of the result of the model, providing progressively a further parameter comprises the following steps: providing at least one second model explanation to the plurality of evaluator who gave a feedback leading to the non- consensus on the received feedback of the result of the model; receiving feedback on the second model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the second model explanation based on the received feedback on the second model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the second model explanation indicates a consensus on the received feedback on the second model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the second model explanation gains the confidence of the plurality of evaluator in the result of the model such that the plurality of evaluator update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator.

The progressively providing of at least one second model explanation, the receiving of feedback on the second model explanation and the determination of the collective feedback on the second model explanation provides a further progressive step which leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model evaluator and process, i.e. find a consensus, all of these opinions such that a model scientist can improve the models in a significant manner.

In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model, the method further comprises: when the collective feedback on the second model explanation still indicates a non-consensus on the received feedback on the second model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model, repeating the following steps until the consensus on the result of the model is provided: providing at least one other model explanation to the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; wherein, when the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non- consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of user in the result of the model such that the plurality of evaluator update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator.

The term repeating used herein is to be understand broadly and represents any replication of one or a plurality of method steps in the same or in an alternatively order. The repeating is not an infinite cycle, because there are only a finite set of explanations available for the model, such that the system, in particular the user interface, is only able to show a limited number of explanations to provide a possibility for achieving consensus.

The progressively providing of at least one other model explanation, the receiving of feedback on the other model explanation and the determination of the collective feedback on the other model explanation provides a further progressive step which leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model evaluator and process all of these opinions such that a model scientist can improve the models in a significant manner.

In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model , the method further comprises: when the plurality of evaluator are unable to reach a consensus on the first model explanation, the second model explanation and/or the other model explanations, checking for a lack on the first model explanation, the second model explanation and/or the other model explanations, wherein, when a lack is identified, classifying the result of the model as to be non-satisfactory such that the model is either rejected due to poor results or referred back to the domain authority for a countercheck of the model, wherein, when a lack is not identified, repeating the following steps: providing at least one other model explanation to the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model; receiving feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non- consensus on the received feedback of the result of the model; determining a collective feedback on the other model explanation based on the received feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non- consensus on the received feedback of the result of the model; wherein, when the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of evaluator in the result of the model such that the plurality of evaluator update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator.

The term domain authority used herein is to be understood broadly and represents any person or company which owns the model or is responsible for the model. Exemplary, the domain authority may be the provider of the model or the programmer/software engineer of the model, but is not limited thereto. Further, the domain authority may be, in particular preferably are, subject matter experts who would understand the result, e.g. predictions, of the model.

The term lack used herein is to be understood broadly and represents any mistake, error, inconsistency, lack of knowledge being identified in the first model explanation, second model explanation and/or other model explanation leading to an impossibility that the plurality of user are able to reach a consensus on the first model explanation, second model explanation and/or other model explanation.

The checking for a lack on the first model explanation, the second model explanation and/or the other model explanation enables an easy, less complex and time saving way to classify results of a model as to be non-satisfactory.

In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model, the first model explanation, the second model explanation, and/or the other model explanation comprises at least one variable of the model. The term variable of the model as used herein is to be understood broadly and represents any variable or parameter being able to control, influence, amend, manipulate the behavior, respectively the results, of the model.

In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model, the first model explanation, the second model explanation, and/or the other model explanation further comprises a relevance- indicating-element for each one of the at least one variable of the model and/or a request- element for requesting feedback, in particular feedback on the result of the model, feedback on the first model explanation, feedback on the second model explanation and/or feedback on the other model explanation, from another evaluator of the plurality of evaluator.

The term relevance-indicating-element as used herein is to be understood broadly and represents any element indicating the importance/relevance of each one of the at least one variable for the model. Further the relevance-indicating-element may indicate the impact of a variable of the model on the model.

The term request-element as used herein is to be understood broadly and represents any element requesting the system to see other user feedback. In other words the system shares the user feedback with other evaluators.

In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model, the method further comprises: when the plurality of evaluator updates his feedback such that consensus on the result of the model is provided, storing the metadata for the first model explanation, second model explanation and/or other model explanation.

The term metadata as to be used herein is to be understood broadly and represents any data being essential for the model. Exemplary, the metadata may include data with respect to the variables of the method as described above. The term storing as to be used herein is to be understood broadly and represents any saving, exporting or transmitting data/metadata to a storage means like a memory, cloud, database but is not limited thereto.

The storing of the metadata for the first model explanation, second model explanation and/or other model explanation leads to an easy, less complex and time saving way to receive and to collect a plurality opinions of the model evaluator and process all of these opinions, in particular to provide the evaluator in progressive steps with parameter instigating most effectively the evaluator to update his given feedback for finding in an easy manner a consensus. Hence, the model scientist can easily process all of these opinions for significant improving the models.

In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model, the feedback on the result of the model, the feedback on the first model explanation, the feedback on the second model explanation, and/or the feedback on the other model explanation comprises a tuple, in particular a 3-tuple, including the degree of agreement, the degree of disagreement, and the uncertainty in assigning an agreement or a disagreement.

The term degree of agreement used herein is to be understood broadly and represents the degree of agreement/accordance between the opinion of the plurality of evaluator and the results of the model. The degree of agreement may be described as a term in the range from 0 to 1 or may be described in percent, but is not limited thereto.

The term degree of disagreement used herein is to be understood broadly and represents the degree of agreement/accordance between the opinion of the plurality of evaluator and the results of the model. The degree of agreement may be described as a term in the range from 0 to 1 or may be described in percent, but is not limited thereto.

The term uncertainty used herein is to be understood broadly and represents the degree of uncertainty of the plurality of evaluator in assigning their opinions to an agreement or a disagreement. The uncertainty may be described as a term in the range from 0 to 1 or may be described in percent, but is not limited thereto. The detailed indication of the degree of agreement, degree of disagreement and the uncertainty leads to an easy, less complex and fast way to instigate the plurality of evaluator to update their given feedback for finding a consensus such that a model scientist can significant improve the model.

In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model, the collective feedback on the result of the model is determined by average the feedback on the result of the model from the plurality of evaluator.

In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model, the collective feedback on the first model explanation, the second model explanation, and/or the other model explanation is determined by average the feedback or by a mathematical operator.

The term mathematical operator as used herein is to be understood broadly and represent a determination of the collective feedback by a tuple, in particular a tuple <a,d,u> (a is the degree of agreement with the explanation, d is the degree of disagreement with the explanation, and u is the uncertainty in assigning an agreement or a disagreement; satisfying a+d+u = 1 and a,d,u e [0,1 ]). When using a tuple, the determining uses a mathematical operator being described as the following formula for determining the collective feedback:

In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model, the plurality of evaluator is at least two evaluator.

In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model, the model is a machine learning model. In an embodiment of the method for providing model result quality improvement data for improving the quality of the results of a model, the method further comprises the step of: providing control data for controlling a providing of another model based on the model and the collective feedback being classified as to be satisfactory.

In this context, the term providing has to be interpreted as generated or programmed etc., but is not limited thereto. In other words, the control data leads to a generation of a new model, which has a higher result quality than the previous used model.

In a further aspect a system for providing model result quality improvement data for improving the quality of the results of a model is presented, the system comprising: an evaluator interface for providing a result of the model to the plurality of evaluator, an at least one first model explanation, an at least one second model explanation, and/or an at least one other model explanation to the plurality of evaluator, and for receiving a feedback on the result of the model, a feedback on the first model explanation from the plurality of evaluator, a feedback on the second model explanation from the plurality of evaluator, and/or a feedback on the other model explanation from the plurality of evaluator; a processor for executing the above described method, and a providing unit (13) for providing model result quality improvement data for improving the quality of the results of a model.

Any disclosure and embodiments described herein relate to the method and the system, lined out above and vice versa. Advantageously, the benefits provided by any of the embodiments and examples equally apply to all other embodiments and examples and vice versa.

As used herein ..determining" also includes ..initiating or causing to determine", “generating" also includes „ initiating or causing to generate" and “providing” also includes “initiating or causing to determine, generate, select, send or receive”. “Initiating or causing to perform an action” includes any processing signal that triggers a computing device to perform the respective action. BRIEF DESCRIPTION OF THE DRAWINGS

In the following, the present disclosure is further described with reference to the enclosed figures:

Fig. 1 illustrates a flow diagram of a method for deciding on a model result quality based on a collective feedback of a plurality of evaluator;

Fig. 2 illustrates an example embodiment of a system for deciding on a model result quality based on collective feedback of a plurality of evaluator;

DETAILED DESCRIPTION OF EMBODIMENT

The following embodiments are mere examples for the method and the system disclosed herein and shall not be considered limiting.

Fig. 1 illustrates a flow diagram of a method for providing model result quality improvement data for improving the quality of the results of a model. In a first step, a result of the model, in particular a prediction of the model, is provided to the plurality of evaluator. The providing of the model is provided to the plurality of evaluator via an evaluator interface. In a second step, feedback on the result of the model, i.e. prediction, is received from the plurality of evaluator. The feedback is received from the plurality of evaluator via the same or another evaluator interface. The feedback of the plurality of evaluator is described as “0” for a disagreement of the evaluator to the result of the model and as “1” for an agreement of the evaluator to the result of the model.

In a third step, a collective feedback of the result of the model, i.e. the prediction, is determined based on the received feedback on the result of the model from the plurality of evaluator. The determination of the collective feedback is provided by average the received feedback, respectively feedback values “0” and “1”.

When the collective feedback indicates a positive consensus, i.e. f(average prediction) = 1 , on the received feedback of the result of the model from the plurality of evaluator, the collective feedback is classified as to be satisfactory, i.e. the quality of the result of the model is good because all evaluator agree with the result of the model. In this case, no further parameter have to be provided. When the collective feedback indicates a negative consensus, i.e. f(average prediction) = 0, on the received feedback of the result of the model from the plurality of evaluator, the collective feedback is classified as to be non- satisfactory, i.e. the quality of the result of the model is bad because all evaluator disagree with the result of the model. In this case, the non-satisfactory classified collective feedback can be send for a countercheck to the model authority or can be discarded due to poor results/model quality. When the collective feedback on the result of the model from the plurality of evaluator indicates a non-consensus, i.e. f( average prediction) > 0 and < 1 , on the received feedback of the result of the model from the plurality of evaluator, progressively further parameter are provided for instigating the plurality of evaluator to update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator. The progressive providing of further parameter comprises the following sub steps. In a first sub step at least one first model explanation, i.e. further parameter, is provided to the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The providing of the model is provided to the plurality of evaluator via an evaluator interface. In a second sub step, feedback on the first model explanation is received from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The feedback is received from the plurality of evaluator via the same or another evaluator interface. The feedback on the first model explanation is a tuple, in particular a 3-tuple/triple, including the degree of agreement, the degree of disagreement, and the uncertainty.

Feedback on explanation (η ) = a tuple < a, d, u > where: a is the degree of agreement with the explanation, d is the degree of disagreement with the explanation, and u is the uncertainty in assigning an agreement or a disagreement satisfying a + d + u = 1 and a, d, u ∈ [0, 1]

Alternatively, instead of using a 3-tuple/triple for providing feedback, the feedback can also be provided by using probability theory and uncertainty. In a third sub step, a collective feedback on the first model explanation is determined based on the received feedback on the first model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The determination of the collective feedback on the first model explanation is provided by mathematical operators. The operator may be:

Exemplary, a first evaluator agrees with the result of the model (prediction) and largely with the first model explanation but isn’t sure about the explanation because of a lack of visibility into the complete workings of the model, the first evaluator might provide feedback F first evaluator = {1 , <0.7, 0.2, 0.1 >}. Further, exemplary, a second evaluator also agrees with the prediction but not quite with the first model explanation, the second evaluator the engineer provides feedback F second evaluator {1 , <0.5, 0.4, 0.1 >}. When determining the collective feedback by using the above described mathematical operator and the feedback of the first evaluator and the feedback of the second evaluator, the collective feedback from the first evaluator and the second evaluator on predictions is π first evaiuatorΛsecond evaluator = 1 , and the collective feedback on explanations is. η first evaluatorΛ ηsecond evaluator = <0.35, 0.52, 0.13>. Therefore, the collective feedback is interpreted as a disagreement, because degree of disagreement has the largest value, i.e., 0.52 on the explanation. The determination of the collective feedback may be calculated once the feedback is received from everyone or one or more feedbacks are received within an allocated time-frame. When the collective feedback on the first model explanation indicates a consensus on the received feedback on the first model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the first model explanation gains the confidence of the plurality of evaluator in the result of the model such that the plurality of evaluator update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator. In a fourth step, model result quality improvement data for improving the quality of the results of a model based on the collective feedback being classified as to be satisfactory. The term model result quality improvement data are data indicating/being indicative for an improving the quality of the results of an model based on the collective feedback indicating a positive consensus on the received feedback of the result of the model. The model result quality improvement data include data indicating quality improvement adjustment parameters for adjusting the quality of the model, and indicating quality improvement recommendation data recommending which model parameters, variables or routines have to be adjusted in order to improve the quality of the results of the model. The model result quality improvement data are provided by a further domain agent (having collective domain knowledge like a SME’s digital twin - could be a trained Bayesian Network based on SME knowledge and data or a generative model but not limited thereto. This domain agent reviews the model based on the model explanations and the collective feedback, and indicates/says how to improve the quality of the model.

Optionally, control data for controlling a providing of another model based on the model and the collective feedback being classified as to be satisfactory are provided. Control data are data leading to a generation of a new model, which has a higher result quality than the previous used model.

Optionally, when the collective feedback on the first model explanation indicates a non- consensus on the received feedback on the first model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model, providing progressively a further parameter is provided, wherein the providing comprises the following sub steps. In a first sub step at least one second model explanation is provided to the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The providing of the at least one second model explanation is provided to the plurality of evaluator via an evaluator interface. In a second step, feedback on the second model explanation is received from the plurality of evaluator who gave a feedback leading to the non- consensus on the received feedback of the result of the model. The feedback is received from the plurality of evaluator via the same or another evaluator interface. In a third sub step a collective feedback on the second model explanation is determined based on the received feedback on the second model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The determination of the collective feedback is provided by the above described mathematical operator. When the collective feedback on the second model explanation indicates a consensus on the received feedback on the second model explanation from the plurality of evaluator who gave a feedback leading to the non- consensus on the received feedback of the result of the model, the collective feedback on the second model explanation gains the confidence of the plurality of evaluator in the result of the model such that the plurality of evaluator update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator.

Optionally, when the collective feedback on the second model explanation still indicates a non-consensus on the received feedback on the second model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model, repeating the following sub steps until the consensus on the result of the model is provided. In a first sub step, at least one other model explanation is provided to the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The providing of the at least one other model explanation is provided to the plurality of evaluator via an evaluator interface. In a second sub step, feedback on the other model explanation is received from the plurality of evaluator who gave a feedback leading to the non- consensus on the received feedback of the result of the model. The feedback is received from the plurality of evaluator via the same or another evaluator interface. In a third sub step, a collective feedback on the other model explanation is determined based on the received feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The determination of the collective feedback is provided by the above described mathematical operator. When the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non- consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of evaluator in the result of the model such that the plurality of evaluator update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator. This cycle may repeat till the agreement is achieved. However, this is not an infinite cycle, because there are only a finite set of explanations available for a model or only a limited number of explanations is shown to provide a possibility for achieving consensus. Alternatively or additionally, apart from an evaluator’s own professional experience, other further parameters can be used as explanations. Exemplary further parameters as to be explanations may be other modes which users may use to try to get convinced about the model prediction. Further parameters are: the evaluator may ask for importance of another variable than the one for which importance is shown, wherein the method may show the impact of that variable in a visualization (e.g., SHAP values plot), performing a scenario evaluation, wherein the evaluator chooses multiple input variables and system/method generates visualizations (e.g., partial dependency plots and SHAP summary plots) for the same, the evaluators may request the system/method to see other evaluators feedback( in this case, the system/method shares each evaluator’s feedback with the other) such that the evaluators may change their feedback on prediction or explanation after reviewing others feedback, or the evaluators may contest each-others’ feedback and together come to a consensus on the feedback.

Optionally, when the plurality of evaluator are unable to reach a consensus on the first model explanation, the second model explanation and/or the other model explanations, the method comprises the step of checking for a lack on the first model explanation, the second model explanation and/or the other model explanations. Users might be unable to reach a consensus on explanations (i.e., the method is unable to gain an agreement on explanations) due to the following reasons: a) the evaluators might be undecided, there might be a lack of knowledge, the explanation did not increase the collective confidence of the evaluator, and the explanation did not help at all. The reason that the evaluators might be undecided is presented, when the collective feedback tuple may be <0.5, 0.0, 0.5>, <0.0, 0.5, 0.5>, <0.5, 0.5, 0.0> or <0.0, 0.0, 1.0>. The reason that there is a lack of knowledge is presented, when the collective feedback tuple may be <0.5, 0.0, 0.5> and <0.0, 0.5, 0.5>. The reason that explanations did not increase the collective confidence of the evaluators in the result of the model is presented, when the collective feedback tuple may be <0.5, 0.5, 0.0>. The reason that explanations did not help at all is presented, when the collective feedback tuple may be <0.0, 0.0,1 ,0>. When a lack is identified, the result of the model is classified as to be non-satisfactory such that the model is either rejected due to poor results or referred back to the domain authority for a countercheck of the model. When a lack is not identified, the following progressively sub steps are repeated. In a first sub step at least one other model explanation is provided to the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model. In a second sub step, feedback on the other model explanation is received from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model. In a third sub step, a collective feedback on the other model explanation is determined based on the received feedback on the other model explanation from the plurality of evaluator who gave a feedback leading to the non-consensus on the received feedback of the result of the model. The determination of the collective feedback is carried out by the above described mathematical operator. When the collective feedback on the other model explanation indicates a consensus on the received feedback on the other model explanation from the plurality of user who gave a feedback leading to the non-consensus on the received feedback of the result of the model, the collective feedback on the other model explanation gains the confidence of the plurality of evaluator in the result of the model such that the plurality of evaluator update their given feedback on the result of the model leading to the non-consensus to a feedback on the result of the model being in conformity with the consensus on the received feedback of the result of the model from the plurality of evaluator.

Alternatively or additionally, when the feedback from the evaluators turns out to be “harmful” for the model, i.e., the result of the model degrades, these harmful feedbacks may be discarded and transparently shared with users that provided the feedback for root-cause analysis and agreement.

Optionally, when the plurality of evaluator update their feedback such that a consensus on the result of the model is provided, the metadata for the first model explanation, second model explanation and/or other model explanation are stored in a memory. Fig. 2 illustrates an example embodiment of a system for providing model result quality improvement data for improving the quality of the results of a model. The system 10 comprises an evaluator interface 11 being configured for providing a result of the model to the plurality of evaluator, an at least one first model explanation, an at least one second model explanation, and/or an at least one other model explanation to the plurality of evaluator. Further, the user interface 11 is configured for receiving a feedback on the result of the model, a feedback on the first model explanation from the plurality of evaluator, a feedback on the second model explanation from the plurality of evaluator, and/or a feedback on the other model explanation from the plurality of evaluator. The evaluator interface 11 is a touchscreen being able to present information to an evaluator and to receive information from a evaluator. The system 10 further comprises a processor 12 for executing the in Figure 1 described method. The processor 12 and the evaluator interface 11 are connected to each other wireless or by wire in order to provide a data exchange between the interface 11 and the processor 12. Further, the system 10 comprises a memory, not depicted, for storing the metadata for the first model explanation, second model explanation and/or the other model explanation, when the plurality of evaluator update their feedback such that a consensus on the result of the model is provided. The system 10 further comprises a providing unit 13 for for providing model result quality improvement data for improving the quality of the results of a model. The providing unit 13 is communicatively coupled to the evaluator interface 11 and the processor 12.

The present disclosure has been described in conjunction with a preferred embodiment as examples as well. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed invention, from the studies of the drawings, this disclosure and the claims. Notably, in particular, the any steps presented can be performed in any order, i.e. the present invention is not limited to a specific order of these steps. Moreover, it is also not required that the different steps are performed at a certain place or at one node of a distributed system, i.e. each of the steps may be performed at a different nodes using different equipment/data processing units.

In the claims as well as in the description the word “comprising” does not exclude other elements or steps and the indefinite article “a” or “an” does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.