MECHANISTIC CAUSAL REASONING FOR EFFICIENT ANALYTICS AND NATURAL LANGUAGE

Title:

MECHANISTIC CAUSAL REASONING FOR EFFICIENT ANALYTICS AND NATURAL LANGUAGE

Document Type and Number:

WIPO Patent Application WO/2021/092099

Kind Code:

Abstract:

A system and method for mechanistic causal reasoning are provided herein. The method includes receiving an input text from a user, the input text specified in a natural language. The method also includes building a knowledge graph that represents real world facts and associations in the form of contextually tagged and weighted knowledge propositions, in multiple knowledge domains (e.g., causality, taxonomy, meronomy, time, space, identity, language, symbols and mathematical formulas). The method also includes resolving ambiguity and determining actual intent of the user for the input text, from a plurality of interpretations of intent for sentences in natural language understanding, using the knowledge graph in conjunction with natural language understanding and logical inference. The method also includes generating a response to the input text, as to why and/or how unknown factors resulted in a known outcome, or what outcomes are likely given known causal factors.

More Like This:

WO/2023/177116	INTERACTIVE BUNKER TRADING SERVICE PROVIDING METHOD AND BUNKER TRADING SYSTEM USING SAME
WO/2015/181589	METHOD OF PROCESSING A VISUAL OBJECT
WO/2021/104385	METHOD AND SYSTEM FOR ONLINE DATA COLLECTION

Inventors:

ROUSHAR JOSEPH (US)

Application Number:

PCT/US2020/058999

Publication Date:

May 14, 2021

Filing Date:

November 05, 2020

Export Citation:

Click for automatic bibliography generation Help

Assignee:

EPACCA INC (US)

International Classes:

G06F40/10; G06F40/20; G06F40/205; G06F40/284; G06F40/30; G06F40/40; G06F40/42

Foreign References:

US20160078039A1	2016-03-17
US20160378851A1	2016-12-29
US7403890B2	2008-07-22

Attorney, Agent or Firm:

RYAN, Michael, S. et al. (US)

Download PDF:

View/Download PDF PDF Help

Claims:

What is claimed is:

1. A system comprising: one or more memory units each operable to store at least one program; and at least one processor communicatively coupled to the one or more memory units, in which the at least one program, when executed by the at least one processor, causes the at least one processor to: receive input data from a user describing a case and known background information about the case, wherein the case is a set of causes and/or outcomes, wherein the information about the case lacks sufficient information about why a known outcome occurred or what outcome will occur as a result of known causal factors; determine whether the user intends to generate a predicted outcome from known causes or generate predicted causes from a known outcome, wherein forward reasoning that maps known causes to inferred outcomes, or reverse reasoning that maps known outcomes to inferred causal factors, are based on a knowledge graph with subgraphs, wherein at least one subgraph is linked to another subgraph, each of the subgraphs representing a knowledge proposition including: a subject component, an associate component, a named relationship component that links the subject component and the associate component, a context component that identifies a domain of knowledge that an association is true, a qualifier component that describes a constraint governing the relationship that further narrows the context in which the association is true, a weight component that is a probability factor or a likelihood that the proposition that the subject component is related to the associate component in the context identified by the context component, and a mechanism component that describes an action that the subject component is performing to affect the associate component; traverse the knowledge graph, including: associate each word of the input with a lexicon object and associate each lexicon object with a plurality of propositions in the knowledge graph, wherein each proposition corresponds to a subgraph, wherein propositions define a relationship between the subject component and the associate component in the subgraph; classify the input and associated knowledge propositions into named attributes of named specialized processing areas based on named relationships in propositions, wherein each specialized processing area represents a contextual component of a solution, wherein each attribute in each specialized processing area represents a characteristic associated with a concept defining a respective specialized processing area, wherein a candidate is a potential component of an unknown outcome and/or unknown cause associated with the named attribute, wherein each candidate is associated with a modifiable confidence vector including a weight component and an emergence flag, wherein processing in a specialized processing area includes: activating emergent behavior by modifying the weight component of each confidence vector of each candidate, wherein a starting value of the weight component is based on the knowledge proposition weight stored in the knowledge graph, wherein value of the weight component is increased each time a corroborating knowledge proposition is processed, and wherein the value of the weight component is decreased each time a refuting knowledge proposition is processed, and modifying a candidate confidence vector of each candidate in each specialized processing area based on frequency of matching between a respective candidate and at least one of: (i) a respective user input, doping input and priming input, and (ii) knowledge propositions encoded in at least one subgraph, to bring about emergent behavior by incrementing or decrementing a weighting component of the candidate confidence vector; extract emergent candidates from each specialized processing area with a largest value of the weighting component of the candidate confidence vector; and generate a solution of unknown outcome and/or unknown causes based on emergent candidates for each attribute of the specialized processing areas.

2. The system of claim 1, further comprising: a storage architecture configured to include temporary special processing area structures used to classify and organize the input data from a user by named category, including: a plurality of context dimensions as ordered multi -dimensional storage structures, including: a named context header, one or more attribute dimensions, each attribute dimension representing a subject component, each attribute dimension being associated with a respective candidate dimension, wherein one or more attribute dimensions is associated with a respective context dimension, each attribute dimension containing a name representing a specific concept applicable to the named context header of said associated context dimension; wherein at least one candidate dimension contains zero or more knowledge propositions, for each named attribute object in said attribute dimension.

3. The system of claim 2, wherein at least one multi-dimensional structure is specialized in causality, when more than one causal candidate exists in an attribute, said causal candidates are ordered to represent a causal path of predecessor and successor knowledge propositions that form causal factors; wherein in any given multi-dimensional structure specialized in taxonomy, when more than one candidate exists in an attribute, said candidates are ordered to represent a hierarchical or taxonomical ordering scheme of super-ordinate and subordinate classes of objects; wherein in any given multi-dimensional structure specialized in space and time, when more than one candidate exists in an attribute, said candidates are ordered to represent a spatial or temporal ordering scheme of location and time classes of objects; wherein in any given multi-dimensional structure specialized in meronomy, when more than one candidate exists in an attribute, said candidates are ordered to represent a part-whole constructive ordering scheme of part and whole classes of objects; wherein each attribute dimension is defined as either required or optional for solution generation; wherein each said candidate is associated with a vector comprised of magnitude and direction components, constituting an adjustable score for each said candidate; and wherein candidate object related information further includes an original magnitude and emergence flag for each candidate.

4. The system of claim 3, wherein the at least one program further includes instructions for analysis of meaning of an ordered group of input text objects forming natural language phrases and sentences based on a scoring strategy, the instructions comprising: segregating individual words in the input text, adding them to a word list in short-term memory (STM) and searching for each input word in a lexicon having a plurality of words therein, each said word linked to a plurality of knowledge propositions; analyzing morphology of said words by determining if a prefix or suffix has been added to a root word to form said input words, and adding root words to the word list; extracting, from the knowledge graph, said knowledge propositions formed, in part, by each word in the word list; classifying a plurality of candidates formed of directed subgraphs, each said candidate describing an explicit logical relationship between one object and another object, into a specialized processing area; comparing a first or X object of each candidate to find matching objects in STM and adjusting the vector of each candidate based on the quantity of matching objects; comparing a second or Y object of each candidate to find matching objects in STM and adjusting the vector of each candidate based on the quantity of matching objects; comparing a third or C object of each candidate to find matching objects in STM and adjusting the vector of each candidate based on the quantity of matching objects; invoking and executing interpretation heuristics associated with the named relationship or R values of the candidates with the highest score vectors to further reorder concepts in each attribute dimension of each specialized processing area based on fitness; adjusting the score vector assigned to affected candidates based on a quantity of recurring objects or a frequency of encountering recurring objects during heuristic processes; reordering said candidates based on the direction and magnitude of said vectors, wherein said vector directions comprise emerging, static, and falling conditions; wherein said vector magnitudes comprise numeric values, when compared with a numeric threshold value, are determined to be above threshold, at threshold, or below threshold value; determining the context of the of the input text based on the highest scored C object in the appropriate specialized processing areas; invoking and executing additional heuristics to find candidates for any required attributes with no candidates, and if found, repeating the process of segregating, analyzing, extracting, classifying, comparing, invoking, adjusting, reordering, determining, invoking and executing additional heuristics steps; applying a fitness algorithm to determine the fittest candidates of those compared in each attribute dimension of each specialized processing area; and formulating a meaning profile based on the highest scoring or fittest emergent candidate of each attribute dimension of each specialized processing area.

5. The system of claim 3, wherein the at least one program further includes instructions for performing deep natural language understanding, the instructions comprising: receiving input text, formed of a plurality of words, and matching each word with a word in the lexicon to populate an ordered word list; extracting phrases including idioms in the lexicon in which one or more words in the input appear in the phrase, and adding such phrases to said word list; using punctuation and other linguistic cues to segregate each sentence in the input to store each input sentence into an ordered sentence matrix; extracting, from the knowledge graph, propositions formed, in part, by each word in the word list; classifying said extracted propositions in the specialized processing areas based on an applicable attribute of a respective specialized processing area; applying the fitness algorithms to determine the fittest propositions of those compared; and invoking natural language understanding heuristics to interpret the context and relationships of said words, phrases and sentences by analyzing each level of linguistic content of said data objects, wherein the levels include pragmatics or context, semantics, grammar or syntax, morphology, phonology, and prosody.

6. The system of claim 1, wherein the at least one program further includes instructions for analysis of the causality based on a scoring strategy of an ordered group of input text objects forming natural language words and phrases classified into a specialized processing area for causality fitness processing representing causal factors or outcomes, the instructions comprising: providing a plurality of candidates formed of directed subgraphs, each said candidate describing an explicit causal relationship between one object and another object; comparing a first or X object of each candidate to find matching objects in STM and adjusting the vector of each candidate based on the quantity of matching objects; comparing a second or Y object of each candidate to find matching objects in STM and adjusting the vector of each candidate based on the quantity of matching objects; comparing a third or C object of each candidate to find matching objects in STM and adjusting the vector of each candidate based on the quantity of matching objects; invoking and executing causality heuristics associated with the named relationship or R values of the candidates with the highest score vectors to further reorder concepts in the attributes dimension of each specialized processing area; adjusting the score vector assigned to affected candidates based on the quantity of common objects or the frequency of encountering common objects during heuristic processes; reordering said candidates based on the direction and magnitude of said vectors, wherein said vector directions comprise emerging, static, and falling conditions; wherein said vector magnitudes comprise numeric values, when compared with a numeric threshold value, are determined to be above threshold or emergent, at threshold, or below threshold value; determining the context of the of the input text based on the highest scored or fittest emergent C object in the appropriate specialized processing areas; invoking and executing additional heuristics to find candidates for any required attributes with no candidates, and if found, repeating the providing, comparing the first or X object, comparing the second or Y object, comparing the third or C object, invoking and executing, adjusting the score vector, reordering, determining the context, and invoking and executing the additional heuristics ; and invoking and executing causality heuristics to create contiguous causal chains or paths that identify and order the most likely causal factors and outcomes for the input data set.

7. The system of claim 6, further comprising means for generating, filtering and scoring alternative candidates for solutions including, forward-looking solutions selecting and prioritizing predicted outcomes for known causal factors; reverse solutions selecting and prioritizing likely candidate causal factors for known outcomes; heuristic algorithms for applying forward-chaining inference rules to adjust the prioritization of solution candidates; heuristic algorithms for applying backward-chaining inference rules to find candidates in the input or the knowledge network for required attribute dimensions with no candidates; rules within the heuristic algorithms for differentiating binary and non-binary factors and applying weighting to each candidate to show both the likelihood of the candidate of forming part of a final solution and the degree to which emergent candidates participate in the outcome; inheritance rules within the heuristic algorithms for applying characteristics of higher- ordered taxonomical concepts to lower-ordered taxonomical concepts; and a human user interface to display prioritized solutions, their weightings and explanations.

8. The system of claim 7, further comprising a lineage tracking algorithm for generating explanations based on the rules and causal path that lead to the solution, and why other possible solutions were rejected.

9. The system of claim 7, further comprising means of automatically validating a solution by searching literature with an advanced causal natural language interpreter to find and analyze corroborating text stating that said solution is possible, common, unlikely or impossible, including, searching and analyzing text in web pages on the open web; searching and analyzing text in deep web content sources with limited access controlled by membership; and searching and analyzing text in case data in internal systems, documents and databases.

10. The system of claim 1, wherein the at least one program further includes instructions for searching a plurality of named sources for information to be used in the creation of new knowledge propositions to build a knowledge graph for use in causal reasoning and natural language understanding, and in the validation of inferred knowledge propositions and solutions, further comprising: a knowledge graph comprising a plurality of predefined seed concept nodes connected by descriptive, taxonomical, meronomical, spatial, temporal, linguistic and other named relationship vertices, and a plurality of directed subgraphs containing manually defined mechanistic cause and effect nodes connected by relation vertices; a search string formulator algorithm and user interface to search a plurality of named sources for content matching the search string or logical components thereof; a source list manager and user interface for selecting sources to search to support learning and validation; a search hot to read text in each source to find phrases that contain the knowledge for comparison in natural language structures that augment, corroborate or refute existing knowledge propositions; machine learning algorithms using natural language analysis to scan text input from prior literature to automatically infer causal and other relationships contained in the text based on declarative statements containing both cause and effect in transitive active (if/then) or passive (result/because) structure; an inference heuristic with knowledge proposition formation rules that enable creation of new well-formed knowledge propositions of the structure of Claim 1; a plurality of heuristic algorithms for generating concept nodes and descriptive, taxonomical, meronomical, spatial, temporal, linguistic and other named relationships, and generate new directed subgraphs containing mechanistic cause and effect nodes connected by relation vertices based on previously inferred causal and other relationships; weighting algorithms for applying and adjusting confidence values to relations between nodes and directed subgraphs in the knowledge graph based on frequency of validation in literature search; qualifying heuristics using nodes, wherein the qualifier defines a known constraint that further defines the unique relationship between the nodes in a subgraph; machine learning algorithms and heuristics to associate newly acquired or inferred concepts and subgraphs to concepts and subgraphs already present in the knowledge graph, then flag them for validation prior to permanent storage; machine learning algorithms and heuristics to modify pre-existing stored knowledge graph nodes, named relationships, subgraphs, their components and weights; validation heuristics for using found knowledge propositions to augment, corroborate or refute solutions derived from causal reasoning processes; and wherein said sources of information include web pages , natural language material stored on permanent storage media such as file stores accessible to the system, or case data stored in content management systems or databases.

11. The system of claim 1, wherein processing in the specialized processing area further includes: retrieving doping inputs and priming inputs from a context associated heuristic algorithm that generates respective doping inputs and priming inputs for the input, and apply the respective doping inputs and priming inputs to each candidate in each attribute in each specialized processing area.

12. The system of claim 1, wherein the at least one program further includes instructions for: detecting gaps by determining whether any attribute of any specialized processing area is required for a solution that has no candidates and in response to determining that a respective attribute has no candidates, performing further search of the knowledge graph for possible candidates.

13. A method comprising: receiving input data from a user, the input data describing a case and known background information about the case, wherein the case is a set of causes and/or outcomes, wherein the information about the case lacks sufficient information about why a known outcome occurred or what outcome will occur as a result of known causal factors; determining whether the user intends to generate a predicted outcome from known causes or generate predicted causes from a known outcome, wherein forward reasoning that maps known causes to inferred outcomes, or reverse reasoning that maps known outcomes to inferred causal factors, are based on a knowledge graph with subgraphs, wherein at least one subgraph is linked to another subgraph, each of the subgraphs representing a knowledge proposition including: a subject component, an associate component, a named relationship component that links the subject component and the associate component, a context component that identifies a domain of knowledge that an association is true, a qualifier component that describes a constraint governing the relationship that further narrows the context in which the association is true, a weight component that is a probability factor of a likelihood that the proposition that the subject component is related to the associate component in the context identified by the context component, and a mechanism component that describes an action that the subject component is performing to affect the associate component; traversing the knowledge graph, including: associating each word of the input with a lexicon object and associate each lexicon object with a plurality of propositions in the knowledge graph, wherein each proposition corresponds to a subgraph, wherein propositions define a relationship between the subject component and the associate component in the subgraph; classifying the input and associated knowledge propositions into named attributes of named specialized processing areas based on named relationships in propositions, wherein each specialized processing area represents a contextual component of a solution, wherein each attribute in each specialized processing area represents a characteristic associated with a concept defining a respective specialized processing area, wherein a candidate is a potential component of an unknown outcome and/or unknown cause associated with the named attribute, wherein each candidate is associated with a modifiable confidence vector including a weight component and an emergence flag, wherein processing by a specialized processing area includes: activating emergent behavior by modifying the weight component of each confidence vector of each candidate, wherein a starting value of the weight component is based on the knowledge proposition weight stored in the knowledge graph, wherein value of the weight component is increased each time a corroborating knowledge proposition is processed, and wherein the value of the weight component is decreased each time a refuting knowledge proposition is processed, and modifying a candidate confidence vector of each candidate in each specialized processing area based on frequency of matching between a respective candidate and at least one of: (i) a respective user input, doping input and priming input, and (ii) knowledge propositions encoded in at least one subgraph, to bring about emergent behavior by incrementing or decrementing a weighting component of the candidate confidence vector; extracting emergent candidates from each specialized processing area with a largest value of the weighting component of the candidate confidence vector; and generating a solution of unknown outcome and/or unknown causes based on emergent candidates for each attribute of the specialized processing areas.

14. A non-transitory computer readable storage medium, wherein the non-transitory computer readable storage medium stores instructions, which when executed by a computer system, cause the computer system to perform a method comprising: receiving input data from a user, the input data describing a case and known background information about the case, wherein the case is a set of causes and/or outcomes, wherein the information about the case lacks sufficient information about why a known outcome occurred or what outcome will occur as a result of known causal factors; determining whether the user intends to generate a predicted outcome from known causes or generate predicted causes from a known outcome, wherein forward reasoning that maps known causes to inferred outcomes, or reverse reasoning that maps known outcomes to inferred causal factors, are based on a knowledge graph with subgraphs, wherein at least one subgraph is linked to another subgraph, each of the subgraphs representing a knowledge proposition including: a subject component, an associate component, a named relationship component that links the subject component and the associate component, a context component that identifies a domain of knowledge that an association is true, a qualifier component that describes a constraint governing the relationship that further narrows the context in which the association is true, a weight component that is a probability factor of a likelihood that the proposition that the subject component is related to the associate component in the context identified by the context component, and a mechanism component that describes an action that the subject component is performing to affect the associate component; traversing the knowledge graph, including: associating each word of the input with a lexicon object and associate each lexicon object with a plurality of propositions in the knowledge graph, wherein each proposition corresponds to a subgraph, wherein propositions define a relationship between the subject component and the associate component in the subgraph; classifying the input and associated knowledge propositions into named attributes of named specialized processing areas based on named relationships in propositions, wherein each specialized processing area represents a contextual component of a solution, wherein each attribute in each specialized processing area represents a characteristic associated with a concept defining a respective specialized processing area, wherein a candidate is a potential component of an unknown outcome and/or unknown cause associated with the named attribute, wherein each candidate is associated with a modifiable confidence vector including a weight component and an emergence flag, wherein processing by a specialized processing area includes: activating emergent behavior by modifying the weight component of each confidence vector of each candidate, wherein a starting value of the weight component is based on the knowledge proposition weight stored in the knowledge graph, wherein value of the weight component is increased each time a corroborating knowledge proposition is processed, and wherein the value of the weight component is decreased each time a refuting knowledge proposition is processed, and modifying a candidate confidence vector of each candidate in each specialized processing area based on frequency of matching between a respective candidate and at least one of: (i) a respective user input, doping input and priming input, and (ii) knowledge propositions encoded in at least one subgraph, to bring about emergent behavior by incrementing or decrementing a weighting component of the candidate confidence vector; extracting emergent candidates from each specialized processing area with a largest value of the weighting component of the candidate confidence vector; and generating a solution of unknown outcome and/or unknown causes based on emergent candidates for each attribute of the specialized processing areas.

15. A method for mechanistic causal reasoning, comprising: receiving an input text from a user, the input text specified in a natural language; building a knowledge graph that represents real world facts and associations in the form of contextually tagged and weighted knowledge propositions, in multiple knowledge domains including causality, taxonomy, meronomy, time, space, identity, language, symbols and mathematical formulas; and resolving ambiguity and determining actual intent of the user for the input text, from a plurality of interpretations of intent for sentences in natural language understanding, using the knowledge graph in conjunction with natural language understanding and logical inference; and generating a response for the user, as to why and/or how unknown factors resulted in a known outcome, or what outcomes are likely given known causal factors, based on resolving the ambiguity and determination of the actual intent.

16. A non-transitory computer readable storage medium, wherein the non-transitory computer readable storage medium stores instructions, which when executed by a computer system, cause the computer system to perform a method comprising: receiving an input text from a user, the input text specified in a natural language; building a knowledge graph that represents real world facts and associations in the form of contextually tagged and weighted knowledge propositions, in multiple knowledge domains including causality, taxonomy, meronomy, time, space, identity, language, symbols and mathematical formulas; and resolving ambiguity and determining actual intent of the user for the input text, from a plurality of interpretations of intent for sentences in natural language understanding, using the knowledge graph in conjunction with natural language understanding and logical inference; and generating a response for the user, as to why and/or how unknown factors resulted in a known outcome, or what outcomes are likely given known causal factors, based on resolving the ambiguity and determination of the actual intent.

17. A system comprising: one or more memory units each operable to store at least one program; and at least one processor communicatively coupled to the one or more memory units, in which the at least one program, when executed by the at least one processor, causes the at least one processor to: receive an input text from a user, the input text specified in a natural language; build a knowledge graph that represents real world facts and associations in the form of contextually tagged and weighted knowledge propositions, in multiple knowledge domains including causality, taxonomy, meronomy, time, space, identity, language, symbols and mathematical formulas; and resolve ambiguity and determining actual intent of the user for the input text, from a plurality of interpretations of intent for sentences in natural language understanding, using the knowledge graph in conjunction with natural language understanding and logical inference; and generate a response for the user, as to why and/or how unknown factors resulted in a known outcome, or what outcomes are likely given known causal factors, based on resolving the ambiguity and determination of the actual intent.

Description:

Mechanistic Causal Reasoning for Efficient Analytics and Natural

Language

PRIORITY CLAIM AND RELATED APPLICATION

[0001] The application claims the benefit of U.S. Provisional Application No.

62/930,742, filed November 05, 2019, the content of which is incorporated herein in its entirety.

TECHNICAL FIELD

[0002] The present disclosure relates to natural language processing systems, and in particular, to systems, methods, and devices for analytics, search and natural language processing using mechanistic causal reasoning.

BACKGROUND

[0003] The human brain is very good at resolving ambiguity, though the computer is not. Conventional systems that use natural language processing (NLP) use statistical models that do not attempt to understand the intent of natural language (NL) text. These systems statistically match a phrase with a known task or a corresponding phrase in the same or another language. It has long been understood that “meaning-” or “knowledge-based” approaches to language understanding can come closer to human competency. However, the computational and storage demands of these more human-like approaches is assumed to be so high as to be impossible with conventional computing hardware and software.

SUMMARY

[0004] Accordingly, there is a need for computationally efficient “meaning-” or

“knowledge-based” systems and methods for language understanding. Techniques described herein can be used to implement automated mechanistic causal reasoning. Unlike conventional NLP tools that perform tokenizing, morphology and syntax analysis and lightweight semantics, and unlike machine learning (ML) tools that perform phrase analysis and fuzzy phrase comparisons, systems according to the techniques described herein analyze words, phrases and sentences in text at the morphology, syntax, semantics, context, and discourse pragmatics levels with fuzzy heuristic processes at each level. These techniques can be used to interpret meaning, answer questions, perform tasks, control internet-of-things (IoT) devices, identify key ideas and topics, identify word correlations, analyze sentiment, summarize text, translate spoken words or phrases, implement chat-bots, implement dynamic dialog, translate text, and/or analyze causality.

[0005] Various implementations of systems, methods and devices within the scope of the appended claims each have several aspects, no single one of which is solely responsible for the desirable attributes described herein. Without limiting the scope of the appended claims, some prominent features are described. After considering this discussion, and particularly after reading the section entitled “Detailed Description” one will understand how the features of various implementations are used to address issues with traditional methods.

[0006] According to some embodiments, a method is provided for mechanistic causal reasoning using techniques described above. The method is performed by a system that includes one or more memory units each operable to store at least one program, and at least one processor communicatively coupled to the one or more memory units, in which the at least one program, when executed by the at least one processor, causes the at least one processor to perform steps of the method. The method includes receiving input data from a user. The input data describes a case and known background information about the case. The case is a set of causes and/or outcomes. The information about the case lacks sufficient information about why a known outcome occurred or what outcome will occur as a result of known causal factors. The method also includes determining whether the user intends to generate a predicted outcome from known causes or generate predicted causes from a known outcome. Forward reasoning that maps known causes to inferred outcomes, or reverse reasoning that maps known outcomes to inferred causal factors, are based on a knowledge graph with subgraphs. At least one subgraph is linked to another subgraph. Each of the subgraphs represent a knowledge proposition including: a subject component, an associate component, a named relationship component that links the subject component and the associate component, a context component that identifies a domain of knowledge that an association is true, a qualifier component that describes a constraint governing the relationship that further narrows the context in which the association is true, a weight component that is a probability factor of a likelihood that the proposition that the subject component is related to the associate component in the context identified by the context component, and a mechanism component that describes an action that the subject component is performing to affect the associate component.

[0007] The method also includes traversing the knowledge graph. Traversing the knowledge graph includes associating each word of the input with a lexicon object, and associating each lexicon object with a plurality of propositions in the knowledge graph. Each proposition corresponds to a subgraph, and the propositions define a relationship between the subject component and the associate component in the subgraph. Traversing the knowledge graph also includes classifying the input and associated knowledge propositions into named attributes of named specialized processing areas based on named relationships in propositions. Each specialized processing area represents a contextual component of a solution. Each attribute in each specialized processing area represents a characteristic associated with a concept defining a respective specialized processing area. A candidate is a potential component of an unknown outcome and/or unknown cause associated with the named attribute. Each candidate is associated with a modifiable confidence vector including a weight component and an emergence flag.

[0008] Processing in a specialized processing area includes activating emergent behavior by modifying the weight component of each confidence vector of each candidate. A starting value of the weight component is based on the knowledge proposition weight stored in the knowledge graph. Value of the weight component is increased each time a corroborating knowledge proposition is processed, and the value of the weight component is decreased each time a refuting knowledge proposition is processed. In some embodiments, processing in the specialized processing area also includes retrieving doping inputs and priming inputs from a context associated heuristic algorithm that generates respective doping inputs and priming inputs for the input, and applying the respective doping inputs and priming inputs to each candidate in each attribute in each specialized processing area. In some embodiments, processing in the specialized processing area also includes modifying a candidate confidence vector of each candidate in each specialized processing area based on frequency of matching between a respective candidate and at least one of: (i) a respective user input, doping input and priming input, and (ii) knowledge propositions encoded in at least one subgraph, to bring about emergent behavior by incrementing or decrementing a weighting component of the candidate confidence vector. Processing in a specialized processing area also includes generating (as described above) a solution of unknown outcome and/or unknown causes based on emergent candidates for each attribute of the specialized processing areas.

[0009] Traversing the knowledge graph may be performed at different stages in the process and includes using candidate concepts to search the knowledge graph for all subgraph propositions composed partially of the identified concepts. Traversing the knowledge graph (in long-term memory) is also initiated by extracting emergent candidates from each specialized processing area (in short-term memory) with a largest value of the weighting component of the candidate confidence vector. In some embodiments, traversing the knowledge graph is also initiated by detecting gaps by determining whether any attribute of any specialized processing area is required for a solution that has no candidates and in response to determining that a respective attribute has no candidates, performing further search of the knowledge graph for possible candidates.

[0010] In another aspect, a computational system is provided, according to some embodiments. The computational system stores information in the form of a knowledge graph describing real world facts and associations in the form of contextually tagged and weighted knowledge propositions, in one or more knowledge domains including causality, taxonomy, meronomy, time, space, identity, language, symbols and mathematical formulas, that is used in conjunction with natural language understanding and logical inference to accurately determine (e.g., determination accuracy close to that of a human, or human level competence) why and/or how unknown factors resulted in a known outcome, and/or what outcomes are likely given known causal factors. The knowledge propositions are used as a basis of resolving ambiguity and determining the actual intent from among many possible interpretations of intent for sentences in natural language understanding.

[0011] In another aspect, a method is provided for mechanistic causal reasoning, according to some embodiments. The method includes receiving an input text from a user, the input text specified in a natural language. The method also includes building a knowledge graph that represents real world facts and associations in the form of contextually tagged and weighted knowledge propositions, in multiple knowledge domains (e.g., causality, taxonomy, meronomy, time, space, identity, language, symbols and mathematical formulas). The method also includes resolving ambiguity and determining actual intent of the user for the input text, from a plurality of interpretations of intent for sentences in natural language understanding, using the knowledge graph in conjunction with natural language understanding and logical inference. The method also includes generating a response to user, as to why and/or how unknown factors resulted in a known outcome, or what outcomes are likely given known causal factors, based on the resolved ambiguity and the actual intent of the user.

[0012] In another aspect, a non-transitory computer readable storage medium is provided, according to some embodiments. The non-transitory computer readable storage medium stores instructions, which when executed by a computer system, cause the computer system to perform any of the methods described herein.

[0013] In another aspect, a server system is provided, according to some embodiments. The server system includes one or more processors, memory, and one or more programs. The one or more programs are stored in the memory and are configured to be executed by the one or more processors. The one or more programs include instructions for performing any of the methods described herein.

BRIEF DESCRIPTION OF THE DRAWINGS

[0014] So that the present disclosure can be understood in greater detail, a more particular description may be had by reference to the features of various implementations, some of which are illustrated in the appended drawings. The appended drawings, however, merely illustrate the more pertinent features of the present disclosure and are therefore not to be considered limiting, for the description may admit to other effective features.

[0015] Figures 1 A, IB, and 1C show block diagrams that illustrate a system architecture for mechanistic causal reasoning, according to some embodiments. Figure 1 A shows multiple logical architecture tiers, Figure IB shows a physical component architecture of servers whereby disk storage and Long-Term Memory (LTM), and Random Access Memory (RAM) and Short-Term Memory (STM) are analogous, and Figure 1C shows components of LTM and STM and Cache memory, according to some embodiments.

[0016] Figures 2A, 2B, 2C, 2D, 2E, and 2F illustrate organization of nodes and relations in a complex knowledge graph structure showing named relationships, the explicit context and the weighting, according to some embodiments. Figure 2A shows a weighted contextual relationship, according to some embodiments. Figure 2B shows a directed weighted, contextual subgraph with a qualifier, according to some embodiments. Figure 2C shows contents of two directed subgraphs that are part of a hypothetical model of sunrise, according to some embodiments. Figure 2D shows six subgraphs, two each for three ambiguous words in which the knowledge propositions distinguish the possible interpretations in each context, according to some embodiments. Figure 2E illustrates six subgraphs whose knowledge propositions are related to specific words, foot and bridge in an input sentence though some are completely unrelated to the intent of the input sentence, according to some embodiments. Figure 2F shows six subgraphs with knowledge propositions related to the causal relationships between antigens and antibodies, according to some embodiments.

[0017] Figures 3A-3C illustrate causal paths shown as directed linked nodes leading from root cause to mediating causes and finally to an outcome, according to some embodiments. Figure 3 A shows a causal path consisting of a plurality of linked directed subgraphs, according to some embodiments. Figure 3B shows simplified diagrams of a causal confounder and a causal collider, according to some embodiments. Figure 3C shows how several linked causal knowledge propositions lead from a root cause to an effect or outcome, according to some embodiments. Figure 3D illustrates a segment of the knowledge graph containing causal and non-causal nodes and subgraphs connected by concept, according to some embodiments.

[0018] Figures 4A-4C illustrate examples of different types of causal paths in the knowledge graph, according to some embodiments. Figure 4A is a direct causal relationship, according to some embodiments. Figure 4B shows a causal detractor in which the factor impairs, delays or prevents an outcome, according to some embodiments. Figure 4C illustrates a complex causal path with confounder, collider and detractor subgraphs, according to some embodiments.

[0019] Figures 5A-5F illustrate specialized internal processing architecture for classifying, filtering, and/or selecting candidate solutions, according to some embodiments. Figure 5A shows a plurality of specialized multi -dimensional structures and linked heuristics, according to some embodiments. Figure 5B shows the internal structure of a single specialized processing dimension, according to some embodiments. Figure 5C shows examples of candidates with their vector magnitudes and directions, according to some embodiments. Figure 5D shows examples of candidate original and current vector magnitudes and their emergence flags in a STM word list structure in Short-Term Memory (STM), according to some embodiments. Figure 5E shows example structures in STM, according to some embodiments. Figure 5F shows an example sentence matrix, according to some embodiments.

[0020] Figure 6A shows a flowchart of a process of receiving input, classifying the input and performing causal reasoning based on a knowledge graph, according to some embodiments. Figure 6B shows separate roles of human supervisor curation and automated learning and inference, according to some embodiments.

[0021] Figures 7A-7C illustrate an example interpretation architecture at a logical level, according to some embodiment. Figure 7A shows interaction between permanently stored knowledge and volatile memory immediate processing space, according to some embodiments. Figure 7B illustrates threshold logic, and Figure 7C illustrates flow and formulas of emergence, according to some embodiments.

[0022] Figures 8A and 8B illustrate interoperation between model seeding and building process, and real-time knowledge use and feedback process, according to some embodiments. Figure 8A shows how automated and human processes contribute to the permanent knowledge graph, according to some embodiments. Figure 8B shows how initial knowledge building process and real-time causal reasoning process interact with the knowledge graph and contribute to learning, according to some embodiments.

[0023] Figures 9A-9E show example heuristics, according to some embodiments.

[0024] In accordance with common practice, the various features illustrated in the drawings may not be drawn to scale. Accordingly, the dimensions of the various features may be arbitrarily expanded or reduced for clarity. In addition, some of the drawings may not depict all of the components of a given system, method or device. Finally, like reference numerals are used to denote like features throughout the specification and figures. DETAILED DESCRIPTION

[0025] The various implementations described herein include systems, methods, and/or devices for analytics, search and natural language processing using mechanistic causal reasoning.

[0026] Numerous details are described herein in order to provide a thorough understanding of the example implementations illustrated in the accompanying drawings. However, the invention may be practiced without many of the specific details. And, well- known methods, components, and circuits have not been described in exhaustive detail so as not to unnecessarily obscure more pertinent aspects of the implementations described herein.

A. THEORETICAL FOUNDATION

[0001] Without a foundation of scientific knowledge, it is easy to mistakenly assume co-occurring phenomena are related causally when they are not, and to mistake the nature of causal relationships when they are. This section describes unified mechanistic causal reasoning (UMCR) theory, including causal models used to represent prior knowledge, and processes to use it effectively. The techniques described herein can be used to build automated tools and associated reasoning to infer causal relationships using scientific evidence for logic-based bi-directional causal reasoning.

[0002] Bi-directional causal reasoning includes forward reasoning, known causes to inferred outcomes, or reverse reasoning, known outcomes to inferred causal factors. In some implementations, on the one hand, knowledge of underlying mechanisms guides causal ascriptions, while on the other, evidence of causal relationships helps discover mechanisms. Some embodiments apply these ideas to evidence-based medicine whereby mechanistic evidence plays a prominent role in explicit hierarchies of evidence. In order to establish a causal claim, some embodiments establish both a statistical connection between the putative cause and the putative effect and a mechanistic connection that can explain the statistical connection.

[0003] In some embodiments, understanding of causal relations enable human-level intelligence, making strong artificial intelligence (AI) a plausible goal. Some embodiments use Unified Mechanistic Causal Reasoning approach to automatically answer “why” and “how” questions.

[0004] In some embodiments, in this model, prospective and retrospective causal reasoning mean identification of basic, underlying and direct determinants or factors that influence outcomes, as in the logic rule modus ponens. The meaning of mechanism in this model is a specific action or process (f) likely to influence a specific outcome. A factor may have a mechanism (f) or an object (X) or both. Epistemologically, this monistic model treats a mechanism as an intrinsic part of a causal factor, thus is a “unified” model.

[0005] Quantitative analytics represent probabilistic models that can identify correlations but do not demonstrate causality, partly because of the absence of a concept of mechanisms associated with phenomena. This approach uses natural language techniques for mechanistic causal reasoning to answer “why” and “how” questions that are needed for advanced diagnosis and qualitative analytics. The frequent occurrence of a rooster crowing and the sun rising is a commonly invoked correlation that explains why quantitative models can expose and describe correlations, but cannot show causality.

[0006] If it is known that the Earth revolves around the sun and that the rotation of the

Earth on its axis exposes each longitudinal area of the Earth’s surface to sunlight in sequence, then it is difficult to think of a rooster’s crowing as causing the sun to rise. Science and knowledge informs the likely mechanisms of many phenomena, so when a system observes correlations and events that co-occur predictably, the system can quickly dismiss implausible causal factors when the mechanism is scientifically unable to cause the phenomenon.

[0007] A mechanistic theory of causality posits that causal connections are defined by an underlying physical mechanism capable of producing the effect. The case for mechanistic reasoning, especially in health science, is strong. Some conventional systems identify the component parts and operations of a mechanism and the organization is only part of the overall endeavor of developing a mechanistic explanation. The mechanism generating a phenomenon typically does so only in appropriate external circumstances. Some embodiments identify complex external circumstances and explore how variations affect the behavior of the mechanism. For example, in cell biology, a simple example is yeast cells carry out fermentation only when glucose and ADP are available and oxygen is not. For more complex examples, gene expression in cell biology and speciation in evolutionary biology, the relevant external circumstances are more complex.

[0008] When humans communicate by speaking or writing, they do not have to begin by sharing all their knowledge about the world so that the recipients can understand what they are saying. Speakers assume that the recipients share a huge body of knowledge about the world. In fact, communications are often tailored to address the recipients’ expected or perceived knowledge level. In some embodiments, the system is designed on the premise that, for a computational system to approximate human performance in interpreting language, the system must begin with a corresponding body of world knowledge.

[0009] Most sentences contain verbs, and verbs are inherently causal. Thus, causal reasoning is a fundamental part of understanding language. The interpreter described herein contains a knowledge graph stored as machine-readable digital data that provides this breadth of causal, pragmatic and other knowledge.

[0010] Pragmatics is a subfield of linguistics and semiotics that studies the ways in which context and knowledge taxonomy contributes to meaning. Pragmatics encompasses speech act theory, conversational implicature, talk in interaction and other approaches to language behavior in philosophy, sociology, linguistics and anthropology. In some embodiments, the language interpretation process focuses on pragmatics as a central part of the interpretation process and incorporates causal reasoning as a component of equal importance with semantics, syntax and other more traditional linguistic analyses.

[0011] In some embodiments, to manage the combinatorial explosion of possibilities, the natural language (NL) interpreter makes no attempt to store nor seek any of the possible interpretations of an entire sentence or utterance in the knowledge base but describes components of solutions associated with words and phrases. This mirrors the way people assemble words and phrases to communicate intent. The knowledge base, therefore, attempts to describe each possible solution of each token that is a component of any possible input text or utterance.

[0012] Input for NL interpretation is referred to herein as “input text”, while input strictly for causal reasoning is herein referred to as “case” data. This approach assumes that most presented inputs will have a sufficient mass of solvable or interpretable components, and that the aggregation of the solved components will be sufficient to describe an acceptable interpretation of the input. It also assumes that the more accurately and dependably the system can resolve the ambiguity and polysemy of the meanings of individual tokens as components, the more accurate the final interpretation will be.

[0013] The problem of polysemy applies to words, phrases and sentences with multiple meanings. Learning and delivering individual resolutions to polysemy at the lexical word and phrase levels makes the NL interpreter better able to solve aggregate problems of phrase and sentence ambiguity, therefore increasing the accuracy of interpretation. In some embodiments, this system resolves important components of ambiguity and polysemy through advanced causal reasoning.

B. SYSTEM OVERVIEW

[0014] Some embodiments include an optimized knowledge graph that supports advanced natural language (NL) interpretation and causal reasoning that runs in a multi-tiered computing environment (an example of which is shown in Figure 1 A), on physical or virtual servers (as shown in Figure IB). In some embodiments, knowledge components (examples of which are shown in Figures 2A-2C, and 3A-3C) are structured based on a universal knowledge theory capable of efficiently supporting highly accurate NL understanding and causal reasoning across an unlimited number of contexts and knowledge domains.

[0015] Referring to Figure 1 A, in some embodiments, the system is configured to receive input data from a user describing a case and known background information about the case through interfaces including mobile based NL dialog interfaces 101, such as Apple and Android devices, and workstation-based visual interface. Mobile devices and workstations connect to the modules, services or micro-services layer 103 through application program interfaces 102 or APIs.

[0016] In some embodiments, the computer system modules or services 103 are configured to determine whether the user intends to generate a predicted outcome from known causes or generate predicted causes from a known outcome. For example, the computer system modules or services 103 may provide a user interface configured to receive an indication from a user to generate a predicted outcome from known causes or generate predicted causes from a known outcome.

[0017] In some embodiments, the computer system modules or services 103 are configured to automatically determine a forward or reverse causal reasoning. The process components or modules, services or micro-services 103 can be deployed to virtualized 104 or physical infrastructure 105 in an elastic “cloud” hosted data center or on-premises data center. The system may be used in qualitative analytics and causal reasoning for scientific discovery and development of new therapies as an Innovation Knowledge Engine (IKE) supported by a suite of semantic language understanding tools, or IKE Semantic Suite, hereinafter abbreviated as “IKE”.

[0018] IKE is a modular dialog-based workload and workflow-driven system, according to some embodiments. A workload manager is responsible for maintaining the overall state of each process in the system and notifying users when input is needed and when solutions are ready for review. IKE APIs provide device-independent user interfaces for both mobile interactions, mostly speech driven, and visually rich desktop interactions.

[0019] In some embodiments, IKE constitutes an operational system architecture and structure for storing and processing proposition information in digital, analog, or other machine-readable formats, and includes:

• a multi-tiered processing architecture with infrastructure and virtualization tiers underlying the modules or services, APIs and user services tiers including a non volatile permanent storage area, analogous to human long-term memory LTM for retaining the knowledge graph;

• a volatile working cache storage area or kernel memory, for temporary storage and analysis of not yet validated portions of said information;

• a volatile ready access storage area (such as RAM or random access memory), analogous to human short-term memory (STM) for retaining a portion of the information from said working cache storage area, the information in said working cache storage area copied from said permanent storage area, said ready access storage area, or from both; • a lexicon in said permanent storage area comprised of letters, symbols, words, numbers, and combinations thereof;

• a lexicon hash table in said permanent storage area to expedite search of said lexicon;

• one or more structures comprised of a two-dimensional matrix supporting input management and coordination; and

• one or more structures comprised of a one-dimensional list of information for reference purposes; and/or

• one or more specialized processing structures to classify and organize input information by named category, supporting complex analytical and fitness testing processes.

[0020] IKE uses optimized structures in internal architecture of computer servers 110 as shown in Figure IB, according to some embodiments. In some embodiments, CPU cores 111 (sometimes called processors) process data passed across a system bus 112 between Random Access Memory 113 (RAM), which is analogous to a human Short-Term Memory (STM). The bus 112 also mediates information exchange with a cache 114 and permanent storage 115 which is analogous to human long-term memory (LTM). The bus 112 connects the computing input and output to external interfaces 116 including keyboards and mice 117, display monitors 118, microphones and speakers 119 to receive and reproduce sound such as voice signals, and/or network adaptors 120 for local area and wide area interconnectivity, according to some embodiments. Some embodiments include GPUs, FPGAs, and/or ASICS, in addition to, or instead of, the CPU cores 111. In the following description, the operations described as being performed by the CPU cores 111 can be performed by any type of processor, according to some embodiments.

[0021] It is noted that the physical and/or virtual infrastructure described herein are only provided for illustration, as a generic framework for efficient computational capabilities, digital, and/or analog processes.

[0022] In some embodiments, the computer system is configured as one or more memory units 104 and 105 each operable to store at least one program. [0023] In some embodiments, the computer system is configured as at least one processor 111 communicatively coupled to the one or more memory units 113 and 114, in which the at least one program 103, when executed by the at least one processor 111, causes the at least one processor to receive input data from a user describing a case and known background information about the case through interfaces including mobile based NL dialog interfaces 101.

[0024] In some embodiments, the computer system modules or services 103 are configured to determine whether the user intends to generate a predicted outcome from known causes or generate predicted causes from a known outcome. For example, the computer system modules or services 103 may provide a user interface configured to receive an indication from a user to generate a predicted outcome from known causes or generate predicted causes from a known outcome.

[0025] In some embodiments, the computer system modules or services 103 are configured to determine a forward or reverse causal reasoning.

C. KNOWLEDGE ENCODING SCHEME

[0026] Referring next to Figure 1C, NL interpreters require exhaustive information about physical and abstract things in the real world as well as information about linguistic patterns and structures and their causal, taxonomical and other interrelationships. For efficient computational processes, knowledge must be stored intelligently and efficiently. The NL interpreter stores the interrelationship information in a knowledge graph 121 of symbolic propositions associated by explicit links, built in a framework of semantic primitives that relate to the full range of natural phenomena and human experiences. This knowledge graph 121 is analogous to Long-Term Memory (LTM) in humans. In some embodiments, the forward or reverse causal reasoning is based on said knowledge graph 121 with subgraphs.

[0027] For the IKE interpreter, the semantic base primitive is “intent” as expressed in words using by its parent primitive, “communication”. The IKE interpreter processes the speech or text communicated to determine intent based on the words chosen. Thus, the knowledge network contains the solution set as a whole and the a-priori weights are the Bayesian distribution. In some embodiments, copying a subset of permanently stored propositions from LTM 115 into specialized processing areas in STM 122 includes populating a subset of the Bayesian network which is the aggregate of potentially meaningful propositions applied to any given input.

[0028] Because of the expansiveness of the knowledge network and the fact that only small portion of that knowledge will be needed to interpret any given sentence or paragraph, in some embodiments, the salient information discovered through searching the knowledge network is copied into a temporary processing area that is an optimized STM 122. While knowledge or information in LTM is persistent, the contents of STM are frequently changed and modified during processing. For performance purposes, in some embodiments, a working storage area 123 or cache is also used to store information that may be needed for multiple successive or parallel reasoning or interpretation processes.

[0029] In some embodiments, the system is configured to associate each word of the input with a lexicon object 124 and associating each lexicon object with a plurality of propositions in the knowledge graph 121. Each proposition corresponds to a subgraph, and the propositions define a relationship between the subject component and the associate component in the subgraph.

[0030] In some embodiments, the atomic or basic components of this information are encoded in a lexicon 124 holding lexical entries or tokens. These tokens can be letters, words, numbers and characters that are not alpha-numeric, but are used understandably in communication. For processing efficiency, in some embodiments, the lexicon is accompanied by a hash table for rapid information search and retrieval.

[0031] In some embodiments, in order to access knowledge in the knowledge graph, the lexicon 124 is used to provide direct access to each proposition in the network associated with that lexical entry or token through a link table or association file 125. Non-lexical object tokens can also be used to access the knowledge network. This direct access is analogous to a content-addressable mechanism for reading information in human LTM.

[0032] In some embodiments, the lexicon 124, association file 125 and knowledge graph 121 are dynamic building blocks of correct interpretation. They are dynamic because new lexical entries can be added, new propositions can be added and confidence values of propositions can be changed. The primary processes of reasoning and interpretation are based on comparing input with this graph of propositions, determining the likelihood that specific propositions apply and are true, then delivering the set of the most applicable and likely propositions as the solution to causal inquiries or interpretation of the original intent.

[0033] Just as people use knowledge about underlying mechanisms to infer factors and outcomes, this process taps into stored “hypothetical” models that are preconceived, and pre-validated expectations about how things work. In some ways, this is not unlike quantitative analytics. Analysts and data scientists spend a significant amount of time up front gathering and organizing the information needed for reports, visualizations and dashboards. They build and test formulas for optimally expressing meaningful indicators in the output.

The optimized “Online Analytical Processing” data structures, report formats, formulas and choices of visualizations constitute the a-priori knowledge needed for successful analytics.

[0034] In some embodiments, each domain and context, such as surface transportation and driving, has a hypothetical model which comprises the set of directed causal subgraphs whose C object matches the name of the domain or context. Thus, extracting the hypothetical model is a simple search of the graph for subgraphs with C objects matching the identified domain or context. Search for environmental factors involves a “spreading activation” process in which the system extracts from LTM subgraphs that are directly connected to the X, Y and C objects of the directed subgraphs in the hypothetical model.

[0035] In some embodiments, for processing input, IKE first loads the hypothetical model into STM as a session state, then classifies the verbal description and observational learning inputs into the session for text analysis, and numeric data into session for quantitative analysis. In this way, some embodiments rapidly identify inputs that correspond to model elements and begin identifying outcomes ranking causal candidates even as data is being acquired.

[0036] Mechanisms are verbs usually with -ing endings. As an example, a pairing of a component (X) and a mechanism (f) is X = “battery” and f = “discharging”. This factor might be used in reasoning about an inoperative automobile. Each knowledge proposition can be read as a natural sentence: X “is a” R “of’ Y “in the context of’ C “that is” Q “with a probability of’ W. [0037] In some embodiments, the forward or reverse causal reasoning is based on said knowledge graph 121 with subgraphs. The system may be configured to associate each word of the input with a lexicon object 124 and associating each lexicon object with a plurality of propositions in the knowledge graph 121, wherein each proposition corresponds to a subgraph, wherein propositions define a relationship between the subject component and the associate component in the subgraph.

D. THE UNIVERSAL INFORMATION THEORY

[0038] The theory behind the IKE Knowledge Representation scheme used to electronically store real-world knowledge, is that high performance can be achieved if all the knowledge is stored in the minimum possible efficient format to support brain-like processing, simulating “spreading activation” of excitatory and inhibitory electrical impulses. Consistency in format lends itself to computational efficiency, thus IKE captures and stores knowledge in a format consistent with the following Universal Information Theory.

[0039] Referring next to Figure 2A, the Universal Information Theory states the following regarding all physical things and abstract concepts that exist:

• All things 201, physical and abstract, can be represented by unique words, symbols or phrases in human language.

• There is nothing, physical or abstract, that is not related to some other thing 202.

• A taxonomy of things or objects can be defined to describe category and part-whole relationships to connect all objects into a single interconnected graph or network.

• Causal chains or paths can be articulated which describe how physical things and abstract concepts interact with other things leading from actions to reactions.

• Each relationship between two things can be described in such a way that an explicit relationship “R” 203 ties each pair of things together.

• Explicit relationships “R” can be described by a finite set of words that logically and linguistically express the nature of each relationship. • Relationships are governed by context “C” 204, such that a valid relationship between two things in one context may be invalid or different in another context.

• Relationships may be further Qualified by constraints “Q” 205 that describe unique characteristics of the relationship or link relevant external information.

• All the objects in a relationship 206, and the descriptors of the relationship, context and constraints can be represented by human language words, symbols or phrases.

• The ordering of a pair of objects, a relationship, a context and constraint constitute a single directed proposition in which the first or X object is the subject concept, the second Y object is the associated concept and the R, C and Q objects uniquely define the proposition.

• For each proposition, a level of probability, confidence or belief may be applied and the confidence value expressed as a weight “W” 207.

[0040] The interconnectedness of this theory lends itself to modeling as a graph and implementing in a graph or relational database. Figure 2A emphasizes how the context defines the relationship R 203 between two objects, X 201 and Y 202, and other relationships may exist between X and Y in the same context or in different contexts, according to some embodiments. Referring next to Figure 2B, the weight component 216 is a probability factor of the likelihood that the proposition that the subject component 211 is related to the associate component 212 in the context identified by the context component 214, according to some embodiments.

[0041] In some embodiments, the IKE system represents knowledge in a concise formal logic proposition format based on the theory. While the graph format favors a graph database platform for implementation, big data such as Hadoop, relational databases and flat databases are also options because the format is simple and adaptable. Nothing in this disclosure should be interpreted as requiring a certain commercial database, server, network, programming language or other standard.

[0042] An example formula for representing this universal theory of information is shown below, according to some embodiments:

For all objects X, (((((X is related to at least one other object Y) by an explicit relationship R) within a specific context C) qualified by a constraint Q) with a probability of W).

[0043] In some embodiments, a context component 214 identifies a domain of knowledge in which the association is true.

[0044] A node is an element in a graph that has a name and a value and may be connected to any number of other name/value pair nodes by vertices. A vertex is a named relation which may be a simple semantic role such as agent, instrument or object, or a complex causal role such as catalyst, initiator or contributor, or a negative role such as barrier, impediment or terminator. Because this knowledge representation scheme includes a massive collection of compact statements of propositional logic, the structure of nodes and subgraphs is useful for pictorial description shown in Figures 2A-2C. Some embodiments use a graph database for implementation.

[0045] Referring next to Figure 2B, in some embodiments, each of the subgraph objects 206 represent a knowledge proposition including a subject component 211, associate component 212, named relationship component 213 that links the subject component and the associate component, a context component 214, a qualifier component 215 that further narrows the context in which the association is true, a weight component 216, and a mechanism component 219.

[0046] In some embodiments, this encoding scheme for real-world knowledge stores information as knowledge proposition subgraphs and objects in machine-readable format optimized for use by expert system, interpretation and learning algorithms. This is analogous to permanent or LTM in humans in which knowledge is learned and remembered. In this interconnected network or graph of explicit concept subgraphs, connections are formed by juxtaposition of objects in relationships that form directed subgraphs. These subgraphs are defined by their nodes, in which each node is named by an explicit object that consists of a token that can be lexical (a word, symbol, number or phrase) or non-lexical tokens, such as machine-readable images, videos and sounds, according to some embodiments.

[0047] In some embodiments, the knowledge proposition is represented computationally in a sequential manner by ordering the objects such that the sequence begins with the X object as subject, and sequentially followed by the remainder of the objects. In some embodiments, the objects contain explicit role labels (X, R, Y, C, Q, W) when implemented as a relational database or a tagged structure such as JSON or XML, or the label of the object role can be inferred from position in the subgraph if all subgraphs are structurally identical. The token nodes can be represented in a file independently from the subgraphs that represent the propositions associated with the tokens. In some embodiments, this independent representation is a list of words, symbols and phrases, called a lexicon, and the tokens are alternately described as lexical items. In some embodiments, the independent representation may also contain non-lexical objects.

[0048] In some embodiments, a lexicon is used to represent the basic elements or nodes of knowledge because all human knowledge is represented by words, symbols and phrases, and if humans cannot describe it using a word, symbol or a phrase, cannot share it using verbal communication. As language evolves to accommodate new knowledge, new words and phrases are coined. In some embodiments, the NL interpreter invokes machine learning to add new words, phrases and other tokens to the lexicon to represent knowledge that is new to the system or new to the language. In some embodiments, the lexicon also contains non-lexical items, such as sounds and images to broaden interpretation capabilities.

[0049] The name of any of the objects described above may be formed by a single letter, number or symbol, or a string of letters, numbers or symbols, and is valid as long as the name is recognizable by someone as a physical or abstract thing in the universe. The specific usage of nodes in knowledge subgraphs may be tailored for a specific class of relationships or a context. Referring back to Figure 2B, in some embodiments, causal relationships include a component X 211 that is a cause or a predecessor in a causal path to an outcome Y 212 that may operate as a mediator. X 211 and Y 212 are joined by a named relationship R 213 within a governing context C 214 with an optional qualifier Q 215. This molecular relationship forms a directed subgraph in the overall knowledge graph and is assigned a weight W 216. The ordering of the objects 217 represents the direction of the causal path and when the causal relationship is truly bi-directional, which is rare but possible, two separate subgraphs, one with X and Y reversed, are needed to express the bi-directionality, according to some embodiments.

D. CAUSAL KNOWLEDGE

[0050] Because the IKE causal model is mechanistic, a causal factor includes both the component X 218, a noun and the mechanism f 219, a verb usually ending in -ing, according to some embodiments. The subject or X object of causal relationships may be formed of a component noun X alone or a mechanism verb f alone, but the preferred structure embodies the combination of X and f. Non-causal relationships may also use the structure of X and f or any other unique structure suited to the nature of the relationship. Two words joined, such as “Earth rotating” or “batter swinging”, form causal factors as phrases that act as natural X objects. In fact, there are many multi-word idioms in the lexicon that form valid X objects in molecular subgraphs such as “up in the air” and “down in the dumps”, each forming a discreet concept that serves as the subject of molecular knowledge propositions. Compact phrases of this type are common in the IKE knowledge model.

[0051] This approach treats each phenomenon as occurring in a domain and context.

As an example, as shown below in Table 1 and in Figure 2C, within the domain “celestial bodies” and the context Earth’s “Solar System”, the phenomenon of sunrise can be defined in a finite set of knowledge propositions as directed subgraphs. Some propositions describe the taxonomy in which the related objects exist while some describe causal factors:

Table 1: Some knowledge pertaining to “sunrise”

[0052] The following examples illustrate using natural phrasing to articulate two of these knowledge propositions: “Earth rotating causes the day-night cycle in the context of the solar system that is 24 hours” and “Sunrise is an event of the day-night cycle in the context of Earth that is night’s ending.” [0053] Things as complex as the functions and interaction of celestial bodies cannot be fully described in a few knowledge propositions shown above. For example, the need for an observer for the concepts of sunrise and sunset to be completely meaningful in human terms, cannot be fully described. The complexity of the astronomical and simply observable phenomena is not reflected in the small subset of the knowledge graph herein. Observers use a combination of senses and life experiences to interpret, remember and understand the meaning of “sunrise”. But the ability of a plurality of such knowledge propositions to express natural phenomena and human experience, even when incomplete, serves as a foundation for both causal reasoning and natural language understanding, and as such, supports accumulating more knowledge (i.e. concept learning) to further improve the quality of artificial intelligence functions.

[0054] The subjects X and activities f may be articulated as follows:

• XI is “Earth” and fΐ is “revolving around the sun” and f2 is “rotating on its axis”

• X2 is “Sun” and f3 is “emitting light” and while humans understand that the sun is in motion vis-a-vis the galaxy it is not critical to understanding the phenomenon of sunrise.

• X3 is a phenomenon called sunrise that marks the night’s ending and the day’s beginning.

[0055] In some embodiments, the mechanism component 219 describes an action that the subject component 218 is performing to affect the associate component 212.

[0056] In some embodiments the set of causal relations (R) 203 and 213 may include cause, instrument, agent, means, catalyst, mechanism, product, byproduct, output, response and result among others. But non-causal relation types may also support causal reasoning.

[0057] The two core linguistic phenomena deeply connected with the ways humans express and understand causality are semantics and pragmatics. Semantics is the language stratum concerned with meaning, especially concerned with the agents, instruments, objects and outcomes of actions. Pragmatics is concerned with the truth values representing the logic and statement of logical propositions of ideas of what can, did or will occur, or not, in the real world as expressed symbolically by spoken utterances or written text based on the peoples’ language strategies used to express those ideas. [0058] In some embodiments, knowledge related to both phenomena is embedded in the knowledge network wherein subgraphs explicitly describe semantic and pragmatic phenomena as knowledge propositions. Examples include the following (shown in Figure 2D). The word “buckle” can be interpreted as a connecting 232 process 233 in the context of clothing 234 that involves a belt 235. In a different context, buckle 236 can be interpreted as a result 238 of stress 237 in the context of materials 239 that causes deformation. The word “fix” 241 is a process 243 of repairing 242 in the context of objects 244 (which is understood to be a universal construct embodying both physical and abstract objects) that is restorative 245. Fix 246 refers to a surgical procedure 248 of neutering 247 in the context of animal reproduction 249 that becomes infertile 250. The idiom “throw out” 251 is an action 253 of disposal 252 in the context of cleaning 254 that involves garbage 255. In the same conference room as one person throws out an empty soda can, a participant may throw out an idea. In this case, throw out 256 is an action 258 of introducing 257 in the context of interaction 259 that involves ideas 260.

[0059] These three examples represent a small subset of the knowledge propositions describing the words “buckle”, “fix” and “throw out” in the knowledge network, but are intended to show how the contextual marking of these knowledge propositions enables resolution of ambiguity in ways not possible with other knowledge representation schemes. Resolution of ambiguity is the core contribution of semantic and pragmatic processes this approach uses for both language understanding and causal reasoning.

[0060] The example shown above illustrates how IKE knowledge propositions support the resolution of linguistic ambiguity. The next example in Figure 2E shows a small subset of the knowledge that would be used to resolve ambiguity in an explanation that a patient may state to a podiatrist: “I feel pain in my left bridge when I wear my dress shoes”.

[0061] The word “bridge” 261 is ambiguous, and, in addition to the foot 262 there are several parts 263 of the human anatomy 264 referred to as bridge. In the case of the foot, the bridge is in the upper 265 area. Bridge 266 is also a structure 267 type 268 described in civil engineering 269 that is used to pass over 270 roads, conduits or natural features. The knowledge proposition beginning with 261 applies directly to the patient’s statement and the proposition beginning with 266 does not, illustrating the differentiation of knowledge used to resolve ambiguity. [0062] “Foot” 271 is also ambiguous, and while the drawing does not show a knowledge proposition that describes the foot 262 as part of human anatomy 264, such propositions exist in the knowledge network to distinguish the body part from the unit of measure 273 that is used in description 274 to represent things of 12 inches 275 in length 272. Combinations of foot 271 and bridge 266 can exhibit further ambiguity such as the term “footbridge” 276 which, in the context of civil engineering 279 is a type 278 of pedestrian 280 bridge 277.

[0063] To resolve the ambiguity of foot and bridge, in some embodiments, other knowledge propositions can both nudge the contextually consistent propositions toward emergence and trigger heuristic processes that serve to promote or “heat up” related knowledge and disqualify or “cool down” pragmatically unrelated knowledge propositions. They effectively create a “resonance” that favors the best interpretations of foot and bridge. The heating up and cooling down of candidates mimics a natural selection process embodied in genetic algorithms in which the fittest candidates survive.

[0064] As illustrations of these influences in the current example, among many other related knowledge propositions in the knowledge network, one describes a shoe 281 as a clothing 282 type 283 in the context of human attire 284 that is worn on the foot 285. This linkage will heat up knowledge propositions whose context is related to humans including the foot 262 of which bridge 261 is a part 263, and cool down a foot 271 used as a unit of measure 273 and a bridge 266 and 276 that exists in the context of civil engineering. There may also be a “clothing” heuristic that builds a new temporary special processing area with attributes that can use candidates to answer questions about dressing and attire, or an anatomy heuristic that can use candidates to answer questions about body parts.

[0065] Another knowledge proposition that will heat up human interpretations of foot and bridge will be the last shown in this series. There will be many propositions associated with pain 286, most of which will directly or indirectly refer to the context of organisms 289. The fact that pain 286 is a response 288 to irritation 287 in the context of organisms 289 that acts as a warning 290 will, in addition to the correct interpretation of the statement, support causal reasoning as the mechanism of the pain 286 is likely to be irritation 287 caused by the shoe 281 on the bridge 261 of the patient’s foot 262. Additional specific causal propositions could reinforce this causal inference, as could a “pain heuristic” that builds a new temporary special processing area with attributes that can use candidates to answer questions about its nature, sources and acuteness.

[0066] Many knowledge propositions support causal reasoning without explicitly containing members of the set of causal relations (R) 203 and 213. This is especially the case in complex knowledge domains such as human biology. As an example, Figure 2F shows that protein binding 2001 is fundamental to antibodies 2002 that participate in the process 2003 of natural 2005 healing 2004. Antibodies 2006 are a product 2008 of an immune reaction 2007 in the context of healing 2009 supported by the bone marrow 2010.

[0067] The system contains hundreds of knowledge propositions that define this process in enough clarity, completeness and expressiveness to enable robust natural language interpretation and causal reasoning. Some describe things at the cellular level, such as a lymphocyte 2011, a white blood cell 2012 type 2013 in the immune system 2014 needed for healing 2015. Knowledge propositions describe instances 2018 of systems 2017 such as the immune system 2016 possessed by many types of organisms 2019 to contribute to healing 2020. Again, the processes of classifying these knowledge propositions into specialized processing areas where heating and cooling heuristics bring about their emergence replicates human cognitive processes associated with language understanding and causal reasoning.

[0068] There may be many knowledge propositions that add important associations to causal reasoning processes. As an example, healing may be associated with injury, disease or both. For injuries there may be sets of knowledge propositions associated with cellular regeneration, cell division and mitosis. For diseases 2024 in which an antigen 2021 is a type 2023 of protein 2022 that acts as an irritant 2025, the biological mechanisms of healing 2029 involve natural processes in which antibodies 2026 are a protein 2027 type 2028 that is secreted by B cells 2030 to respond to the irritant 2025.

[0069] The way knowledge propositions support causal reasoning without explicitly containing members of the set of causal relations (R) 203 and 213 is by supporting semantic and pragmatic reasoning that corroborate or refute causal reasoning processes through the heating and cooling processes described earlier.

[0070] Semantically, an “object” may be an agent, instrument or object. Any pairing of an object and an activity form a complete factor and may be treated a candidate in a causal chain, according to some embodiments. Factors with a known mechanism and an unknown object, or a known object with an unknown mechanism can also be causal candidates and outcomes, but the confidence in the verdict diminishes, according to some embodiments.

[0071] In some embodiments, the IKE causal model is an ontology of interactions between factors and outcomes that form causal chains or paths to comprise a hypothetical model. The elements in the model are weighted with confidence values to permit fuzzy reasoning, and may include tags that identify the factors as “basic”, “underlying” or “direct” determinants, but this may also be inferred by position in a causal chain based on proximity to the outcome. Each causal chain is enclosed in a context, which is part of a larger domain. Contextualization permits inheritance, so salient details that may apply to many objects and/or mechanisms may be encoded at a higher level and not repeated for each factor.

[0072] In some embodiments, the causal model for this approach contains complex definitions, with factors representing an object and mechanism, and causal chains tied to specific phenomena operating within a larger context and domain of knowledge. This representation permits similar or identical factors to have completely different behaviors and outcomes in different contexts. Note that in a causal path based on subgraphs, when name of the Y element of one subgraph matches the X element of another, they form a chain. As the graph grows with greater breadth of knowledge, the key is finding the right chains, or the best chains through heuristics that favor correct solutions through associations with a preponderance of corroborating knowledge.

[0073] Granularity of descriptions of phenomena refer to the scope of the description such as global vs. local and population vs. individual and organism vs. system vs. organ vs. tissue vs. cell. Matching the granularity of the of phenomena with the factors inferred to be causally related is critical to determining the validity of the model, and ultimately to the success of the UMCR process: mismatched granularity can lead to incorrect retrospective verdicts or unrealistic predictions. In some embodiments, the system uses natural language accounts to infer possible causes and their salience without regard to granularity. In the curation process, some embodiments tune the model by matching granularity.

[0074] Figures 3A-3C show examples of causal paths as graphs, according to some embodiments. The causal path shown in Figure 3 A consists of five directed subgraphs, four causal factors and one outcome. In this path, the root cause 301 leads to a mediator 302. The mediator at 302 leads to two additional mediators, 303 and 304. Each of the subgraphs shown by dotted lines 305, includes a named relationship R object 306. The final outcome 307 is shown as the result of all the predecessors in the causal path. Mediators 302, 303 and 304 are the Y objects of the subgraphs in which 301 and 302 are the X objects, and 302, 303 and 304 are the X objects of their own subgraphs. This is possible because the exactly matching word that is the name of the node is what makes them effectively the same object or concept.

[0075] As an example, X 301 is “rain falling”, R is “causes” and Y 302 is “slippery road” in the context C of “surface transportation” in the left-most subgraph. X 302 is “slippery road”, R is “causes” and Y 303 is “reduced traction” in the context C of “driving” in the lower branch subgraph. Alternatively, a detractor in this position of the path might be X 302 is “slippery road”, R is “impairs” and Y 303 is “traction”. X 302 is “slippery road”, R is “contributor” and Y 304 is “driver losing control” in the context C of “driving” in the upper branch subgraph. And, X 304 is “driver losing control”, R is “precursor” and Y 304 is “collision” in the context C of “driving” in the right-most subgraph.

[0076] This example illustrates how the same concept can act as both X or cause in a plurality of directed causal subgraphs, and Y or outcome / mediator in a plurality of other subgraphs. In this example, the domain and context are “surface transportation” and “driving”. The set of all subgraphs whose C objects match the domain and context are the hypothetical model for that context. Viewing the network shown in FIG. 3b, subgraph 311 shares an X object with subgraph 312. Subgraph 312 has such intersections at X, Y and C. Subgraph 313 shares a common Y element with subgraph 312, and subgraph 314 shares common C and Y elements with other subgraphs. Again, the commonality consists of exactly matching words that represent concepts.

[0077] As used herein, generating a predicted outcome from known causes may refer to advanced model search heuristics using inherited characteristics in the component and/or the mechanism to expose positive or negative causal impacts that do not appear in the causal paths in the model, according to some embodiments. Specifically, the model may show that water or ice on a road surface can reduce traction, and reduced traction can cause a driver to lose control of a vehicle, and losing control of a vehicle can cause a collision.

[0078] In some embodiments, the system’s ability to perform advanced natural language processing enables the use of models and subgraphs that are not exact spelling matches but different forms of the same word, thus conceptually linked. This is accomplished through the “morphological analysis” process, synonym matching, similarity heuristics and environmental heuristics.

E. CAUSAL PHENOMENA

[0079] The present application has mechanisms for identifying causal phenomena such as co-occurrence, colliders and confounders, mediators and environmental factors, according to some embodiments.

[0080] Confounders: In causal paths, a confounding factor or lurking variable is a causal factor that influences more than one outcome 311, possibly causing a spurious association. In forward causal reasoning confounders constitute a logical OR stated as either XI or X2 can cause Y and they are treated as separate valid paths. As an example, strenuous activity or poor nutrition can independently cause a reduction in a person’s energy level. And even though both may be present, they are independent factors in the outcome.

[0081] If both XI AND X2 are required to cause Y, such as thrust and lift and specific air density are independently needed to generate enough lift for an aircraft to take flight, the resolution requires a logical AND, and is treated differently than unrelated confounders, according to some embodiments. For complex causality, the mechanisms and interplay of causal factors are especially important to capture. The directed molecular subgraph model supports complex causality using “required” qualifiers (Q) in cause-effect subgraphs. No matter how many causal factors are required for an outcome, the “required” qualifier forces the system to resolve for each.

[0082] In some embodiments, This system includes a Confounder Heuristic: When two or more outcomes (Yl, Y2...Yn) 312 and 313 are independently associated with or caused by the same causal factor (X), the system will search the model for any direct or indirect causal path between the outcomes. If none are in the model, the system will search literature for causal paths from each outcome (Yn) to each other outcome (Yn). If literature search turns up no causal paths, X is a confounder.

[0083] As used herein, generating a predicted cause form a known outcome may refer to IKE causal reasoning used to identify factors that account for an outcome and explain why an outcome occurred, according to some embodiments. The reasoning process explicitly aims to differentiate primary causes and secondary causes such as “confounders”. Deconfounding experiments seek to block secondary causes or “backdoors” to demonstrate the outcome would occur absent their influence. While this system is designed to process input from such experiments, the primary purpose is to accept as input data describing normally occurring phenomena and use a-priori knowledge to identify and rank causal factors that could account for the phenomenon. In some embodiments, The system has no capabilities to run such experiments nor block confounders/backdoors.

[0084] In some embodiments, the system is configured to find causal paths by traversing the knowledge graph.

[0085] Colliders: In causal paths, an outcome or mediator is a collider when it is causally influenced by two or more causal factors. The name "collider" refers to the symbology in graphical models (Figure 3B), in which arrows from more than one factor 314 and 315, often unrelated to one another, lead into the same node 316. That node, whether an outcome or a mediator in the causal path is the collider. A collider does not necessarily imply causal association between the predecessor variables.

[0086] In some embodiments, the system includes a Collider Heuristic: When two or more independent or unrelated causal factors (XI, X2...Xn) 314 and 315 are found in the input and have direct paths to the same outcome (Y) 316 the system will search the model for any direct or indirect causal path between the causal factors. If none are in the model, the system will search literature for causal paths from each causal factor (Xn) to each other causal factor (Xn). If literature search turns up no causal paths, the factors are colliders and are treated as independent, even if both factors appear in the input case.

[0087] Figure 3D illustrates how the same concept can act as both X or cause in a plurality of directed causal subgraphs, and Y or outcome / mediator in a plurality of other subgraphs, according to some embodiments. In this example, the domain and context are "surface transportation" and "driving". The set of all subgraphs whose C objects match the domain and context are the hypothetical model for that context. Viewing the network shown in FIG. 3d, subgraph 331 shares an X object with subgraph 332. Subgraph 332 has such intersections at X, Y and C. Subgraph 333 shares a common Y element with subgraph 332, and subgraph 334 shares common C and Y elements with other subgraphs. [0088] Negative Causality: Some of the illustrations (e.g., Figures 4A-4C) show simplified directed subgraphs to emphasize the relationship between a cause 401 and an effect or outcome 402. The presence of a named relationship 403 makes the subgraph 404 more robust than an unnamed directional arrow. When a causal factor comprised of either a subject component X, a mechanism f or both 411, has a negative impact on an outcome 412, the relationship R 413 describes the nature of the negative impact, in a refuting subgraph 414 reducing the likelihood of the outcome or rendering it impossible. When the knowledge is available, the magnitude of the impact is typically stored in the Q object of a fully articulated knowledge proposition, according to some embodiments.

[0089] In some embodiments, the system includes a Negative factor Heuristic: The

IKE system uses a brain-like process of both “activation” and “inhibition” in which candidates and solutions “heat up” as the aggregate weight of confirming knowledge grows, and “cool down” as negative or refuting knowledge accumulates. The direct or indirect impact of obstacles, barriers and terminators is intrinsic to the causal path analysis, can affect the rise of candidates to solutions, and become part of the explanations that describe how the solution was selected.

[0090] Combinations of positive and negative factors in the hypothetical model are uncommon in automated causal reasoning systems but are essential to form complete models of real-world phenomena. While Figure 3 A shows a model in which reduced traction is represented as a contributor, the detailed description suggests an alternative detractor, according to some embodiments. The path in Figure 4C shows a topical ointment applied to a small laceration 421 makes the skin itch 422 and reduces bacteria 423. Itching the skin 422 adds bacteria, thus detracting from the ointment’s efficacy at 423. Using the topical ointment 421 both contributes directly to healing 425 and reducing bacteria 423 also contributes to the healing 425. The decision to use positive or negative factors may be purposeful during seeding and curation as part of supervised learning, but inferred knowledge propositions may fall either way depending on the contents of the selected training set, according to some embodiments.

F. COMPLEX ANALYSIS [0091] Root cause analysis demonstrates that many phenomena have multiple levels of depth between the outcome and original or root cause. Sometimes one can draw a clean line from the root cause to the outcome, but many phenomena are far more complex, and require a “network” model of causes.

[0092] Linearity in causal models is represented by the arrows (process focus), yet many causal models, weather predicting for example, have many factors contributing to a single outcome (complex systems focus), and the factors often influence one another creating chaotic patterns that defy directional path models.

[0093] For this reason, some embodiments use more breadth in the model and greater variety of types of conceptual knowledge that can be analyzed as part of analysis thereby contributing to developing better predictive and descriptive solutions. The distributed graph nature of the model is more interconnected and brain-like than, for example, relational database models that have limited and somewhat arbitrary interconnections between conceptually linked data objects. Besides brain-like structure, IKE uses emergent brain-like processes, according to some embodiments.

[0094] Both NL understanding to contribute to causal reasoning and causal reasoning to improve the quality of NL interpretation benefit from a brain-like process using specialized processing areas to analyze each dimension of salient knowledge. Referring to Figure 5 A, specialized processing areas in STM 501 provide an expandable set of distinct reasoning frameworks as dimensions. The basic dimensions include causality 502, taxonomy 503, time and space 504, part-whole or meronomy 505, Language 506, and/or other reasoning dimension(s) 507 as needed to address the context of the input.

[0095] Each dimension has one or more heuristics tailored to that area of knowledge with functions that answer specific questions related to that area coded as heuristics 508.

[0096] An example of one such heuristic, in addition to the confounder, collider and negative factor heuristics described above is the Environmental Factor Heuristic: The system automatically searches the model for possible environmental factors beyond the hypothetical model, and inside or outside the domain of the input phenomena that could impact the outcome or key factors in the causal path. This is possible because of the interconnected structure of the knowledge network in which each outcome (Y) and causal factor (X) and mechanism (f) is characterized by its core attributes: concepts that are part of the global taxonomy.

[0097] Related objects in the network are also characterized by their attributes, any of which may shed light on, and possibly influence the solution, especially when subordinate classes of objects inherit descriptive attributes from super-ordinate classes. This is a brain like approach because the brain also exhibits electrical signal flow that follows neuronal link associations wherever the dendrites lead. Innovators and poets are examples of people well known their ability to tap into the more remote associative links as part of their cognitive processes. This capability of making associations across multiple subject areas is core to understanding human’s creative thinking capability, and ability to infer complex causal associations.

[0098] Referring next to Figure 5B, in some embodiments, the system may be configured to classify the input 606 (Figure 6 A) and associated knowledge propositions 516 into named attributes 514 of named specialized processing areas 511 based on named relationships in propositions. In some embodiments, each specialized processing area 511 represents a contextual component of the solution named in the header 512. In some embodiments, each attribute 514 in each specialized processing area 511 represents a characteristic associated with a concept defining a respective specialized processing area. A vector in the specialized processing area 513 is used to track the progress of emergence in that context, according to some embodiments.

[0099] In some embodiments, a candidate 516 is a potential component of an unknown outcome and/or unknown cause associated with the named attribute 514. In some embodiments, each candidate 516 is associated with a modifiable confidence vector 517.

Each attribute also has a vector used to track the progress of emergence in that attribute, according to some embodiments.

[00100] Figure 5C shows examples of actual values in a special processing area dimension named “space” 520 with several attributes 521, each with one or more candidates 522 and 524 and their associated candidate vectors 523, 525 and 526. The vector has two parts, the magnitude 525, and direction 526, according to some embodiments. [00101] Figure 5D shows an example STM word list, according to some embodiments. The example STM word list is an ordered group of lexical items or words including input objects in which 530 is a unique index for each lexical item in the matrix and 531 is the input as received or a related lexical item extracted from the knowledge network that could contribute to understanding the input. The original magnitude (M) 533 and emergence flag (F) 534 are indexed with numbers 532 matching the associated lexical item to which the values apply, according to some embodiments.

[00102] Referring next to Figure 5E, in some embodiments, the structures in IKE STM 541, including a list of case records 542, a matrix of all the words in each input sentence 543, and a matrix 544 of all the associations between individual words and their locations as candidates in specialized processing areas 545 (e.g., Sentence 1, Sentence 2, ..., Sentence n, Space, Taxonomy, Response, Time, Causality, and Self), are used to efficiently process the complex heuristics used in interpretation and causal reasoning.

[00103] The ability to classify words and process them in contextually relevant specialized processing areas is fundamental to robust natural language understanding and helpful in effective causal reasoning. Adding geographic knowledge enables system to identify location in the input and associate the location with a causal factor, an outcome or both. Adding the ability to sequence events temporally is equally important. Understanding meronomy to establish part-whole relationships that affect causality also improves the causal reasoning process. In various embodiments, the system design includes heuristics for some or all of these.

[00104] Figure 5F shows an example sentence matrix, according to some embodiments. The example sentence matrix is a structure in STM which is an ordered group of input objects in which 546 is a unique index for each row in the sentence matrix and 547 is an example of a lexical item or word stored in the actual sequence it appears in the sentence.

[00105] Referring next to Figure 6A, in some embodiments, the process flow begins with a step for receiving input 601 including metadata and related historical case data. This step comprises creating a “session state” in cache and RAM or STM that will persist until the causal analysis or interpretation is complete, then will be logged for future reference before the session state is purged from the volatile memory. The prerequisites for the process to be successful include a training data set 602, a pre-established knowledge base in graph structure 603, a validation data set 604.

[00106] The training data set is used to prime the system with selected knowledge propositions based on context information provided by the user prior to presenting the text to interpret or the case data for causal reasoning, according to some embodiments. The knowledge base is the complete set of known knowledge propositions stored in permanent non-volatile storage or LTM, and only a small portion of the knowledge is searched and used to process the input, according to some embodiments. The validation data set includes a list of named sources 605 to search to corroborate or refute the solution, according to some embodiments.

[00107] In some embodiments, when case data or text to interpret is presented to the system, the system classifies the input 606 along with any historical data presented by the user to support the reasoning or interpretation process. Classification generally means populating attributes in specialized processing areas in STM based on a natural language interpretation process. Bi-directional causal reasoning 607 involves searching LTM 603 for salient knowledge propositions, classifying them in the same specialized processing areas in STM based on the R objects in each proposition, and invoking the heuristics associated with each populated attribute, according to some embodiments.

[00108] For both prospective and retrospective causal reasoning, the inputs include the model, the case data tagged or positionally associated with model elements, and domain- specific rules to create causal predispositions, according to some embodiments. “Priming” information is treated as predispositions because the inputs, outputs and processes all use fuzzy logic thus the system delivers likelihoods rather than certainties and inferences rather than hard facts, according to some embodiments. Knowledge domains sometimes have inherent uncertainties, and some embodiments derive best-guess verdicts or predictions and build explanations that quantify the uncertainty as accurately as possible.

[00109] If there are any required attributes, in other words attributes needed for a solution that are not populated, additional knowledge is sought for those attributes in the knowledge graph 603, according to some embodiments. Causal reasoning is bi-directional 607 because it attempts to discover both outcomes and causal factors not in the input. The fitness algorithm and heuristics cause the fittest knowledge propositions representing both causes (verdict) and outcomes (predictions) to emerge as the most likely. These are submitted to the user with explanation of the causal lineage 608 describing the causal path(s) and the outcome(s) that are associated with each other. The reason there may be more than one possible solution is that many domains have co-occurrences, such as comorbidities in health diagnosis and multiple cascading constraints and outcomes in weather forecasting.

[00110] A significant challenge to working with an expansive knowledge model is maintaining a process within boundaries that will not lead to a combinatorial explosion of possibilities, most of which are too low in probability to be worth the processing cost to consider. In some embodiments, the IKE interpreter procedure of creating specialized dimensions in STM effectively breaks the problem up into its logical subdivisions permitting components of the solution to be calculated independently, and later merged with the other component solutions. Each attribute of each specialized dimension is used to resolve a multivariate marginal likelihood from which the multinomial truth values constituting the end solution can be assembled when the system finishes analysis for a given sentence or input case, according to some embodiments.

[00111] The model -based automated approach to inferring causality does not deal with absolutes but with likelihoods. Using weighted models can help in differentiating the relevance of possible causal factors in the final outcome. This approach is not intended for use in analyzing human intent as an element in causality: “She decided to do it, and the outcome was assured.” While human intentionality is an important aspect of natural language understanding of causality, it is not as amenable to UMCR.

[00112] The model and the structure of mechanism pairings with objects in the ontology axiomatize the knowledge for efficient processing. In some embodiments, rules and heuristics, for example mereological, temporal, spatial and taxonomical reasoning, interoperate in the causal reasoning to deliver more robust predictions and verdicts. Axiomatization further enables a single model to support multiple types of causal reasoning, according to some embodiments.

[00113] Types of causality include probabilistic, counterfactual, regularity, dispositional and agency forms of causality, according to some embodiments. In some embodiments, within a phenomenon, a candidate is causally connected only if a change to the candidate affects the outcome. When the system cannot confirm or refute the verdict, expert input bridges the gap. In some instances, identifying incorrect verdicts is difficult without human curation of the model, adding constraints that identify counter correlated factors that do not contribute to the outcome.

[00114] The respective roles of human curators and automated inference are shown in Figure 6B, according to some embodiments. Ingested data 611 describes the input set described earlier in 601. Assembling the input is usually mostly manual, but can be augmented with hots to search for case history data for the case input and similar cases for training data sets 612. Supervisors curate 613 a training data set by identifying the historical cases that are closest to the case under consideration and defining why the cases are similar. The human curators are subject matter experts and they also curate the validation data sets 614 and perform supervised learning tasks associated with the interpretation algorithms 615 and the causality inferences 616.

[00115] When preparing input, a data entry task asks the person submitting the case to describe the context and what is known and assumed about the case. This information becomes a set of selected concepts 617 that help prime STM for the interpretation and causal reasoning processes. The hypothetical model 618 is the subset of the knowledge graph that relates directly to the context and conceptual details of the case, including related causal paths. The IKE system 619 automatically infers, learns and validates knowledge as it is being processed, and once the case submitter and subject matter experts review the verdict or predictions 620 and accept or decline them, IKE adjusts confidence values of key knowledge propositions that contributed to the solution, according to some embodiments.

G. MACHINE LEARNING

[00116] The weights in the knowledge network represent probabilities and the internal structure of each subgraph, and the links between objects represent probabilistic propositions, according to some embodiments. Thus, the knowledge network is structured as a Bayesian network. As a Bayesian network, the knowledge network is a multinomial distribution of millions of discreet elements, each complex in content and able to link with an arbitrary number of other elements. One element may be connected to one other element, or to 10,000. The link structure is, therefore, chaotic and unpredictable. Consequently, typical neural approaches, such as Boltzmann Machines and Hidden Markov Models, cannot be used with this model to deliver solutions through the typical training and processing functions of forward and backward propagation waves.

[00117] In some embodiments, using natural language interpretation and concept learning, however, the system adds to its knowledge and refines the confidence values of individual knowledge propositions as a result of processing new cases. The closer the cases are related, the more new cases contribute to understanding prior cases, especially when there are overlapping or intersecting causal paths. In this way, the system constantly learns and becomes better able to perform interpretation and causal reasoning functions, according to some embodiments. The more the system learns, the less human input is required to curate the inferences.

[00118] Referring next to Figure 7A, human long-term memory 701 is like a disk drive for storing facts and associations. The knowledge graph 702 is intended to resemble the structure, contents and functions of the human brain and LTM. STM 703 is also a part and function of the human brain and some embodiments model it in computers using volatile storage or Random Access Memory (RAM) 704 as a ready access storage area. During analysis and interpretation 705, small subsets of knowledge propositions from LTM are copied into STM for efficient processing, according to some embodiments.

[00119] The working storage area or cache 706 has significant roles in supporting priming 707 and learning 708. Information from LTM that may not be directly related to the case, but that shares a conceptual framework constitutes the selected concepts 617 described earlier. The selected concepts prime the network in a way similar to the brain function of constantly processing contextual cues from the five senses. These cues prepare the brain for new input. When humans encounter something completely out of context, it often creates confusion and is difficult to understand until enough context is gathered to make sense of it.

In addition to storing the context that primes the knowledge processes, the cache is also used for learning as newly acquired knowledge propositions, and adjusted weights for existing knowledge propositions are stored in cache until enough evidence is gathered to commit them to LTM, according to some embodiments.

[00120] Though this description distinguishes between real-time analysis tasks and post-facto learning processes, much of the research in the field of cognitive modeling of neural processes treats the real-time adjustments of weights in STM as “learning”. In some embodiments, the primary interpretation algorithms that allow correct interpretations to emerge are learning about the input. In some embodiments, the IKE interpreter system simply chooses to deliver the learned information as output and only remember things that are determined to be new information to the system.

[00121] In some embodiments, the working storage area in the IKE interpreter is also a persistent holding place for parameters used in causal reasoning and for information that is expected to be useful in helping to interpret inputs. By retaining information that generally applies to a user and domain of work, the IKE interpreter can better disambiguate words or phrases that have unique meanings in the user’s context. The collection of parameter and user context information is cached as propositions organized in working memory lists and matrices, according to some embodiments.

[00122] Objects in the cache may come from different sources:

• User Preferences (elicited and inferred)

• User Profile (elicited and inferred)

• Discourse Context (inferred)

• Operating Parameters (Preset, then adjusted automatically)

[00123] The same emergent brain-like processes that support processing, support learning, according to some embodiments. The term “emergent behavior” is applied to the human brain and other complex systems whose internal behavior involves non-deterministic functionality or is so complex, or involves the interaction of so many steps that tracing the process from beginning to end is not feasible. IKE algorithms catalyze emergent behavior through the application of multiple complex contextual constraints, genetic algorithms to assign, adjust and analyze the fitness of multiple candidates, attributes and contexts, and threshold logic, according to some embodiments.

[00124] In some embodiments, the system is configured to activate emergent behavior by modifying the weight component of each confidence vector 517 of each candidate 516. In some embodiments, the starting value of the weight component is based on the knowledge proposition weight stored in the knowledge graph. In some embodiments, the value of the weight component is increased each time a corroborating knowledge proposition 404 is processed and the value of the weight component is decreased each time a refuting knowledge proposition 414 is processed.

[00125] Referring next to Figure 7B, threshold logic in the IKE interpreter involves mathematical functions applied to vectors between a maximum value 711 and a minimum value 712 to determine if the magnitude of the vector is sufficient to merit attention, according to some embodiments. The threshold may be expressed as a single minimum threshold 713, or may have standard 714 and maximum tiers 715. This logic conceptually places a bar below which the activation value is insufficient to emerge to consciousness and above which attention is drawn to the vector of a specific concept or candidate, according to some embodiments. This bar is expressed as a numerical value that is within range of the expected activation potential of vectors to which the threshold applies, according to some embodiments. Different thresholds may be applied to different vectors and the thresholds for a single vector or for multiple vectors may be adjusted during the course of processing, according to some embodiments.

[00126] In some embodiments, because thresholds are adjustable, the mathematical threshold function is a sigmoidal curve 716 over a candidate X whose values are inspected over time as in the formula: f(Xt, f(Xt+l, f(Xt+2...)...)...).

[00127] Specialized dimension containers are the fundamental structure in the IKE interpreter system that exhibit emergent behavior, according to some embodiments. The three types of vectors, dimension or context, attribute and candidate, each possess activation levels that represent the fitness of each dimension, attribute and candidate, according to some embodiments. The threshold factor applicable to each of these determines whether the vector emerges to consciousness or not, according to some embodiments. At or above threshold magnitude, an object at any level is said to emerge or attract attention. Parameters in the system define how many emergent objects in each category are fit enough to survive, according to some embodiments.

[00128] Each dimension vector, attribute vector and candidate element vector has both a direction and a numeric level of activation, according to some embodiments. The distinct levels of activation are below threshold 717, at threshold 718 and above threshold 719. The directions are emerging 720, static 721 and falling 722. [00129] Determining the emergence of candidates in a single attribute of a context dimension in a specialized processing area can be compared to a children’s game in which an object is hidden in a room and the person who hid the object guides the contestant to the object by telling them they are getting hotter or colder. The nearer they approach the object, the hotter they are, and the further they are, the colder. In some embodiments, in the IKE interpreter system, candidate, attribute and dimension vectors heat up and cool down. An automated interpreter agent searches through all hot dimensions for hot attributes, based on inputs from other concepts 731 and selects hot candidates (surviving genes) based on magnitude and rate of change, for resolutions to the meaning of the input, according to some embodiments.

[00130] In some embodiments, the system is configured to retrieve doping inputs 731 and priming inputs from a context associated heuristic algorithm and apply the respective doping inputs and priming inputs to applicable candidates 516 in applicable attributes 514 in applicable specialized processing areas 511.

[00131] Depending on the stage of the selection process at the time of emergence of any given object vector, the process can be different, according to some embodiments. In some embodiments, attention in the context of emergence is applied as the final interpretation of a part of input when emergence occurs at or near the end of the interpretation process. When emergence occurs earlier, it can trigger additional processes such as spawning a new wave of activation in LTM, in STM or both. The new wave of activation has the potential to increase and/or decrease the magnitude of any vector, including the vector object that spawned the wave, thus potentially forcing it below threshold and deselecting it, according to some embodiments.

[00132] In some embodiments, the system is configured to modify a candidate confidence vector 517 of each candidate 516 in each specialized processing area 511 based on frequency of matching between a respective candidate and at least one of: (i) a respective user input, doping input and priming input, and (ii) knowledge propositions encoded in at least one subgraph, to bring about emergent behavior by incrementing or decrementing a weighting component of the candidate confidence vector 517.

[00133] In some embodiments, candidate selection is based on aggregate activation generated through neural processes in the portion of the knowledge network in STM (shown in Figure 7C). This process applied to each individual candidate 732 is probabilistic in that the emergence of winning or surviving candidates arises from analyzing the Bayesian probability that this proposition applies to the current input, according to some embodiments. In other words, each increment of positive 733 and negative activation 734 applied to each candidate respectively 738 increases and decreases the probability that the recipient candidate will emerge, according to some embodiments. Hence, each increment of activation bolsters or weakens the probability that the recipient proposition will be found true and applicable to solving the problem needed to resolve the case or meaning of the input 731, according to some embodiments.

[00134] In some embodiments, the system is configured to extract emergent candidates from each specialized processing area with a largest value of the weighting component of the candidate confidence vector 739.

[00135] The combination of the direction and magnitude of activation of each element in each vector constitutes its state, according to some embodiments. There are nine possible states for each element as shown in Figure 7B. Activation can also be implemented as two states: 1) below threshold and 2) at or above threshold or “fired”. Both the direction and activation can be calculated from the vector weight, the previous vector weight, and new activation flow potentials. The original vector weight and other constraints can be combined to make the state more expressive or richer, enabling more complex reasoning.

[00136] In some embodiments, the system is configured to detect gaps by determining whether any attribute 514 of any specialized processing area 511 is required for a solution that has no candidates 516 and in response to determining that a respective attribute has no candidates, performing further search of the knowledge graph 121 for possible candidates.

[00137] In some embodiments, the stochastic processes that determine and adjust the fitness of each candidate, attribute and specialized dimension in the IKE interpreter operate at the object level. This is necessary because resolution of ambiguity must successfully find the correct meaning or meanings for each symbol, word and phrase. This is possible because every specialized dimension, attribute and candidate possesses direct ties to knowledge in the knowledge network both at the object and proposition levels. The rise and fall of object vectors is the primary mechanism of genetic selection. [00138] In some embodiments, from a propositional logic perspective, the fitness of a candidate is determined from the truth values of the objects at the proposition level. But unlike typical methods for mapping truth values such as Venn diagrams or truth tables, the IKE interpreter uses excitatory and inhibitory values that are derived from activation wave processes, according to some embodiments. Doping modifies the behavior of an activation wave according to some embodiments. The starting value of a vector’s node comes directly from the knowledge network, but that value may rise or fall based on the frequency of encountering supporting and contradictory propositions in the knowledge network, according to some embodiments.

[00139] In some embodiments, doping introduces quasi-random variables after the first activation wave has propagated through all the specialized processing areas. In some embodiments, a genetic mutation process alters the characteristics of a solution candidate or path during the course of processing. The mutated result can then compete with other results for fitness as a solution.

[00140] The common use of weights in fuzzy logic or stochastic processes is appropriate as a measure of activation at the object level, therefore the weight of an object reference to a candidate, attribute or specialized dimension constitutes the level of activation or magnitude of its vector. In some embodiments, this activation level or magnitude is used as the fitness for the genetic scoring processes. As such, unlike the weightings in typical neural networks that result in single “winner-take-all” results, the fitness values can result in multiple successful results, thus enabling interpretation of multiple meanings which may be present in text whether intended or unintended by the speaker or generator of the text.

[00141] In some embodiments, the system is configured to generate the solution of unknown outcome and/or unknown causes based on emergent candidates 516 for each attribute 514 of the specialized processing areas 511.

H. BUILDING THE MODEL

[00142] As with humans and many AI systems, IKE capabilities grow more accurate and broader over time, according to some embodiments. Referring next to Figure 8A, the foundational causal knowledge is explicitly seeded in the knowledge graph 801 by human knowledge engineers 802, according to some embodiments. In some embodiments, there are automated components and machine learning techniques 803 that make this process efficient, but much of the work is performed by human subject matter experts and AI technicians. The automated seeding processes scan linked open data and literature 804 located in deep web subscription sites 805 and open web free sites 806 based on very specific, and relatively narrowly defined search criteria to build on manually created knowledge propositions, according to some embodiments.

[00143] In some embodiments, models are populated with supervised concept learning, where causal knowledge is inferred from source inputs and combined with seeded concepts. This repeats the process shown in 803 through 806, but on a much broader basis, giving the system the ability to follow new web links to expand the search to concepts not explicitly defined by the human knowledge engineers. Seeded concepts are predefined factors and causal chains that serve as templates for machine learning procedures 803, including Bayes classification algorithms, simplified genetic algorithms and path heuristics. Bayes classifiers are used to weight the model by calculating posterior probability from the class and predictor of prior probability: P(c | x) = P(xl | c) x P(x2 | c) x ... x P(xn | c) x P(c).

[00144] Simplified genetic algorithms are used to discriminate, rank and validate possible candidate determinants in retrospective and prospective causal models, according to some embodiments.

[00145] In the initial phase, as the network is acquiring baseline knowledge, the testing and refining processes are more manual than automated 807, but the model is structured as a Bayesian Network, so it lends itself to automated concept testing and validation 808 as extensions of the core learning algorithms and heuristics, according to some embodiments. The automated knowledge validation processes scan linked open data and literature 809 located in deep web subscription sites and open web free sites based on the specific elements in the solution or newly acquired knowledge propositions in working memory, according to some embodiments.

[00146] In some embodiments, the model is completely distributed. In some embodiments, the model grows arbitrarily without impairing reasoning processes and outcomes. Some embodiments use a curated model (e.g., with unconstrained growth). [00147] Referring next to Figure 8B, in some embodiments, the components and processes of building the model 811 and of using the model to support interpretation and causal reasoning 812 are similar. In some embodiments, the seeding process 813, while much more complex and time-consuming, is analogous to the manual portions of input preparation process described above in reference to Figure 6B. In some embodiments, knowledge curation 814 occurs in both building the model and improving it. In some embodiments, the knowledge graph 815 serves as the central knowledge store for both.

[00148] Some embodiments scan literature to learn at both the model building 816 phase and solution validation. NL case data scanning 817 is used in the context of input for causal reasoning, as well as in the a-priori learning process to build causal knowledge and hypothetical models. In some embodiments, the algorithms and heuristics used to infer new knowledge propositions 818 use the existing knowledge graph to avoid duplicating existing knowledge and attempt to connect new knowledge with existing knowledge through matching Y or C objects. Some embodiments use synonym matching, similarity heuristics or morphological analysis when the words do not match exactly.

[00149] Some embodiments use advanced heuristics for fitting new knowledge to the model and optimizing fitted hypothetical models 819. Fitting is a way of differentiating binary from non-binary factors, and using heuristic mechanisms and rules to quantify the impact of non-binary factors to show the degree to which that factor influences the outcome. As an example of a binary factor: the automobile’s “alternator is functional”. This proposition can be true or not. If not, the battery is predicted to die after a certain number of miles driven. A non-binary factor is the “number of miles driven before the battery will die”. With these two factors, if the alternator is not functional and distance to the destination is greater than the number of miles driven before the battery will die, the system accurately predicts that the car will not be able to arrive at the destination under its own power and that the battery will need to be externally recharged before the next journey, according to some embodiments.

[00150] In some embodiments, IKE supports an ever-growing list of heuristics, and each can be tied to either an R object, a C object or a Q object in any knowledge proposition

[00151] The processes of classifying input data 820 to identify and filter candidates 821 so that they can be scored 822 using emergence algorithms and heuristics are described above, according to some embodiments. [00152] In some embodiments, automated validation processes 823 as described earlier, precede delivery of solutions and explanations 824 and work in conjunction with manual validation 825, also referred to as curation and “supervised learning”.

I. EXPLANATION UTILITIES

[00153] AI has long been characterized by “Explanation Utilities” that explain the reasoning process that lead to the conclusion or verdict presented 824. In the domain of human health, the ability to explain both the causal relationships and the mechanisms of causation will be critical to supporting highly trained medical professionals in their diagnostic work.

[00154] In some embodiments, the IKE system uses both inference rules and causal modeling to derive solutions. It uses a strong mechanistic causal model to describe causal links between factors and outcomes, and rules in conjunction with heuristics to filter, score and select candidates, constituting the core of causal reasoning. The system uses a more detailed and granular knowledge base and a more flexible set of heuristics inference options than are available elsewhere, according to some embodiments.

[00155] In some embodiments, much of the process is embedded in the knowledge, and the explanation utility’s ability to reconstruct the lineage of the causal reasoning makes it easy for people of varying levels of technical knowledge to understand the bases for solutions delivered. IKE builds an explanation based on the emergent set of knowledge propositions in specialized processing areas. Because each knowledge proposition can be articulated as an English language sentence, the collection of the emergent propositions serves as the explanation, especially when causal propositions from the hypothetical model are chained to show the progress from root cause to final outcome, according to some embodiments.

[00156] Figure 9A shows an example logical flow of a priming heuristic, according to some embodiments. In some embodiments, prior to input processing expectations are established by preloading a set of commonly used words. Commonality is defined broadly in terms of frequency of occurrence in the user’s language, according to some embodiments. Commonality is also defined narrowly in terms of historical user inputs, according to some embodiments. As illustrated, common words are stored 901 in the knowledge graph 902 and loaded 903 in STM 904 at the start of a user session in which input is expected. The initial magnitudes for frequent words in the user’s language are set very low, for example at half the confidence value of average a-priori magnitude of propositions in LTM and for frequent words in user’s prior inputs are set medium low, for example at three fourths the confidence value of average a-priori magnitude of propositions in LTM 905. Once priming is complete, the system is ready for input 906.

[00157] Figure 9B shows an example logical flow of a doping heuristic, according to some embodiments. In some embodiments, early in the processing and emergence of concept nodes 911 in STM 913, i.e. prior to the completion of language, causality and/or validation heuristics, additional related concepts are searched and retrieved 912 from the knowledge network 914 (sometimes called the knowledge graph), classified in STM based on classification procedures 915, then used as a starting point for an activation wave 916 to activate emergent behavior in a way that is only indirectly related to the input.

[00158] Figure 9C shows an example logical flow of a sequence of language heuristics, according to some embodiments. In some embodiments, each stratum of language from syntax and semantics (921), to deixis (926), and to logical intent (927), are resolved using knowledge propositions from the knowledge graph or LTM 922, classified in special processing areas or dimensions in STM 924. In some embodiments, language related concepts are searched and retrieved from the knowledge network 922, classified in STM 924, then combined with causality heuristics 925 to resolve possible ambiguity in each language stratum and determine the intent of the user. Some embodiments use time, space, taxonomy, meronomy, identity and commerce heuristics, for interpretation of intent. In some embodiments, syntax roles include parts of speech, such as noun, verb, adjective and pronoun. Semantic roles include agent, action, instrument and object and reflect specific roles that persons and things perform vis-a-vis the action or verb, according to some embodiments.

[00159] Figure 9D shows an example logical process flow of a causality heuristic, according to some embodiments. In some embodiments, an extractor 931 uses advanced matching algorithms to identify and extract causal knowledge propositions from a knowledge graph 932 and classifies them in STM 933. The classification process forms multiple tentative causal chains and each is tied to concepts from the input 934 in the STM Word List and other STM structures. Based on the specific relationships contained in the knowledge propositions, some causal factors or elements in causal chains can be marked as probable colliders or confounders 935. Coordination with the semantic interpretation process 936 then identifies the action, its agent(s), any instrument(s) or mechanism(s) likely to contribute to the outcome, the object(s) of the action and the likely outcome(s), according to some embodiments.

[00160] Prior to invoking validation heuristics, additional heuristics associated with special processing areas or specific R, C or Q values may be applied to the candidates in specific attributes and/or context dimensions. As examples: temporal heuristics may be applied to attributes in the time dimension to infer the time the event described in the input occurred, or its beginning, ending or duration; spatial heuristics may be applied to attributes in the space dimension to infer the location, origin, destination or distance of the events in the input; taxonomical inheritance heuristics may be applied to candidates in the taxonomy dimension to infer characteristics of parent objects that may be applicable to child objects in ways that may affect the outcome.

[00161] Figure 9E shows an example logical process flow of a validation heuristic, according to some embodiments. The validation heuristic tests the emergent results interpretation in STM 941 by getting 942 the result set and using the key concepts as a basis for a literature scan 943 that uses a pre-existing validation set 944 based on relevant literature or seeks to create a new validation set if none is available by searching online information about the concepts under consideration. The verbal statements about historical causes and outcomes matching the concepts in STM are linguistically and logically compared to see if they support the conclusions or not 946, and if not, the conclusions may be reformulated 947 based on the validation data set, according to some embodiments. In some embodiments, the meaning profile includes answering one or more questions to represent the intent of the input. For example, the questions may include the following: Is the sentence declarative, interrogative, imperative or exclamatory? Who did what to whom, where, when and with what instrument s)? Why did the described action occur and why is it important? What is described and is the description consistent with common presuppositions?

[00162] A concept learning heuristic may use preformulated sentence structures as a basis for inferring new knowledge propositions from text content on web pages. As an example, if mined web data on a page describing types of glass for construction professionals contains the sentence “Coated glass is highly durable and performs well in harsh weather conditions”, the concept learning heuristic may infer the knowledge proposition: (X) “coated glass” (R) “type” (Y) “glass” (C) “construction” (Q) “highly durable”. This knowledge molecule can be read “Coated glass is a type of glass in the context of construction that is highly durable”. A similar knowledge proposition may be inferred from the same input with X, R, Y and C being identical and the Q reading “performs well in harsh weather”.

Example Systems and Methods of Mechanistic Causal Reasoning

[00163] According to some embodiments, a method is provided for mechanistic causal reasoning using techniques described above. The method is performed by a system (e.g., the system shown in Figure IB) that includes one or more memory units (e.g., the memory 113,

114, and/or 115) each operable to store at least one program, and at least one processor (e.g., the processor 111) communicatively coupled to the one or more memory units, in which the at least one program, when executed by the at least one processor, causes the at least one processor to perform steps of the method.

[00164] The method includes receiving input data from a user (e.g., input obtained using any of the devices 117, ..., 120). The input data describes a case and known background information about the case (as described above in reference to Figure 1 A). The case is a set of causes and/or outcomes. The information about the case lacks sufficient information about why a known outcome occurred or what outcome will occur as a result of known causal factors.

[00165] The method includes determining whether the user intends to generate a predicted outcome from known causes or generate predicted causes from a known outcome (as described above in reference to Figure 1C). Forward reasoning that maps known causes to inferred outcomes, or reverse reasoning that maps known outcomes to inferred causal factors, are based on a knowledge graph (as described above in reference to Figure 1C) with subgraphs. At least one subgraph is linked to another subgraph. Each of the subgraphs represent a knowledge proposition including: a subject component (e.g., the subject component 211 in Figure 2B), an associate component (e.g., the associate component 212), a named relationship component (e.g., the component 213) that links the subject component and the associate component, a context component (e.g., the component 214) that identifies a domain of knowledge that an association is true, a qualifier component (e.g., the component 215) that describes a constraint governing the relationship that further narrows the context in which the association is true, a weight component (e.g., the component 216) that is a probability factor of a likelihood that the proposition that the subject component is related to the associate component in the context identified by the context component, and a mechanism component (e.g., the component 219) that describes an action that the subject component is performing to affect the associate component.

[00166] The method includes traversing the knowledge graph stored in LTM. Traversing the knowledge graph includes associating each word of the input with a lexicon object (e.g., the object 124), and associating each lexicon object with a plurality of propositions in the knowledge graph. Each proposition corresponds to a subgraph, and the propositions define a relationship between the subject component and the associate component in the subgraph. Traversing the knowledge graph also includes classifying (e.g., as described above in reference to Figure 5B and 6A) the input and associated knowledge propositions into named attributes (e.g., the attributes 514) of named specialized processing areas (e.g., the processing areas 511) in STM based on named relationships in propositions. Each specialized processing area represents a contextual component of a solution. Each attribute in each specialized processing area represents a characteristic associated with a concept defining a respective specialized processing area. According to some embodiments, a candidate is a potential component of an unknown outcome and/or unknown cause associated with the named attribute. Each candidate is associated with a modifiable confidence vector including a weight component and an emergence flag.

[00167] Processing by a specialized processing area includes: (i) activating emergent behavior (e.g., as described above in reference to Figure 7B) by modifying the weight component of each confidence vector of each candidate. A starting value of the weight component is based on the knowledge proposition weight stored in the knowledge graph. Value of the weight component is increased each time a corroborating knowledge proposition is processed, and the value of the weight component is decreased each time a refuting knowledge proposition is processed; (ii) retrieving doping inputs and priming inputs (e.g., the inputs 731 described above in reference to Figure 7C) from a context associated heuristic algorithm (e.g., as described above in reference to Figures 9A and 9B) that generates respective doping inputs and priming inputs for each candidate in each attribute in each specialized processing area, and applying the respective doping inputs and priming inputs to each candidate in each attribute in each specialized processing area; and (iii) modifying a candidate confidence vector (e.g., the candidate confidence vector 517 or the candidate confidence vector 739) of each candidate in each specialized processing area based on frequency of matching between a respective candidate and at least one of: (i) a respective user input, doping input and priming input, and (ii) knowledge propositions encoded in at least one subgraph, to bring about emergent behavior by incrementing or decrementing a weighting component of the candidate confidence vector.

[00168] Traversing the knowledge graph also includes extracting emergent candidates from each specialized processing area with a largest value of the weighting component of the candidate confidence vector. In some embodiments, traversing the knowledge graph also includes detecting gaps by determining whether any attribute of any specialized processing area is required for a solution (e.g., the solution in Figure 5C) that has no candidates and in response to determining that a respective attribute has no candidates, performing further search of the knowledge graph for possible candidates. Traversing the knowledge graph also includes generating (as described above) a solution of unknown outcome and/or unknown causes based on emergent candidates for each attribute of the specialized processing areas.

[00169] In some embodiments, the system further includes: a storage architecture (e.g., the memories 113 or 114) configured to include temporary special processing area structures used to classify and organize the input data from a user by named category, including a plurality of context dimensions (e.g., the dimensions 502, 503, ..., 507 in Figure 5 A, the dotted line structure includes dimensions in short term memory, also as described above in reference to Figure 5B) as ordered multi-dimensional storage structures. Each context dimension includes a named context header (e.g., the header 512), one or more attribute dimensions (e.g., the attribute dimensions 514). Each attribute dimension represents a subject component, and each attribute dimension is associated with a respective candidate dimension. One or more attribute dimensions is associated with a respective context dimension. Each attribute dimension contains a name representing a specific concept applicable to the named context header of said associated context dimension. At least one candidate dimension contains zero or more knowledge propositions, for each named attribute object in said attribute dimension.

[00170] In some embodiments, at least one multi-dimensional structure (e.g., the structure shown in Figure 5A) is specialized in causality, when more than one causal candidate exists in an attribute, said causal candidates are ordered to represent a causal path of predecessor and successor knowledge propositions that form causal factors. In some embodiments, in any given multi-dimensional structure specialized in taxonomy, when more than one candidate exists in an attribute, said candidates are ordered to represent a hierarchical or taxonomical ordering scheme of super-ordinate and subordinate classes of objects. In some embodiments, in any given multi-dimensional structure specialized in space and time, when more than one candidate exists in an attribute, said candidates are ordered to represent a spatial or temporal ordering scheme of location and time classes of objects. In some embodiments, in any given multi -dimensional structure specialized in meronomy, when more than one candidate exists in an attribute, said candidates are ordered to represent a part- whole constructive ordering scheme of part and whole classes of objects. In some embodiments, each attribute dimension is defined as either required or optional for solution generation. In some embodiments, each said candidate is associated with a vector comprised of magnitude and direction components, constituting an adjustable score for each said candidate. In some embodiments, candidate object related information further includes an original magnitude and emergence flag (e.g., the flags 534 shown in Figure 5D) for each candidate.

[00171] In some embodiments, the method further includes steps for analysis of meaning of an ordered group of input text objects (e.g., input text is tokenized, each token constitutes ordered group, and each token is an object; as also described above in reference to Figure 5F; sometimes referred to as lexical items) forming natural language phrases and sentences based on a scoring strategy. The steps include segregating (or separating) individual words in the input text, adding them to a word list in a short-term memory (STM), searching for each input word in a lexicon having a plurality of words therein. Each said word is linked to a plurality of knowledge propositions. In some embodiments, the steps also include analyzing morphology of said words by determining if a prefix or suffix has been added to a root word to form said input word and adding root words to the word list. [00172] In some embodiments, the steps also include extracting, from the knowledge graph, said knowledge propositions formed, in part, by each word in the word list. In some embodiments, the word list is expanded to include additional words when extracted propositions contain X, Y, C, or Q objects not yet in the word list. In some embodiments, extracting includes using lexical items or tokens from the input and related information to search the knowledge network for knowledge propositions in which the lexical item matches the X, Y, C, or Q object in any knowledge propositions, then returning those knowledge propositions to STM for classification.

[00173] In some embodiments, the steps also include classifying a plurality of candidates formed of directed subgraphs, each said candidate describing an explicit logical relationship between one object and another object, into a specialized processing area. In some embodiments, the steps also include comparing a first or X object of each candidate to find matching objects in STM and adjusting the vector of each candidate based on the quantity of matching objects.

[00174] In some embodiments, the steps also include comparing a second or Y object of each candidate to find matching objects in STM and adjusting the vector of each candidate based on the quantity of matching objects. The steps also include comparing a third or C object of each candidate to find matching objects in STM and adjusting the vector of each candidate based on the quantity of matching objects. In some embodiments, comparing includes matching the lexical items or tokens from the input with lexical items that exactly or closely match the X, Y, C, or Q object in any knowledge proposition.

[00175] In some embodiments, the steps also include invoking and executing interpretation heuristics associated with the named relationship or R values of the candidates with the highest score vectors to further reorder concepts in each attribute dimension of each specialized processing area based on fitness.

[00176] In some embodiments, the steps also include adjusting the score vector assigned to affected candidates based on a quantity of recurring objects or a frequency of encountering recurring objects during heuristic processes. In some embodiments, adjusting includes changing the numerical value of the score vector or confidence value associated with a given concept of knowledge proposition. Corroborating indicators adjust the score upward representing a higher confidence that this is a correct understanding of the concept or knowledge proposition in the case of the given input, while refuting indicators adjust the score downward representing a lower confidence that this is a correct understanding of the concept or knowledge proposition in the case of the given input. The aggregate influence of all the corroborating and refuting indicators constitute the final confidence value applied to each concept and knowledge proposition in STM.

[00177] In some embodiments, the steps also include reordering said candidates based on the direction and magnitude of said vectors, wherein said vector directions comprise emerging, static, and falling conditions. Said vector magnitudes include numeric values, when compared with a numeric threshold value, are determined to be above threshold, at threshold, or below threshold value. In some embodiments, the steps also include determining the context of the of the input text based on the highest scored C object in the appropriate specialized processing areas. In some embodiments, the steps also include invoking and executing additional heuristics to find candidates for any required attributes with no candidates, and if found, repeating the above steps of segregating, analyzing, extracting, classifying, comparing, adjusting, reordering, determining, invoking and executing additional heuristics steps. In some embodiments, the steps also include applying a fitness algorithm to determine the fittest candidates of those compared in each attribute dimension of each specialized processing area. In some embodiments, the steps also include formulating a meaning profile based on the highest scoring or fittest emergent candidate of each attribute dimension of each specialized processing area.

[00178] In some embodiments, the method further includes steps for performing deep natural language understanding. The steps include receiving input text, formed of a plurality of words, and matching each word with a word in the lexicon to populate an ordered word list. In some embodiments, the steps also include extracting phrases (e.g., as described above) including idioms in the lexicon in which one or more words in the input appear in the phrase, and adding such phrases to said word list. In some embodiments, the steps also include using punctuation and other linguistic cues to segregate each sentence in the input to store each input sentence into an ordered sentence matrix. In some embodiments, the steps also include extracting, from the knowledge graph, propositions formed, in part, by each word in the word list. In some embodiments, the steps also include classifying said extracted propositions in the specialized processing areas based on an applicable attribute of a respective specialized processing area. In some embodiments, the steps also include applying the fitness algorithms to determine the fittest propositions of those compared. In some embodiments, the steps also include invoking natural language understanding heuristics to interpret the context and relationships of said words, phrases and sentences by analyzing each level of linguistic content of said data objects, wherein the levels include pragmatics or context, semantics, grammar or syntax, morphology, phonology, and prosody.

[00179] In some embodiments, the method includes steps for the analysis of the causality based on a scoring strategy of an ordered group of input text objects forming natural language words and phrases classified into a specialized processing area for causality fitness processing representing causal factors or outcomes. In some embodiments, the steps include providing a plurality of candidates formed of directed subgraphs, each said candidate describing an explicit causal relationship between one object and another object. In some embodiments, the steps also include comparing a first or X object of each candidate to find matching objects in STM and adjusting the vector of each candidate based on the quantity of matching objects. In some embodiments, the steps also include comparing a second or Y object of each candidate to find matching objects in STM and adjusting the vector of each candidate based on the quantity of matching objects. In some embodiments, the steps also include comparing a third or C object of each candidate to find matching objects in STM and adjusting the vector of each candidate based on the quantity of matching objects. In some embodiments, the steps also include invoking and executing causality heuristics (e.g., as described above in reference to Figure 9D) associated with the named relationship or R values of the candidates with the highest score vectors to further reorder concepts in the attributes dimension of each specialized processing area. In some embodiments, the steps also include adjusting the score vector assigned to affected candidates based on the quantity of common objects or the frequency of encountering common objects during heuristic processes. In some embodiments, the steps also include reordering said candidates based on the direction and magnitude of said vectors, wherein said vector directions comprise emerging, static, and falling conditions. Said vector magnitudes comprise numeric values, when compared with a numeric threshold value, are determined to be above threshold or emergent, at threshold, or below threshold value. In some embodiments, the steps also include determining the context of the of the input text based on the highest scored or fittest emergent C object in the appropriate specialized processing areas. In some embodiments, the steps also include invoking and executing additional heuristics (see examples described above) to find candidates for any required attributes with no candidates, and if found, repeating the steps of providing, comparing the first or X object, comparing the second or Y object, comparing the third or C object, invoking and executing, adjusting the score vector, reordering, determining the context, and invoking and executing the additional heuristics. In some embodiments, the steps also include invoking and executing causality heuristics (e.g., as described above in reference to Figure 9D) to create contiguous causal chains or paths that identify and order the most likely causal factors and outcomes for the input data set.

[00180] In some embodiments, the method further includes generating, filtering and scoring alternative candidates for solutions including: (i) forward-looking solutions selecting and prioritizing predicted outcomes for known causal factors; (ii) reverse solutions selecting and prioritizing likely candidate causal factors for known outcomes; (iii) heuristic algorithms for applying forward-chaining inference rules to adjust the prioritization of solution candidates; (iv) heuristic algorithms for applying backward-chaining inference rules to find candidates in the input or the knowledge network for required attribute dimensions with no candidates; (v) rules within the heuristic algorithms for differentiating binary and non-binary factors and applying weighting to each candidate to show both the likelihood of the candidate of forming part of a final solution and the degree to which emergent candidates participate in the outcome; (vi) inheritance rules within the heuristic algorithms for applying characteristics of higher-ordered taxonomical concepts to lower-ordered taxonomical concepts; and/or (vii) a human user interface to display prioritized solutions, their weightings and explanations.

[00181] In some embodiments, the method further includes using a lineage tracking algorithm for generating explanations based on the rules and causal path that lead to the solution, and why other possible solutions were rejected.

[00182] In some embodiments, the method further includes automatically validating a solution by searching literature with an advanced causal natural language interpreter to find and analyze corroborating text stating that said solution is possible, common, unlikely or impossible, including: (i) searching and analyzing text in web pages on the open web; (ii) searching and analyzing text in deep web content sources with limited access controlled by membership; and/or (iii) searching and analyzing text in case data in internal systems, documents and databases.

[00183] In some embodiments, the method further includes searching a plurality of named sources for information to be used in the creation of new knowledge propositions to build a knowledge graph for use in causal reasoning and natural language understanding, and in the validation of inferred knowledge propositions and solutions. Some embodiments include a knowledge graph comprising a plurality of predefined seed concept nodes (sometimes called seeded concept or seeded concept node; e.g., the node 813 described above in reference to Figure 8B) connected by descriptive, taxonomical, meronomical, spatial, temporal, linguistic and/or other named relationship vertices, and a plurality of directed subgraphs containing manually defined mechanistic cause and effect nodes connected by relation vertices. Some embodiments include a search string constructor or formulator algorithm and user interface to search a plurality of named sources for content matching the search string or logical components thereof. Some embodiments include a source list manager and user interface for selecting sources to search to support learning and validation. Some embodiments include a search bot to read text in each source to find phrases that contain the knowledge for comparison in natural language structures that augment, corroborate or refute existing knowledge propositions. Some embodiments include one or more machine learning algorithms using natural language analysis to scan text input from prior literature to automatically infer causal and other relationships contained in the text based on declarative statements containing both cause and effect in transitive active (if/then) or passive (result/because) structure. Some embodiments include an inference heuristic (an example of which is described above) with knowledge proposition formation rules that enable creation of new well-formed knowledge propositions (e.g., the knowledge propositions described above). Some embodiments include a plurality of heuristic algorithms for generating concept nodes and descriptive, taxonomical, meronomical, spatial, temporal, linguistic and other named relationships, and generate new directed subgraphs containing mechanistic cause and effect nodes connected by relation vertices based on inferred causal and other relationships (e.g., relationships stored in the memory of the system). Some embodiments include weighting algorithms for applying and adjusting confidence values to relations between nodes and directed subgraphs in the knowledge graph based on frequency of validation in literature search or the reliability of the source of the content. Some embodiments include qualifying heuristics using nodes, wherein the qualifier defines a known constraint that further defines the unique relationship between the nodes in a subgraph. Some embodiments include machine learning algorithms and heuristics to associate newly acquired or inferred concepts and subgraphs (e.g., the concepts and/or subgraphs stored in the memory of the system) to concepts and subgraphs already present in the knowledge graph, then flag them for validation prior to permanent storage. Some embodiments include machine learning algorithms and heuristics to modify pre-existing stored knowledge graph nodes, named relationships, subgraphs, their components and weights. Some embodiments include validation heuristics (e.g., as described above in reference to Figure 9E) for using found knowledge propositions to augment, corroborate or refute solutions derived from causal reasoning processes. Said sources of information include web pages, natural language material stored on permanent storage media such as file stores accessible to the system, or case data stored in content management systems or databases. A searching process is sometimes referred to as a literature search, even though some of the information searched is not of the form of published documents, according to some embodiments.

[00184] In another aspect, a computational system is provided, according to some embodiments. The computational system stores information in the form of a knowledge graph describing real world facts and associations in the form of contextually tagged and weighted knowledge propositions, in one or more knowledge domains including causality, taxonomy, meronomy, time, space, identity, language, symbols and mathematical formulas, that is used in conjunction with natural language understanding and logical inference to accurately determine (e.g., determination accuracy close to that of a human, or human level competence) why and/or how unknown factors resulted in a known outcome, and/or what outcomes are likely given known causal factors. The knowledge propositions are used as a basis of resolving ambiguity and determining the actual intent from among many possible interpretations of intent for sentences in natural language understanding. [00185] In another aspect, a method is provided for mechanistic causal reasoning, according to some embodiments. The method includes receiving an input text from a user, the input text specified in a natural language. The method also includes building a knowledge graph that represents real world facts and associations in the form of contextually tagged and weighted knowledge propositions, in multiple knowledge domains (e.g., causality, taxonomy, meronomy, time, space, identity, language, symbols and mathematical formulas). The method also includes resolving ambiguity and determining actual intent of the user for the input text, from a plurality of interpretations of intent for sentences in natural language understanding, using the knowledge graph in conjunction with natural language understanding and logical inference. The method also includes generating a response to user, as to why and/or how unknown factors resulted in a known outcome, or what outcomes are likely given known causal factors, based on the resolved ambiguity and the actual intent of the user.

GLOSSARY OF TERMS Activation

[00186] The spread of positive and negative electrical potential in the brain from neuron to neuron is called activation. Positive activation is called excitation and negative activation is called inhibition. In the IKE interpreter’s knowledge network activation spreads from node to node based on associative links. This activation is a means of stochastic or fuzzy recognition used in the genetic selection algorithms. Excitation corresponds to “heating up” or increasing confidence values and inhibition corresponds to “cooling down” or reducing confidence values.

Activation Wave

[00187] This expression represents the serial flow of excitation and/or inhibition triggered by a single input in a natural or artificial neural network. Natural and artificial neural networks can exhibit directional or chaotic flow of activation. An example of directional activation flow in a natural system is the human visual cortex which has multiple layers and through which activation flows sequentially from the back to the front, then on into the correlation and interpretation centers of the brain. Consequently, the deconstruction of the image in the brain’s visual center is an output of a relatively directional wave of activation flow. In other areas of the brain, electrical impulses flow in less directional and more chaotic patterns, but the level of activity triggered by an input usually subsides quickly making it possible for the brain to handle new inputs triggering new waves of activation.

[00188] Once in the correlation and interpretation centers, the flow becomes much less directional or more chaotic. Activation flows in parallel to many specialized areas of the brain. These processing centers respond by sending back activation patterns that contribute to the emergent phenomena of recognition and interpretation that go on to support all cognitive functions.

[00189] Whether directional or not, the path of any activation flow in a neural system can theoretically be traced backward from the point (neuron or node) where the flow stops to the point where it began, no matter how much it spreads or branches out in the process. The collection of all such serial paths triggered by a single input constitutes is called a wave.

[00190] The complexity of the input may be arbitrary, but the more complex the input, the more complex the wave will be, hence the more difficult to trace. The science of tracing activation waves in the brain is not yet mature enough to trace the entire path of activation flow either backward or forward from neuron to neuron for a given input. Artificial systems, however, can be traced. Mimicking human activation flow patterns is one of the key objectives of many artificial neural systems including the subject of this application.

Bi-directional Causal Reasoning

[00191] Unified mechanistic causal reasoning can function in both directions: Forward is to analyze internal knowledge, data and literature to extract causal factors from which to build a model to predict outcomes and reverse is to apply the model to new cases to test its ability to correctly infer causes.

Collider

[00192] In causal paths, an outcome or mediator is a collider when it is causally influenced by two or more causal factors. The name "collider" refers to the symbology in graphical models, in which arrows from more than one factor and, often unrelated to one another, lead into the same node.

Confounder

[00193] In causal paths, a confounding factor or lurking variable is a causal factor that influences more than one outcome, possibly causing a spurious association.

Consciousness

[00194] In humans, consciousness is an emergent cognitive phenomenon usually active whenever one or more of the senses is perceptually active. Other cognitive phenomena, such as attention, derive from consciousness and may be described as "heightened states of consciousness". In the IKE interpreter, consciousness is a state of accepting and processing input while maintaining a broader map of the spatial, temporal, commercial and social context associated with its primary user.

Context

[00195] Context is a snapshot of the universe from a specific point of view to a specific depth. If the viewpoint is that of an astronomer at work, it could begin at her desk and include a radius of many thousands of light years. If the viewpoint is that of an electron in an inert substance, the context would encompass a very small distance. Context includes locations in space, points in time, activities, ideas, intentions, communications, motion, change, stasis, and any describable thing closely associated with the person place or thing to which the context applies. Higher or superior levels of context may be described as domains.

Counterfactual

[00196] A counterfactual is a proposition that states that a certain associative proposition or causal link is unlikely, thus it spreads negative or inhibitory activation. Disambiguation

[00197] Disambiguation is the process of resolving ambiguity, especially in words, symbols or phrases that carry multiple possible meanings (polysemy). This is necessary for accurate interpretation of input human language text or utterances. Context is needed to disambiguate polysemous input.

[00198] Domain is named concept representing a high level of a taxonomy of context in which multiple subordinate contexts exist. “Domain” may be considered to be shorthand for “Domain of Knowledge”. There are knowledge domains that correspond to specialized processing areas in IKE, such as “time”, “space”, “causality” and “meronomy”, and knowledge domains that describe specialized areas of science or human activity such as “marine biology” and “professional sports”. All these domains combine to form a taxonomy of knowledge that supports definitions of domain characteristics which may be inherited by lower level domains and contexts.

Doping

[00199] In a genetic algorithm, doping is the process of introducing random or quasi random variables into the equation, population or gene pool to affect the process or the output or both. In this context quasi random may mean based on a random number generator, based on random selection of targets to which to apply variables, or based on non-random variables applied in a non-random way, but in which the variables have no describable association with the core interpretive processes or the targets to which the variables are applied.

Emergence

The term “emergent behavior” has been applied to the human brain and other complex systems whose internal behavior involves non-deterministic functionality or is so complex, or involves the interaction of so many pieces that tracing the process from beginning to end is not possible or not feasible. Because of the power of computers and their ability to track internal activity, it is not possible to produce 100% untraceable processes just as it is not possible to produce a random number generator that is actually random and not ostensibly random, thus, some emergent behavior in computers, other than in some artificial neural systems or neural networks, is trackable, and thus explainable.

[00200] In the context of the IKE interpreter, emergence is a computational behavior that mimics the inventor’s understanding of the spreading activation behavior of the human brain processes used to interpret human language.

Encoding Scheme

[00201] A way of representing something using a tightly specified symbol or token set. ASCII and EBCDIC are encoding schemes for symbol systems for alphabetic and numeric symbols. In this document, encoding scheme refers to a specific design for structuring language knowledge facts and real-world knowledge facts in the form of words and other human and machine readable symbols into conceptually relevant associations that can support automated or computerized access and processing. The English language is such an encoding scheme, but its irregularities make it difficult for use in its normal form for automated processing. Well-formed syllogisms or other logical statements with a finite set of connectors and operators are a more regular encoding scheme for knowledge of facts.

Expectation

[00202] Expectation is a concept that is relatively foreign to computing but essential to achieving high accuracy in natural language interpretation. Expectation is an a-priori set of contextual markers that describe a world familiar to the IKE interpreter system based on the world familiar to the human user. The more the system knows about the primary users and their surroundings, the better it will be able to determine the users’ intentions based on the words they submit to the system as input.

Fitness

[00203] Fitness is a characteristic of a candidate solution or a part thereof. In a genetic selection process, survival-of-the-fittest is used to differentiate possible solutions and enable the one or more fittest solutions to emerge victorious. Genetic Selection

[00204] Genetic Selection is a process of survival -of-the-fittest in which fitness algorithms are applied to multiple possible solutions and only the best survive each generation. Unlike winner-take-all processes in which only the single best candidate solution emerges, genetic selection can yield multiple surviving solutions in each generation. Then, as successive generations are processed, survivors from previous generations may die off if the succeeding generations are more fit.

Inference

[00205] Inference involves correlating multiple constraints contained in input and derived from other sources as premises, and drawing conclusions based on testing the relative truth of multiple propositions affecting each constraint or premise.

[00206] Inference is what humans constantly do with their brains. Based on perceptions, humans make inferences about meaning, about the state of things, about consequences of actions, and about life, the universe, and everything. Inference involves applying logic to new information derived from senses and remembered information stored in the brain to form conclusions.

[00207] Forming conclusions is important because the conclusions form a basis for correct interpretation and appropriate further processing. The IKE interpreter is capable of abandoning an inferred conclusion if newer information prompts it to do so.

Knowledge

[00208] The term knowledge means correlated information. This definition is part of a larger continuum in which everything interpretable can be assigned to some point on the continuum. The position of knowledge in this continuum can be described in terms of its complexity relative to other things in the environment that are interpretable. The level of facts that humans can learn and describe in simple phrases is called existential knowledge or data. Data is the kind of knowledge expressed in almanacs. At the complex end of the knowledge continuum is one or more levels of meta-knowledge or knowledge about knowledge.

[00209] The term “noise” is borrowed from radio wave theory to describe environmental things that interfere with the interpretation of or acquisition of knowledge. Noise, the simplest of all interpretable things, is made up of things in the perceptual environment or input that are less meaningful than data and typically irrekevant to the process or solution under consideration. An interpretation system must be able to process noise because it is omnipresent. Thus, a system must have knowledge that enables it to differentiate noise from salient data, though this may be more of an attention function than actual knowledge. Once the noise in the environment is filtered out, all that remains is data, which can be correlated to constitute information and knowledge.

[00210] Data elements that humans process are input in the form of perceptual stimuli to the five senses. The specific types of data available are tactile sensations, tastes, smells, sounds and images. These perceptual inputs are processed in specialized areas of the brain, correlated in parallel, then used as the basis for cognitive processing. The IKE interpreter algorithms are primarily designed to interpret human language, but are also able to be generalized to interpret the other forms of sensory input described above.

Knowledge Base

[00211] The IKE interpreter uses a combination of a lexicon and a Knowledge Network that contain information about things in the world and the way they are interrelated.

Knowledge Network

[00212] A massively interconnected network of information about how linguistic and real-world objects relate to one another.

Lexicon

[00213] Lexicon is a list of words, letters, numbers and phrases used in a natural language, such as English, that express meaning or facts or represent objects or phenomena. The lexicon consists of a list of lexical items, each a word or symbol or combination thereof. In the IKE interpreter, the lexicon is a gateway to the knowledge network.

Mutation

[00214] In a genetic algorithm mutation is a process of altering, possibly randomly, the characteristics of a candidate solution or a part thereof during processing. The mutated result then can compete with other results for fitness as a solution.

Natural Language Processing

[00215] Natural language processing means using computers to analyze natural language input, such as English sentences, for the purpose of interpretation, paraphrasing or translation.

Neural

[00216] Of, resembling or having to do with the processing components and functions of the brain and/or its cells. Perceptual, inquisitive, communicative, interpretive, creative and decisive cognitive processes occur in the brain through the functioning of its network of neuron cells. Those processes are neural and automated processes designed to resemble the structure and/or functions of these processes are often characterized as neural.

Polysemy

[00217] The linguistic phenomenon of multiple meanings applying to a single word, symbol or phrase.

Real-World Knowledge

[00218] Facts about phenomena and objects. In this document, real-world knowledge refers to information or data associations encoded in an ontology or knowledge graph in a meaningful or expressive way to represent facts in the world. Some facts describe the hierarchical relations between classes and subclasses of objects in the real world such as “a dog is Canine in the animal kingdom”. Other facts describe causal relations such as “gravity pulls physical objects toward itself’, and yet others describe constructive relations such as “a knob is part of a door”.

Stochastic

[00219] Non-deterministic or “fuzzy” processing techniques and encoding approaches that deliver output from a process that uses statistical probabilities instead of simple true false logic. In some stochastic processes it is virtually impossible to predict the output based on the inputs because of the sheer number of permutations and/or the complexity of the weighting mechanisms and processes to adjust weights during the course of the process and prior to the output.

Token

[00220] A token is a discrete string of one or more symbols or characters that has a beginning and an end and an unchanging content. If the content were to change through the addition, subtraction or modification of one or more of its characters, it would become a different token.

[0027] The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the claims. As used in the description of the embodiments and the appended claims, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

[0028] As used herein, the term “if’ may be construed to mean “when” or “upon” or

“in response to determining” or “in accordance with a determination” or “in response to detecting,” that a stated condition precedent is true, depending on the context. Similarly, the phrase “if it is determined [that a stated condition precedent is true]” or “if [a stated condition precedent is true]” or “when [a stated condition precedent is true]” may be construed to mean “upon determining” or “in response to determining” or “in accordance with a determination” or “upon detecting” or “in response to detecting” that the stated condition precedent is true, depending on the context.

[0029] The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated.

Previous Patent: BALLOON CATHETER ASSEMBLY FOR INSERTION AND POSITIONING THERAPEUTIC DEVICES WITHIN A VASCULAR SYSTEM

Next Patent: IMIDE-CONTAINING POLYESTER POLYOLS AND INTUMESCENT RIGID FOAMS