AUTOMATIC DETECTION OF INTENTION OF NATURAL LANGUAGE INPUT TEXT

Title:

AUTOMATIC DETECTION OF INTENTION OF NATURAL LANGUAGE INPUT TEXT

Document Type and Number:

WIPO Patent Application WO/2022/211737

Kind Code:

Abstract:

According to one aspect of the invention, there is provided a platform for processing natural language input text, the platform configured to parse an input string into constituent components; categorise each of the constituent components through context with respect to other constituent components; identify, from the categorised constituent components, components that are activity related; validate a first hypothesis that determines which of the activity related components provides an intention of the input string from analysing the constituent components used to support the hypothesis, together with their placement relative to the activity-related component; identify, from the categorised constituent components, noun components; validate a second hypothesis that determines which of the noun components provides a target of the intention from analysing the constituent components used to support the second hypothesis, together with their placement relative to the noun component; and extract the activity related component that provides the intention of the input string and the noun component that provides the target of the intention, so as to construct the intention of the input string.

Inventors:

CHOY JUNYU (SG)

Application Number:

PCT/SG2022/050183

Publication Date:

October 06, 2022

Filing Date:

March 30, 2022

Export Citation:

Click for automatic bibliography generation Help

Assignee:

EMO TECH PTE LTD (SG)

International Classes:

G06F40/20; G06F40/30; G10L15/00

Foreign References:

US20200184307A1	2020-06-11
CN111984778A	2020-11-24
CN102789464A	2012-11-21
US10600419B1	2020-03-24
US20200012906A1	2020-01-09

Attorney, Agent or Firm:

BIRD & BIRD ATMD LLP (SG)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. A platform for processing natural language input text, the platform configured to parse an input string into constituent components; categorise each of the constituent components through context with respect to other constituent components; identify, from the categorised constituent components, components that are activity related; validate a first hypothesis that determines which of the activity related components provides an intention of the input string from analysing the constituent components used to support the hypothesis, together with their placement relative to the activity-related component; identify, from the categorised constituent components, noun components; validate a second hypothesis that determines which of the noun components provides a target of the intention from analysing the constituent components used to support the second hypothesis, together with their placement relative to the noun component; and extract the activity related component that provides the intention of the input string and the noun component that provides the target of the intention, so as to construct the intention of the input string.

2. The platform of statement 1, wherein the categorisation is performed by a universal dependency treebank, which categorises based on interdependency of the constituent components.

3. The platform of statement 1 or 2, wherein the validation of the first hypothesis and the validation of the second hypothesis comprises analysing results of an application of rules that test the first hypothesis and the second hypothesis.

4. The platform of statement 3, wherein the rules used to test the first hypothesis comprise any one or more of determining whether there is a noun after the intention that is being validated; whether the intention that is being validated is subordinate to another intention; whether the input string has complimentary intentions to the intention that is being validated; whether conjunctions follow the intention that is being validated; and whether the intention that is being validated is a first occurring verb in the input string.

5. The platform of statement 3 or 4, wherein the rules used to test the second hypothesis comprise any one or more of determining whether there is an intention before the noun that is being validated; and whether the noun that is being validated has been referenced by the constituent components in the input string.

6. The platform of any one of the statements 3 to 5, wherein the rules are grouped into two, wherein an application of the rules in either group results in an affirmative or a negative outcome.

7. The platform of any one of the preceding statements, wherein the input string comprises a phrase having a predefined minimum number of words.

8. The platform of any one of the preceding statements, wherein the input string comprises text tokenised into individual sentences found between two punctuation marks.

9. The platform of any one of the preceding statements, further configured to identify language of the input string before parsing into its constituent components.

10. The platform of any one of the preceding statements, further configured to classify the input string as either a question or an utterance.

11. The platform of any one of the preceding statements, further configured to map the extracted intention and the target of the intention to a matching input for an external application, to which the external application is able to recognise and respond.

Description:

Automatic detection of intention of natural language input text

FIELD

The present invention relates to processing of natural language input text to extract its intention through the use of, for example, semantic rules.

BACKGROUND

Chatbots are taking hold of the online world. Markets &Markets forecast chatbots to become a US$ lObn industry, fuelled by demand for automation both in the covid and post-covid world.

Businesses favour them because they reduce customer service costs, increase sales leads and are available 24/7. Consumers also favour them because they are convenient, private (in the sense that they do not require human to human engagement) and easy to use.

Banks use chatbots to facilitate banking transactions, customer service requests and product fulfilment. E-commerce use them to offer 24/7 customer service, catering to shoppers who browse websites at night. Beauty companies rely on them to advise customers and generate sales leads.

Unfortunately, chatbots are not easy to implement. Users need bots to understand what they want (the intention) using natural language expressions. The industry has invested billions of dollars in advanced Natural Language Processing (NLP) technologies, using machine-learning processes that consume vast computational resources to parse, decode and interpret human intent.

Despite these investments, bots still struggle to understand human intent in the myriad ways that users express themselves. They need to be extensively trained and most organisations do not have the resources to fully support them. The results is that, for example, some chatbots employed by the Government fail to interpret human intent.

There is thus a need to develop a model that can be deployed with chatbots with minimal training effort, compared to typical machine-learning based NLPs. The benefit is lower development costs, faster time- to-market and significantly less effort to maintain and evolve the bot.

SUMMARY

Figure 1 is a block diagram of a text input type determination module present in a platform in accordance with an embodiment of the present invention.

Figure 2 is a block diagram of a parsing decision determination module present in a platform in accordance with an embodiment of the present invention.

Figure 3 is a block diagram of an intent determination module present in a platform in accordance with an embodiment of the present invention.

Figure 4 is a block diagram of an intent verification determination module present in a platform in accordance with an embodiment of the present invention.

OVERVIEW

This section provides an overview of several features of the present invention, as a preface to the detailed description and is not intended to delineate the scope of the present invention.

The present invention relates to intention identification of textual information input for determining a user's intention that enable the appropriate actions to be taken. The intention refers to what an input text seeks to achieve, i.e. the objective of the input text, whereby the intention of the input text comprises an activity related component (in most instances being, but not limited, to a verb) and a noun component (which is the object of the activity related component). “Components” when used in the context of input text refer to a word or a phrase extracted from the input text.

Extraction of one or more intentions from the textual inputs is done through semantical rules to detect a user's intention. Semantical rules are defined as rules that identify the specific nature of words, giving them meaning in the context of a particular sentence. The semantical rules are derived from the syntactical rules of the language of the input text. Syntactical rules define the nature and order of words used by a language for sentence construction, whereby detecting the presence of these rules allow for the sentence to be parsed and for its intention to be extracted.

Various types of languages consistently adhere to specific structures. Generally, languages follow either a S-O-V (Subject-Object- Verb) structure, or follow a S-V-O (Subject- Verb-Object) structure. This underlying syntactic structure makes it possible to identify the subject and object for any sentence if the verb is identified. For example, in the case of English which follows the S-V-O structure, the sentence "I love you.", the verb can be identified as "love" which according to the syntactical rules implies that "I" is the subject while "you" refers to object. At the same time, languages adopt various syntactic rules to identify the sentence's nature actively. These unique characteristics of languages are harnessed to develop text parsing rules that achieve the purpose of automatically detecting the structure of text input, to determine semantic and lexical features that can identify intentions.

The intention of an input text can be an action, idea or request. The identification of the action, idea, or request depends on identifying specific elements in the sentence. As with syntactic/semantic rules, there are other structures present in the sentence that identifies the sentence's nature. For example, the sentence "Could you please assist me?", the addition of "Please" and "Could" creates new semantic features that changes the intention of the S-V-O structure of "You assist me." which is an action to a request. To accomplish this step of identification, the system uses multiple decision rules to guide the parsing approach and to apply different semantic rules to, ultimately, identify the intention accurately. Identifying the intention accurately provides a means to positively engage with a provider of the input text. DETAILED DESCRIPTION

The subject matter is now discussed with reference to the drawings, wherein like reference numbers are used to refer to like elements. As used in this application, the term “module” can refer to software, algorithm, hardware, a combination of hardware and software, process, or process in execution. One or more modules may reside within a process or hardware.

Herein disclosed is means to process natural language input to automatically extract (i.e. without user input) its objective, to return a result that meets the objective. The extraction is particularly advantageous for search, chatbot or similar applications that are unable to parse the natural language input in its original format. The extraction is done by an algorithm that analyses the natural language input to ascertain its semantic structure, identifies its intention and a target of the intention, i.e. the underlying objective of the input text and what a user seeks to achieve. Digital users have different needs under different contexts, resulting in them requesting various information with various intentions. Providing the right answer to the user can dramatically increase customer satisfaction.

This extraction algorithm may be hosted in a platform required to have natural language input text processing capability. The platform parses an input string into constituent components, whereby the input is separated into words or phrases that include noun(s), verb(s), connector word(s) like "is" and "are". The platform categorises each of the constituent components through context with respect to other constituent components. In one approach, placement of one constituent component with respect to another constituent component is used as basis for the categorisation. Each word in an input text is categorised as a noun, verb or connector by analysing its position and order compared to other words in a sentence. For example, the word "vote" in the sentence, "I went to vote " is a verb. However, "vote" in the sentence, "Every vote counts in an election" is a noun

The platform then identifies, from the categorised constituent components, components that are activity related, such as components that convey the performance of an action, like verbs and adverbs. The platform validates a first hypothesis that determines which of the activity related components provides an intention of the input string from analysing the constituent components used to support the first hypothesis, together with their placement relative to the activity related component. This validation is required because each activity related component could possibly provide the intention of the input text. In one approach, a selected activity related component is tested to determine whether it conveys the intention of the input string, the testing involving the use of other constituent components (i.e. excluding the selected activity related component), factoring in their location compared to the selected activity related component. If the selected activity related component does not provide the intention, another activity related component is then tested.

The platform also identifies, from the categorised constituent components, noun components. Noun components are relevant for whether they provide a subject or an object in the input text. In addition, when the noun component is an object, whether it is the target of the intention of the input text. The platform validates a second hypothesis that determines which of the noun components provides a target of the intention from analysing the constituent components used to support the second hypothesis, together with their placement relative to the noun component. In one approach, a selected noun component is tested to determine whether it the target of the intention (resulting from the validation of the first hypothesis), the testing involving the use of other constituent components (i.e. excluding the selected noun component), factoring in their location compared to the selected noun component.. If the selected noun component is not the target of the intention, another noun component is then tested. As part of the second hypothesis validation, the platform may decide whether the input text follows a S-O- V (Subject-Object- Verb) or S-V-O (Subject- Verb-Object) syntactic structure, so as to identify the subject and object for any sentence for the already identified verb (i.e. the activity related component identified from the first hypothesis). The activity related component that provides the intention of the input string and the noun component that provides the target of the intention can both be extracted, for example, for use as input for external applications. The platform for processing natural language input text to extract its intention and the target of the intention is described in greater detail below, with reference to Figures 1 to 4.

Figures 1 to 3 show block diagrams, each representative of a module present in a platform 150 for processing natural language input text, in accordance with an embodiment of the present invention. Each computer may be a programmed module implemented by one or more processors executing instructions to perform its designated function, to allow the platform 100 to achieve its objective of extracting the intention of an input text and a target of the intention. The platform 150 may be part of a network of servers, not shown for the sake of simplicity, that are working in conjunction to achieve this objective and/or utilise the extracted intention and the target of the intention.

Figure 1 shows that the platform 150 comprises a text input type determination module 101 that receives an input string as text input 100 and outputs text input type 102. The input string may be a phrase preferably with a predefined minimum number of words. The text input type determination module 101 classifies the text input 100 into one of three types of textual inputs. The three types of textual inputs are identified as action, question and comments, so that the text input type determination module 101 essentially classifies an input string as either a question or an utterance. The classification process involves the active application of specialised syntactic/semantic rules developed to identify the textual inputs. These rules are provided by the user or pre-determined by the platform 150 or a system to which the platform 150 belongs.

Figure 2 shows that the platform 150 further comprises a parsing decision determination module 201 that analyses the text input type 102 (i.e. the output of the text input type determination module 101 providing a classification of the text input 100 into one of three types) to decide how to parse the input string into constituent components. The parsing decision determination module 201 determines the most appropriate parsing approach 202 that can facilitate the extraction of the intention of the input text 100. The determination of the most appropriate parsing approach 202, which also results in categorising each of the parsed constituent components through context with respect to other constituent components in the input string, is based on the placement of the core semantic feature of the language and the peripheral features of the language. The ordering and placement of the parsed constituent components in the input string will determine the core intention. The rules used by the parsing decision determination module 201 to decide on a parsing approach and the subsequent categorisation of the parsed constituent components are provided by the user or pre-determined by the platform 150 or a system to which the platform 150 belongs. In one implementation, the parsing decision determination module 201 uses a universal dependency treebank, which categorises based on interdependency of the constituent components, to determine how to categorise an input string into its constituent components.

Figure 3 shows that the platform 150 further comprises an intention determination module 304, which includes a parsed output intention identification module 302. The parsing approach 202, chosen as explained with respect to Figure 2, is provided to the parsed output intention identification module 302. The parsed output intention identification module 302 then parses the input string, received as the text input type 100 by the intention determination module 304, using the chosen parsing approach 202. This results in the input string being parsed into its constituent components, with each of the constituent components being categorised through context with respect to other constituent components. In addition, the parsed output intention identification module 302 applies a specific set of rules to parse the text input to produce initial intentions, i.e. activity related components and noun components are identified from the categorised constituent components. The rules are provided by the user or pre determined by the platform 150 or a system to which the platform 150 belongs. Once input parsing has been completed, the intention determination module 304 provides parsed intentions 303, being the identified activity related components and the identified noun components from the input string in the text input type 102.

Figure 4 shows that the platform 150 further comprises an intention verification determination module 404. The intention verification determination module 404 comprises an intention and object identification module 401 that receives parsed intentions 303 (i.e. the output from the intention determination module 304 of Figure 3); and an intention and object coherence determination module 402. The intention and object coherence determination module 402 utilises a specific set of rules to identify the intention and object relationship, from the activity related components and the identified noun components present in the parsed intentions 303, to determine the correct intention and a target of the correct intention. The rules are provided by the user or predetermined by the platform 150 or a system to which the platform 150 belongs.

In some situations, the parsed intentions 303 have multiple possible intentions, especially in the case where the parsed intentions 303 are derived from is a complex statement having several activity related components (such as verbs, verb extensions, like gerunds) and several noun components (which can serve as an object to the activity related components). The intention verification determination module 404 tests these activity related components (i.e. the multiple intentions) by validating a first hypothesis that determines which of them provides an intention of the input string, from analysing the constituent components used to support the first hypothesis together with their placement relative to the activity related component that is being tested. The intention verification determination module 404 also validates a second hypothesis that determines which of the noun components provides a target of the intention from analysing the constituent components used to support the second hypothesis, together with their placement relative to the noun component.

A sequential approach may be adopted during the validation of two hypothesis. If the validation of the first hypothesis returns a definitive outcome, the selected activity related component that was tested, along with constituent components tested together with the selected activity related component form the intention of the input string. There is then no requirement to validate the second hypothesis. If both hypotheses require validation, the first hypothesis is performed first, followed by the second hypothesis. In one approach, the intention verification determination module 404 performs the validation of the first hypothesis and the second hypothesis by checking for each activity related component coherence with a noun component in the parsed intentions 303. The checking of coherence is achieved through use of order and placement of semantic features. Identifying the location of the specific feature and its relative position to other features identifies the right intentions from the several possible intentions. In other situations, there are ambiguous cases where the word placements are incorrect, leading to parsing of one or more phrases with an unclear purpose (i.e. the meaning of each such phrase is not readily discernible), each presenting a possible intention from which a correct intention for the input string has to be verified. To deal with this situation, the intention determination module 304 performs the validation of the first hypothesis and the second hypothesis by utilising multiple rules to test the most appropriate purpose for each such phrase. The correctly identified purposes are then tested to verify which provides the correct intention for the input string. The rules test various hypotheses of the sentence structure to evaluate the right intention. Once intention verification has been completed, the intention and object coherence determination module 402 outputs coherent intentions 403, being the activity related component that provides the intention of the input string and the noun component that provides the target of the intention. The coherent intentions 403 may be exported, such as to map the intention and the target of the intention to a matching input for an external application, to which the external application can recognise and respond.

In one implementation, the validation of the first hypothesis and the validation of the second hypothesis involves analysing results of an application of rules that test the first hypothesis and the second hypothesis. The rules used to test the first hypothesis may include any one or more of determining whether there is a noun after the intention that is being validated (e.g. the noun “polls” after the intention “habit to vote early in the primaries”); whether the intention that is being validated is subordinate to another intention (e.g. the relationship of the intention “polls open on weekend” to the intention “habit to vote early in the primaries”); whether the input string has complimentary intentions to the intention that is being validated (e.g. “arrive at the polls station when it opens”, which is complimentary to “habit to vote early in the primaries”); whether conjunctions follow the intention that is being validated (e.g. “and” in the phrase “and arrive at the polls station when it opens” after “habit to vote early in the primaries”); and whether the intention that is being validated is a first occurring verb in the input string (e.g. is “vote” in “habit to vote early in the primaries” the first verb in the input string?). The rules used to test the second hypothesis may include any one or more of determining whether there is an intention before the noun that is being validated (e.g. “vote” before “primaries” in the intention “habit to vote early in the primaries”); and whether the noun that is being validated has been referenced by the constituent components in the input string (e.g. “election day”).

The rules used for the validation of the first hypothesis and the validation of the second hypothesis may be categorised into two groups, those whose application results in an affirmative outcome and those whose application results in a negative outcome. To successfully validate either or both of the first or second hypothesis, the rules belonging to the two groups may have to be applied in the affirmative or negative accordingly.

For instance, to validate the first hypotheses the rule of whether there is a noun component after an activity related component has to be positive; the rule of whether the intention that is being validated is subordinate to another intention has to be negative; the rule of whether there are complimentary intentions to the intention that is being validated has to be positive; the rule of whether conjunctions follow the intention that is being validated has to be positive; and the rule of whether the intention that is being validated is a first occurring verb in the input string has to be positive.

On the other hand, to validate the second hypothesis, the rule of whether there is an intention before the noun that is being validated has to be negative; and the rule of whether the noun that is being validated has been referenced by the constituent components has to be positive.

The input string may comprise text tokenised into individual sentences found between two punctuation marks. In addition, the platform 150 may be further configured to identify language of the input string before parsing into its constituent components.

The working of the platform 150 is described below with reference to a simple input phrase, along with an illustration of a selection of the rules used in determining the intention of the input phrase.

The input phrase is "I want to cancel my plan.". The phrase is received by the platform 150 as text input 100. Text input 100 is processed by the text input type determination module 101 to determine the classification of the input phrase. To perform the classification, the text input type determination module 101 may apply the most straightforward rule to identify verbs' position. The first rule of the S- V-O structure is applied to identify the placement of subject, verb and object. From the sentence parsing, the verbs “want” and “cancel” are found to be between the subject “I” and the object “plan”. Hence the text input type determination module 101 outputs the text input type 102, indicating that the input phrase is classified as a request.

The text input type 102, specifying the classification of the input phrase, is provided to the parsing determination module 201. As the input phrase is a request, the parsing determination module 201 will then select a parsing rule which is based on V - O. The intention of the input phrase is determined through the verb - object rule. The selected parsing rule is output as parsing approach output 202.

The input phrase and the selected parsing rule are respectively received as text input 100 and parsing approach output 202 to the intent determination module 304. The selected parsing rule from the parsing approach output 202 is applied to extract the intention from the text input 100. Using the V-O rule, two possible intentions are identified and output as parsed intentions 303. The first possible intention is "Want - Plan" and the second possible intention is "Cancel - Plan". The parsed intentions 303 form the input to the intention verification module 404. The intention and object coherence determination module 402 uses coherence determination rules and hypotheses to determine which of the possible intentions in the parsed intentions 303 is the actual intention of the input phrase. The coherence determination uses several hypotheses to verify the intention. A possible first hypothesis is V-C-V (verb - conjunction - verb). This first hypothesis assumes that the verbs are linked to one another and thus they will reinforce each other. In this case, "Want - to - Cancel" fits the hypothesis which is then reinforced by a possible second hypothesis that is "V - C - V - O" that identifies "Cancel - my Plan" as the verified intention. This then produces the verified intention of the input phrase as "Cancel - my plan", output as coherent intentions 403.

While this invention has been described with reference to exemplary embodiments, it will be understood by those skilled in the art that various changes can be made and equivalents may be substituted for elements thereof, without departing from the spirit and scope of the invention. In addition, modification may be made to adapt the teachings of the invention to situations and materials, without departing from the essential scope of the invention. Thus, the invention is not limited to the examples that are disclosed in this specification, but encompasses all embodiments falling within the scope of the appended claims.

Previous Patent: VARIANTS OF RHIZOMUCOR MIEHEI LIPASE AND USES THEREOF

Next Patent: METHOD AND ASSEMBLY FOR DETERMINING CONTAINER WEIGHT