PROVIDE KNOWLEDGE ANSWERS FOR KNOWLEDGE-INTENTION QUERIES

Title:

PROVIDE KNOWLEDGE ANSWERS FOR KNOWLEDGE-INTENTION QUERIES

Document Type and Number:

WIPO Patent Application WO/2021/257178

Kind Code:

Abstract:

The present disclosure proposes methods and apparatuses for providing knowledge answers for knowledge-intention queries. A query may be obtained. It may be identified whether the query is a knowledge-intention query. In response to identifying that the query is a knowledge-intention query, a knowledge answer for the knowledge-intention query may be determined from a corpus, the corpus being established based on at least one reference document. A response for the knowledge-intention query may be provided, the response comprising the knowledge answer.

Inventors:

TIAN YANG (US)
ZHENG WEI (US)
CHEN PENG (US)

Application Number:

PCT/US2021/028517

Publication Date:

December 23, 2021

Filing Date:

April 22, 2021

Export Citation:

Click for automatic bibliography generation Help

Assignee:

MICROSOFT TECHNOLOGY LICENSING LLC (US)

International Classes:

G06F16/332

Domestic Patent References:

WO2019051845A1

2019-03-21

Foreign References:

US20190236142A1	2019-08-01
US20200005118A1	2020-01-02
US20200036659A1	2020-01-30

Attorney, Agent or Firm:

SWAIN, Cassandra T. et al. (US)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1. A method for providing knowledge answers for knowledge-intention queries, comprising: obtaining a query; identifying whether the query is a knowledge-intention query; in response to identifying that the query is a knowledge-intention query, determining a knowledge answer for the knowledge-intention query from a corpus, the corpus being established based on at least one reference document; and providing a response for the knowledge-intention query, the response comprising the knowledge answer.

2. The method of claim 1, wherein the determining a knowledge answer comprises: selecting at least one candidate text item for the knowledge-intention query from the corpus; ranking the at least one candidate text item; and selecting the knowledge answer from the at least one ranked candidate text item.

3. The method of claim 1, wherein the identifying whether the query is a knowledge- intention query comprises: classifying the query as a knowledge-intention query or a non-knowledge-intention query at least through a classifier.

4. The method of claim 1, wherein the corpus comprises a plurality of text items, each text item comprising a single sentence, a combination of sentences, a single paragraph, or a combination of paragraphs extracted from a corresponding reference document.

5. The method of claim 4, wherein the combination of sentences comprises two or more consecutive sentences, and the combination of paragraphs comprises a combination of relevant sentences from two or more paragraphs.

6. The method of claim 1, wherein the at least one reference document is associated with at least one target domain.

7. The method of claim 2, wherein the selecting at least one candidate text item comprises: searching out the at least one candidate text item from an index set corresponding to the corpus based on the knowledge-intention query.

8. The method of claim 2, wherein the ranking the at least one candidate text item comprises: ranking the at least one candidate text item at least through a ranker.

9. The method of claim 2, wherein the selecting the knowledge answer comprises: selecting a highest-ranked candidate text item in the least one ranked candidate text item as the knowledge answer.

10. The method of claim 9, wherein the selecting a highest-ranked candidate text item as the knowledge answer comprises: determining whether a ranking score of the highest-ranked candidate text item exceeds a predetermined threshold; and in response to determining that the ranking score exceeds the predetermined threshold, selecting the highest-ranked candidate text item as the knowledge answer.

11. The method of claim 1, wherein the response is provided in a form of information card, the information card comprising the knowledge answer.

12. The method of claim 11, wherein the information card further comprises at least one of: a title of a reference document corresponding to the knowledge answer; an image in the reference document; and a link to the reference document.

13. The method of claim 1, further comprising: in response to identifying that the query is not a knowledge-intention query, providing a response to the query in a chitchat mode.

14. An apparatus for providing knowledge answers for knowledge-intention queries, comprising: a query obtaining module, for obtaining a query; a knowledge-intention query identifying module, for identifying whether the query is a knowledge-intention query; a knowledge answer determining module for, in response to identifying that the query is a knowledge-intention query, determining a knowledge answer for the knowledge- intention query from a corpus, the corpus being established based on at least one reference document; and a response providing module, for providing a response for the knowledge-intention query, the response comprising the knowledge answer.

15. An apparatus for providing knowledge answers for knowledge-intention queries, comprising: at least one processor; and a memory storing computer-executable instructions that, when executed, cause the at least one processor to: obtain a query, identify whether the query is a knowledge-intention query, in response to identifying that the query is a knowledge-intention query, determine a knowledge answer for the knowledge-intention query from a corpus, the corpus being established based on at least one reference document, and provide a response for the knowledge-intention query, the response comprising the knowledge answer.

Description:

PROVIDE KNOWLEDGE ANSWERS FOR KNOWLEDGE-INTENTION

QUERIES

BACKGROUND

[0001] Artificial Intelligence (AI) chatbot is becoming more and more popular and is being applied in more and more scenarios. A chatbot is designed to simulate human utterances and may chat with a user through texts, speeches, images, etc. In general, a chatbot may identify language content in a message input by a user or apply natural language processing to the message, and then provide the user with a response to the message. SUMMARY

[0002] This Summary is provided to introduce a selection of concepts that are further described below in the Detailed Description. It is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

[0003] Embodiments of the present disclosure propose methods and apparatuses for providing knowledge answers for knowledge-intention queries. A query may be obtained. It may be identified whether the query is a knowledge-intention query. In response to identifying that the query is a knowledge-intention query, a knowledge answer for the knowledge-intention query may be determined from a corpus, the corpus being established based on at least one reference document. A response for the knowledge-intention query may be provided, the response comprising the knowledge answer.

[0004] It should be noted that the above one or more aspects comprise the features hereinafter fully described and particularly pointed out in the claims. The following description and the drawings set forth in detail certain illustrative features of the one or more aspects. These features are only indicative of the various ways in which the principles of various aspects may be employed, and this disclosure is intended to include all such aspects and their equivalents.

BRIEF DESCRIPTION OF THE DRAWINGS

[0005] The disclosed aspects will hereinafter be described in conjunction with the appended drawings that are provided to illustrate and not to limit the disclosed aspects. [0006] FIG.l illustrates an exemplary application scenario of providing knowledge answers for knowledge-intention queries according to an embodiment.

[0007] FIG.2 illustrates an exemplary process of establishing a corpus according to an embodiment. [0008] FIG.3 illustrates an exemplary process of responding to a knowledge-intention query according to an embodiment.

[0009] FIG.4 illustrates an exemplary process of determining a knowledge answer according to an embodiment.

[0010] FIG.5 illustrates an exemplary chat session window according to an embodiment.

[0011] FIG.6 illustrates a flowchart of an exemplary method for providing knowledge answers for knowledge-intention queries according to an embodiment.

[0012] FIG.7 illustrates an exemplary apparatus for providing knowledge answers for knowledge-intention queries according to an embodiment.

[0013] FIG.8 illustrates an exemplary apparatus for providing knowledge answers for knowledge-intention queries according to an embodiment.

DETAILED DESCRIPTION

[0014] The present disclosure will now be discussed with reference to several example implementations. It is to be understood that these implementations are discussed only for enabling those skilled in the art to better understand and thus implement the embodiments of the present disclosure, rather than suggesting any limitations on the scope of the present disclosure.

[0015] In general, a chatbot may chat automatically in a session with a user. A "session" may refer to a time continuous conversation between chat participants, and may comprise messages and responses in the conversation. A "message" may refer to any information input by a user, e.g., a query from a user, an opinion of a user, an answer of a user to a chatbot's response, etc. The term "message" and the term "query" may also be used interchangeably. A "response" may refer to any information provided by a chatbot, which comprises, e.g., an answer of a chatbot to a user's query, a comment of a chatbot, a topic proposed by a chatbot, etc.

[0016] Most of the existing chatbots are constructed, based on training data in a general domain, for performing chitchat with users, wherein the chitchat may also be referred to as free chat, pure chat, etc. In some scenarios, a chatbot may adopt traditional query-answering (QA) techniques for answering a user query in an automated chatting. Usually, a chatbot may only give answers to general queries, and the approach of retrieving answers is only based on simple matching between query texts and candidate answer texts.

[0017] Embodiments of the present disclosure propose to provide knowledge answers for knowledge-intention queries. Herein, a knowledge-intention query may refer to a query having a clear intention, for which an informative knowledge answer is expected to obtain. In an aspect, a knowledge-intention query is a query involving knowledge acquisition, which is intended to obtain concerned knowledge or information. In another aspect, a knowledge-intention query has a clear intention, e.g., the query clearly expresses what kind of knowledge is desired to obtain. In some cases, a user may desire to obtain knowledge of interest about a specific domain, topic, or entity, therefore, a query proposed by the user in a chat with a chatbot may be a knowledge-intention query. For improving chatting or service experience of a user, it will be very important for a chatbot to be able to identify a knowledge-intention query and provide a corresponding answer. The embodiments of the present disclosure propose implementations for especially responding to queries of such knowledge-intention type. Herein, a knowledge answer may refer to an accurate and clear answer to a knowledge-intention query, which comprises knowledge or information expected by the knowledge-intention query.

[0018] In an aspect, the embodiments of the present disclosure may pre-establish a corpus for retrieving knowledge answers based on reference documents. A reference document may be a document containing a large amount of knowledge information. Text items extracted from the reference documents may be added into the corpus. Herein, a text item may refer to a corpus data item stored in the corpus. The extracted text items may cover one or more sentences, one or more paragraphs, etc. in the reference documents. Therefore, answers determined from these text items in the corpus will also be informative, e.g., including a plurality of sentences or a plurality of paragraphs, so that these answers will be more detailed and concrete. Moreover, when extracting the text items from the reference documents, specific processing rules may be considered, so as to further improve relevance and completeness of content in each text item.

[0019] In an aspect, the embodiments of the present disclosure propose effective identification processing for a knowledge-intention query, so as to trigger determination of an answer to the knowledge-intention query. A pre-established classifier may be used for efficiently and accurately performing knowledge-intention query identification.

[0020] In an aspect, the embodiments of the present disclosure propose accurate ranking processing for candidate text items in a corpus. The candidate text items may be ranked through a pre-established ranker, so as to accurately select knowledge answers.

[0021] In an aspect, the embodiments of the present disclosure propose a more effective and reasonable response presenting form. For example, a response containing a knowledge answer may be provided in a form of predefined information card. This presenting approach will facilitate to guide a user to further obtain context information relevant to the knowledge answer, e.g., accessing a corresponding reference document, thereby enhancing service usage by users.

[0022] According to the embodiments of the present disclosure, it may be effectively identified whether a user query has a clear knowledge acquisition intention, and an accurate and comprehensive knowledge answer may be provided if the user query is a knowledge- intention query. Thus, service experience by users may be significantly improved.

[0023] FIG.l illustrates an exemplary application scenario 100 of providing knowledge answers for knowledge-intention queries according to an embodiment.

[0024] It is assumed that a user 110 may access a target service 122 through a terminal device 120. The terminal device 120 may be any type of electronic computing device that is capable of accessing service platforms, servers or websites on the network, and capable of processing data or signals. For example, the terminal device 120 may be a desktop computer, a notebook computer, a tablet computer, a smart phone, an AI terminal, etc. [0025] The target service 122 may be a service provided by a dedicated application or client installed on the terminal device 120, or may be a web service accessible through an application such as a browser on the terminal device 120. In the application scenario 100, the target service 122 is a service capable of providing an automated chatting function. [0026] A target service platform 130 may be connected to the terminal device 120 through the network, for providing the target service 122 to the user 110. The target service platform 130 may be various network elements such as a server, a website, etc., used for running, supporting, maintaining, or managing the target service 122.

[0027] It is assumed that the target service platform 130 provides an automated chatting function in the target service 122 through interacting and cooperating with a chatbot server 140. In this case, a chatbot or an electronic conversation agent supported by the chatbot server 140 may be embedded in the target service 122, so that the user 110 may chat with the chatbot when accessing the target service 122. Exemplarily, in the case that the target service 122 is a service related to a social networking application, the chatbot may act as a virtual character or a target character to chat with the user 110. For example, the chatbot may automatically chat with the user 110 on behalf of an operator of an official account within the official account of a social networking application. Exemplarily, in the case that the target service 122 is a service related to e-commerce, the chatbot may act as a virtual customer service staff of a store to chat with the user 110. It should be understood that the embodiments of the present disclosure are not limited to the above exemplary target services, but may cover any other service that provides an automated chatting function. [0028] The target service platform 130 may have a large number of service reference documents 132 associated with the target service 122. For example, when the target service 122 is an official account related to sports content in a social networking application, an operator of the official account may provide a large number of reference documents about sports content. For example, when the target service 122 is an e-commerce website, the e- commerce website may have a large number of descriptive documents about commodities. [0029] In order to enable the chatbot to provide knowledge answers to knowledge- intention query, the chatbot server 140 may obtain the service reference documents 132, and extract text items from the service reference documents 132 to form a corpus. Since the service reference documents 132 contain knowledge information about the target service 122, answers provided based on text items in the corpus will also be knowledge answers. For example, when the user 110 proposes a knowledge-intention query about the target service in the target service 122, the chatbot may retrieve a knowledge answer from the corpus containing knowledge information relevant to the target service. Moreover, the chatbot server 140 may also support a conventional chitchat function, so that it is feasible to chat with the user 110 in a chitchat mode even when a query proposed by the user 110 is not a knowledge-intention query.

[0030] It should be understood that all the network entities shown in FIG.l are exemplary, and according to specific application requirements and designs, any other network entities may be involved in the application scenario 100. Moreover, the embodiments of the present disclosure are not limited to the application scenario 100 in FIG.l, but may cover any other scenarios in which the embodiments of the present disclosure for providing knowledge answers for knowledge-intention queries can be applied. For example, instead of adopting reference documents specific to a target domain to which the target service belongs, a wider range of reference documents in a plurality of target domains may be adopted for establishing a corpus. For example, instead of accessing the target service 122, the user 110 may access a chatbot service client in the terminal device 120, and the chatbot service client may be configured for providing knowledge answers for knowledge-intention queries from the user 110. For example, in addition to the scenarios in which a chatbot is used for conducting real-time conversations or real-time automated chatting, the embodiments of the present disclosure may also be applied to any other non- real -time question-answering scenarios, e.g., providing knowledge answers to users' knowledge-intention queries in a knowledge question-answering website, etc. Therefore, it should be understood that the automated chatting scenario involved in the following discussion is only exemplary, and the embodiments of the present disclosure are not limited to the automatic chatting scenario.

[0031] FIG.2 illustrates an exemplary process 200 of establishing a corpus according to an embodiment.

[0032] A large number of reference documents 220 may be obtained from a data source 210. The data source 210 may comprise various information sources capable of providing reference documents, e.g., the target service platform 130 in FIG.l, a content providing platform or website on the network, a partner or user of a chatbot service, etc. The reference documents 220 may be associated with a specific target domain. For example, the reference documents 220 may be associated with a specific topic such as sports, music, movie, etc., may be associated with a specific entity such as a product, a product manufacturer, a product seller, etc., and so on. Moreover, the reference documents 220 may also relate to a wider range of multiple target domains, or relate to a general domain.

[0033] Optionally, the reference documents 220 may be converted into normalized reference documents 230. In some cases, the reference documents 220 obtained from the data source 210 may have different formats, e.g., document format, URL format, etc. The reference documents of different formats may be converted into a specific normalized format through predetermined normalization rules, e.g., a file format including title, text, picture set, etc.

[0034] At 240, text item extraction may be performed on the reference documents 220 or the normalized reference documents 230. A plurality of text items may be extracted from each reference document. Each extracted text item may cover one or more sentences, one or more paragraphs, etc. in a corresponding reference document. For example, each text item may comprise any one of a single sentence, a combination of sentences, a single paragraph, and a combination of paragraphs extracted from a corresponding reference document. The combination of sentences may comprise two or more consecutive sentences in the reference document. The combination of paragraphs may comprise a combination of related sentences from two or more paragraphs in the reference document. In an implementation, when extracting text items from a reference document, predetermined processing rules 242 may be considered, so as to further improve relevance and completeness of content in each text item. In one case, the processing rules 242 may comprise performing context-based coreference resolution when extracting a combination of sentences as a text item. For example, when a first sentence in a combination of sentences contains a person "Michael Jordan", and a second sentence immediately after the first sentence contains a pronoun "he", then the pronoun "he" may actually refer to the person in the first sentence, thus, the "he" in the second sentence may be replaced with "Michael Jordan" in an extracted text item. In one case, the processing rules 242 may comprise some heuristic rules. For example, some reference documents may recite different aspects for a topic in multiple paragraphs, and thus it may be attempted to extract one or more sentences from these paragraphs respectively, to form a combination of paragraphs about the topic and take them as a text item. In one case, a first sentence of each paragraph in these paragraphs may be a summary of content in this paragraph, thus a first sentence of each paragraph may be extracted to form a combination of paragraphs. In one case, some important representative sentences may be highlighted (e.g., bold, italicized, underlined, etc.) in these paragraphs, thus these highlighted sentences may be extracted to form a combination of paragraphs. It should be understood that the processing rules 242 are not limited to the above examples, but may include any other rules that facilitate to enhance relevance and completeness of content of text items.

[0035] Assuming that a plurality of text items 250 are extracted through the text item extraction at 240, these text items may be included into a corpus 260. Optionally, the process 200 may also establish an index set 270 corresponding to the corpus 260, so as to improve execution efficiency of subsequent retrieval of answers from the corpus 260. For example, text items in the corpus 260 may be indexed through the inverted index technique, and the index set 270 may be established accordingly.

[0036] It should be understood that all the steps and the sequence thereof in the process 200 are exemplary, and the embodiments of the present disclosure will cover any modification to the process 200.

[0037] FIG.3 illustrates an exemplary process 300 of responding to a knowledge- intention query according to an embodiment. It is assumed that the process 300 is performed in a chat session between a user and a chatbot.

[0038] A query 310 may be obtained from the user. For example, the query 310 may be provided in the chat session by the user.

[0039] At 320, it may be identified whether the query is a knowledge-intention query, in order to determine whether to trigger a subsequent knowledge answer determination processing. If the query 310 is identified as a non-knowledge-intention query at 320, the chatbot may enter a chitchat mode at 330. Accordingly, the chatbot may adopt any known chitchat responding mechanism to provide a response to the query 310. If the query 310 is identified as a knowledge-intention query at 320, execution of a subsequent knowledge answer determination processing may be triggered.

[0040] In an implementation, at 320, it may be identified whether the query 310 is a knowledge-intention query through a classification technique. For example, a classifier for a knowledge-intention query classification task may be pre-established, and the query 310 may be classified as a knowledge-intention query or a non-knowledge-intention query with the classifier. The classifier may be pre-established through a variety of different approaches. In an exemplary approach, a lightweight model such as a gradient boosting decision tree (GBDT) may be adopted in the classifier, and the classifier may be fine-tuned by, e.g., a bidirectional encoder representation from transformers (BERT) model to effectively enhance the performance of the classifier. Labeled dataset of knowledge- intention query may be previously prepared for use in the fine-tuning. For example, a large amount of chat records may be collected as raw data, and messages or queries from users in these chat records may be added with labels about whether they are knowledge-intention queries. It should be understood that the embodiments of the present disclosure may also adopt any model other than BERT to fine-tune the classifier.

[0041] In the knowledge-intention query identification at 320 or in the preparation of the labeled dataset, judgment criteria about whether a query is a knowledge-intention query may be defined based on attributes specific to knowledge-intention queries, e.g., knowledge related, clear intention, etc. As an example, a query "What is the path to register a member" has a clear intention to learn about knowledge of member registration path, a query "How to avoid sports injury" has a clear intention to learn about knowledge of measures to avoid sports injury, and a query "What are the advantages of these shoes" has a clear intention to learn about knowledge of product characteristics. All the above queries are related to specific knowledge and have clear intentions, thus they may all be regarded as knowledge- intention queries. A query that lacks knowledge acquisition intention, e.g., greetings, etc., may be regarded as a non-knowledge-intention query. In one situation, a sentence involving a knowledge-intention query is not limited to an interrogative sentence. Taking a sentence "Pay service fees online" as an example, although this sentence is not an interrogative sentence, it actually expresses a clear intention to obtain knowledge about how to pay service fees online, thus this sentence may be regarded as a knowledge-intention query. In one situation, a message related to seeking resources may be regarded as a non-knowledge- intention query. For example, messages related to seeking pictures, music, movies, e-books, download links, etc. are not intended to obtain knowledge information, thus they are non- knowledge-intention queries. In one situation, if only one noun phrase is included in a message, it may be impossible to determine a clear intention of the message, thus this message is a non-knowledge-intention query. Taking a message "Italian cheese pizza" as an example, it may be difficult to determine whether the user wants to know how to make Italian cheese pizza or which restaurant serves Italian cheese pizza, thus this message is a non-knowledge-intention query. In one situation, if a message simply states a fact, it may be impossible to determine a clear intention of the message, thus this message is a non knowledge-intention query. Taking a message "Children often get nosebleeds" as an example, it may be difficult to determine whether the user wants to know how to prevent the children from getting nosebleeds or why the children often get nosebleeds, thus this message is a non-knowledge-intention query. It should be understood that the embodiments of the present disclosure are not limited to the above exemplary judgment criteria related to knowledge-intention query, but may cover any other judgment criteria.

[0042] According to the process 300, if it is identified at 320 that the query 310 is a knowledge-intention query, then a knowledge answer for the knowledge-intention query may be determined at 340. For example, the knowledge answer may be determined from a pre-established corpus 350. An exemplary knowledge answer determination processing will be discussed in detail later in conjunction with FIG.4.

[0043] At 360, a response to the knowledge-intention query 310 may be provided, which may comprise the knowledge answer determined at 340. For example, the response may be presented to the user in the chat session. A knowledge answer may be provided in a response through various approaches. In an implementation, the knowledge answer may be directly provided as a response. In an implementation, the response may be provided in a form of information card, wherein the information card comprises the knowledge answer. For example, the knowledge answer may be presented as a part of the information card. Optionally, the information card may further comprise more information about a reference document corresponding to the knowledge answer, e.g., a title of the reference document, an image in the reference document, a link to the reference document, etc. Through providing a response in the form of information card, a user may be enabled to obtain desired information from the response more quickly and intuitively. Moreover, through including a link to the reference document in the information card, e.g., setting a hyperlink to the reference document for at least one part of the information card, a user may be enabled to access the reference document by clicking the link, e.g., clicking the part containing the hyperlink, so as to obtain further information. Optionally, the knowledge answer may be placed in a prominent position in the information card, e.g., at the top, etc. Optionally, the knowledge answer may be highlighted in the information card, e.g., bold, italicized, zoomed in, etc. Optionally, when the link in the information card is clicked and the reference document is presented, in an user interface that presents the reference document, the reference document may be automatically focused on a position that includes a text involving the knowledge answer, or a text involving the knowledge answer may be highlighted in the reference document.

[0044] FIG.4 illustrates an exemplary process 400 of determining a knowledge answer according to an embodiment. The process 400 is intended to determine a knowledge answer corresponding to a knowledge-intention query 410. The knowledge-intention query 410 may be, e.g., identified through the knowledge-intention query identification processing at 320 in FIG.3.

[0045] At 420, at least one candidate text item for the knowledge-intention query 410 may be selected from a corpus 430. For example, text items associated with the knowledge- intention query 410 may be selected from the corpus 430 as candidate text items, wherein a knowledge answer will be selected from these candidate text items through subsequent processing. In an implementation, the selection at 420 may be performed with an index set corresponding to a corpus. For example, the at least one candidate text item may be searched out from an index set corresponding to the corpus 430 based on the knowledge-intention query 410. In the case that the index set is established through, e.g., the inverted index technique, candidate text items may be searched out from the index set with a searching function provided by an inverted index service, e.g., a searching approach that is based on BM25, etc.

[0046] At 440, the candidate text items may be ranked. In an implementation, the candidate text items may be ranked through a ranker. For example, a ranker for ranking candidate text items may be pre-established. The ranker may be pre-established through a variety of different approaches. In an exemplary approach, a lightweight model such as FastRank may be adopted in the ranker, and the ranker may be fine-tuned by, e.g., a generative pre-training 2 (GPT2) model. The GPT2 model has strong text representation capability, and it may train a model with less supervised data and more unsupervised data, thus ranking accuracy of the ranker may be significantly improved. It should be understood that the embodiments of the present disclosure are not limited to adopt the GPT2 model to fine-tune the ranker, but may also adopt any other appropriate model to fine-tune the ranker. [0047] A large number of <knowledge-intention query, answer> labeled datasets may be previously prepared for use in the fine-tuning. Optionally, for each data pair of knowledge-intention query and answer, multi-level labeling may be performed on the data pair based on an accuracy level of utilizing the answer to reply the knowledge-intention query. Exemplarily, it is assumed that labels comprise four levels of 0, 1, 2 and 3. If an entity involved in the answer is different from an entity in the knowledge-intention query, the data pair may be labeled as 0 to indicate that the answer cannot reply the knowledge- intention query at all. For example, for a query "How to avoid sports injury" and an answer "The mental health issue should not be ignored", since the entity "sports injury" in the query is completely different from the entity "mental health" in the answer, this data pair may be labeled as 0. If an entity involved in the answer is the same as an entity in the knowledge- intention query, but discussion directions about the entity are different, the data pair may be labeled as 1 to indicate that the answer cannot effectively reply the knowledge-intention query. For example, for a query "Buy a Samsung mobile phone or an Apple mobile phone" and an answer "Samsung and Apple are both trend leaders in the mobile phone industry", although both the query and the answer involve Samsung mobile phone and Apple mobile phone, the answer does not provide reference knowledge of how to choose between Samsung mobile phone and Apple mobile phone, thus this data pair may be labeled as 1. If the answer has the same entity and the same discussion direction as the knowledge-intention query, but content in the answer still needs to be improved, the data pair may be labeled as 2 to indicate that the answer may basically reply the knowledge-intention query. For example, for a query "How to avoid sports injury" and an answer "Sports injury is an issue that requires attention, and various measures should be taken to avoid it during exercise", since this answer only provides limited information about avoiding sports injury, the data pair may be labeled as 2. If the answer has the same entity and the same discussion direction as the knowledge-intention query, and the answer fully replies the knowledge-intention query, the data pair may be labeled as 3 to indicate that the answer may fully reply the knowledge-intention query. For example, for a query "How to avoid sports injury" and an answer "Fully warm up, correct exercise posture, and wearing protective gear", since the answer provides accurate and sufficient information about avoiding sports injury, this data pair may be labeled as 3. It should be understood that the embodiments of the present disclosure are not limited to the above exemplary labeling levels, but may cover any other divisions of labeling levels.

[0048] According to the process 400, after performing the ranking of the candidate text items at 440, a knowledge answer may be selected from the ranked candidate text items. For example, the highest-ranked candidate text item among the ranked candidate text items may be selected as the knowledge answer.

[0049] In an implementation, at 450, it may be determined whether a ranking score of the highest-ranked candidate text item meets with a predetermined condition, e.g., whether it exceeds a predetermined threshold. In response to determining that the ranking score exceeds a predetermined threshold, the highest-ranked candidate text item may be selected as a knowledge answer at 470. If it is determined at 450 that the ranking score of the highest- ranked candidate text item does not exceed the predetermined threshold, the chatbot may enter the chitchat mode at 460. Accordingly, the chatbot may adopt any known chitchat responding mechanism to provide a response to the knowledge-intention query 410.

[0050] After determining the knowledge answer for the knowledge-intention query 410 through the process 400, a response including the knowledge answer may be provided through, e.g., the processing at 360 in FIG.3.

[0051] FIG.5 illustrates an exemplary chat session window 500 according to an embodiment. The chat session window 500 shows a chat session between a chatbot and a user in a target service related to, e.g., health. The chatbot has an ability of providing knowledge answers for knowledge-intention queries according to the process 300 in FIG.3 and the process 400 in FIG.4. According to the process 200 in FIG.2, a corpus may be pre- established based on reference documents associated with a domain of the target service. [0052] When receiving a user message "The weather is so nice today", since the message simply states a fact but does not indicate a definite intention, the chatbot may identify the message as a non-knowledge-intention query according to, e.g., the processing at 320 in FIG.3, therefore, may provide a response "Very sunny" in a chitchat mode.

[0053] When receiving a user message "Health is the greatest wealth", since the message simply states a fact but does not indicate a definite intention, the chatbot may identify the message as a non-knowledge-intention query according to, e.g., the processing at 320 in FIG.3, therefore, may provide a response “You are so right, I totally agree" in a chitchat mode.

[0054] When receiving a user message "How to keep healthy", since the message has a clear intention to learn about knowledge of keeping healthy, the chatbot may identify the message as a knowledge-intention query according to, e.g., the processing at 320 in FIG.3. Furthermore, the chatbot may determine a knowledge answer for the knowledge-intention query through, e.g., the processings at 340 and 360 in FIG.3 and the process 400 in FIG.4, and provide the knowledge answer directly as a response 510. As shown in the response 510, the knowledge answer comprises a plurality of sentences, and these sentences may be a text item, in the form of a combination of paragraphs, selected from the corpus. For example, each of these sentences may be extracted from a paragraph in a corresponding reference document.

[0055] When receiving a user message "How can I avoid sports injury", since the message has a clear intention to learn about knowledge of avoiding sports injury, the chatbot may identify the message as a knowledge-intention query according to, e.g., the processing at 320 in FIG.3. Furthermore, the chatbot may determine a knowledge answer for the knowledge-intention query through, e.g., the processings at 340 and 360 in FIG.3 and the process 400 in FIG.4, and provide a response 520 including the knowledge answer in a form of information card. As shown in the response 520, content of the determined knowledge answer is presented in the uppermost box 522 of the information card, a title of a reference document corresponding to the knowledge answer is presented in the box 524 at the lower left of the information card, and an image in the reference document corresponding to the knowledge answer is presented in the box 526 at the lower right of the information card. At least one part of the information card may contain a hyperlink connected to the reference document corresponding to the knowledgeable answer, so that when the at least one part is clicked by the user, it will automatically navigate to the reference document and the reference document will be presented to the user.

[0056] It should be understood that all the elements in the chat session window 500 in FIG.5 are exemplary. The embodiments of the present disclosure are not limited to any presented details in the chat session window 500, but may cover, e.g., any other chat session interface, any other approach of presenting knowledge answers and responses, etc.

[0057] FIG.6 illustrates a flowchart of an exemplary method 600 for providing knowledge answers for knowledge-intention queries according to an embodiment.

[0058] At 610, a query may be obtained.

[0059] At 620, it may be identified whether the query is a knowledge-intention query. [0060] At 630, in response to identifying that the query is a knowledge-intention query, a knowledge answer for the knowledge-intention query may be determined from a corpus, the corpus being established based on at least one reference document.

[0061] At 640, a response for the knowledge-intention query may be provided, the response comprising the knowledge answer.

[0062] In an implementation, the determining a knowledge answer may comprise: selecting at least one candidate text item for the knowledge-intention query from the corpus; ranking the at least one candidate text item; and selecting the knowledge answer from the at least one ranked candidate text item.

[0063] In an implementation, the identifying whether the query is a knowledge-intention query may comprise: classifying the query as a knowledge-intention query or a non knowledge-intention query at least through a classifier.

[0064] In an implementation, the corpus may comprise a plurality of text items, each text item comprising a single sentence, a combination of sentences, a single paragraph, or a combination of paragraphs extracted from a corresponding reference document. The combination of sentences may comprise two or more consecutive sentences, and the combination of paragraphs may comprise a combination of relevant sentences from two or more paragraphs.

[0065] In an implementation, the at least one reference document may be associated with at least one target domain.

[0066] In an implementation, the selecting at least one candidate text item may comprise: searching out the at least one candidate text item from an index set corresponding to the corpus based on the knowledge-intention query.

[0067] In an implementation, the ranking the at least one candidate text item may comprise: ranking the at least one candidate text item at least through a ranker.

[0068] In an implementation, the selecting the knowledge answer may comprise: selecting a highest-ranked candidate text item in the least one ranked candidate text item as the knowledge answer.

[0069] The selecting a highest-ranked candidate text item as the knowledge answer may comprise: determining whether a ranking score of the highest-ranked candidate text item exceeds a predetermined threshold; and in response to determining that the ranking score exceeds the predetermined threshold, selecting the highest-ranked candidate text item as the knowledge answer.

[0070] In an implementation, the response may be provided in a form of information card, the information card comprising the knowledge answer.

[0071] The information card may further comprise at least one of: a title of a reference document corresponding to the knowledge answer; an image in the reference document; and a link to the reference document.

[0072] In an implementation, the method 600 may further comprise: in response to identifying that the query is not a knowledge-intention query, providing a response to the query in a chitchat mode. [0073] In an implementation, the method 600 may further comprise: in response to determining that the ranking score does not exceed the predetermined threshold, providing a response to the query in a chitchat mode.

[0074] In an implementation, the classifier is fine-tuned by a BERT model.

[0075] In an implementation, the ranker may be fine-tuned by a GPT2 model.

[0076] It should be understood that, the method 600 may further comprise any step/process for providing knowledge answers for knowledge-intention queries according to the embodiments of the present disclosure as described above.

[0077] FIG.7 illustrates an exemplary apparatus 700 for providing knowledge answers for knowledge-intention queries according to an embodiment.

[0078] The apparatus 700 may comprise: a query obtaining module 710, for obtaining a query; a knowledge-intention query identifying module 720, for identifying whether the query is a knowledge-intention query; a knowledge answer determining module 730 for, in response to identifying that the query is a knowledge-intention query, determining a knowledge answer for the knowledge-intention query from a corpus, the corpus being established based on at least one reference document; and a response providing module 740, for providing a response for the knowledge-intention query, the response comprising the knowledge answer.

[0079] In an implementation, the knowledge answer determining module 730 may be for: selecting at least one candidate text item for the knowledge-intention query from the corpus; ranking the at least one candidate text item; and selecting the knowledge answer from the at least one ranked candidate text item.

[0080] In an implementation, the corpus may comprise a plurality of text items, each text item comprising a single sentence, a combination of sentences, a single paragraph, or a combination of paragraphs extracted from a corresponding reference document.

[0081] Moreover, the apparatus 700 may further comprise any other module configured for providing knowledge answers for knowledge-intention queries according to the embodiments of the present disclosure as described above.

[0082] FIG.8 illustrates an exemplary apparatus 800 for providing knowledge answers for knowledge-intention queries according to an embodiment.

[0083] The apparatus 800 may comprise at least one processor 810 and a memory 820 storing computer-executable instructions. When the computer-executable instructions are executed, the processor 810 may: obtain a query; identify whether the query is a knowledge- intention query; in response to identifying that the query is a knowledge-intention query, determine a knowledge answer for the knowledge-intention query from a corpus, the corpus being established based on at least one reference document; and provide a response for the knowledge-intention query, the response comprising the knowledge answer. Moreover, the processor 810 may further perform any step/process for providing knowledge answers for knowledge-intention queries according to the embodiments of the present disclosure as described above.

[0084] The embodiments of the present disclosure may be embodied in a non-transitory computer-readable medium. The non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform any operations of the methods for providing knowledge answers for knowledge-intention queries according to the embodiments of the present disclosure as described above.

[0085] It should be appreciated that all the operations in the methods described above are merely exemplary, and the present disclosure is not limited to any operations in the methods or sequence orders of these operations, and should cover all other equivalents under the same or similar concepts.

[0086] It should also be appreciated that all the modules in the apparatuses described above may be implemented in various approaches. These modules may be implemented as hardware, software, or a combination thereof. Moreover, any of these modules may be further functionally divided into sub-modules or combined together.

[0087] Processors have been described in connection with various apparatuses and methods. These processors may be implemented using electronic hardware, computer software, or any combination thereof. Whether such processors are implemented as hardware or software will depend upon the particular application and overall design constraints imposed on the system. By way of example, a processor, any portion of a processor, or any combination of processors presented in the present disclosure may be implemented with a microprocessor, microcontroller, digital signal processor (DSP), a field- programmable gate array (FPGA), a programmable logic device (PLD), a state machine, gated logic, discrete hardware circuits, and other suitable processing components configured to perform the various functions described throughout the present disclosure. The functionality of a processor, any portion of a processor, or any combination of processors presented in the present disclosure may be implemented with software being executed by a microprocessor, microcontroller, DSP, or other suitable platform.

[0088] Software shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, threads of execution, procedures, functions, etc. The software may reside on a computer-readable medium. A computer-readable medium may include, by way of example, memory such as a magnetic storage device (e.g., hard disk, floppy disk, magnetic strip), an optical disk, a smart card, a flash memory device, random access memory (RAM), read only memory (ROM), programmable ROM (PROM), erasable PROM (EPROM), electrically erasable PROM (EEPROM), a register, or a removable disk. Although a memory is shown as being separate from the processor in various aspects presented in this disclosure, a memory may also be internal to the processor (e.g., a cache or a register). [0089] The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects. Thus, the claims are not intended to be limited to the aspects shown herein. All structural and functional equivalents to the elements of the various aspects described throughout the present disclosure that are known or later come to be known to those of ordinary skilled in the art are intended to be encompassed by the claims.

Previous Patent: SPONTANEOUS TEXT TO SPEECH (TTS) SYNTHESIS

Next Patent: VAGINAL GEL