Title:
FEATURE WORD EXTRACTION METHOD AND APPARATUS, TEXT SIMILARITY CALCULATION METHOD AND APPARATUS, AND DEVICE
Document Type and Number:
WIPO Patent Application WO/2021/072850
Kind Code:
A1
Abstract:
A feature word extraction method, comprising: in response to a word segmentation instruction for a target text, performing word segmentation on the target text to obtain a segmented words set (S101); combining segmented words in the segmented words set to obtain several phrases, wherein each of the phrases comprises several segmented words (S102); calculating a first TF value and a TF-IDF value of the phrase (S103); calculating a second TF value of each segmented word combined into the phrase, so as to obtain several second TF values (S104); calculating a probability limit TF-IDF value of the phrase by using the TF-IDF value, the first TF value and the several second TF values (S105); and selecting the phrase corresponding to the probability limit TF-IDF value arranged before a predetermined position to be a feature word of the target text (S106). Further provided are a text similarity calculation method, a feature word extraction apparatus, a text similarity calculation apparatus, a computer device and a computer-readable storage medium.
More Like This:
Inventors:
LIU XIANG (CN)
YAO FEI (CN)
YAO FEI (CN)
Application Number:
PCT/CN2019/117401
Publication Date:
April 22, 2021
Filing Date:
November 12, 2019
Export Citation:
Assignee:
PING AN TECH SHENZHEN CO LTD (CN)
International Classes:
G06F40/289
Foreign References:
CN105550168A | 2016-05-04 | |||
CN101206752A | 2008-06-25 | |||
CN105095175A | 2015-11-25 | |||
US20110191355A1 | 2011-08-04 |
Attorney, Agent or Firm:
INTELLECPRO CHINA LIMITED (CN)
Download PDF: