Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DOCUMENT CLUSTERING METHOD AND APPARATUS, ELECTRONIC DEVICE, AND STORAGE MEDIUM
Document Type and Number:
WIPO Patent Application WO/2022/134343
Kind Code:
A1
Abstract:
A document clustering method, a document clustering apparatus (400), an electronic device (500), and a storage medium. The document clustering method comprises: the document clustering apparatus (400) obtains N documents to be clustered, N being an integer greater than 1 (101); the document clustering apparatus (400) determines a co-citation similarity between any two documents among the N documents (102); according to the co-citation similarity between the any two documents, the document clustering apparatus (400) performs first clustering on the N documents to obtain M clusters, the M clusters corresponding to K documents to be clustered, M being an integer greater than or equal to 1, and K being an integer less than or equal to N (103); and the document clustering apparatus (400) performs second clustering on the remaining (N-K) documents so as to fuse the (N-K) documents into the M clusters (104). The method helps to increase the accuracy of document clustering.

Inventors:
CHAI LING (CN)
Application Number:
PCT/CN2021/082726
Publication Date:
June 30, 2022
Filing Date:
March 24, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
PING AN TECH SHENZHEN CO LTD (CN)
International Classes:
G06F16/35; G06F40/258; G06F40/284; G06K9/62
Foreign References:
CN103455622A2013-12-18
CN108509481A2018-09-07
CN111898366A2020-11-06
US6457028B12002-09-24
US20110295903A12011-12-01
Other References:
WU, FENGHUI: "Improvement of K-means Algorithm Based on Co-Citation Analysis", JOURNAL OF THE CHINA SOCIETY FOR SCIENTIFIC AND TECHNICAL INFORMATION, vol. 31, no. 1, 31 January 2012 (2012-01-31), CN , pages 82 - 94, XP009537875, ISSN: 1000-0135
Attorney, Agent or Firm:
SCIHEAD IP LAW FIRM (CN)
Download PDF: