Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
文書抽出プログラム、文書抽出装置、及び文書抽出方法
Document Type and Number:
Japanese Patent JP7419961
Kind Code:
B2
Abstract:
To extract two documents having a correspondence relation from a set of documents respectively described in two languages with high accuracy.SOLUTION: A computer generates first distribution information showing a distribution of first feature amounts based on a distance between two words included in each of a plurality of first language documents described in the first language. The computer generates second distribution information showing a distribution of second feature amounts based on a distance between two words included in each of a plurality of second language documents described in the second language. The computer extracts a specific first language document from the plurality of first language documents and extracts a specific second language document form the plurality of second language documents on the basis of a similarity between the first distribution information of each of the plurality of first language documents and the second distribution information of each of the plurality of second language documents. The specific second language document is a second language document corresponding to the specific first language document.SELECTED DRAWING: Figure 6

Inventors:
Shun Liang
Seiji Okajima
Application Number:
JP2020083737A
Publication Date:
January 23, 2024
Filing Date:
May 12, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
富士通株式会社
International Classes:
G06F40/216; G06F16/383
Domestic Patent References:
JP2018180839A
JP2012506596A
Foreign References:
WO2015145981A1
Attorney, Agent or Firm:
Infot Patent Attorney Corporation
Hiroyoshi Aoki
Masayuki Amada
Yoshiyuki Ohsuga