Title:
文書抽出プログラム、文書抽出装置、及び文書抽出方法
Document Type and Number:
Japanese Patent JP7419961
Kind Code:
B2
Abstract:
To extract two documents having a correspondence relation from a set of documents respectively described in two languages with high accuracy.SOLUTION: A computer generates first distribution information showing a distribution of first feature amounts based on a distance between two words included in each of a plurality of first language documents described in the first language. The computer generates second distribution information showing a distribution of second feature amounts based on a distance between two words included in each of a plurality of second language documents described in the second language. The computer extracts a specific first language document from the plurality of first language documents and extracts a specific second language document form the plurality of second language documents on the basis of a similarity between the first distribution information of each of the plurality of first language documents and the second distribution information of each of the plurality of second language documents. The specific second language document is a second language document corresponding to the specific first language document.SELECTED DRAWING: Figure 6
Inventors:
Shun Liang
Seiji Okajima
Seiji Okajima
Application Number:
JP2020083737A
Publication Date:
January 23, 2024
Filing Date:
May 12, 2020
Export Citation:
Assignee:
富士通株式会社
International Classes:
G06F40/216; G06F16/383
Domestic Patent References:
JP2018180839A | ||||
JP2012506596A |
Foreign References:
WO2015145981A1 |
Attorney, Agent or Firm:
Infot Patent Attorney Corporation
Hiroyoshi Aoki
Masayuki Amada
Yoshiyuki Ohsuga
Hiroyoshi Aoki
Masayuki Amada
Yoshiyuki Ohsuga
Previous Patent: Data collection system, data collection method and program
Next Patent: Steering device manufacturing method and steering device
Next Patent: Steering device manufacturing method and steering device