Title:
対訳文抽出装置、対訳文抽出方法およびプログラム
Document Type and Number:
Japanese Patent JP6678087
Kind Code:
B2
Abstract:
PROBLEM TO BE SOLVED: To create a bilingual corpus with a further high quality compared with a case of mapping a sentence on the basis of only the number of accorded words.SOLUTION: A parallel translation acquisition unit 105 of a parallel translation extraction apparatus 1 matches a first language sentence and a second language sentence constituting a bilingual document using a bilingual dictionary to acquire one or more parallel translations in the first language and the second language. A translation model generation unit 107 generates a translation model on the basis of the one or more parallel translations. A translation unit 108 translates, on each of one or more parallel translations, the first language sentence constituting the parallel translation into the second language using the generated translation model. An edition distance calculation unit 109 calculates an edition distance, on each of the one or more parallel translations, between the first language sentence translated into the second language, and the second language sentence corresponding to the sentence. A parallel translation selection unit 110 selects, out of the one or more parallel translations, a parallel translation whose edition distance calculated is larger than a threshold.SELECTED DRAWING: Figure 1
Inventors:
Tsutomu Matsunaga
Daisuke Sato
Daisuke Sato
Application Number:
JP2016165873A
Publication Date:
April 08, 2020
Filing Date:
August 26, 2016
Export Citation:
Assignee:
NTT DATA Corporation
International Classes:
G06F40/45; G06F16/36
Domestic Patent References:
JP2005250536A | ||||
JP2015170168A | ||||
JP2009289219A | ||||
JP2009223525A |
Foreign References:
US20030110023 | ||||
US20120310869 |
Attorney, Agent or Firm:
Asahi Patent Office