Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
MULTILINGUAL DOCUMENT-SIMILARITY-DEGREE LEARNING DEVICE, MULTILINGUAL DOCUMENT-SIMILARITY-DEGREE DETERMINATION DEVICE, MULTILINGUAL DOCUMENT-SIMILARITY-DEGREE LEARNING METHOD, MULTILINGUAL DOCUMENT-SIMILARITY-DEGREE DETERMINATION METHOD, AND STORAGE MEDIUM
Document Type and Number:
WIPO Patent Application WO/2015/145981
Kind Code:
A1
Abstract:
This invention provides a technology for searching for similar documents in a multilingual document group at lower cost and with higher precision, even if three or more languages are present. This multilingual document-similarity-degree learning device (1) comprises the following: a multilingual matrix storage unit (11) that holds a matrix for each target language; a word-vector acquisition unit (12) that acquires a word vector corresponding to a document; a meaning-vector creation unit (13) that creates a meaning vector for said document on the basis of the word vector for said document and the matrix corresponding to the language in which said document is written; a similarity-degree calculation unit (14) that calculates similarity degrees on the basis of meaning vectors for documents in a document group; and a multilingual matrix learning unit (15) that implements learning by adjusting values in the matrices corresponding to the respective target languages such that, within a set of documents each written in one of the target languages, the similarity degrees for groups of documents that exhibit source-translation relationships are higher than the similarity degrees for groups of documents that do not exhibit source-translation relationships.

Inventors:
SADAMASA KUNIHIKO (JP)
Application Number:
PCT/JP2015/001028
Publication Date:
October 01, 2015
Filing Date:
February 27, 2015
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NEC CORP (JP)
International Classes:
G06F17/30; G06F17/27
Foreign References:
US20100179933A12010-07-15
Other References:
JOHN C. PLATT ET AL.: "Translingual document representations from discriminative projections", E MNLP '10 PROCEEDINGS OF THE 2010 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, pages 251 - 261, XP055225060, Retrieved from the Internet [retrieved on 20150326]
Attorney, Agent or Firm:
SHIMOSAKA, NAOKI (JP)
Naoki Shimosaka (JP)
Download PDF: