Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND APPARATUS FOR EXPANDING DATA OF BILINGUAL CORPUS, AND STORAGE MEDIUM
Document Type and Number:
WIPO Patent Application WO/2015/067092
Kind Code:
A1
Abstract:
Disclosed are a method and an apparatus for expanding data of a bilingual corpus. The method for expanding data of a bilingual corpus comprises: searching, in a source language-pivot language corpus, for at least one first pivot language phrase matching the semanteme of a first source language phrase; searching, in the source language-pivot language corpus, for at least one second source language phrase matching the semanteme of each first pivot language phrase; searching, in the pivot language-target language corpus, for at least one first target language phrase matching the semanteme of each first pivot language phrase; combining the second source language phrases in a source language phrase set with the first target language phrases in a target language phrase set; and storing combined phrase pairs between the source language phrases and the target language phrases into a source language-target language corpus. Data of a bilingual corpus is expanded, thereby solving the problem of data sparseness in the bilingual corpus.

Inventors:
ZHU XIAONING (CN)
HE ZHONGJUN (CN)
WU HUA (CN)
WANG HAIFENG (CN)
Application Number:
PCT/CN2014/085947
Publication Date:
May 14, 2015
Filing Date:
September 04, 2014
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
BEIJING BAIDU NETCOM SCI & TEC (CN)
International Classes:
G06F40/00; G06F17/28
Foreign References:
CN103577399A2014-02-12
CN102591857A2012-07-18
US20070010989A12007-01-11
US20080249760A12008-10-09
Attorney, Agent or Firm:
BEYOND ATTORNEYS AT LAW (CN)
北京品源专利代理有限公司 (CN)
Download PDF: