Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SYSTEM FOR EXTRACTING TERM FROM DOCUMENT CONTAINING TEXT SEGMENT
Document Type and Number:
WIPO Patent Application WO/2010/038540
Kind Code:
A1
Abstract:
Provided is techniques of extracting terms from a document, classifying the extracted terms from a viewpoint useful for the summary understanding or the details understanding of the document, and presenting the classified terms to a user. A computer system extracts noun words from document data containing a text segment by using first text processing information, extracts term candidates for the noun words from the document data or from a corpus including text data described in the same language as the document data by using second text processing information, selects a kind of noun words to be given a weight by using third text processing information in order to determine to which kind of noun words, out of a plurality of kinds of noun words, the noun words and the term candidates belong, gives the weight to the respective noun words and term candidates according to the selected kind, determines a kind to which the noun words and the term candidates belong according to the given weight, and outputs the noun words and the term candidates in association with the determined kind.

Inventors:
IKAWA YOHEI (JP)
TAKEUCHI HIRONORI (JP)
NEGISHI SHIHO (JP)
Application Number:
PCT/JP2009/063584
Publication Date:
April 08, 2010
Filing Date:
July 30, 2009
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
IBM (US)
IKAWA YOHEI (JP)
TAKEUCHI HIRONORI (JP)
NEGISHI SHIHO (JP)
International Classes:
G06F17/28; G06F17/21; G06F17/30
Foreign References:
JP2004151882A2004-05-27
JPH09190438A1997-07-22
JP2005196513A2005-07-21
JPH10177575A1998-06-30
Other References:
See also references of EP 2315129A4
Attorney, Agent or Firm:
UENO Takeshi et al. (JP)
Tsuyoshi Ueno (JP)
Download PDF: