SYSTEM FOR EXTRACTING TERM FROM DOCUMENT CONTAINING TEXT SEGMENT

Title:

SYSTEM FOR EXTRACTING TERM FROM DOCUMENT CONTAINING TEXT SEGMENT

Document Type and Number:

WIPO Patent Application WO/2010/038540

Kind Code:

A1

Abstract:

Provided is techniques of extracting terms from a document, classifying the extracted terms from a viewpoint useful for the summary understanding or the details understanding of the document, and presenting the classified terms to a user. A computer system extracts noun words from document data containing a text segment by using first text processing information, extracts term candidates for the noun words from the document data or from a corpus including text data described in the same language as the document data by using second text processing information, selects a kind of noun words to be given a weight by using third text processing information in order to determine to which kind of noun words, out of a plurality of kinds of noun words, the noun words and the term candidates belong, gives the weight to the respective noun words and term candidates according to the selected kind, determines a kind to which the noun words and the term candidates belong according to the given weight, and outputs the noun words and the term candidates in association with the determined kind.

More Like This:

JPH03138763	NATURAL LANGUAGE GENERATING METHOD AND NATURAL LANGUAGE PRESENTING METHOD
JPH08185404	SYSTEM AND METHOD FOR PROCESSING NATURAL LANGUAGE

Inventors:

IKAWA YOHEI (JP)
TAKEUCHI HIRONORI (JP)
NEGISHI SHIHO (JP)

Application Number:

PCT/JP2009/063584

Publication Date:

April 08, 2010

Filing Date:

July 30, 2009

Export Citation:

Click for automatic bibliography generation Help

Assignee:

IBM (US)
IKAWA YOHEI (JP)
TAKEUCHI HIRONORI (JP)
NEGISHI SHIHO (JP)

International Classes:

G06F17/28; G06F17/21; G06F17/30

Foreign References:

JP2004151882A	2004-05-27
JPH09190438A	1997-07-22
JP2005196513A	2005-07-21
JPH10177575A	1998-06-30

Other References: