文書特徴抽出装置、文書特徴抽出方法、文書特徴抽出プログラム - Nippon Telegraph and Telephone Corporation

Title:

文書特徴抽出装置、文書特徴抽出方法、文書特徴抽出プログラム

Document Type and Number:

Japanese Patent JP5739310

Kind Code:

B2

Abstract:

PROBLEM TO BE SOLVED: To appropriately extract a feature corresponding to a browser's browsing intention using a reference relationship between structured documents.SOLUTION: A browsing history recording unit 2 of a document feature extraction device 1 records a browsing history of each browser in a browsing history set DB 3. A feature extraction unit 4 extracts a link and related text of the link from a structured document as a link source included in the browsing history in the DB 3. Words are extracted from body text as a representative portion in a structured document as a link destination including the extracted information. A feature recalculation unit 5 calculates weighting for the extracted words. An output unit 6 outputs the extracted words in a priority order corresponding to the weighting.

More Like This:

JP6063053	A system and a method for showing a network data set and navigating visually
JP2011123643	SESSION MARK SYSTEM, SESSION MARK METHOD AND PROGRAM THEREOF

Inventors:

Masayuki Sugisaki
Yamato Takahashi
Shigeru Fujimura
Tadashi Uchiyama

Application Number:

JP2011249430A

Publication Date:

June 24, 2015

Filing Date:

November 15, 2011

Export Citation:

Click for automatic bibliography generation Help

Assignee:

Nippon Telegraph and Telephone Corporation

International Classes:

G06F17/30

Domestic Patent References:

JP2008176685A
JP2003242166A
JP2011048730A
JP2007264718A

Foreign References:

US20070240031

Other References:

小谷忠史他,リンク元コンテキストを考慮するハイパーリンク重要箇所同定法,情報処理学会研究報告,日本,社団法人情報処理学会,2003年 5月23日,Vol.2003,No.54(2003-DD-39(1)),pp.1-6.
小谷忠史他,ハイパーリンク先ページでの重要箇所の同定法,第64回知識ベースシステム研究会資料,社団法人人工知能学会,2004年 3月 1日,Vol.SIG-KBS-A304,No.41,pp.245-250.
近藤光正他,Web文書からの軽量な本文抽出法,電子情報通信学会2009年総合大会講演論文集,日本,社団法人電子情報通信学会,2009年 3月 4日,Vol.1,No.D-4-11,p.29
黒田慎介他,Web環境におけるユーザアクティビティの共有と検索,第13回データ工学ワークショップ(DEWS2002)論文集,日本,電子情報通信学会データ工学研究専門委員会,2002年 5月15日,Vol.2002,No.A2-4,pp.1-8.

Attorney, Agent or Firm:

Hiromichi Kobayashi
Uzawa Hidehisa
Koji Yamaguchi
Tsuyoshi Hashimoto

Previous Patent: ディジタル保護制御装置

Next Patent: JPS5739311