Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
表構造解析装置及び表構造解析方法
Document Type and Number:
Japanese Patent JP5775839
Kind Code:
B2
Abstract:
PROBLEM TO BE SOLVED: To stably extract item names-data relation even when an item name word is not known in advance, and an item name dictionary cannot be fully provided in technology for extracting the item name-data relation from unspecified and a large amount of documents.SOLUTION: Form features and character string features of all pairs of adjacent frames in a table are referred to, and a differential score showing difference among them is set to a contact of the pair of frames. Next, to all rule grids in the table, the differential score set to a frame contact belonging to the rule grids is projected (such as taking the sum, taking the average), and an item name-data boundary score is calculated. The item name-data boundary score is a certainty factor representing whether or not the rule grids are ruled lines at the boundaries between an item name frame and a data frame, and set based on a policy that the contact at which difference appears in frame features is the boundary between the item name frame and the data frame. Next, the item name frame in the table is determined from a position of the item name-data boundary, and the item name-data relation is determined based on adjacent relation with other frames.

Inventors:
Junichi Hirayama
Masakazu Fujio
Yoshiyuki Kobayashi
Kimikichi Machii
Kaoru Kawabata
Application Number:
JP2012056656A
Publication Date:
September 09, 2015
Filing Date:
March 14, 2012
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
株式会社日立製作所
International Classes:
G06K9/00; G06K9/20; G06K9/68; G06T7/60
Domestic Patent References:
JP2001143018A
JP2009169844A
JP2011248609A
JP11161736A
Foreign References:
US6549662
Attorney, Agent or Firm:
Manabu Inoue
Yuji Toda
Shigemi Iwasaki