Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
DOCUMENT PICTURE STRUCTURE ANALYSIS METHOD
Document Type and Number:
Japanese Patent JPH11232439
Kind Code:
A
Abstract:

To precisely and efficiently analyze the structure of a document picture by using content information when the document picture is converted into an electronized document.

For learning the document structure of a whole document, the document picture of a content page is taken in at first, it is extracted in a basic rectangle at every line, a character is recognized and is analyzed. Here, chapter/clause numbers are analyzed, indexes are extracted and the page numbers of the respective indexes are extracted. The document picture of the text page is taken in, several tens of continuous pages are inputted and the basic rectangle is extracted and analyzed against the respective pages. The layout elements of a header, a footer, the page number, chapter/clause, a text, graphic/ table are identified from the layout feature of the extracted basic rectangle. All the elements except for the rectangles identified as the graphic/list are character-recognized. The index is matched with an index candidate extracted in text analysis at every index page extracted in content analysis as a matching processing and more precise index information is analyzed.


Inventors:
HAYASHI TOSHINARI
Application Number:
JP5013098A
Publication Date:
August 27, 1999
Filing Date:
February 16, 1998
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
HAYASHI TOSHINARI
International Classes:
G06F17/21; G06F17/30; G06K9/20; G06T1/00; G06T7/00; G06T7/40; (IPC1-7): G06T1/00; G06F17/21; G06F17/27; G06T7/00
Attorney, Agent or Firm:
Kawahara Kazuho