Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
PDFデータ取り出しシステム及びPDFデータ取り出しシステム用プログラム
Document Type and Number:
Japanese Patent JP6719862
Kind Code:
B2
Abstract:
PROBLEM TO BE SOLVED: To provide a system capable of reconstructing a table included in a PDF file which is an analysis report obtained, for example, as a result of control and analysis of an analyzer in a correct form even when a blank cell exists.SOLUTION: A system is constituted of: an extraction unit setting part for setting extraction of character strings from a PDF file from middleware for the PDF file by unit of line; a horizontal threshold setting part for setting a horizontal threshold in character string extraction by unit of line to a predetermined value; a character string data acquisition part for extracting the character strings by every line from the PDF file specified by using the middleware for the PDF file according to setting of the extraction unit setting part; a table formation part for forming, when a plurality of lines in which a plurality of character strings separated from a horizontal movement amount equal to or more than the horizontal threshold exist continue, a table by coordinate values of the respective character strings included in the plurality of lines; and an output part for outputting the table in a predetermined data format. Thus, it becomes possible to exactly reconstruct the table and to also reduce labor of a user.SELECTED DRAWING: Figure 2

Inventors:
Wakabayashi Kazuto
Application Number:
JP2015057056A
Publication Date:
July 08, 2020
Filing Date:
March 20, 2015
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
SHIMADZU CORPORATION
International Classes:
G06F40/151
Domestic Patent References:
JP2008257737A
JP201015554A
JP6266742A
Foreign References:
US6757870
Attorney, Agent or Firm:
Kyoto International Patent Office