Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
PRINT DATA TRANSFORMATION
Document Type and Number:
WIPO Patent Application WO/2008/022377
Kind Code:
A1
Abstract:
A method of providing an output file based on print data. The method includes, in a processing system determining a layout associated with the print data, extracting content and generating an output file using the extracted content and the layout.

Inventors:
KEMP, Todd, Mitchell (67 Crusis Street, Coorparoo, QLD 4151, AU)
KEMP, Jennifer, Frances (67 Crusis Street, Coorparoo, QLD 4151, AU)
Application Number:
AU2007/001186
Publication Date:
February 28, 2008
Filing Date:
August 20, 2007
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ROMEENA PTY LIMITED AS TRUSTEE FOR KEMP FAMILY TRUST (67 Crusis Street, Coorparoo, QLD 4151, AU)
KEMP, Todd, Mitchell (67 Crusis Street, Coorparoo, QLD 4151, AU)
KEMP, Jennifer, Frances (67 Crusis Street, Coorparoo, QLD 4151, AU)
International Classes:
G06F17/00; G06F17/00
Attorney, Agent or Firm:
SMITH, Alistair, James et al. (Davies Collison Cave, Level 14255 Elizabeth Stree, Sydney New South Wales 2000, AU)
Download PDF:
Claims:
THE CLAIMS DEFINING THE INVENTION ARE AS FOLLOWS:

1) A method of providing an output file based on print data, the method including, in a processing system: a) determining a layout associated with the print data; b) extracting content; and, c) generating an output file using the extracted content and the layout.

2) A method according to claim 1, wherein the method includes, in the processing system: a) receiving the print data; and, b) validating a print data source.

3) A method according to claim 2, wherein the validation involves at least one of: a) validating an end station; and, b) validating a user of an end station.

4) A method according to claim 1, wherein the print data is a representation of at least one document.

5) A method according to claim 1, wherein the method includes, in the processing system, determining the layout by: a) determining at least one document type associated with the print data; and, b) determining at least one layout associated within the determined at least one document type.

6) A method according to claim 1, wherein the method includes, in the processing system, determining at least one document type by at least one of: a) determining at least parameter associated with the print data; and, b) parsing the print data.

7) A method according to claim 1 , wherein the method includes, in the processing system, determining the at least one layout by at least one of: a) selecting the at least one layout from a number of predetermined layouts; and, b) defining the at least one layout.

8) A method according to claim 7, wherein the method includes, in the processing system: a) displaying a representation of content within the print data; and, b) determining the layout in accordance with input commands from an operator.

9) A method according to claim 8, wherein the method includes, in the processing system, determining, using the input commands, at least one of: a) a field start location and a field end location; and b) a field region.

1O) A method according to claim 1, wherein the method includes, in the processing system, decrypting the print data. H) A method according to claim 1, wherein the method includes, in the processing system, generating a structured output file.

12) A method according to claim 11, wherein the structured output file includes at least one field identified using a respective identifier, the at least one field being associated with respective content.

13) A method according to claim 12, wherein the at least one field is identified using a respective structured output file includes at least one field associated with respective content.

14) A method. according to claim 1, wherein the method includes, in the processing system: a) determining, using the layout, at least one field within the print data; and, b) at least one of: i) extracting the content from the at least one field; and, ii) importing content into the at least one field in the output file.

15) A method according to claim 1, wherein the method includes, in the processing system: a) comparing the output file to the print data; and, b) in response to the results of the comparison, at least one of: i) generating an alert; and, ii) providing the output file.

16) A method according to claim 1, wherein the method includes, in the processing system, receiving the print data from an end station via a communications network.

17) A method according to claim I 5 wherein the print data is for use in a medical environment.

18) A processing system for providing an output file based on print data, the processing system being for: a) determining a layout associated with the print data;

b) extracting content; and, c) generating an output file using the extracted content and the layout.

19) A processing system according to claim 18, the apparatus being for performing the method of claim 1.

20) A method of providing an output file based on print data, the method including, in an end station: a) generating print data; and, b) transferring the print data to a processing system, the processing system being for: i) determining a layout associated with the print data; ii) extracting content; and, iii) generating an output file using the extracted content and the layout. 2I)A method according to claim 20, wherein the method includes, in the end station, generating the print data in accordance with validation information.

22) A method according to claim 20, wherein the method includes, in the end station, encrypting the print data.

23) A method according to claim 20, wherein the method includes, in the end station, transferring the print data to the processing system in accordance with at least one transfer option.

24) A method according to claim 20, wherein the method includes, in the end station, transferring the print data to the end station via at least one of: a) web services; b) e-mail; and, c) file transfer protocols.

25) A method according to claim 20, wherein the print data is for use in a medical environment.

26) Apparatus for providing an output file based on print data, the apparatus including an end station for: a) generating print data; and, b) transferring the print data to a processing system, the processing system being for: i) determining a layout associated with the print data; ii) extracting content; and,

iii) generating an output file using the extracted content and the layout. 27) Apparatus according to claim 26, the apparatus being for performing the method of claim 1.

Description:

PRINT DATA TRANSFORMATION

Background of the Invention

The present invention relates to a method and apparatus for providing an output file based on print data, and in particular to a method and apparatus for transforming print data to generate a structured output file.

Description of the Prior Art

The reference in this specification to any prior publication (or information derived from it), or to any matter which is known, is not, and should not be taken as an acknowledgment or admission or any form of suggestion that the prior publication (or information derived from it) or known matter forms part of the common general knowledge in the field of endeavour to which this specification relates.

Databases are used in the medical and other industries for tracking usage and ordering of items, such as consumables, or the like. Thus, for example, when items such as consumables are used, it is typical for individuals or devices using to generate a report outlining the consumables or other items used, allowing the database to be updated to thereby reflect current stock levels or the like.

However, the reports are not usually provided in a form suitable to allow automated updating of the database. Thus, for example, existing medical equipment, such as sterilisers typical generate reports of used consumables or process validation in a printed format. In this case, it is necessary for an individual to interpret the printed report, manually extract relevant information and utilise this to update database entries reflecting consumables used and/or to identify consumables required for re-order.

Accordingly, the maintenance of such databases is typically performed manually, which is a complex and time consuming procedure. This also suffers from the disadvantage that database maintenance can be inaccurate as a result of mistakes made by operators.

Summary of the Present Invention

In a first broad form the present invention provides a method of providing an output file based on print data, the method including, in a processing system: a) determining a layout associated with the print data; b) extracting content; and, c) generating an output file using the extracted content and the layout.

Typically the method includes, in the processing system: a) receiving the print data; and, b) validating a print data source.

Typically the validation involves at least one of: a) validating an end station; and, b) validating a user of an end station.

Typically the print data is a representation of at least one document.

Typically the method includes, in the processing system, determining the layout by: a) determining at least one document type associated with the print data; and, b) determining at least one layout associated within the determined at least one document type.

Typically the method includes, in the processing system, determining at least one document type by at least one of: a) determining at least parameter associated with the print data; and, b) parsing the print data.

Typically the method includes, in the processing system, determining the at least one layout by at least one of: a) selecting the at least one layout from a number of predetermined layouts; and, b) defining the at least one layout.

Typically the method includes, in the processing system: a) displaying a representation of content within the print data; and,

b) determining the layout in accordance with input commands from an operator.

Typically the method includes, in the processing system, determining, using the input commands, at least one of: a) a field start location and a field end location; and b) a field region.

Typically the method includes, in the processing system, decrypting the print data.

Typically the method includes, in the processing system, generating a structured output file.

Typically the structured output file includes at least one field identified using a respective identifier, the at least one field being associated with respective content.

Typically the at least one field is identified using a respective structured output file includes at least one field associated with respective content.

Typically the method includes, in the processing system: a) determining, using the layout, at least one field within the print data; and, b) at least one of: i) extracting the content from the at least one field; and, ii) importing content into the at least one field in the output file.

Typically the method includes, in the processing system: a) comparing the output file to the print data; and, b) in response to the results of the comparison, at least one of: i) generating an alert; and, ii) providing the output file.

Typically the method includes, in the processing system, receiving the print data from an end station via a communications network.

Typically the print data is for use in a medical environment.

In a second broad form the present invention provides a processing system for providing an output file based on print data, the processing system being for: a) determining a layout associated with the print data; b) extracting content; and, c) generating an output file using the extracted content and the layout.

Typically the apparatus being for performing the method of the first broad form of the invention.

In a third broad form the present invention provides a method of providing an output file based on print data, the method including, in an end station: a) generating print data; and, b) transferring the print data to a processing system, the processing system being for: i) determining a layout associated with the print data; ii) extracting content; and, iii) generating an output file using the extracted content and the layout.

Typically the method includes, in the end station, generating the print data in accordance with validation information.

Typically the method includes, in the end station, encrypting the print data.

Typically the method includes, in the end station, transferring the print data to the processing system in accordance with at least one transfer option.

Typically the method includes, in the end station, transferring the print data to the end station via at least one of: a) web services; b) e-mail; and, c) file transfer protocols.

Typically the print data is for use in a medical environment.

In a fourth broad form the present invention provides apparatus for providing an output file based on print data, the apparatus including an end station for: a) generating print data; and, b) transferring the print data to a processing system, the processing system being for: i) determining a layout associated with the print data; ii) extracting content; and, iii) generating an output file using the extracted content and the layout.

Typically the apparatus is for performing the method of the third broad form of the invention.

Brief Description of the Drawings

An example of the present invention will now be described with reference to the accompanying drawings, in which: -

Figure 1 is a schematic diagram of an example of a system for providing an output file based on print data;

Figure 2 is a schematic diagram of an example of a distributed architecture for use in providing an output file;

Figure 3 is a schematic diagram of an example of a processing system of Figure 2;

Figure 4 is a schematic diagram of an example of an end station of Figure 2;

Figures 5 A to 5C are a flow chart of a second example of the process of providing an output file based on print data;

Figure 6 is a schematic diagram of an example of the relationship between a print client and print server;

Figure 7 is a schematic diagram of an example of the functionality of the print server of

Figure 6;

Figure 8 is a schematic diagram of an example of a graphical user interface for defining a layout.

Detailed Description of the Preferred Embodiments

An example of a process of providing an output file based on print data will now be described with reference to Figure 1.

At step 100 print data is obtained. This maybe achieved in any one of a number of ways but typically involves having a device, such as a computer system, or other custom device with embedded processing capabilities generate data representing information to be printed.

The print data may be in any one of a number of forms depending on the preferred implementation. Thus, for example, the print data could be in a form that can be transferred directly to and interpreted by a printer to allow the printer to provide a physical hard copy of the respective information. This may therefore be a postscript file, or similar. Alternatively, the print data can be in custom or any other suitable form depending on the nature of the information being printed.

At step 110 a layout associated with the print data is determined. The layout will typically depend not only on the form of the print data, but also on the nature of the information contained therein.

The layout maybe determined in any one of a number of ways. Thus, for example, the layout maybe predefined depending on the nature or source of the print data, in which case the layout could be selected from a number of potential layouts. Alternatively, it maybe necessary to define a layout if no previous suitable layout has been defined for the respective print data.

At step 120 fields within the print data are determined using the layout, with this being used to allow content to be extracted from the print data at step 130. At step 140 an output file is then generated using the extracted content.

Thus, for example, this process can utilise the layout to identify content within the print data. The content can then be extracted from the print data and imported into respective fields within an output file, based on the layout, allowing the output file to be created.

In one example, the output file is structured, such that the output file identifies the extracted content in some way. This can include, for example, providing identifiers, tags or elements within the output file that identify the nature, field or a context associated with the extracted content. The structured output file may be of any suitable form, such as an XML file, or the like.

It will be appreciated that the print data may be indicative of a number of different documents being printed. Accordingly, it may be necessary to apply a number of different layouts to a single instance of print data. In this instance, an output file can be provided for each document, alternatively, a single output file can be generated corresponding to the print data, in which case the output file can include multiple documents. In this instance, the documents may be identified within the output file, using fields in a manner similar to that described above with respect to the identification of specific content. Thus, for example, the start of a document can include groups such as "Tax Invoice" or Tax Credit" when multiple documents defined by the print data are provided in a single output file.

The use of a structured file allows the output file to be interpreted automatically, for example to allow the content to be automatically imported into a database. This is possible as the use of the layout allows context or meaning associated with the content in the print data to be identified in the output file. This context to be subsequently reused for interpretation purposes.

This process is typically performed using a distributed architecture, an example of which will now be described with reference to Figure 2.

In this example, a base station 201 is coupled to a number of end stations 203 via a communications network 202, such as the Internet, and/or via communications networks 204, such as local area networks (LANs), or wide area networks (WANs). Thus it will be appreciated that the LANs 204 may form an internal network at a specific location. A number of custom devices 205 may also be provided.

The base station 201 typically includes one or more processing systems 210, optionally coupled to one or more databases 211. In use, the processing system 210 is adapted to receive print data from the end stations 203, and provide an output file, allowing the print data to be automatically imported into databases, or to allow the print data to be more easily interpreted and utilised by other processing systems.

Accordingly, any form of suitable processing system 210 may be used. An example is shown in Figure 3. In this example, the processing system 210 includes at least a processor 300, a

memory 301, an input/output (I/O) device 302, such as a keyboard, and display, and an external interface 303, coupled together via a bus 304 as shown.

In use, the processor 300 typically executes applications software stored in the memory 301 to allow received print data to be interpreted and transformed into an appropriate output file. This is typically achieved by having the processor parse the received print data in accordance with layouts stored in memory 301 or the database 211.

Additionally or alternatively, the processing system 300 may be used to define layouts, in which case the processor 300 can execute applications software to allow print data to be displayed to an operator, and to allow the operator to define an appropriate layout.

Accordingly, it will be appreciated that the processing system 210 may be formed from any suitable processing system, such as a suitably programmed PC, Internet terminal, lap-top, hand-held PC, web server, network server, or the like, which is typically operating applications software to enable the above described process to be performed.

The end stations 203 are typically adapted to generate print data and then communicate with the processing system 210 positioned at the base station 201 to allow the print data to be provided thereto. It will be appreciated that this allows a number of different forms of end station 203 may be used.

An example of a suitable end station 203 is shown in Figure 4. As shown the end station 203 includes a processor 400, a memory 401, an input/output device 402 such as a keyboard and display, and an external interface 403 coupled together via a bus 404, as shown. The internal interface 405 is typically provided to allow the end station 203 to be coupled to one of the communications networks 202, 204, and accordingly, this may be in the form of a network interface card, or the like.

In use, the processor 400 executes applications software stored in the memory 401 allowing the print data to be generated and transferred to the base station 201, via the communications networks 202, 204.

Accordingly, it will be appreciated that the end stations 203 may be formed from any suitable processing system, such as a suitably programmed PC, Internet terminal, lap-top, hand-held PC, smart phone, PDA, web server, or the like.

In the case of a custom device 205, such as sterilisation equipment or other instruments, the device may be adapted to perform predetermined operations and then generate an output print data file indicative of the performance of the operation of the device. In one example, the custom device 205 may incorporate a processing system similar to that provided in the end station 203, thereby allow the device 205 to operate in a manner substantially identical to that of the end station 203.

However, alternatively the device 205 may only be of limited configurability. In this case, it may not be possible for the manner in which the print data is provided to be altered as this is controlled by instructions embedded within the hardware. In this instance, the device is typically adapted to provide printed reports in a set format to a specific output port on the device. In this instance, an end station 203 may be utilised to receiving the print data from the device 205 and then forward this to the base station 201. This can be achieved, either by connecting the end station 203 directly to the custom device 205, or by having the end station 203 monitor a specific port or other output from the custom device 205 to detect print data when it is generated. Alternatively there may be a requirement for an I/O or printer output to a LAN device to provide the TCP address for identification of the port or client/instrument.

It will therefore be appreciated that the custom device 205 may be of any suitable form, and may therefore be similar to the end station 203.

For the purpose of simplicity, the following description will therefore focus generally on the situation in which the end station 203 can provide print data in an appropriate manner allowing it to be transmitted to the base station 201. However, it will be appreciated that the techniques can also be applied to situations in which a custom device 205 is used, either alone, or in combination with an end station 203 in the event that the custom device cannot be suitably configured.

An example of this process will now be described in more detail with respect to Figures 5 A to 5C.

At step 500 the user installs a print client on the end station 203. At step 505 the user then configures the print client with validation information and transfer options.

Steps 500 and 505 can be performed on a one off basis when the end station 203 is initially configured. Thus, this process will typically involve installation of the client in the form of a printer driver, allowing the end station 203 to transfer the print data to the base station 201.

In one example, installation of the print client causes an option to be displayed in a print dialogue box. This allows a print to an output file to be provided as an option in a drop down list including other available printers, such as any hardcopy printers, as will be appreciated by persons skilled in the art.

One the client is installed the user can configure the validation information and transfer options. This can be achieved for example using a printer properties option in the print dialogue box.

The validation information can be utilised to identify the respective user and/or the end station 203 providing the print data. This may therefore be of any suitable form, such as a username and/or password, or a device identifier such as an IP or MAC (Media Access Control) address.

In one example, the validation information serves two purposes. Firstly, this allows the base station 201 to confirm that either the user and/or the end station 203 is registered to use the system. This is typically required as it is standard to charge for print operations in accordance with the process and in order to achieve this it may be necessary to identify the user and/or end station 203. Secondly, the system may be utilised to print sensitive information, in which case encryption processes maybe used. In this instance identification of the user and/or end station 203 may be required to allow the encryption process to be performed correctly.

As part of this process it may also be necessary for the user to select and/or configure an encryption mechanism or the like, which may involve the provision of a public key, corresponding to a secret key held by the base station 201. This ensures that the print data cannot be decryted and interpreted by third parties. This is of particular importance in the medical environment, in which print data may contain sensitive information such as patient details.

The transfer options will also typically list a manner in which the print data is to be transferred to the base station 201. This maybe achieved in any one of a number of ways, such as through the use of FTP (File Transfer Protocol), email, HTTPS, submission to a website, or the like.

In any event once the initial configuration has been completed at step 505 the system may then be used.

Accordingly, at step 510 the end station 203 determines information for printing. This maybe achieved in any one of a number of ways depending on the preferred implementation and the nature of the end station 203.

Thus, for example, if the end station 203 is a standard computer system the information will typically be generated by applications software installed thereon. Alternatively however if the end station 203 is a custom device 205, the information may be generated automatically during or upon completion of a process. Thus, for example, in the case of a sterilisation machine, each time the sterilisation process or part thereof is completed, the sterilisation machine will automatically generate information detailing the sterilising process. This will include information such as the time, duration, temperature, consumable used, or the like.

At step 515 the end station 203 generates the print data representing the information to be printed. The print data will be generated utilising the print client installed on the end station 203. Thus, this may involve having the user select a print option corresponding to a generate output file print selection. Alternatively however the print data may be generated in the normal way and simply transferred to a port of the device 205, in which case the print client may be provided on a separate end station 203, or may not be required.

The print data is typically in the form of a graphical representation of at least one document or other content being printed. This is typically in a form that can be interpreted by the printer to allow the representation to be printed, such as a postscript file. However, additionally, or alternatively, the print data can be in any suitable form, such as a PDF or text file.

The end station 203 may be used to perform a batch print of different documents, with this being used to generate a single print data file for transfer to the base station 201. In this case, the print data may contain print representations of one or more documents of different types, with each document including appropriate content corresponding to the document type. Similarly different content types may be provided within a single document within the print data. Thus, for example, if the document is an invoice, the content can include details for the entity being invoiced, details of the services performed or products provided, and a corresponding invoice amount. Similarly, if the document type is a prescription, the content may include details of the prescribing doctor, the patient and the drugs being prescribed.

At step 520 the end station 203 transfers the print data to the base station 201. The transfer of the print data will be performed in accordance with the validation information and transfer options supplied at step 505 above.

Thus, the print client may operate to append or prepend the print data with validation information and then transfer the data to the base station 201. This can include, for example, providing parameters allowing the nature or source of the print data to be subsequently determined, allowing for easier identification of layouts associated with the print data, as will be described in more detail below.

Additionally and/or alternatively the print client may operate to encrypt the print data. It will be appreciated that in this regard, the end station 203 may maintain security packages for different types of data transfer, which can be configured either by manual setup using known utilities or part of a custom install process similar to that outlined above in steps 500 and 505.

In the event in which the information is simply output from a port of the end station 203, it maybe necessary to configure an alternative or additional processing system to route the print

data to the base station 201, using for example an end station 203, as will be appreciated by persons skilled in the art.

Additionally, or alternatively, the print process may also involve printing to an additional printer. Thus, for example, this could allow a hard copy document to be created, whilst simultaneously forwarding the print data to the base station 201 for generation of the output file. This could be achieved in any one of a number of ways.

For example, the print client installed on the end station 203 can include a setting that allows the print data to be transferred as described above, whilst being simultaneously copied to another selected printer. Alternatively, the client can be adapted to forward any or selected instances of print data that is forwarded to a printer for printing. This could be used to allow a print stream sent to a printer to be hijacked directly or by output to a suitable LAN device, such as a network printer, network server, or the like.

At step 525 the base station 201 parses the received print data and attempts to determine the print data source at step 530. In particular this will typically involve utilising any provided validation information to identify the user and/or the end station 203 from which the print data is received.

At step 535 the base station 201 operates to validate the source of the print data. The validation may be performed in any one of a number of ways and may for example involve the successful decryption of encoded print data, comparison of validation information to predetermined validation information stored in the database 211 or the like.

At step 540 if the source is not validated the base station 201 generates a validation alert. This typically indicates that the print data has been received from a source, such as a user or end station 203 that is not registered with the system and accordingly no action is to be taken.

However, it may be that the validation process fails for some reason even though the print data is from a genuine source. Accordingly, when the validation alert is generated it is typical for an operator to review the print data, resend, re-queue or manage audit logs - queues and the like to determine if the process should proceed even though the validation has

failed. It will be appreciated that if the process is to be proceed then the operator can cause the process to move on to step 545.

In this instance, or in the event that the source is validated at step 545, the base station 201 accesses transformation layouts stored in the database 211. At step 550 the base station 201 determines if one or more layouts are available for the respective print data.

In general, the layouts are specific not only to the source of the print data, but also depend on the nature of the print data, the types of documents within the print data and/or information within the print data itself. Thus, for example, an end station 203 may generate a number of different types of print data depending on the applications software utilised thereon, or dependent on the information being printed. Alternatively, as described above, the print data may be indicative of a number of different documents, which may in turn include a number of different document types, with a respective layout being associated with each document type.

To determine if one or more appropriate layouts are available, it is typically necessary for the base station 201 to determine the document type of each document in the print data, and then compare this to an indication of the document types associated with each of the types of layouts. To assist in identifying document types, it may be necessary to parse the print data to determine parameters relating to the documents or content, with the document types being determined using the parameters. This can include information based on the content of the documents or print data, as well as an indication of the application software utilised to generate the print data. This can then be compared to corresponding information provided by the end station 203, or with the print data.

If a layout is not available for either the print data, or any of the documents therein, then at step 555 the base station 201 displays a representation of the print data to an operator. The representation will typically be in the form of a visual representation provided on the I/O device 302 and will have the appearance of the print data had it been provided on physical media. This allows the operator to define field locations within the print data at step 560. This would typically be achieved utilising a suitable GUI (Graphical User Interface) which allows the user to highlight regions of the representation utilising an appropriate input device

such as a mouse, or to indicate field start and end locations. Once the user has selected a region of the print representation, the user defines a field associated with this region.

Thus, for example, if the print data is representative of a prescription, this will generally have a standard layout. Accordingly, the operator may select a first region of the document representing a prescribing doctor, a second region representing a patient and a third region representing a prescribed drug. The operator will then define fields that are to be associated with each of the regions. Thus, in this example, the operator will define that the first, second and third regions are to be associated with doctor, patient and drug fields respectively. An example GUI that can be used to perform the above described process is shown in Figure 8.

In this example the GUI 900 includes a print window 910, a document structure window 920, and a region window 930. The print window 910 is used to display the visual representation of the print data, whilst the document structure window 920 displays a visual representation of the structure of any regions and the region window 930 displays content of selected regions within the print data.

In use, when an operator is to define a layout, the operator will use an appropriate input means to select the relevant print data, causing this to be displayed in the print window 910. In the example of Figure 8, a letter is displayed including an address and letter text.

When the operator wishes to define a region, the operator uses an appropriate input mechanism such as a mouse or the like, to highlight a region within the document. A first example of this is shown at 912, in which the operator has highlighted the address so as to define an address region.

The operator then designates start and end points 911, 913 for the address region 912. The start and end points 911, 913 are used to define the address region 912, so that the address region 912 can subsequently be detected by parsing of the print data.

In one example, the start and end points 911, 913 are defined as particular locations on the document. Thus, each time a document having that layout is analysed, the region 912 is identified by it's spatial position within the document. However, it will be appreciated that this is not always appropriate for all forms of document. Thus, for example, the position at

which the address is printed may vary depending on other information within the document. Accordingly, in this example, the start and end points 911, 913 are identified by specific features within the address. Thus, as the address will always start with a "Mr" or other title, such as "Mrs", "Dr", this can be used to identify the start of the address region 912. Similarly, detecting the term "Australia" or other country or appropriate feature can be used to identify the end of the address region 912.

As the address region 912 is defined, the base station 201 will extract data contained in the start and end points 911, 913 and import these into respective header and footer portions 931, 933 of the region window 930. Thus, in this example, the header portion 931 contains the term "Mr", whilst the footer portion 933 contains the term "Australia". This allows the operator to review the terms and determine if an alternatives can be provided. Thus, as a letter address may start with a different title, the operator can add alternatives to the term "Mr" in the header portion 931. A similar process can be performed with the footer portion.

Any data within the region 912 is also extracted and displayed within a region portion 932 of the region window 930. This allows the operator to ensure that the region 912 has been appropriately defined. Once the region 912 is acceptable, an indication of this can be provided using suitable input means and other regions may then be defined.

Thus, in this example, a second region is defined at 915 corresponding to a letter text region. In this instance, start and end points 914, 916 are defined based on "Dear" and "Yours Sincerely" features.

It will be noted that in the example of the address region 912, the start and end points 911, 913 are outside the region 912, whereas in the case of the letter text region 915, the start and end points 914,. 916. It will be appreciated from this, that the start and end points can be defined in any way, and in any positional relationship relative to the respective region, and that these example are for the purpose of illustration only.

Details of the regions can also be shown in the document structure window, which in this example shows each of the regions 912, 915 are 922, 925 respectively. Any relationships between the regions can then be represented using a tree structure 921, and this can be used to

indicate situations in which regions are related in some manner, such as if a region includes sub-regions.

It will be appreciated that this allows operators to define regions using a GUI in a straightforward and logical manner, allowing the defined regions to be reviewed using the region window 930 to ensure that the regions are defined correctly.

It will be appreciated that additionally, or alternatively, layouts may be determined using the end stations 203 in a similar manner. Thus, for example, in the event that no layout exists for a given document type, the base station 201 may generate an indication, which is supplied to the end station 203 indicating no layout exists. This can allow a user to determine a layout using the end station 203, for example, by having the end station 203 display a suitable GUI in a manner similar to that described above in steps 555 and 560, with the generated layout being returned to the base station 201.

At step 565 once a layout has been defined the base station 201 determines field locations within the print data and then it extracts content from the field locations at step 570. Thus, in the above prescription example, and for the first region, the base station 201 will operate to extract a doctor's name.

The extraction of the content may require interpretation of visual representation of alphanumeric characters, or the like. Thus, the extraction may require performing optical character recognition on alphanumeric characters. It will be appreciated however that any suitable content extraction technique may be used.

At step 575 the base station 201 imports the content into an output file. The output file is typically a structured output file such as an XML file which includes markers indicative of the respective fields. Thus, as would be appreciated by persons skilled in the art, the fields will be associated with elements in the XML file such that each element corresponds to a respective field. In the above prescription example, the XML file would therefore need to include doctor, patient and drug fields. It will be appreciated that this will be achieved using corresponding field identifying tags which can then be interpreted by any processing system receiving the output file.

In any event, this allows the content of the print data to be provided in an electronically interpretable file structure which can be used by other remote processing systems to interpret the print data. It may then be restructured and sent to another processing system such as Australia Posts eLetter in a format acceptable to AUPost for bulk printing and mailing of multiple letters from multiple sources.

At step 580 the base station 201 determines if the output file is acceptable. This process may involve for example ensuring that all content in the print data has been associated with corresponding elements in the XML file, or that all the defined fields in the layout are populated as expected. This therefore ensures that no data has been overlooked.

If it is determined that the output file is unacceptable, the base station 201 generates an error alert at step 585. This allows an operator to review the generated output file and the print data to determine if any modifications are required to the output file before it is utilised.

Otherwise at step 590 the base station 201 can provide the output file, for example, to another processing system to allow the content to be imported into a database, or otherwise utilised. The output file may also be provided together with the original print data, allowing the print data to be review by the receipt of the data. This provides an additional checking process, allowing the accuracy of the conversion to the output file to be confirmed at a later date.

An example of the functionality of the system will now be described in more detail with respect to Figures 6 and 7.

In particular, Figure 6 shows the relationship between a client 600, a print server 610 and a transformation layout 620. In particular, the client 600, which is typically implemented by one of the end stations 203, provides the print data to the generic print server 610 either via a web service, a TCP printer, an email or the like as shown generally at 605.

The generic print server, implemented by the base station 201 in the above example, will validate the received print data and assuming this is successful, provide the print data to the layout 620, at 615, to attempt to translate the print data into a structured output file. As shown at 625 an indication of the status of this procedure is provided to the print server which may also generate an alert and transfer this back to the client 600 as shown at 630. •

Functionality of the generic print service 10 will now be described in more detail with reference to Figure 7.

In this example the print data is received at step 700 and provided to a validate stream module 705, which attempts to validate the source of the print data. In the event that the source of the print data cannot be validated then the validate stream function generates a validation alert 710 which may be provided to an alert administrator at 715 allowing a status indication to be generated at 720. At 725 the invalidated print data is forwarded to a discard stream 730.

In the event that the validate stream function has validated the print data it then determines if this is qualified and has an associated way out or is unqualified and has no associated layout.

In the event that the data is unqualified, then at 735 the data is forwarded to an unqualified stream 740 allowing a layout to be generated. Otherwise the data is forwarded to a queued stream 750 at 745 allowing the document to be provided to a layout module 755, which operates to compare the print data to a corresponding layout. To achieve this, the document is retrieved from the queued at 760 with the validate stream module providing instructions to the layout at 765.

The data transform layout function can then operate to apply the layout and generate a result at 770 which is transferred to a check status module 775, allowing an indication of the status to be transferred to the validate stream function at 780. In the event that the status is correct an indication of this is transferred at 785 to a processed queue 790. Otherwise an indication of failure is transferred at 795 to an error queue 800.

In either case an indication of the status is provided by the validate stream module 705 at 805, allowing a log audit entry to be recorded by a log audit entry module 810, allowing a log entry 815 to be provided to an audit log 820.

The provision of an audit log allows tracing of creation of the output files, so that in the event that a mistake is subsequently found, this can be traced. Additionally, this allows transfer of information to be tracked. This can be particularly important in the medical environment, when it can be important to ensure that information can be correctly transferred.

In any event, it will be appreciated that this can be used to implement the processes described above.

Features utilised within the specific examples of Figures 6 and 7 will now be described in more detail.

Print Client 600

In one example, the print client intercepts EMF data stream from any client application and allows for API callbacks to determine the required output format, rendering the output into the required format, and then generate a callback to send the output via an email or web service.

Installation

The print client function can operate as if it was an installed printer and does not require any special actions on the part of the user other than selecting the Print client as the desired printer.

The installation and deployment of the software allows for download and installation from a central repository (possibly a web-site). The process of installation can be a single operation with no user input as all activity performed by the software is to be configurable after installation of the printer.

The printer installed will have defaults inherited from a licence key and will direct the interactive user to the properties window if any required field is not available on a print request. An example of this behaviour is the initial use of the Fax Printer object.

The software typically requires a licence key to enable it to operate. This key can be supplied as a separate file as a localised INI configuration file holding client/site specific configuration settings and be refreshed on each print request or change to the configuration.

Configuration

The software typically allows for configuration of the parameters set out below by utilising the properties page of the installed printer.

Once the method of delivery (presumably by tabsheet or section design), as Email or Web Service, is chosen the appropriate parameters will be available for editing.

General settings are typically populated from license key provided on file system, including:

• Customer Name (Display only)

• Order Number (Display only)

• Expiration Date (Display only)

• Authorisation Key (Display only)

• File format (Display only) PDF/RTF

• Temporary File Path (default to Operating System default) - editable via File system navigator

Email settings typically include:

• From Email Address (required)

• To Email Address (default from license key)

• Cc email Address

• SMTP/MAPI Server settings (required)

Web Service settings

• URL path for Web Service(default from license key)

Operation

Selection of the output file print option to perform a print operation from the calling application determines the appropriate file output format specified by the configuration details provided and create a temporary file in the specified temp directory path. The file releases all resource locks and execute the chosen method of delivery on the single file as an attachment if by Email, or as a binary stream reference to the enabled Web Service.

All successful or failure events, can be written to the operating system's application event log and the temporary file deleted on completion.

Interfaces

A number of different user interfaces may be displayed by the end station 203 to allow installation and operation of the client printer application. These include:

• Installation - the product can allow for a web based point and click download and installation using standard browser and OS facilities

• Configuration — configuration interfaces can conform to standard OS print driver interface standards

• Use - print dialogs can conform to standard OS print dialogs

• Document formats - documents can be produced in unlocked PDF or in RTF formats

• Email message — the email message can contain the Authorisation key from the license and the document as an attachment. The email can be a format suitable for a standard SMTP service.

• Web service - The web service will accept two arguments. One is the Authorisation key from the license, the second being the binary file stream.

Print Server 610

The print server component will typically accept formatted documents from email, or as web services, and, subject to validation of the document source and type, extract content from the document and store the extracted content in a format to be specified.

Installation

The installation and deployment of the server side software is typically automated as much as possible. Where multiple applications are to be deployed, all software should be provided as a complete integrated set with associated documentation for installation.

Database compatibility

Functionality requiring data storage typically utilises a robust multi-user database via SQL compliant connectivity thereby allowing database independence. If the database is not supplied as part of the solution, database schemas identifying any table structures used are typically be made available in an appropriate format.

AIl data objects rendered as part of the solution can be created as XML objects and therefore cross-database and application compliant, although other suitable formats may be used.

Data Transformation

Data transformation of received binary or other similar files can be accomplished by use of multiple processes. These processes are typically multi-thread capable and handle asynchronous communication with no concurrency issues.

• The first process(es) is typically to accept the binary stream (this may also be a file to be dumped into a specific location or other data format sutible for the appropriate use as required) as an incoming web service request, an incoming email or a TCP connection from the client software. The subsequent stream and client authentication value will be stored as operating system files and referenced in an XML object. A system service is generally used to poll and receive the incoming data.

• The second process is usually to poll the incoming XML objects, examine and authenticate the customer credentials. If the customer credentials fail authentication, the object is marked as Discarded and moved to the appropriate queue. An unattended alert is raised in this case and optionally will contact an appropriate administrative email address.

• If the customer credentials authenticate, the document content can be examined and compared to the customer's authorisation list of Data Transformation Layouts (DTL). If an acceptable match is not located, the object is marked as Unqualified and moved to the appropriate queue. An unattended alert is raised in this case and optionally will contact an appropriate administrative email address.

• If a matching DTL is located, the object is marked as Queued and moved to the appropriate queue. The incoming stream may be converted for this purpose depending on client settings for document examination, however the original will always be referenced and available with an optional converted document reference. Document conversion may be required as part of the tool supplied to interrogate the matching criteria to a DTL.

• The third process is to poll the Queued XML objects and apply the necessary DTL to the incoming stream. The rendering of the DTL against the stream provides a further XML object to be made available and recorded as the resultant output. This output may be an embedded XML stream or an external reference to one. The XML object will be marked as Processed and moved to the appropriate queue.

• Any errors in the rendering can be recorded on the object with a status of Error and moved to the appropriate queue. An unattended alert is raised in this case and optionally will contact an appropriate administrative email address.

• The resultant XML output is typically be available to third party processing once the object is marked as Processed.

Data Transformation Layouts

In general a facility to create DTL' s for a sample file stream is provided. Multiple stream content types may be supported, although generally at least PDF and TXT file types are supported.

Generally this process is performed via an administrative interface that is invoked from the Unqualified files received queue. These files are typically searchable by customer details and audit information to filter the queue. On selection of a file for DTL creation, the tool will launch and allow the administrator to create the necessary mappings of the selected file for this DTL.

Mouse and/or menu-driven capability to define sets of information including data labels, data boundaries (start and end data), custom translation of data will be required. Properties may be specified on sets of information identifying them as a Document Set (which identifies this document to the layout as a unique identifier), Record Sets with the ability to nest and associated related Record Sets, and Field Sets which will identify individual data elements within a Record Set.

Capability for repeating record sets can be catered for. An example of this would be a multi- page invoice where each invoice line must be associated to a single invoice number that

spans multiple pages and the invoice number is repeated in the header of every page. Furthermore if the file was a batch print of several invoices, the capability to either split the file by invoice as a pre-process step or create repeating invoices in the resultant XML output may also be provided.

The data transformation layouts may be linked to a customer providing a one DTL to many customer relationship or alternatively the ability to copy existing DTL' s and apply them to individual customers.

Queue Management

The queue management services and/or processes are typically automated with built in recovery processes on restart in the event of system failure.

Communication Processes

System services are typically required to accept binary streams as an incoming web service request, an incoming email or a TCP connection from the client software. The subsequent stream and client authentication value is typically stored as operating system file references and referenced in an XML object.

These system services can be capable of start-up on system boot and include self recovery capabilities and diagnostic or trouble shooting events in the case of error.

Customer Management

The administrator interface generally offers browse, filter, edit and new entry capabilities for allowing customers to interact with the print application.

Each customer record generally contains the necessary fields to allow for the recording of company details and contact information. Additionally for license key generation, the type of file format to be supplied from this client, order number, expiry date, To Email address, and web service URL path.

The above details are used to generate a license key including an authorisation key (token) for delivery to the client computer. The authorisation key (token) will be transmitted with

any email or web service request and used as the means of identifying the incoming file therefore must be unique.

Auditing

The entire process flow is generally audited at each and every step to allow subsequent review. Auditing of creation, modification and deletion of DTL' s to customer profiles is also typically provided.

Modifications through the customer management module or any XML object should also be audited by fields such as CreatedDate, CreatedTime, CreatedBy, ModifiedDate, ModifiedTime, ModifiedBy details.

The entire audit log is generally capable of exposure through an administrative interface tool, and the configuration of any optional alerts should be catered for within the same tool.

Security

Standard security protocols can be used allowing tools to be compatible with relevant security models for the operating system.

The interface is generally capable of multi-user operation and therefore offer standardised integrated authentication and authorisation limits for each menu driven function in the interface.

Communication processes should also be security compliant and secure protocol compliant, e.g. https, pgp, etc.

Where direct text print data is received from the TCP service, no authorisation key will be transmitted as the security control can be implemented via standard firewall techniques allowing only specific TCP client connections if required.

Persons skilled in the art will appreciate that numerous variations and modifications will become apparent. All such variations and modifications which become apparent to persons skilled in the art, should be considered to fall within the spirit and scope that the invention broadly appearing before described.