A COMPUTER IMPLEMENTED SYSTEM AND METHOD OF PROVIDING AN AUTOMATED COMPARATIVE INSURANCE QUOTE TO A USER

Title:

A COMPUTER IMPLEMENTED SYSTEM AND METHOD OF PROVIDING AN AUTOMATED COMPARATIVE INSURANCE QUOTE TO A USER

Document Type and Number:

WIPO Patent Application WO/2021/229496

Kind Code:

Abstract:

A computer implemented system for providing a comparative insurance quote to a user includes a memory for storing data. A communications module receives an insurance document in the form of an electronic document or digital image containing information relating to an individual and information relating to an insurance policy and stores this in the memory. A processor is programmed to access the memory and retrieve the stored electronic document or digital image. The processor executes optical character recognition on the electronic document or digital image to extract text from the electronic document or digital image and analyses the extracted text to identify relevant fields related to the individual and relevant fields related to insurance information. The processor maps the insurance information fields to comparative insurance fields of the insurer and uses the mapped fields to generate an insurance quote which is then transmitted to the user via the communications module.

Inventors:

FRIEDLANDER GARETH (ZA)
KALLNER HYLTON (ZA)
SADOWSKI ROMUALD STANISLAW (ZA)
FALCONER STEVEN JOHN (ZA)

Application Number:

PCT/IB2021/054107

Publication Date:

November 18, 2021

Filing Date:

May 13, 2021

Export Citation:

Click for automatic bibliography generation Help

Assignee:

DISCOVERY LTD (ZA)

International Classes:

G06Q40/08; G06Q30/06

Foreign References:

US20200097714A1	2020-03-26
US20100202698A1	2010-08-12

Other References:

GRINGEL FABIAN: "Comparison of OCR tools: how to choose the best tool for your project", 20 January 2020 (2020-01-20), pages 1 - 13, XP055822840, Retrieved from the Internet [retrieved on 20210709]
PEIRSON ERICK: "Tutorial: Text Extraction and OCR with Tesseract and ImageMagick", 9 December 2015 (2015-12-09), XP055822920, Retrieved from the Internet [retrieved on 20210709]
ANONYMOUS: "Batch OCR for many PDF files (not already OCRed)?", 20 January 2015 (2015-01-20), pages 1 - 3, XP055822919, Retrieved from the Internet [retrieved on 20210709]

Attorney, Agent or Firm:

SPOOR & FISHER et al. (ZA)

Download PDF:

View/Download PDF PDF Help

Claims:

CLAIMS

1 . A computer implemented system for providing an automated comparative insurance quote from an insurer to a user, the system including: a memory for storing data; a communications module for receiving an insurance document in the form of an electronic document or digital image containing information relating to an individual and information relating to an insurance policy or quote that the individual has received and to store the received electronic document or digital image in the memory; a processor operably coupled to the memory and communications module, the processor programmed to: access the memory and retrieve the stored electronic document or digital image; execute optical character recognition on the electronic document or digital image to extract text from the electronic document or digital image; analyse the extracted text to identify relevant fields related to the individual and relevant fields related to insurance information; map the insurance information fields to comparative insurance fields of the insurer; use the mapped fields to generate an insurance quote; and transmit the generated insurance quote to the user via the communications module.

2. A computer implemented system according to claim 1 wherein if the insurance document received is in the form of an electronic document the processor first converts the document to a digital image and then executes optical character recognition on the digital image.

3. A computer implemented system according to claim 2 wherein the digital image is submitted in batches for optical character recognition.

4. A computer implemented system according to any preceding claim wherein the executing of the optical character recognition by the processor includes: transmitting the electronic document or digital image via the communications module to an external optical character recognition engine; receiving back from the external optical character recognition engine data including optical character recognition results; and further processing the optical character recognition results.

5. A computer implemented system according to any preceding claim wherein the executing of the optical character recognition by the processor includes using a fuzzy matching library to improve the optical character recognition results.

6. A computer implemented system according to any preceding claim wherein the executing of the optical character recognition by the processor includes using a document detection framework to identify which insurer the received insurance document is from.

7. A computer implemented system according to any preceding claim wherein the processor translates words recognised from the insurance document into words used by the insurer before mapping the insurance fields to comparative insurance fields of the insurer.

8. A computer implemented system according to any preceding claim wherein the processor maps competitive values and benefits from the optical character recognised document to competitive values and benefits of the insurer.

9. A method of providing an automated comparative insurance quote from an insurer to a user, the method including: receiving, via a communications module, an insurance document in the form of an electronic document or digital image containing information relating to an individual and information relating to an insurance policy or quote that the individual has received; storing the received electronic document or digital image in a memory; accessing the memory and retrieving the stored electronic document or digital image; executing optical character recognition on the electronic document or digital image to extract text from the electronic document or digital image; analysing the extracted text to identify relevant fields related to the individual and relevant fields related to insurance information; mapping the insurance information fields to comparative insurance fields of the insurer; using the mapped fields to generate an insurance quote; and transmitting the generated insurance quote to the user via the communications module.

10. A method according to claim 9 wherein if the insurance document received is in the form of an electronic document it is first converted to a digital image and then optical character recognition is executed on the digital image.

11 . A method according to claim 10 wherein the digital image is submitted in batches for optical character recognition.

12. A method according to any of claims 9 to 11 wherein the executing of the optical character recognition includes: transmitting the electronic document or digital image via the communications module to an external optical character recognition engine; receiving back from the external optical character recognition engine data including optical character recognition results; and further processing the optical character recognition results.

13. A method according to any of claims 9 to 12 wherein the executing of the optical character recognition includes using a fuzzy matching library to improve the optical character recognition results.

14. A method according to any of claims 9 to 13 wherein the executing of the optical character recognition includes using a document detection framework to identify which insurer the received insurance document is from.

15. A method according to any of claims 9 to 14 wherein words recognised from the insurance document are translated into words used by the insurer before mapping the insurance fields to comparative insurance fields of the insurer.

16. A method according to any of claims 9 to 15 wherein competitive values and benefits from the optical character recognised document are mapped to competitive values and benefits of the insurer.

Description:

A COMPUTER IMPLEMENTED SYSTEM AND METHOD OF PROVIDING AN AUTOMATED COMPARATIVE INSURANCE QUOTE TO A USER

BACKGROUND OF THE INVENTION

The present application relates to a system and a method of providing a comparative insurance quote to a user.

SUMMARY OF THE INVENTION

According to one example there is provided a computer implemented system for providing a comparative insurance quote to a user, the system including: a memory for storing data; a communications module for receiving an insurance document in the form of an electronic document or digital image containing information relating to an individual and information relating to an insurance policy or quote that the individual has received and to store the received electronic document or digital image in the memory; a processor operably coupled to the memory and communications module, the processor programmed to: access the memory and retrieve the stored electronic document or digital image; execute optical character recognition on the electronic document or digital image to extract text from the electronic document or digital image; analyse the extracted text to identify relevant fields related to the individual and relevant fields related to insurance information; map the insurance information fields to comparative insurance fields of the insurer; use the mapped fields to generate an insurance quote; and transmit the generated insurance quote to the user via the communications module.

If the insurance document received is in the form of an electronic document, the processor first converts the document to a digital image and then executes optical character recognition on the digital image.

The digital image may be submitted in batches for optical character recognition.

In one example, the executing of the optical character recognition by the processor includes: transmitting the electronic document or digital image via the communications module to an external optical character recognition engine; receiving back from the external optical character recognition engine data including optical character recognition results; and further processing the optical character recognition results.

Executing of the optical character recognition by the processor may also include using a fuzzy matching library to improve the optical character recognition results. Executing of the optical character recognition by the processor may also include using a document detection framework to identify which insurer the received insurance document is from.

In one example, the processor translates words recognised from the insurance document into words used by the insurer before mapping the insurance fields to comparative insurance fields of the insurer.

The processor typically maps competitive values and benefits from the optical character recognised document to competitive values and benefits of the insurer.

According to another example there is provided an automated comparative insurance quote from an insurer to a user, the method including: receiving, via a communications module, an insurance document in the form of an electronic document or digital image containing information relating to an individual and information relating to an insurance policy or quote that the individual has received; storing the received electronic document or digital image in a memory; accessing the memory and retrieving the stored electronic document or digital image; executing optical character recognition on the electronic document or digital image to extract text from the electronic document or digital image; analysing the extracted text to identify relevant fields related to the individual and relevant fields related to insurance information; mapping the insurance information fields to comparative insurance fields of the insurer; using the mapped fields to generate an insurance quote; and transmitting the generated insurance quote to the user via the communications module.

If the insurance document received is in the form of an electronic document, it may be first converted to a digital image and then optical character recognition is executed on the digital image.

The digital image may be submitted in batches for optical character recognition.

In one example, the executing of the optical character recognition includes: transmitting the electronic document or digital image via the communications module to an external optical character recognition engine; receiving back from the external optical character recognition engine data including optical character recognition results; and further processing the optical character recognition results.

The executing of the optical character recognition may also include using a fuzzy matching library to improve the optical character recognition results.

The executing of the optical character recognition includes using a document detection framework to identify which insurer the received insurance document is from. In one example, words recognised from the insurance document are translated into words used by the insurer before mapping the insurance fields to comparative insurance fields of the insurer.

Competitive values and benefits from the optical character recognised document may be mapped to competitive values and benefits of the insurer.

BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 is a block diagram illustrating an example server to implement the methodologies described herein;

Figure 2 shows an example system of the present invention; and

Figure 3 shows an example mobile computing device for use with the present invention.

DESCRIPTION OF EMBODIMENTS

The systems and methodology described herein relate to providing a comparative insurance quote to a user.

A computer implemented system for providing an automated comparative insurance quote from an insurer to a user typically takes the form of a server 10, typically operated by an insurer.

The server 10 includes a memory 12 for storing data.

A communications module 14 is used for receiving an insurance document in the form of an electronic document or digital image containing information relating to an individual and information relating to an insurance policy or quote that the individual has received and to store the received electronic document or digital image in the memory 12. The electronic document may be a PDF document or may be any plain text, Word documents, spreadsheets or e-mail to name a few examples.

A processor 16 is operably coupled to the memory 12 and communications module 14.

The server 10 also includes a display 18 for displaying information to a user and a user interface 20 through which a user can input information or instructions into the device 10.

In use, the processor 16 controls the server 10 to access the memory 12 and retrieve the stored electronic document or digital image and to then execute optical character recognition on the electronic document or digital image to extract text from the electronic document or digital image.

The processor 16 will analyse the extracted text to identify relevant fields related to the individual and relevant fields related to insurance information and map the insurance information fields to comparative insurance fields of the insurer.

The processor 16 uses the mapped fields to generate an insurance quote and transmits the generated insurance quote to the user via the communications module.

The above will be described in more detail below.

Referring to Figure 2, a user can access the server for a comparative insurance quote using any one of a number of example devices.

For illustrative purposes, Figure 2 shows a tablet mobile computing device 22, a mobile telephone 24 and a computer 26.

All of these are able to communication with the server 10 via a communications network 28. One example embodiment will now be described with reference to Figure 3 which shows a mobile telephone 24.

The mobile telephone 24 includes a communication module 30 which is used for transmitting and receiving data to and from the mobile telephone 24.

The communication module 30 of a mobile telephone typically includes short- range communication ability and long-range communication ability.

The short-range communication ability could be Bluetooth, for example, by means of which the mobile telephone 24 can communicate with other devices in relatively near proximity to the mobile telephone 24.

The long-range communication ability could be, for example, to transmit data over a communications network.

The communication network may be a Public Services Telecommunications Network (PSTN), a Private Network, a Virtual Private Network (VPN), an Intelligent Network (IN) or Converged Network for example, or may be a combination of one or more of these network types.

The mobile telephone 24 may also include a display 32 by means of which information can be displayed to a user of the device and a user interface 34 by means of which a user can input data and instructions into the device.

An example of a user interface 34 is a touchscreen which is often included in modern smartphones together with one or more buttons and/or switches also typically included in modern smartphones.

The mobile telephone 24 includes a camera 36 which is used for capturing images. For purposes of the present invention the image captured will relate to an insurance quote. The mobile telephone 24 also includes an on-board memory 38.

A processor 40 is connected to the memory 38, communication module 30, display 32, user interface 34 and camera 36.

The processor 40 has software executing thereon and the processor 40 controls the operation of the mobile telephone 24.

In use, the user will either use the camera 36 of the mobile telephone 24 to take a picture of an insurance quote and transmit this via the communications network to the server or use the camera 36 and appropriate scanning software to scan a document into an electronic format for transmission to the server 10.

Alternatively, or in addition, the user will already have an insurance quote in an electronic format and will forward this to the server 10, for example by forwarding an email with the electronic document attached.

Alternatively, or in addition, a user will upload the electronic insurance quote to the server via a file sharing application, for example Drop Box or Google Drive.

On receipt at the server 10, the server will commence executing the processes described above.

Regarding executing optical character recognition on the electronic document or digital image to extract text from the electronic document or digital image, a number of existing services can be used such as Amazon, Azure and Google OCR engines.

The processor 16 submits batches of images which are either photographic images or scans for OCR. The processer 16 uses dense text recognition for document text but free text is also supported for sparse areas of text within a larger image.

The pages submitted for OCR need to be images for example in JPEG,

PNG or TIFF format.

Thus if the quote received is in PDF format, the processor 16 converts the PDF pages into an image and then submits the images for OCR in batches.

An effect of this approach is that the conversion of documents is synchronous and real-time (executed immediately when request), not asynchronous and offline (submitting a request and polling for completion).

This is achieved by submitting the images to be OCRed directly with the request. This is also avoids unnecessary transmission of any content to and from cloud or other storage.

To produce better extraction results, the processor 16 interprets the document content as either paragraphs when in columns, or as lines of text across the page.

Both capabilities are needed to deal with the various document structures.

In one example embodiment the images are transmitted via the communications module 14 to the OCR engine and the results received back from the OCR engine via the communications module 14 in Json format, for example.

The received OCR results are stored in the memory 12 and accessed for further processing by the processor 16.

The processor 16 retrieves the received OCR results from the memory and reorders these when the OCR engine returns text out of order. In order to improve the reliability of OCR, the processor 16 is able to deal with characters that look like others but wouldn’t ordinarily match, for example the Cyrillic characters which look like Latin characters can be correctly matched by the processor 16 tools when the OCR engine gets those wrong.

The processor further uses a fuzzy matching library which is done to improve the OCR results. For example, “Hunday 120” is actually Hyundai i20 and the fuzzy logic helps with this type of matching.

In the prototype this was implemented using a custom implementation of a Levenshtein distance calculator for fuzzy string matching.

Instead of edit distance between two strings (larger number of edits = more dissimilar, 0 edits = exactly alike), the processor 16 uses a confidence level to measure the similarity of two strings (0% = nothing alike, 100% = exactly alike), to match the semantics used by OCR engines.

The processor 16 uses an adapted Levenshtein distance calculation to a percentage confidence calculation, which is also increasingly strict when matching longer text values.

The processor 16 will use the fuzzy match calculator as the foundation to build a library of text comparison and manipulation functions (such as “matches”, “contains”, “startsWith”.

The processor 16 further uses some smart processing tools which can guess at the correct value of a string even when OCR has been unreliable. E.g. the processor 16 will correctly interpret “R1.000.000” as “R1, 000, 000.00”, i.e. one million Rand, not one Rand.

Once the optical character recognition is completed, the processor 16 analyses the extracted text to identify relevant fields related to the individual and relevant fields related to insurance information map the insurance information fields to comparative insurance fields of the insurer. Before processing further, the document content is scrubbed to remove unwanted lines such as header and footer content.

The processor 16 includes two search tools that are capable of navigating the OCR document content to extract values from the text.

The key steps used by the processor 16 are:

• find content in the document text using fuzzy matching (“matches”, “contains”, “startsWith”, and similar operations).

• find content in the document using pattern matching (e.g. using regular expressions).

• find content in the document using position (line number, geometric position, relative position, i.e. a piece of text near another known piece of text).

• find key/value pairs.

• find and interpret tables of values such as Rand value amounts in a table of premiums.

When interpreting the document content, the processor 16 interprets content in the document in a competitor’s domain language into terms that make sense to the insurer. For example, translating “Bicycle” to “Pedal cycle” is a feature used by the processor 16 in an extraction and mapping engine.

The processor 16 is able to deal with several tricky special cases, for example:

• When two columns of text are so close together horizontally that OCR interprets them as a single paragraph.

• When OCR returns text out of order (due to tiny bounding box inaccuracies in OCR results), the processor 16 can still generally find what it’s looking for by scanning content above and below the expected location. • When document content wraps across multiple lines, breaking a string that is expected to be seen such as “A quick brown fox” into multiple strings such as (“A quick brown”, “fox”).

• A document detection framework can take an unknown document and guess what type of document it is (and which insurer it’s from) using the library of tools listed above. This is done by looking for key values, Name, Reg No or a combination of uniquely identifying values.

In the next step in the process, a mapping engine maps competitive values and benefits to equivalent the insurer operating the system.

There are two example ways of doing this, the first uses a configuration of based elements where there is a configuration for each competitor value and what the insurer value to which it maps.

In the second example way a more complex logic is applied where a number of competitor values maps to a number of the insurer’s values. This is implemented as specific java classes.

The output of the mapping engine is then passed to the insurer’s quoting engine that takes a quote provided in the insurer’s format and then validates the structure and calculates the premium that would apply thereby generating the quote.

The generated insurance quote is then transmitted to the user via the communications module.

Previous Patent: MACHINE FOR PRODUCING PRE-CALIBRATED LEAD SHOT

Next Patent: METHODS RELATING TO MEDICAL DIAGNOSTICS AND TO MEDICAL DIAGNOSTIC SYSTEMS