Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
AN SYSTEM FOR IMPROVING FINANCIAL DOCUMENT DIGITIZATION AND EXTRACTION USING HIGH DEFINITION VIDEO CAPTURES
Document Type and Number:
WIPO Patent Application WO/2020/183499
Kind Code:
A1
Abstract:
The present invention relates to improving document digitization and extraction. The present invention includes a handheld mobile device (102) and a cloud server (104). The handheld device (102) and the cloud server (104) are connected through internet. The handheld device (102) configured to receiving a video stream of the user's credential financial documents from a camera (116) of the handheld device (102) to detect the financial documents in the video stream. If the criteria involved the image capture from a video stream are satisfied, then documents is imaged. The image data comprising a plurality of frames and generating a composite image based on at least two of the plurality of frames. Then the cloud server (104) is used for optical character recognition and the extraction of information from the image sent by the handheld device.

More Like This:
Inventors:
BIANCHI LUCAS (IN)
JOHN JINESH (IN)
GUTHI SIDDHIVINAYAK (IN)
PARAMESWARAN KRISHNAN (IN)
Application Number:
PCT/IN2020/050232
Publication Date:
September 17, 2020
Filing Date:
March 13, 2020
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
OPENDOORS FINTECH PVT LTD (IN)
International Classes:
G06K9/22; G06K9/46; G06Q90/00
Foreign References:
US20140327940A12014-11-06
US20150078671A12015-03-19
Attorney, Agent or Firm:
SHARMA, Isha (IN)
Download PDF:
Claims:
CLAIMS

1. The system(lOO) for improve document digitization and extraction, the system(lOO) comprising: an at least one handheld mobile device(102), the at least one handheld mobile device(102) is used to video grab the documents; an at least one cloud server(104), the at least one handheld device(102) and at least one the cloud server(104) are connected through internet; wherein, the at least one cloud server(104) is used for optical character recognition and the extraction of information.

2 The system(lOO) as claimed in claim 1, wherein the handheld device( 102) comprises: an at least one camera(l 16), an at least one handheld device machine readable storage medium(106), the at least one handheld device machine readable storage medium(106) stores executable instructions, and an at least one handheld device processor(108); wherein the at least one handheld device processor(108) processes the executable instructions to execute an image analysis and extract high quality image.

3. The system(lOO) as claimed in claim 1, wherein the cloud server(104) comprises: an at least one server computer(l 10), the at least one server computer(l 10) having an at least one cloud server machine readable storage medium(112), the at least one cloud server machine readable storage medium(112) stores executable instructions, and an at least one cloud server processor(l 14); wherein, the at least one cloud server processor(114) processes the executable instructions to perform optical character recognition and extract information from the image.

4. The system(lOO) as claimed in claim 1, wherein the at least one handheld device(102) is executes the machine learning algorithm that uses neural networks understand and identifies good images from the video grab.

5. The system(lOO) as claimed in claim 1, wherein the at least one handheld device(102) is selected from a mobile device , tablet or a hand held scanners.

6 The method for operation and monitoring of document digitization, the method comprising: an at least one handheld device(102) configured to receiving a video stream of the user's credential financial documents from an at least one camera(116); detect the financial documents in the video stream; if the criteria involved the image capture from a video stream are satisfied, presenting a second indication on the at least one handheld device(102) that the user's credential document is ready to be imaged; the image data comprising a plurality of frames; generating a composite image based on at least two of the plurality of frames; the composite image are transfer from the at least one handheld device(102) to at least one cloud server(104) to perform optical character recognition and extract information from the image; and capturing information from a user interface regarding a particular type of credential of financial document from the image sent by that hand held device(102).

7. The method as claimed in claim 6, wherein the at least one handheld device(102) determines whether the following criteria for capturing the image of the user's credential document are satisfied: the video stream includes a complete surface of the user's credential document; the video stream indicates that the user's credential document is within a threshold distance of the at least one handheld device(102); the video stream indicates that the user's credential document is in focus; and the video stream of the user's credential document is recognized to match the particular type of credential document to be imaged from the captured information; if the criteria are not satisfied, presenting a first indication on the at least one handheld device(102) that the user's credential document is not ready to be imaged; and if the criteria are satisfied, presenting a second indication on the at least one handheld device(102) that the user's credential document is ready to be imaged and capturing an image of the user's credential document.

Description:
AN SYSTEM FOR IMPROVING FINANCIAL DOCUMENT DIGITIZATION AND EXTRACTION USING HIGH DEFINITION VIDEO CAPTURES

FIELD OF THE INVENTION

The present invention relates to a system and method for improving financial document digitization, more specifically the present invention relates to extraction of data from financial documents using high definition video.

BACKGROUND

Modern mobile devices, such as smart phones and the like, combine multiple technologies to provide the user with a vast array of capabilities. For example, many smart phones are equipped with significant processing power, sophisticated multi-tasking operating systems, and high-bandwidth Internet connection capabilities. Moreover, such devices often have additional features that are becoming increasingly more common as standardized features. Such features include, but are not limited to, location-determining devices, such as Global Positioning System (GPS) devices; sensor devices, such as accelerometers and touch pads; and high-resolution video cameras. Modern mobile devices are well adapted to capturing images of a variety of objects, including documents, persons, automobiles, etc. Improvements to the mobile device camera capabilities and/or processing power make applications for capturing and/or processing digital image data using a mobile device increasingly attractive in an increasingly mobile-device-driven economy. However, limitations of the mobile device hardware and practical limitations of capturing images using a mobile device present major challenges to efficient and effective digital image processing. For example, digital images captured using a mobile device are often of insufficient quality for subsequent processing due to one or more artefacts such as blur, uneven illumination, insufficient illumination, oversaturated illumination, insufficient resolution, projective effects, etc. Attempts to process digital images including such artefacts may fail completely or produce inadequate quality results for the desired application. At best, the user may be required to repeat the capture operation and attempt to improve the quality of the image, but in some cases recapturing the image may be impossible, resulting in lost opportunity for acquiring images of important but transient circumstances, such as the location or condition of a person or vehicle before, during, and/or after an automobile accident. US9253349B2 discloses a method includes capturing plural frames of video data using a mobile device. The frames are analyzed to determine whether any depict an object exhibiting one or more defining characteristics, and if so, whether those frame(s) depicting the object also satisfy one or more predetermined quality control criteria. If one or more of the frames depict the object and also satisfy the one or more predetermined quality control criteria, the method further includes automatically capturing an image of the object. Exemplary defining characteristics are specified for various types of object, particularly objects comprising documents. Related systems and computer program products are also disclosed. The presently disclosed techniques and systems represent translational developments across the fields of image processing and business process management. Improved analytical techniques enable processing of image captured using cameras rather than traditional scanner technology, and facilitate distribution, tracking and analysis of documents and information throughout business processes.

US8995012B2 discloses an automated document processing system, particularly for mobile image capture and processing of financial documents to enhance images captured on a mobile device with camera capabilities for data extraction. The systems comprise a mobile device that includes a capture device configured to capture color images of documents, and that has a processor for performing certain operations, such as color reduction, and a transmitter for sending an image from the mobile device to a server. The server is configured to optimize and enhance the image, and to apply an improved binarization algorithm using a window within a relevant document field and/or a threshold for the document field. Orientation correction may also be performed at the server by reading the MICR line on a check and comparing a MICR confidence to a threshold. A check image may also be size corrected using features of the MICR line and expected document dimensions.

The existing inventions are not effective in video grabbing the image from the video captured. In some of the existing systems video grabbing does not work properly some images are blurred out. The existing inventions are time consuming and require a lot of effort grabbing a high definition image from video. The present invention overcomes the deficiencies in the prior art. Hence there is needed of present invention in order to facilitate the effective operation and monitoring of solar power plant. OBJECTIVE OF THE INVENTION

The main objective of the present invention is to provide financial document digitization. Another objective of the present invention is to provide clear images from the video grab.

Yet another objective of the present invention is to make sure that the image captured is of high quality with the help of OCR. (Algorithm which determines and filters only high quality ones)

Yet another objective of the present invention is to automatically perform the with lowest maintenance cost.

Yet another objective of the present invention is to provide image analysis by machine learning algorithm

Further objectives and features of the present invention will become apparent from the detailed description provided herein below, in which various embodiments of the disclosed present invention are illustrated by way of example and appropriate reference to accompanying drawings.

SUMMARY OF THE PRESENT INVENTION

The present invention relates to improving document digitization and extraction. The present invention includes a handheld mobile device and a cloud server. The handheld mobile device is used to video grab the documents. The handheld device and the cloud server are connected through internet. The cloud server is used for optical character recognition and the extraction of information. In an embodiment, the handheld device comprises: a camera, a handheld device machine readable storage medium and a handheld device processor. The handheld device machine readable storage medium stores executable instructions. Herein the handheld device processor processes the executable instructions to execute image analysis and extract high quality image. The cloud server comprises of a server computer, the server computer having a cloud server machine readable storage medium and a cloud server processor. The cloud server machine readable storage medium stores executable instructions. Herein the cloud server processor processes the executable instructions to perform optical character recognition and extract information from the image.

In an embodiment, the method for operation and monitoring of document digitization, the method comprising: A handheld device configured to receiving a video stream of the user's credential financial documents from a camera of the handheld device and detect the financial documents in the video stream if the criteria involved the image capture from a video stream are satisfied, presenting a second indication on the handheld device that the user's credential document is ready to be imaged. The image data comprising a plurality of frames and generating a composite image based on at least two of the plurality of frames. The composite image are transfer from the handheld device the cloud server to perform optical character recognition and extract information from the image; and capturing information from a user interface regarding a particular type of credential of financial document from the image sent by that hand held device.

An advantage of the present invention is that the present invention effectively provide high quality image from video grab.

Another advantage of present invention is that the present invention helps in rejecting the images which are not good.

Another advantage of the present invention is that the present invention does not lead to any blur images.

Yet another advantage of the present invention is that the present invention is it helps in fast process of work.

Yet another advantage of the present invention is that the present invention is operationally effective, cost effective, and easy to operate.

Yet another advantage of the present invention is that the present invention is very reliable.

Yet another advantage of the present invention is that the present invention does not have any shadows or lines in the final image feed.

Further advantages and features of the present invention will become apparent from the detailed description provided herein below, in which various embodiments of the disclosed present invention are illustrated by way of example and appropriate reference to accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings are incorporated in and constitute a part of this specification to provide a further understanding of the invention. The drawings illustrate one embodiment of the invention and together with the description, serve to explain the principles of the invention. Fig.l illustrates the schematic drawing of system of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

Definition

The terms“a” or“an”, as used herein, are defined as one or as more than one. The term “plurality”, as used herein, is defined as two or as more than two. The term“another”, as used herein, is defined as at least a second or more. The terms“including” and/or“having”, as used herein, are defined as comprising (i.e., open language). The term“coupled”, as used herein, is defined as connected, although not necessarily directly, and not necessarily mechanically.

The term“comprising” is not intended to limit inventions to only claiming the present invention with such comprising language. Any invention using the term comprising could be separated into one or more claims using“consisting” or“consisting of’ claim language and is so intended. The term“comprising” is used interchangeably used by the terms“having” or “containing”.

Reference throughout this document to“one embodiment”, “certain embodiments”, “an embodiment”,“another embodiment”, and“yet another embodiment” or similar terms means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of such phrases or in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics are combined in any suitable manner in one or more embodiments without limitation.

The term“or” as used herein is to be interpreted as an inclusive or meaning any one or any combination. Therefore,“A, B or C” means any of the following:“A; B; C; A and B; A and C; B and C; A, B and C”. An exception to this definition will occur only when a combination of elements, functions, steps or acts are in some way inherently mutually exclusive.

As used herein, the term "one or more" generally refers to, but not limited to, singular as well as plural form of the term.

The drawings featured in the figures are for the purpose of illustrating certain convenient embodiments of the present invention, and are not to be considered as limitation there to. Term“means” preceding a present participle of an operation indicates a desired function for which there is one or more embodiments, i.e., one or more methods, devices, or apparatuses for achieving the desired function and that one skilled in the art could select from these or their equivalent in view of the disclosure herein and use of the term“means” is not intended to be limiting.

Fig.l illustrates the schematic drawing of system(lOO) relates to improving document digitization and extraction. The present invention includes a handheld mobile device(102) and a cloud server(104). The handheld device(102) comprises: a camera(116), a handheld device machine readable storage medium(106), the handheld device machine readable storage medium(106), and a hand held processor(108). The cloud server(104) comprises: a server computer(l lO). The server computer(l lO) having a cloud server machine readable storage medium(112) and a cloud server processor(114).

The present invention relates to improving document digitization and extraction. The present invention includes a handheld mobile device and a cloud server. The handheld mobile device is used to video grab the documents. The handheld device and the cloud server are connected through internet. The cloud server is used for optical character recognition and the extraction of information. In an embodiment, the handheld device comprises: a camera, a handheld device machine readable storage medium and a handheld device processor. The handheld device machine readable storage medium stores executable instructions. Herein the handheld device processor processes the executable instructions to execute image analysis and extract high quality image. The cloud server comprises of a server computer, the server computer having a cloud server machine readable storage medium and a cloud server processor. The cloud server machine readable storage medium stores executable instructions. Herein the cloud server processor processes the executable instructions to perform optical character recognition and extract information from the image. In an embodiment, the handheld device executes, the machine learning algorithm that uses neural networks, understands and identifies good images from the video grab. In an embodiment, the handheld device is selected from a mobile device, tablet or hand held scanners.

In an embodiment, the method for operation and monitoring of document digitization, the method comprising: a handheld device configured to receiving a video stream of the user's credential financial documents from a camera of the handheld device, detect the financial documents in the video stream, if the criteria involved the image capture from a video stream are satisfied, presenting a second indication on the handheld device that the user's credential document is ready to be imaged, the image data comprising a plurality of frames; and generating a composite image based on at least two of the plurality of frames, the composite image are transfer from the handheld device the cloud server to perform optical character recognition and extract information from the image; and capturing information from a user interface regarding a particular type of credential of financial document from the image sent by that hand held device.

In an embodiment, the handheld device determines whether the following criteria for capturing the image of the user's credential document are satisfied, the method comprising: the video stream includes a complete surface of the user's credential document, the video stream indicates that the user's credential document is within a threshold distance of the handheld device, the video stream indicates that the user's credential document is in focus, and the video stream of the user's credential document is recognized to match the particular type of credential document to be imaged from the captured information, if the criteria are not satisfied, presenting a first indication on the handheld device that the user's credential document is not ready to be imaged, and if the criteria are satisfied, presenting a second indication on the handheld device that the user's credential document is ready to be imaged and capturing an image of the user's credential document.

The present invention relates to improving document digitization and extraction. The present invention includes one or more handheld mobile device and one or more cloud server. One or more handheld mobile device is used to video grab the documents. One or more handheld device and one or more cloud server are connected through internet. One or more cloud server is used for optical character recognition and the extraction of information. In an embodiment, one or more handheld devices having one or more cameras, one or more handheld device machine readable storage mediums and one or more handheld device processors. The one or more handheld device machine readable storage mediums store executable instructions. Herein one or more handheld device processor processes the executable instructions to execute an image analysis and extract high quality image. One or more cloud server comprises of one or more server computer, one or more server computer having one or more cloud server machine readable storage medium and one or more cloud server processor. One or more cloud server machine readable storage medium stores executable instructions. Herein one or more cloud server processor processes the executable instructions to perform optical character recognition and extract information from the image. In an embodiment, one or more handheld device is used for video grab of documents, the machine learning algorithm which uses neural networks understands and identifies good images from the video grab. In an embodiment, one or more handheld device is selected from a mobile device, tablet or hand held scanners.

In an embodiment, the method for operation and monitoring of document digitization, the method comprising:

One or more handheld device configured to receiving a video stream of the user's credential financial documents from a camera of the one or more handheld devices, detect the financial documents in the video stream, if the criteria involved the image capture from a video stream are satisfied, presenting a second indication on the one or more handheld devices that the user's credential document is ready to be imaged, the image data comprising a plurality of frames; and generating a composite image based on at least two of the plurality of frames, the selected frames are transfer from the one or more handheld device to one or more cloud for data extraction, thecomposite image are transfer from the one or more handheld device to one or more cloud servers to perform optical character recognition and extract information from the image; and capturing information from a user interface regarding a particular type of credential of financial document from the image sent by that hand held device.

In an embodiment, one or more handheld device determines whether the following criteria for capturing the image of the user's credential document are satisfied, the method comprising: the video stream includes a complete surface of the user's credential document, the video stream indicates that the user's credential document is within a threshold distance of the one or more handheld devices, the video stream indicates that the user's credential document is in focus, and the video stream of the user's credential document is recognized to match the particular type of credential document to be imaged from the captured information, if the criteria are not satisfied, presenting a first indication on the one or more handheld devices that the user's credential document is not ready to be imaged; and if the criteria are satisfied, presenting a second indication on the one or more handheld devices that the user's credential document is ready to be imaged and capturing an image of the user's credential document.

Further objectives, advantages and features of the present invention will become apparent from the detailed description provided herein below, in which various embodiments of the disclosed present invention are illustrated by way of example and appropriate reference to accompanying drawings. Those skilled in the art to which the present invention pertains may make modifications resulting in other embodiments employing principles of the present invention without departing from its spirit or characteristics, particularly upon considering the foregoing teachings. Accordingly, the described embodiments are to be considered in all respects only as illustrative, and not restrictive, and the scope of the present invention is, therefore, indicated by the appended claims rather than by the foregoing description or drawings. Consequently, while the present invention has been described with reference to particular embodiments, modifications of structure, sequence, materials and the like apparent to those skilled in the art still fall within the scope of the invention as claimed by the applicant.