Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND DEVICE FOR CLASSIFYING FIELD TYPES OF A DIGITAL IMAGE
Document Type and Number:
WIPO Patent Application WO/2004/086292
Kind Code:
A1
Abstract:
A method (20) and electronic device (1) , for classifying field types of text in an image captured by a camera (17). In use both the method (20) and device (1) obtain the image from the camera (17) and then identify text areas of the image. Next character recognition is performed on text in the text area to provide output character data that is classified into a field types. The output character data is then stored in a memory (16) in a location indicative of the field type.

Inventors:
ZHEN LI XIN (CN)
LI JUN (CN)
HUANG JIAN CHENG (US)
Application Number:
PCT/EP2004/050281
Publication Date:
October 07, 2004
Filing Date:
March 10, 2004
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MOTOROLA INC (US)
MOTOROLA LTD (GB)
ZHEN LI XIN (CN)
LI JUN (CN)
HUANG JIAN CHENG (US)
International Classes:
G06K9/20; (IPC1-7): G06K9/20
Domestic Patent References:
WO2002061670A12002-08-08
Foreign References:
US20030044068A12003-03-06
US20020037104A12002-03-28
Other References:
PATENT ABSTRACTS OF JAPAN vol. 0082, no. 19 (P - 306) 5 October 1984 (1984-10-05)
Attorney, Agent or Firm:
Mccormack, Derek J. (Midpoint Alencon Link, Basingstoke Hampshire RG21 7PL, GB)
Download PDF:
Claims:
.-WE CLAIM :
1. A method effected by an electronic device, the method providing for classifying field types of text in an image captured by a camera, the method including : obtaining the image ; identifying at least one text area of the image; performing character recognition on text in the at. least one said text area to provide output character data; classifying said at least one text area into a field type; and storing said output character data in a location indicative of said field type.
2. The method as claimed in claim 1, the method further including rotating the image to substantially remove skew associated with the text, the rotating being effected prior to said classifying.
3. The method as claimed in claim 1, wherein the obtaining includes performing resolution enhancement on said image.
4. The method as claimed in claim 3, wherein the resolution enhancement is effected if the image has a resolution below a threshold value.
5. The method as claimed in claim 3, wherein the obtaining includes performing binarization on said image if the image is represented in gray scale.
6. The method as claimed in claim 1, further characterized by the image being an image of a business card,.
7. The method as claimed in claim 1, wherein said least one text area into a field type is effected by a set of rules.
8. The method as claimed in claim 1, wherein the storing effects a storing of said output character data in at least one address book field on an electronic device.
9. The method as claimed in claim 8, wherein said address book field is a telephone number field.
10. The method as claimed in claim 8, wherein said address book field is a person's name field.
11. The method as. claimed in claim 8, wherein said address book field is a street address field..
12. The method as claimed in claim 8, wherein the said address book field is a company name field.
13. An electronic device for classifying field types of text in an image, the device comprising: a processor; a memory coupled to the processor ; and a camera input port wherein, in use, the camera input port allows for obtaining the image and the processor thereafter effects identifying at least one text area of the image, and thereafter the processor performs: character recognition on text in the at least one said text area to provide output character data; Classifies said at least one text area into a field type; and Stores in the memory said output character data in a location indicative of said field type.
14. An electronicdevice as clamed in claim 13 wherein the camera input port is coupled to a camera.
15. An electronic device as claimed in claim 13 wherein the processor provides for rotating the image to substantially remove skew associated with the text, the rotating being effected prior to the processor classifying the text area into a field type.
Description:
METHOD AND DEVICE FOR CLASSIFYING FIELD TYPES OF A DIGITAL IMAGE FIELD OF THE INVENTION This invention relates to classifying field types of a digital image that includes text. The invention is particularly useful for, but not necessarily limited to, classifying field types from a digital image of a business. card.

BACKGROUND OF THE INVENTION It is known to obtain data from business card by using desktop scanning devices. Such scanning devices require cards to be inserted into a slot in an aligned manner so that the fields such as a person's name filed and telephone filed can be easily identified. Once the fields are identified, the scanner can store the field's associated information (personal name, address, telephone number, company name etc) that can be down loaded to a user's personal address book located on a computer, personal digital assistant, cellular telephone or any other suitable electronic device. However, it is not convenient to carry these scanning devices on business trips or to out of office meetings and therefore, typically, a user would have to wait until they returned to their office or home before scanning any business cards that they received.

In US patent 6178270 there is described a method and device for. processing images of, for example, written documentation. This patent describes the use of a camera for capturing an image of a document and

a user then selects-areas of text in he image for processing. A skew-angle is then determined. for use in processing the. selected areas of text. Although this method and device are useful for processing images, they do not provide for the capture of data associated with classification of field types, where these field types are identified from a digital image of a business card captured by a camera or other similar device.

In this specification, including the claims, the terms comprises', comprising'or similar terms are intended to mean a non-exclusive inclusion, such that a method or apparatus that comprises a list of elements does not include those elements solely, but may well include other elements not listed.

SUMMARY OF THE INVENTION According to one aspect of the invention there is provided a method effected by an electronic device, the method providing for classifying field types of text in an image captured by a camera, the method including: obtaining the image; identifying at least one text area of the image; performing character recognition on the text in the at least one said text area to provide output character data; classifying said at least one text area into a field type ; and storing said output character data in a location indicative of said field type.

Preferably, the method includes rotating the image to substantially remove skew associated with

the text, the rotating being effected prior to said classifying..

The obtaining may suitably include performing resolution enhancement, on said image. Preferably, the resolution. enhancement is effected if the image has a resolution below a threshold value.

The obtaining may preferably include performing binarization, on said image if the image is represented in gray scale.

Suitably, the method is further characterized by the image being an image of a business card.

Preferably, the classifying said least one text area into a field type is effected by a set of rules.

Preferably, the storing effects a storing of said output character data in at least one address book field on an electronic device.

Preferably, said address book field is a telephone number field.

Suitably, said address book field is a person's name field.

Preferably, said address book field is a street address field.

Suitably, said address book field is a company name field.

According to another aspect of the invention there is provided an electronic device for

classifying field types of text in an image, the : device comprising: a processor; a memory coupled to the processor; and a camera input port wherein, in use,. the camera input port allows for obtaining the image and the processor thereafter effects identifying at least one text area of the image, and thereafter the processor performs: character recognition on text in the at least one said text area to provide output character data; Classifies said at least one text area into a field type ; and Stores in the memory said output character data in a location indicative of said field type.

Preferably, the camera input port is coupled to a camera.

Suitably, processor provides for rotating the image to substantially remove skew associated with the text, the rotating being effected prior to the processor classifying the text area into a field type.

BRIEF DESCRIPTION OF THE DRAWINGS In order that the invention may be readily understood and put into practical effect, reference will now be made to a preferred embodiment as illustrated with reference to the accompanying drawings in which: Fig. 1 is a block diagram illustrating an embodiment of an electronic device in accordance with the invention;

Fig. 2 is a flow diagram illustrating a method for classifying field types of text in an image captured by a camera of the, electronic device of Fig.

. 1 ; and Fig. 3 is a flow diagram illustrating a method of obtaining an image process used in Fig 2.

DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT OF THE INVENTION In the drawings, like numerals on different Figs are used to indicate like elements throughout. With reference to Fig. 1, there is illustrated an electronic device 1 comprising a radio frequency communications unit 2 coupled to be in communication with a processor 3. An input interface in the form of a screen 5 and a keypad 6 are also coupled to be in communication with the processor 3. Furthermore, there is a camera input port 19 coupled also coupled to be in communication with the processor 3, the camera input port 19 being coupled to an associated camera 17. As will be apparent to a person skilled in the art, the camera 17 may be an integral part of the device 1 or alternatively the camera 17 may be a detachable accessory.

The processor 3 includes an encoder/decoder 11 with an associated Read Only Memory (ROM) 12 storing data for encoding and decoding voice or other signals that may be transmitted or received by electronic device 1. The processor 3 also includes a micro- processor 13 coupled to both an encoder/decoder 11 and an associated character Read Only Memory (ROM) 14. Micro-processor 13 is also coupled to a Random Access Memory (RAM) 4, the keypad 6, the screen 5, the camera 17 and a static programmable memory 16.

Auxiliary outputs of micro-processor. 13 are coupled to an alert module 15 that typically contains a speaker, vibrator motor and associated drivers.

The character Read only memory 14 stores code for decoding or encoding text messages that may be, received by the communication unit 2, input at the keypad 6. In this embodiment the. character Read Only Memory 14 also stores operating code (OC) for micro- processor. 13 and code for performing a method for classifying field types of text in an image captured by the camera 17.

The radio frequency communications unit 2 is a combined receiver and transmitter having a common antenna 7. The communications unit 2 has a transceiver 8 coupled to antenna 7 via a radio frequency amplifier 9. The transceiver 8 is also coupled to a combined modulator/demodulator 10 that couples the communications unit 2 to the processor 3.

As will be apparent to a person skilled in the art, the electronic device 1 can be any electronic device including a cellular telephone, a conventional type telephone, a laptop computer or a PDA.

Referring to Fig. 2 there is illustrated a method 20 for classifying field types of text in an image captured by the camera 17. The method 20 includes a start step 21 that is invoked by a user operating a command function on the keypad 6.. An image is obtained at an'obtaining an image step 22 wherein a user would typically direct the camera'17 at business card until the business card approximately fitted within the boundary of the screen 5 that displays what is being detected by the camera 17. However, the complete business card does not need to be displayed on screen 17 and only the

required text'. portion.. of the card needs to be displayed. Once, a user is satisfied with what is being displayed on. screen 5 the image is captured and stored in RAM 4.

The processor. 3. then effects a step of rotating if required step 23 to provide for rotating the image to substantially remove skew associated with the text. The skew is determined by an angle detection algorithm to. calculate the angle between a reference axis (horizontal axis) and an axis along which text in the text areas extends. The skew-angle is determined by an algorithm described in US patent 6178270 that is incorporated into this specification by. reference.

After the processor 3 effects the rotating if required step 23, the processor 3 then controls the method 20 to effect an identifying 24 of at least one text area of the image, the text area comprising text. The identification first projects the binarized image in both horizontal and vertical directions, then analyses the projected profiles of both directions using known layout analysis techniques to delimit (distinguish) every text area from the background of the image.

The method then conducts a test 25 to check the image quality by checking a separability ratio between black and white regions in the binarized image. The test checks to determine the likelihood of the black regions being characters. The likelihood is separability ratio. A larger ratio results in better. image quality. If the separability ratio is below a threshold, that is usually evaluated and calculated beforehand by training'sample images containing just text areas, then typically the

separability ratio should at least 0.8 for an image of good quality. Otherwise, binarization refinement 26 is required. Binarization refinement 26. is restricted to only text areas detected by the step of identifying 2.4 of at least one text area. As will be apparent to a person skilled in the art, binarization refinement 26 recalculates a binarization decision threshold using data within only the identified text areas so as to avoid the effects from non-text regions.

After binarization refinement 26, or if the test 25 determines the image is of a sufficient quality, the method 20 proceeds to performing character recognition 27 on text in at least one text area to provide output character data. Every image pieces containing either a line of texts or a word are input one by one into a dedicated Optical Character Recogniser, and converted into the corresponding characters.

The method 20 then proceeds to Classifying 28A each of the text areas into a respective field type by using a set of rules. These rules are based on keyword matching is to perform field classification, for example, when a keyword"Address","street", "st.","Avenue"occurs, it is possible this line or area the text area where address information is located. To identify a telephone number field the word"Tel"or"TelephoneN or"+"is identified and the phone number should follow directly thereafter.

For email addresses an"s","email",". coma". UK"and the like are identified. To identify a person's name their title is identified (e. g. Dr. Mr. Mrs.

Miss. Ms. ) and their name should follow directly thereafter. For a company or firm or business. name

identification keywords such as"Inc","Pty","Pte", "Ltd","Limited-and"Partners"areused., The method. then provides for storing 28B the output character data in a location indicative of the field type of the field, the output character data being stored in. static memory 16. The output character data is typically stored in an address book field and the filed and the method may populate, with the output character data, address book fields such as: a telephone number field ; a person's name field; a street address field ; or a company name field.

The method 20 then terminates at an end step 29 and the user may then actuate the keypad again to. obtain more output character data from another business card.

Referring to Fig. 3 there is illustrated a method that is a more detailed description of the obtaining image 22 process. The process effects an image capture by camera 17 and stored in RAM 4. A test 35 then determines whether the resolution of the image stored in RAM 4 is above a threshold value.

Usually, Dots-Per-Inch (DPI) is used to represent image's resolution and the threshold value is suitably set to a 200 DPI resolution. If the test determines that the image is above the threshold value then a test 37 is conducted, alternatively resolution enhancement 36 is effected and an enhanced image is stored in RAM 4. The enhancement can be processed by image interpolation methods. Hence, if the original image is enlarged, every pixels on the original image will be mapped to several pixels on the enlarged image. The image values of those pixels mapped in the enlarged image will be calculated by the neighbour of the pixel in the original image.

Image interpolation is described in"T. M. Lehmann, :'C.

Gonner, K Spitzer, Survey: interpolation methods in medical image processing, Medical Imaging, IEEE, Transactions on, Volume: 18 Issue: 11, Nov, 1999 Page (s): 1049-1075". This document is incorporated into this specification as a short hand reference.

At test 37 the image or enhanced image is checked to determine if it is a black and white or a gray scale image. Only if the image or enhanced image is a'determined to be in a gray. scale format then binarization process 28 is effected on the image or enhanced Image and then the process 22 ends.

Advantageously, the present invention provides for a convenient method and device for obtaining and storing data in address book fields by simply taking a picture of a business card to obtain an image. The invention processes the image and fields on the image are classified and the address book fields are updated with data obtained from the image such as: a telephone number field; a person's name field; a street address field; or a company name field.

The detailed description provides a preferred exemplary embodiment only, and is not intended to limit the scope, applicability, or configuration of the invention. Rather, the detailed description of the preferred exemplary embodiment provides those skilled in the art with an enabling description for implementing preferred exemplary embodiment of the invention. It should be understood that various changes may be made in the function and arrangement of elements without departing from the spirit and scope of the invention as set forth in the appended claims.