Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
METHOD AND SYSTEM FOR FACE IN VIVO DETECTION
Document Type and Number:
WIPO Patent Application WO/2017/078627
Kind Code:
A1
Abstract:
The present invention discloses a method and a system for face in-vivo detection based on an illumination component. The method and system focuses on in-vivo detection based on the illumination information of the face image rather than relying on complex three-dimensional reconstruction and the detection method based on facial feature points. It can distinguish between a real face and a face image safely, and during detection only requires a user to perform a facial movement casually instead of to perform different movements as strictly required at specific times, offering a more friendly user experience. As the present invention does not rely on the detection method based on facial feature points, several deficiencies such as lower accuracy and complex calculation caused by the detection method based on facial feature points are avoided. The present invention also does not involve three-dimensional face reconstruction, hence achieving higher calculation speed and performing real-time processing.

Inventors:
WENG BIN (SG)
Application Number:
PCT/SG2016/050543
Publication Date:
May 11, 2017
Filing Date:
November 04, 2016
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
JING KING TECH HOLDINGS PTE LTD (SG)
International Classes:
G06T7/00; G06V10/60
Foreign References:
US20110188712A12011-08-04
US8254647B12012-08-28
US8542879B12013-09-24
Other References:
CHEN W. ET AL.: "Illumination compensation and normalization for robust face recognition using discrete cosine transform in logarithm domain", IEEE TRANS. SYST. MAN CYBERN. B CYBERN., vol. 36, no. 2, April 2006 (2006-04-01), pages 458 - 466, XP055178159
ZHAO M. ET AL.: "The discrete cosine transform (DCT) plus local normalization: a novel two-stage method for de-illumination in face recognition", OPTICA APPLICATA., vol. 41, no. Issue 4, 2011, pages 825 - 839
Attorney, Agent or Firm:
AEDIFICARE LAW CORPORATION (SG)
Download PDF:
Claims:
C LAIM S

1. A method for face detection, comprising:

capturing a facial movement by video and processing the video to obtain a plurality of facial images from a plurality of successive video frames;

using the Lambertian model to render each facial image obtained and computing discrete cosine transform (DTC) to obtain an illumination component of each facial image;

calculating the mean local variance for the illumination components of the facial images; and

comparing the mean local variance with a predetermined threshold (Th) to determi ne whether the facial i mage is an i mage of a real face.

2. The method according to claim 1, wherein each of the facial images obtained is denoted as Ij, where i is a natural number.

3. The method according to claim 2, wherein the illumination component of each face image Ij, which, according to the Lambertian model, can be denoted as:

Ii(x,y)=Ri(x,y)Li(x,y) where Rj is the reflection component, representing the surface reflectance of the facial image; Lj is the illumination component, representing the illumination and shadow of the facial image, and (x,y) represents the coordinates of the pixels in the image; log- transform the face image Ij, to obtain:

where fj, vj and uj respectively represent the value of I, R and L over the log- domain, i.e. vj=logR, uj=logL, compute DCT for f,, i.e.:

where N is the length and width of the image, and the high frequency coefficient of Fi(s,t) is set at 0, i.e.:

where M is a parameter to be defined, which is generally set at 5, compute the inverse DCT (discrete cosine transform) for the adjusted frequency domain coefficient F ~ i.e.:

Take f^as the estimation of the illumination component, i.e.:

Then the illumination component of the facial image can be obtained via inverse I ogari thmi c transf ormati on, i.e.:

4. The method according to claim 3, wherein M has an empirical value of 5.

5. The method according to claim 4, wherein the step of calculating the mean local variance for the illumination components of the face images obtained from T successive video frames comprises: dividing the illumination component of each face image equally into aBb image blocks with aBb pixels contained in each block, and use Bij to represent the No.j image block of the face image from the No.i frame, therefore the mean local variance for T successive video frames is:

wher

6. The method according to claim 5, wherein in the step of comparing the mean local variance (Avar) obtained with a predetermined threshold (Th), if the Avar value is greater than or equal to T h, the face i mage i n the v i deo i s an i mage of a real face.

7. The method according to claim 6, wherein if the Avar value is less than Th, the face image in the video is not an image of a real face.

8. The method according to any one of the preceding claims, wherein the threshold (Th) is set according to specific image quality, whereby a lower image resolution means a lower threshold Th.

9. A system for face detecti on, comprising:

an acquisition unit configured to capture a facial movement by video and to process the video to obtain a plurality of facial images from a plurality of successive video frames;

a calculation unit configured to render each facial image obtained using the Lambertian model, compute discrete cosine transform (DTC) to obtain an illumination component of each facial image, and calculate the mean local variance for the i 11 umi nati on components of the faci al i mages; and

a determination unit configured to compare the mean local variance with a predetermined threshold (Th) to determine whether the facial image is an image of a real face.

10. The system according to claim 9, wherein each of the facial images obtained is denoted as Ij, where i is a natural number.

1 1. The system according to claim 10, wherein the il lumination component of each face image Ij, which, according to the Lambertian model, can be denoted as: where Rj is the reflection component, representing the surface reflectance of the facial image; Lj is the ill umination component, representing the ill umination and shadow of the facial image, and (x,y) represents the coordinates of the pixels in the image; log- transform the face image Ij, to obtain:

where fj, vj and uj respectively represent the value of I, R and L over the log- domain,

where N is the length and width of the image, and the high frequency coefficient of Fi(s,t) is set at 0, i.e.:

where M is a parameter to be defined, which is generally set at 5, compute the inverse DCT (discrete cosine transform) for the adjusted frequency domain coefficient F ~ i.e.:

Take f^as the estimation of the illumination component, i.e.: T hen the i 11 umi nati on component of the f aci al i mage can be obtai ned vi a i nverse I ogari thmi c transf ormati on, i.e.:

12. The system according to claim 11, wherein M has an empirical value of 5.

13. The system according to claim 12, wherein the step of calculating the mean local variance for the illumination components of the face images obtained from T successive video frames comprises:

dividing the illumination component of each face image equally into aBb image blocks with aBb pixels contained in each block, and use Bij to represent the No.j image block of the face image from the No.i frame, therefore the mean local variance for T successive video frames is:

where var is the variance of the pixel val ufe of the i mage bl ock B i,j.

14. The system according to claim 13, wherein in the step of comparing the mean local variance (Avar) obtained with a predetermined threshold (Th), if the Avar value is greater than or equal to T h, the face i mage i n the v i deo i s an i mage of a real face.

15. The system according to claim 14, wherein if the Avar value is less than Th, the face image in the video is not an image of a real face.

16. The system according to any one of claims 9 to 15, wherein the threshold (Th) is set according to specific image quality, whereby a lower image resolution means a lower threshold Th.

Description:
M E T H OD A ND SY ST E M FOR FAC E IN VIV O DE T E CTION FIE L D OF T H E INV E NTION

The present invention relates to a method and a system for face in vivo detection and in particular, but not exclusively, for face in vivo detection based on an illumination component.

BAC K G ROU ND

The following discussion of the background to the invention is intended to facilitate an understanding of the present invention. However, it should be appreciated that the discussion is not an acknowledgment or admission that any of the material referred to was published, known or part of the common general knowledge in any jurisdiction as at the priority date of the application.

In recent years, biometric identification technology has made great progress and some common biometric features used for identification include human face, fingerprint, and iris. Biometric information for personal identification has been widely used in the world, through which real users and fake users can be distinguished. However, the accuracy of biometric identification can be compromised by various means, such as the use of forged images of human face, fingerprint, iris, etc., during biometric verification. To address this issue, there exist biometric identification systems for in vivo detection to distinguish between biometric information submitted to such a system from a living individual and that of a non-living individual (such as forged images of a living individual), so as to prevent illegal forgers from stealing other people's biometric information for personal identification.

Biometric identification systems, in particular, face recognition identification systems have been widely used for personal identification, video surveillance and video information retrieval analysis in recent years due to the convenience of use, high acceptability and other advantages. However, from the research phase to the application phase of face recognition technology, security threats associated with this technology must be addressed to ensure reliability and security of face recognition systems. In general, forgery login to a face recognition system may adopt one or more of the following methods: face images, face video clips and three-dimensional face model replica. Among them, a face image can be obtained more easily than a face video clip or a three-dimensional face model replica, and hence is more frequently used in forgery login to a face recognition system. It is thus necessary to design a face in vivo detection system that is protected against threats from forgery face image login for the purpose of practical application of a face recognition system. The technologies of face in vivo detection and face recognition are complementary; the advancement and maturity of the former technology would affect the practical applications of the latter technology.

Existing detection methods in the field of face in vivo detection to distinguish between an image of a face (or face image) and a real human face are typically as follows: 1) To estimate the three-dimensional depth information by motion. The difference between a real face and a face image is that a real human face is a three-dimensional object with depth information while a face image is a two-dimensional plane, differentiations between the two can be found by reconstructing a three-dimensional face with several photos taken during the head- turning action. The disadvantage of this method is that the three-dimensional facial reconstruction requires many photos of facial features for accurate tracking, which still needs major adjustment. In addition, the calculation based on three-dimensional facial reconstruction method is very complicated; hence it is not possible to achieve real-time application based on this. 2) To distinguish between the two by analyzing the high-frequency component ratio of a face image and a real human face. The basic assumption of this method is that the face image imaging compared to real face imaging loses high frequency information. This method can effectively detect low- resolution face images, but does not apply to high- resolution photos. 3) To extract features from face images and design classifiers to distinguish between a face image and a real face. This method does not take into account three-dimensional geometric information in real faces, making it difficult to achieve ideal distinction accuracy. 4) J udgment based on interaction. The system sends all kinds of movement instructions to users randomly (such as turning head, nodding, opening mouth, blink, etc.), users perform the corresponding actions, and subsequently the system distinguishes between real faces and face images by analyzing these actions. This method requires the analysis of all kinds of actions, requires many complex algorithms for the analysis, and its verification accuracy and efficiency is far from satisfaction. To analyze actions like the opening and closing of mouth and blinking of eyes requires precise tracking of the feature points on human faces, which is a very big challenge. In addition, the method requires users to perform a variety of movements in strict accordance with instructions, which is not friendly to users.

Therefore, there is an urgent need to address the technical problems of existing methods for distinguishing between an image of a face and a real human face. The present invention seeks to provide a method and a system to overcome at least in part some of the aforementioned disadvantages.

SU M MARY O F T H E INV E NT ION

Throughout this document, unless otherwise indicated to the contrary, the terms ' comprising , , ' consisting of_, and the like, are to be construed as non-exhaustive, or in other words, as meaning ' including, but not limited to_.

In accordance with a first aspect of the present invention, there is provided a method for face detection, comprising:

capturing a facial movement by video and processing the video to obtain a plurality of facial images from a plurality of successive video frames;

using the Lambertian model to render each facial image obtained and computing discrete cosine transform (DTC) to obtain an illumination component of each facial image;

calculating the mean local variance for the illumination components of the facial images; and

comparing the mean local variance with a predetermined threshold (Th) to determi ne whether the facial i mage is an i mage of a real face.

Preferably, each of the facial images obtained is denoted as Ij, where i is a natural number.

Preferably, wherein the illumination component of each face image Ij, which, according to the Lambertian model, can be denoted as: where Rj is the reflection component, representing the surface reflectance of the facial image; Lj is the illumination component, representing the illumination and shadow of the facial image, and (x,y) represents the coordinates of the pixels in the image; log- transform the face image lj, to obtain:

where fj, vj and uj respectively represent the value of I, R and L over the log- domain, i.e. vj=logR, uj=logL, compute DCT for f,, i.e.:

where N is the length and width of the image, and the high frequency coefficient of Fi(s,t) is set at 0, i.e.:

where M is a parameter to be defined, which is generally set at 5, compute the inverse DCT (discrete cosine transform) for the adjusted frequency domain coefficient F ~ i.e.:

Take the estimation of the illumination component, i.e.:

Then the illumination component of the facial image can be obtained via inverse I ogari thmi c transf ormati on, i.e.: Preferably, M has an empirical value of 5.

Preferably, the step of calculating the mean local variance for the illumi nation components of the face i mages obtai ned from T successive vi deo frames compri ses: dividing the il lumination component of each face image equally into aBb image blocks with aBb pixels contained in each block, and use Bij to represent the No.j image block of the face image from the No.i frame, therefore the mean local variance for T successive video frames is:

where var( is the variance of the pixel val ufe of the i mage bl ock B i,j.

Preferably, in the step of comparing the mean local variance (Avar) obtained with a predetermined threshold (Th), if the Avar value is greater than or equal to Th, the face image in the video is an image of a real face.

Preferably, if the Avar value is less than Th, the face image in the video is not an image of a real face.

Preferably, the threshold (T h) is set according to specific image quality, whereby a lower image resolution means a lower threshold Th.

In accordance with a second aspect of the present invention, there is described a system for face detection, comprisi ng:

an acquisition unit configured to capture a facial movement by video and to process the video to obtain a plurality of facial images from a plurality of successive video frames;

a calculation unit configured to render each facial image obtained using the Lambertian model, compute discrete cosine transform (DTC) to obtain an illumination component of each facial image, and calculate the mean local variance for the i 11 umi nati on components of the faci al i mages; and

a determination unit configured to compare the mean local variance with a predetermined threshold (T h) to determine whether the facial image is an image of a real face.

Preferably, each of the facial images obtained is denoted as Ij, where i is a natural number.

Preferably, the illumination component of each face image Ij, which, according to the Lambertian model, can be denoted as: where Rj is the reflection component, representing the surface reflectance of the facial image; Lj is the illumination component, representing the illumination and shadow of the facial image, and (x,y) represents the coordinates of the pixels in the image; log- transform the face image Ij, to obtain:

where fj, vj and uj respectively represent the value of I, R and L over the log- domain, i.e. vj=logR, uj=logL, compute DCT for f,, i.e.:

where N is the length and width of the image, and the high frequency coefficient of Fi(s,t) is set at 0, i.e.:

where M is a parameter to be defined, which is generally set at 5, compute the inverse DCT (discrete cosine transform) for the adjusted frequency domain coefficient F ~ i.e.: Take f^as the estimation of the illumination component, i.e.:

T hen the i 11 umi nati on component of the f aci al i mage can be obtai ned vi a i nverse I ogari thmi c transf ormati on, i.e.:

Preferably, M has an empirical val

Preferably, the step of calculating the mean local variance for the illumination components of the face i mages obtai ned from T successive vi deo frames compri ses: dividing the illumination component of each face image equally into aBb image blocks with aBb pixels contained in each block, and use Bij to represent the No.j image block of the face image from the No.i frame, therefore the mean local variance for T successive video frames is:

where var( Bij) is the variance of the pixel val ufe of the i mage bl ock B i,j.

Preferably, in the step of comparing the mean local variance (Avar) obtained with a predetermined threshold (Th), if the Avar value is greater than or equal to Th, the face image in the video is an image of a real face.

Preferbaly, if the Avar value is less than Th, the face image in the video is not an image of a real face.

Preferably, the threshold (Th) is set according to specific image quality, whereby a lower image resolution means a lower threshold Th.

Other aspects and advantages of the invention will become apparent to those skilled in the art from a review of the ensuing description, which proceeds with reference to the f ol I owi ng i 11 ustrati ve drawi ngs of vari ous embodi ments of the i nventi on. DE TAIL E D D E SC RIPTION

Particular embodiments of the present invention will now be described. The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the scope of the present invention. Additionally, unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art to which this invention belongs.

The use of the singular forms ' a_, an_, and ' the , include both singular and plural referents unless the context clearly indicates otherwise. The use of means ' and/or , unless stated otherwise. Furthermore, the use of the

terms ' including , and ' having , as well as other forms of those terms, such as 'includes , , ' included , , ' has , , and ' have , are not limiting.

In order to overcome at least in part some of the aforementioned disadvantages of existing methods for distinguishing between an image of a face (or a face image) and a real face such as complex calculations, poor adaptability, low distinction accuracy, and low efficiency, the present invention provides a method and a system for face or face in vivo detection based on an illumination component which offers high accuracy, excellent real-time performance and improved user experience.

In order to achieve the above-mentioned technical objectives, the technical solution of the present invention comprises a method for face or face in vivo detection based on an i 11 umi nati on component, wherei n the method i ncl udes the f ol I owi ng steps:

Step 1 : capture a facial movement such as a human head movement in the form of a video and obtain a plurality of face images by cropping the video.

Step 2: use the Lambertian model to render each face image obtained from Step 1, then compute discrete cosine transform (DCT) to obtain the illumination component of each face image. It would be appreciated that in other embodiments, it can be synchronized with surface shading algorithms, ray cast and the like.

Step 3: based on the results above in Step 2, calculate the mean local variance for the illumination component of each of the face images obtained from several successive

video frames.

Step 4: compare the mean local variance with a predetermined or predefined threshold to determi ne whether the face i mage is an i mage of a real face.

Step 1 above requires obtaining face images by cropping the human head movement vi deo and the face i mages are denoted as Ij.

Step 2 above requires extracting the illumination component of each face image Ij, which, according to the Lambertian model, can be denoted as:

Where Rj is the reflection component, representing the surface reflectance of the image scene; L ί i s the i 11 umi nati on component, represent! ng the i 11 umi nati on and shadow of the image scene, and (x,y) represents the coordinates of the pixels in the image; log- transform the face image Ij, to obtain:

Where fj, vj and uj respectively represent the value of I, R and L over the log-domain,

Compute DCT (discrete cosine transform) for fi, i.e.:

Where N is the length and width of the image, and the high frequency coefficient of Fi(s,t) is set at 0, i.e.:

Where M is a parameter to be defined, which is generally set at 5,

Compute the inverse DCT (discrete cosine transform) for the adjusted frequency domain coefficient F ~ i.e.:

the esti mati on of the i 11 umi nati on component, i.e.:

Then the illumination component of the image domain can be obtained via inverse logarithmic transformation, i.e.:

Preferably, M has an empirical value of 5. It would be appreciated that M is an adj ustabl e parameter and can take on other val ues apart from the val ue 5.

Step 3 requires calculating the mean local variance for the illumination component of the face images obtained from T successive video frames: Divide the illumi nation component of each face image equally into aBb i mage blocks with aBb pixels contained in each block, and use By to represent the No.j image block of the face image from the No.i frame, therefore the mean local variance for T successive video frames is:

Where var(Bi,j) is the variance of the pixel value of the image block Bij.

The method for face in-vivo detection based on ill umination component, wherein Step 4 requires conducting face in-vivo detection: Compare the Avar value obtained from Step 3 with the predefined threshold Th, if the Avar value is greater than or equal to Th, the face image in the video is an imagr of a real face, otherwise a face image. The threshold Th is set according to specific image quality. A lower image resolution means a lower threshold Th. There is described hereinafter a method for face or face in vivo detection in accordance with various embodiments of the present invention.

In an embodiment of the present invention, a method for face in vivo detection comprises the following steps:

Step 1 : in order to make a facial movement video, such as a human head movement video, in actual practice a voice clip or a screen text can be used to give instructions to a user, requiring the user to shake or nod his/her head at the camera. It would be appreciated that other types of facial movement such as blinking of eyes, raising of eye brows, moving of lips and the like can also be performed and captured.

Step 2: conduct face detection for each image frame captured by the camera. As a well - known technology, face detection identifies the face in photos (or video frames) and returns the location of the face. Obtain a face region from a video frame through cropping according to face detection results and scale down the face region into a 100B100 image. Use I, to represent the face image obtained from the No.i frame through cropping and being scaled down, where i is a natural number.

Step 3: Acquire the illumination component of each face image Ij. According to the Lambertian model, the face image Ij can be denoted as:

Where Rj is reflection component, mainly describing the surface reflectance of the image scene; L i is the illumination component, mainly describing the illumination and shadow of the i mage scene. L og-transf orm each face i mage Ij to acqui re:

Where fj, vj and uj respectively represent the value of I, R and L over the log-domain, i.e. vj=logR, uj=logL, By this time, values of vj and uj are unknown and the value of uj needs to be estimated. Compute DCT (discrete cosine transform) for fj, i.e.:

Where N is the length and width of the image, and the high frequency coefficient in Fi(s,t) is set at 0, i.e.:

Where M is a parameter to be defined, which is generally set at 5,

Compute the inverse DCT (discrete cosine transform) for the adjusted frequency domain coefficient F ~ i.e.:

Formula from (3) to (6) actually demonstrate the low frequency filtering for the face image fi over the log-domain via DCT (discrete cosine transform).

It would be appreciated that in other embodiments, it can be synchronized with surface shading algorithms, ray cast and the like. It would also be appreciated that M is an adj ustabl e parameter and can take on other val ues apart from the val ue 5.

According to a large number of existing researches, as the illumination component in images varies slowly, the low frequency component can be used to estimate the illumination component. Therefore, f^ can be used as estimation of the illumination component, i.e.:

Then the illumination component of the image domain can be obtained by computing inverse logarithmic transformation (exponential transformation), i.e.:

Step 4: calculate the mean local variance for the il lumination component of the face i mages obtai ned from T successive vi deo frames.

Divide the illumi nation component of each face image equally into 10B10 image blocks with 10B10 pixels contained in each block. Use B to denote the No.j image block of the face image from the No.i frame, therefore the mean local variance for T successive video frames is:

Where var(Bi,j) is the variance of the pixel value of the image block Bij. In the present embodi ment, T is set at 100.

Step 5: conduct face in-vivo detection.

Each face has a unique three-dimensional geometric structure (e.g. distinct unevenness can be seen around nose, cheekbones, mouth, and eyes), therefore when a person shake or nod his/her head, the regional shadow on his/her face will experience significant changes, which are properly recorded as the illumination volume Lj. A photo has a smooth surface, therefore moving the photo will not lead to significant changes in the regional shadow. As a result, the mean local variance Avar can be used to distinguish between a real face and a face image. If the Avar value is greater than the preset threshold T h, the face image in the video can be considered as a real face, otherwise it is merely a face image. T he threshold T h is set according to specific image type and image quality. A lower image resolution means a lower threshold Th.

In accordance with another aspect of the present invention, there is described a system for face detection in accordance with an embodiment of the present invention. The system comprises an acquisition unit configured to capture a facial movement by video and to process the video to obtain a plurality of facial images from a plurality of successive video frames; a calculation unit configured to render each facial image obtained using the Lambertian model, compute discrete cosine transform (DTC) to obtain an illumination component of each facial image, and calculate the mean local variance for the i 11 umi nati on components of the facial i mages; and a determi nati on unit configured to compare the mean local variance with a predetermined threshold (Th) to determine whether the facial image is an image of a real face.

Each of the facial images obtained is denoted as Ij, where i is a natural number.

The illumination component of each face image Ij, which, according to the Lambertian model, can be denoted as: where Rj is the reflection component, representing the surface reflectance of the facial image; Lj is the illumination component, representing the illumination and shadow of the facial image, and (x,y) represents the coordinates of the pixels in the image; log- transform the face image Ij, to obtain:

where fj, vj and uj respectively represent the value of I, R and L over the log- domain, i.e. vj=logR, uj=logL, compute DCT for f,, i.e.:

where N is the length and width of the image, and the high frequency coefficient

where M is a parameter to be defined, which is generally set at 5, compute the inverse DCT (discrete cosine transform) for the adjusted frequency domain coefficient F ~ i.e.:

Take f^as the estimation of the illumination component, i.e.:

T hen the i 11 umi nati on component of the f aci al i mage can be obtai ned vi a i nverse logarithmic transformation, i.e.:

It would be appreciated that in other embodiments, it can be synchronized with surface shading algorithms, ray cast and the li ke. It would also be appreciated that M is an adj ustabl e parameter and can take on other val ues apart from the val ue 5.

The step of calculating the mean local variance for the illumination components of the face images obtained from T successive video frames comprises:

dividing the illumination component of each face image equally into aBb image blocks with aBb pixels contained in each block, and use Bij to represent the No.j image block of the face image from the No.i frame, therefore the mean local variance for T successive video frames is:

where var(Bi, j ) is the variance of the pixel val ufe of the image block Bij.

In the step of comparing the mean local variance (Avar) obtained with a predetermi ned threshold (Th), if the Avar value is greater than or equal to Th, the face image in the video is an image of a real face. If the Avar value is less than Th, the face image in the video is not an image of a real face. The threshold (Th) is set according to specific image quality, whereby a lower image resolution means a lower threshold Th.

The technical effects of the present invention lie in that the detection method and system of the present invention can distinguish between a real face and a face image safely, and during detection only requires a user to perform a facial movement casually, such as to move his/her head casually, instead of making different movements as strictly required at specific times, offering a more friendly user experience. As the present invention does not rely on the detection method based on facial feature points, several deficiencies such as lower accuracy and complex calculation caused by the detection method based on facial feature points are avoided. The present invention does not involve three-dimensional face reconstruction, hence achieving higher calculation speed and performing real-time processing.

Advantageously, the present invention focuses on face in vivo detection based on the illumination information in a face image rather than relying on complex three- dimensional reconstruction and is based on facial feature points.

It is to be understood that the above embodiments have been provided only by way of exemplification of this invention, and that further modifications and improvements thereto, as would be apparent to persons skilled in the relevant art, are deemed to fall within the broad scope and ambit of the present invention described herein. It is further to be understood that features from one or more of the described embodiments may be combi ned to form further embodi ments of the i nvention.