Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
FACIAL EXPRESSION TO AUGMENT FACE ID AND PRESENTATION ATTACK DETECTION
Document Type and Number:
WIPO Patent Application WO/2022/147411
Kind Code:
A1
Abstract:
A system, computer readable medium, and method are configured to control access to a secure asset. A scanning system includes a lens system configured to sense light rays and a sensor board configured to capture image data. An electronic memory is configured to store a credential of a subject and a gesture key sequence. A processor is configured to acquire images over time of a biometric presentation of the subject, identify at least one of the images as corresponding to the subject, access the gesture key sequence from the electronic memory, identify gestures performed over time by the subject in the images, compare the gestures as identified to the gesture key sequence, and grant access to the secure asset to the subject based on the gestures as identified matching the gesture key sequence and the subject being identified in at least one of the images.

Inventors:
RAGUIN DANIEL HENRI (US)
MCCLURG GEORGE WILLIAM (US)
Application Number:
PCT/US2021/073075
Publication Date:
July 07, 2022
Filing Date:
December 22, 2021
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ASSA ABLOY AB (SE)
RAGUIN DANIEL HENRI (US)
MCCLURG GEORGE WILLIAM (US)
International Classes:
G06V40/16; G06F21/32; G07C9/00
Foreign References:
US20170116490A12017-04-27
CN108108610A2018-06-01
US203862631323P
EP1953675A12008-08-06
US9690369B22017-06-27
US10417483B22019-09-17
US9997159B22018-06-12
US5835616A1998-11-10
US6301370B12001-10-09
US9471829B22016-10-18
US6681032B22004-01-20
US9135500B22015-09-15
Attorney, Agent or Firm:
PERDOK, Monique M. et al. (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. An access control system (ACS) configured to control access to a secure asset, comprising: a scanning system, including a lens system configured to sense light rays and a sensor board configured to capture image data; an electronic memory configured to store a credential of a subject and a gesture key sequence; a processor, operatively coupled to the electronic memory and to the scanning system, configured to: acquire images over time of a biometric presentation of the subject upon the subject coming into visual range of the lens system; identify at least one of the images as corresponding to the subject; access the gesture key sequence from the electronic memory; identify gestures performed over time by the subject in the images; compare the gestures as identified to the gesture key sequence; and grant access to the secure asset to the subject based on the gestures as identified matching the gesture key sequence and the subject being identified in at least one of the images.

2. The access control system of claim 1, wherein the gestures are facial gestures and wherein the images acquired include a face of the subject.

3. The access control system of claim 2, wherein the facial gestures include mouth gestures.

4. The access control system of any one of claim 1 to claim 3, wherein the gesture key sequence comprises a plurality of gestures organized in a predetermined sequence.

5. The access control system of claim 4, wherein the gesture key sequence is associated with the subject and stored with the credential.

6. The access control system of claim 4, further comprising: a user interface operatively coupled to the processor;

33 wherein the gesture key sequence is generated by the access control system and presented to the subject via the user interface upon the subject coming into visual range of the lens system.

7. The access control system of any one of claim 1 to claim 6, wherein the processor is configured to identify the subject before identifying gestures performed over time by the subject.

8. The access control system of any one of claim 1 to claim 6, wherein the processor is configured to identify the subject concurrently with identifying gestures performed over time by the subject.

9. The access control system of any one of claim 1 to claim 6, wherein the processor is configured to identify the subject after identifying gestures performed over time by the subject.

10. The access control system of any one of claim 1 to claim 6, wherein the gesture key sequence includes at least one gesture performed by a body part of the user other than the face.

11. A computer readable medium comprising instructions which, when implemented by a processor, cause the processor to perform operations comprising: acquire images over time of a biometric presentation of a subject upon the subject coming into visual range of a lens system of an access control system; identify at least one of the images as corresponding to the subject; access a gesture key sequence from an electronic memory of the access control system; identify gestures performed over time by the subject in the images; compare the gestures as identified to the gesture key sequence; and grant access to a secure asset to the subject based on the gestures as identified matching the gesture key sequence and the subject being identified in at least one of the images.

12. The computer readable medium of claim 11, wherein the gestures are facial gestures and wherein the images acquired include a face of the subject.

34

13. The computer readable medium of claim 12, wherein the facial gestures include mouth gestures.

14. The computer readable medium of any one of claim 11 to claim 13, wherein the gesture key sequence comprises a plurality of gestures organized in a predetermined sequence.

15. The computer readable medium of claim 14, wherein the gesture key sequence is associated with the subject and stored with a credential stored in the electronic memory.

16. The computer readable medium of claim 14, wherein the access control system includes a user interface operatively coupled to the processor, wherein the gesture key sequence is generated by the access control system and wherein the computer readable medium further comprises instructions that cause the processor to prompt the subject to provide the gesture key sequence, via the user interface, upon the subject coming into visual range of the lens system.

17. The computer readable medium of claim 16, wherein the prompt is a question.

18. The computer readable medium of claim 16, wherein the prompt is a gesture or gesture sequence to be mimicked by the subject.

19. The computer readable medium of claim 16, wherein the prompt is to induce the subject to give a coded response.

20. The computer readable medium of any one of claim 11 to claim 19, wherein the instructions cause the processor to identify the subject before identifying gestures performed over time by the subject.

21. The computer readable medium of any one of claim 11 to claim 19, wherein the instructions cause the processor to identify the subject concurrently with identifying gestures performed over time by the subject.

22. The computer readable medium of any one of claim 11 to claim 19, wherein the instructions cause the processor to identify the subject after identifying gestures performed over time by the subject.

23. The computer readable medium of any one of claim 11 to claim 19, wherein the gesture key sequence includes at least one gesture performed by a body part of the user other than the face.

24. A processor-implemented method of controlling access to a secure asset, comprising: acquiring images over time of a biometric presentation of a subject upon the subject coming into visual range of a lens system of an access control system; identifying at least one of the images as corresponding to the subject; accessing a gesture key sequence from an electronic memory of the access control system; identifying gestures performed over time by the subject in the images; comparing the gestures as identified to the gesture key sequence; and granting access to a secure asset to the subject based on the gestures as identified matching the gesture key sequence and the subject being identified in at least one of the images.

25. The method of claim 24, wherein the gestures are facial gestures and wherein the images acquired include a face of the subject.

26. The method of claim 25, wherein the facial gestures include mouth gestures.

27. The method of any one of claim 24 to claim 26, wherein the gesture key sequence comprises a plurality of gestures organized in a predetermined sequence.

28. The method of claim 27, wherein the gesture key sequence is associated with the subject and stored with a credential stored in the electronic memory.

29. The method of claim 27, wherein the access control system includes a user interface operatively coupled to the processor, wherein the gesture key sequence is prompted by the access control system, and wherein the method further comprises prompting the subject, via the user interface, upon the subject coming into visual range of the lens system.

30. The method of any one of claim 24 to claim 29, wherein identifying the subject is before identifying gestures performed over time by the subject.

31. The method of any one of claim 24 to claim 29, wherein identifying the subject is concurrent with identifying gestures performed over time by the subject.

32. The method of any one of claim 24 to claim 29, wherein identifying the subject is after identifying gestures performed over time by the subject.

33. The method of any one of claim 24 to claim 29, wherein the gesture key sequence includes at least one gesture performed by a body part of the user other than the face.

37

Description:
FACIAL EXPRESSION TO AUGMENT FACE ID AND PRESENTATION

ATTACK DETECTION

PRIORITY APPLICATION

[0001] This application claims priority to U. S. Provisional Patent Application Serial Number 63/132,338, filed December 30, 2020, the disclosure of which is incorporated by reference herein in its entirety.

BACKGROUND

[0002] Biometric facial recognition and biometric facial recognition systems are used in numerous circumstances for the purposes of identifying or verifying an individual. Access control (e.g., access to a building, a room, a computer system, an online bank account) and time and attendance (e.g., tracking the time worked for factory employees) are two main classes of these deployed facial biometric use-cases. Often, these applications require two factor identification where face plus some other identity factor, such as something the individual possesses (e.g., an identification (ID) card) or knows (e.g., a personal identification number (PIN) or password), is used as the second factor.

SUMMARY

[0003] The following presents a simplified summary of one or more embodiments of the present disclosure in order to provide a basic understanding of such embodiments. This summary is not an extensive overview of all contemplated embodiments and is intended to neither identify key or critical elements of all embodiments nor delineate the scope of any or all embodiments.

[0004] There are certain drawbacks to the two factor identification protocols described above. The face presented to a biometric facial recognition system may in fact be a presentation attack (PA) such as a photograph or 3D replica of a stored facial biometric of an individual. Additionally, by requiring the individual to enter a password, type a PIN, or swipe or tap an ID card, the use of one or more hands may often be required. However, the user’s hands may not be readily available (e.g., travelers with suitcases in their hands or workers carrying material or equipment) and, at least in the case of a password or PIN, the individual may be required to touch a surface and therefore faces the increased risk of disease transmission. The present disclosure aims to address both of these drawbacks through the use of gestures as a second identification factor.

[0005] An access control system (ACS) may be or may include a face scanning system that may scan a biometric presentation of a subject, identify the presentation as a human face, and further identify that the face matches a specific entry for a subject in a database to which the apparatus has access. The ACS further utilizes a series of gestures from the subject that serve as an additional factor for identification. The gestures may be prompted by a user interface of the ACS or may be automatic or unprompted. The ACS may provide access to the subject to a secure asset in the first instance and/or may be utilized to maintain access of the subject to the secure asset after access has previously been granted to the subject.

[0006] In certain embodiments, the ACS first establishes that there is a human face being presented but does not yet perform a match. Instead, the ACS either prompts or expects the subject to perform the series of gestures associated with the gesture key. When the subject presents the gesture key and the ACS recognizes the gesture key, the ACS uses the gesture key to look up a record for the subject and thereby is able to retrieve the specific face information for that subject. At that point the ACS is able to perform a match with the subject, either one-to-one or one-to-several (e.g., less than N, where N represents the number of all users in the database the ACS has access to), in the case where subjects are allowed to potentially chose a same gesture key as another subject by chance, rather than a 1:N match, providing for potentially improved use of computing resources, reduced latency, and improved accuracy relative to systems that rely only on facial recognition.

[0007] In various embodiments, the analysis of each retrieved image for face recognition and gesture recognition is not separate but performed on each image. In such examples, each image captured in an image stream has the possibility to be analyzed to determine if the image has a high enough quality score to be used for a face matching algorithm and analyzed to determine if the captured image is part of a gesture. The rational for the phrase “possibility” is that if sufficient images of high enough quality are captured for face matching purposes, then the remaining images captured may solely be analyzed for gesture. In an example implementation, if a subject’s gesture key is blinking twice, followed by smiling, then looking up, down, right and left, then as the subject is presenting these gestures, some images in the image stream will tend to include the subject looking straight forward with a neutral pose that may be used for face matching. The ACS can therefore begin reading images and if the images are deemed to be those of a human face, then the system can start analyzing the images for both sufficient quality for face matching as well as to analyze the images for gestures. If the image is found to have sufficient quality for face matching, the system might do a 1:N face match before a gesture key is recognized, or may compare the face image to other images previously acquired and stored to determine if the newly acquired face image is of higher quality than those stored. The ACS may be designed to store the highest quality-scoring face images, such as a predetermined number of highest qualityscoring face images, for later face matching once the gesture key is recognized.

[0008] In various embodiments, this face gesture sequence may be predetermined and known to the subject in advance. The subject may then use this known face gesture sequence with the advantage that the subject does not require hands to be free to touch a keypad or keyboard, which provides greater convenience and hygiene. In the same manner as a password, the gesture sequence may be revocable and may be changed on a regular basis as part of a standard security protocol or changed upon notification of a potential or actual security breach.

[0009] In various examples, the ACS is configured to prompt the subject to perform the gesture key sequence according to a variety of methods. In one embodiment, the face gesture sequence is not known by the subject, but the subject is instructed to give a certain face gesture sequence via cues from a user interface. These cues may be a combination of one or more visual cues, e.g., text and/or images that are displayed by a user interface of the ACS, and audio cues, e.g., a speaker issuing voice commands. The cues may give the subject a gesture sequence to repeat in the order given or that the subject must manipulate, e.g., code, certain gestures to arrive at the correct gesture key sequence. For the latter, by way of example, the ACS may show a gesture key sequence and then ask the subject to give the gesture key sequence in reverse order or the subject may already know without prompt that the correct response is to give the gesture key in reverse to that indicated by the prompt. As a further example, the subject’s correct gesture key response may alternatively be a coded response of the prompt or a combination of a manipulation and a coded response. By way of example of a coded response, the code given to authorized subjects at a time prior to the subject’s encounter with the ACS may be that a prompted wink gesture is responded with a smile or that the correct response to an open mouth gesture is a raised eyebrow. By way of another example, the ACS may prompt the subject via an audio or visual cue for a known pass phrase such as “Which baseball team is your favorite?” or “What is the name of your first pet?”, the answer to which the subject may whisper or just mouth the answer as the correct gesture key sequence. Consequently, additional levels of security may be added to or incorporated by the ACS by adding a priori known codes or manipulation responses to an ACS prompt. The ACS may then check to see if the subject did indeed present the correct gesture key requested. This recognition of a gesture key may be used by the ACS as a means of presentation attack detection as the gestures cannot be made by photo attack nor can they be made by a prerecorded video playback if, for example, the apparatus changes the gesture code at each identification attempt. As such, the ACS may be implemented with both predetermined and one-time use gesture keys where the predetermined gesture key is used, for example, to confirm a known secret input, while the one-time use gesture key is used, for example, to confirm that the subject is a human subject and not a prerecording of a subject reciting the known facial gesture key. However, both the predetermined gesture key and the one-time use gesture key may be used to identify the subject not just through the first-order recognition of the subject’s motion, but also by the identification how the specific features of the subject’s face change upon making the gesture. By way of example, in the blinking or winking of one’s eyes or the smiling of the face, certain wrinkle features present themselves that are not present in the subject’s neutral pose which may be used in the identification process as well.

[0010] The ACS may operate in static environments in which the subject is stationary or in dynamic environments in which the subject may be in motion. For instance, the ACS may acquire the face of the subject while the subject is walking towards the location of the scanning system and the subject may give the gesture sequence while still walking. Additionally, the subject may be in motion and then stop moving to provide some or all of the gesture sequence.

[0011] For the purposes of this disclosure, facial gestures are used for illustrative purposes. However, it is to be recognized and understood that any of a variety of gestures not involving the subject's face may be used instead of or in addition to facial gestures. Thus, for instance, a gesture sequence may be comprised of or may include in part hand and/or finger gestures, arm gestures, leg gestures, foot gestures, head gestures - such as nodding or otherwise tilting the head or rotating the head - or whole body gestures. Such additional gestures may be effective in relatively private environments where the gestures may not typically be visible to third parties, while facial gestures may be relatively less apparent to third parties, and thus, effective, for example, in relatively more public areas. Accordingly, for the purposes of this disclosure, while use of the term “gesture(s)” or the like may reference a facial gesture(s) in particular, it is to be recognized and understood that references to “gesture(s)” may apply to any bodily gesture(s) by the subject. In the description of this disclosure, movement of the head, face and/or certain facial features of a person may be described as a change in the subject’s facial gesture, expression or pose. In all such cases, this lexicon is meant to convey the act of a subject changing its facial appearance in a manner that serves as a factor for identity authentication. Further a sequence of facial gestures performed by the subject for identification by the apparatus of the present disclosure may be referred to as the gesture key sequence, gesture password, gesture sequence, or gesture key.

[0012] The present disclosure provides various advantages over previous image recognition-based identity systems. The use of gestures as the sole or part of the identification process means the subject does not have to have their hands free. Identification modalities that involve fingerprinting or presentation of an identification badge that might be read via a host of technologies such as RFID, magnetic stripe, or near-field communications (NFC) generally require that the badge be in a person’s hands presented very close to a scanner, if not touching it. Likewise, the use of a password or personal identification number (PIN) that is typed on a keypad or touchpad may require the subject’s hands to be free. In the cases where touching of an identification apparatus is required, then there is also a hygiene risk that the present disclosure does not suffer from. Further, since the detection of gestures can be achieved remotely, the identification system of the present apparatus does not require the ACS to be in close proximity to the subject. For example, a subject may be identified via their face gestures as they are walking towards a locked door and that door is unlocked or opened before the subject reaches the door as a consequence of the face gesture identification made. In another example, face gesture identification is made remotely using a security camera that could be placed on a ceiling or side of a wall or building relatively far from the subject. A further example of the present disclosure is that the use of facial gestures may not be affected by ambient audio noise and therefore may not suffer the disadvantage of voice or speech recognition in high ambient audio noise environments where background noise may degrade the ability of the system to perform an audio identification. Voice and speech recognition also suffer the risk of covert or even overt recording of a person’s audio password.

[0013] A gesture key sequence may incorporate one or more gestures. However, the use of a single gesture as the gesture key sequence may be clearly disadvantageous relative to a gesture key sequence with two or more gestures. A single gesture may provide poor authentication control as, similar to a single character password or PIN, a single gesture may be inherently easy to guess. Furthermore, while a multi-gesture key sequence typically requires deliberate action by a subject, a single gesture may be subject to inadvertent or accidental completion simply by virtue of the subject acting naturally but without knowledge of what the single gesture is supposed to be, e.g., incidentally looking to the left while engaging with the ACS. Moreover, a single gesture may be easily spoofed, e.g., with a photograph or mask. If a user of the ACS does not require particularly strong protection for a secure asset the use may implement examples of the ACS disclosed herein with a gesture key sequence including a single gesture and still be within the principles of the use of a gesture key sequence. However, it is explicitly contemplated that various examples of the ACS disclosed herein implement gesture key sequences which not only include multiple gestures but which explicitly require multiple gestures, and that ACS systems with multiple gestures as the gesture key sequence may be treated as distinct from ACS systems with single gestures as the gesture key sequence.

[0014] Facial gestures identified by the present disclosure may entail numerous movements performed by one’s face and may include such movements as smiling, frowning, opening/closing the mouth, blinking or squinting of eye(s), raising of eyebrow(s), smirking, wiggling of ears, etc. These gestures may be repeated with the same gesture type or may be combined with other gestures. By way of example, the gesture could be the blinking of one’s eyes in a pattern similar to Morse code. A rapid blinking of the eyes could be read as a dot and a longer closing of the eyes during blinking read as a dash. The dash-dot sequences could be actual Morse code in that it actually spells out a letter or a word or may be simply a blink combination that means nothing in Morse code but serves as an access sequence previously enrolled into the system by the user. As another example, the gesture key, might be one made up of several different types of gestures, for example the opening and closing of the mouth twice, followed by two blinks of both eyes and then one blink of the left eye. The gestures may also be made simultaneously as part of the access code. By way of example, the subject may first smile and blink while smiling followed by opening of the mouth and blinking while their mouth is open. For gesture keys that involve a portion or all of the key to be mouth movement, that portion of the gesture key may be tied to the mouthing of a phrase. By way of example, the subject may choose a certain phrase to say such as “I like baseball” or the ACS may prompt the subject with a question such as “What is your mother’s maiden name” to which the subject may provide the correct response. Unlike other systems that require an audio input, the subject need only mouth that phrase or say it in a very low whisper since the identification is not using audio for identification. Mouthing or whispering the gesture key minimizes the risk of another entity recording an audio key while providing the relationship to actual words or sounds that may improve the ability of the subject to remember the gesture key.

BRIEF DESCRIPTION OF THE DRAWINGS

[0015] Some embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings.

[0016] FIG. 1 illustrates an access control system (ACS), in an example embodiment. [0017] FIGs. 2 A and 2B illustrate the ACS implemented in a physical environment, in an example embodiment.

[0018] FIG. 3 illustrates a block diagram schematic of various components of an example ACS.

[0019] FIGs. 4A-4D illustrate a gesture key sequence of eye configurations, in an example embodiment.

[0020] FIG. 5 illustrates various face images demonstrating various mouth gestures that may be part of a gesture key sequence, in example embodiments.

[0021] FIG. 6 illustrates eye motion or positioning as a gesture, in example embodiments

[0022] FIGs. 7A and 7B illustrate the effect of light reflections on an eye for the purposes of identification of the eye positioning gestures, in an example embodiment. [0023] FIGs. 8A and 8B are a flowchart for implementing a gesture key sequence by the ACS 102, in an example embodiment.

[0024] FIGs. 9A and 9B are a flowchart for implementing a gesture key sequence by the ACS 102, in an example embodiment.

[0025] FIGs. 10A and 10B are a flowchart for implementing a gesture key sequence by the ACS 102, in an example embodiment.

[0026] FIGs. 11 A and 1 IB are a flowchart for implementing a gesture key sequence by the ACS 102, in an example embodiment.

DETAILED DESCRIPTION

[0027] Example methods and systems are directed to an access control system, devices, and method. Examples merely typify possible variations. Unless explicitly stated otherwise, components and functions are optional and may be combined or subdivided, and operations may vary in sequence or be combined or subdivided. In the following description, for purposes of explanation, numerous specific details are set forth to provide a thorough understanding of example embodiments. It will be evident to one skilled in the art, however, that the present subject matter may be practiced without these specific details.

[0028] FIG. 1 illustrates an ACS 102, in an example embodiment. A scanning system 104 includes or is contained within a housing 106. Within the housing 106 is an optical biometric face scanning system that is capable of scanning a face 108 of a subject 110 presented to the ACS 102. Face 108 may be illuminated in whole or in part by an optional illumination module 112, by ambient light, or both the illumination module 112 and ambient light. The illumination module 112 may contain one or more light sources, such as a light emitting diode (LED), along with associated optics to direct the light from the one or more light sources to a certain volume of space that may be occupied by the face 108 that may be scanned, represented by emitted light ray 114a and emitted light ray 114b. The wavelength of light emitted by the light source(s) may include but is not limited to wavelengths that are generally used for face recognition such as visible or near infrared wavelengths. Ambient light, light from illumination module 112, or a combination of the two is sensed as returned light rays, such as returned light ray 116a and returned light ray 116b, by lens system 118. The lens system 118 focuses the returned light rays such as 116a and 116b onto a sensor board 120. The lens system 118 and sensor board 120 may include componentry known in the art, such as electronic digital sensors, such as complementary metal-oxide-semiconductor (CMOS) sensors. [0029] A user interface 122 may provide a prompt to the subject 110 to give a gesture key. The prompt module may include the necessary light source and optics to provide a visual prompt and/or a speaker with which an audio prompt may be produced. For the case of a subject-known gesture key, the visual cues may be as simple as the flashing of a certain color LED or may be more self-explanatory such as the use of an organic LED (OLED) or liquid crystal (LC) pixelated array that spells out a message or graphic instructing the subject 110 to issue their known gesture key. Similarly, the audio cue may be a simple cue such as a beep of a particular tone or may be more self-explanatory such as the use of an audio phrase, e.g., “present face gesture”, that is in a language or languages that the subject 110 may understand or may be expected to understand. For apparatus-generated gesture keys the subject 110 is to repeat or respond to, user interface 122 may include a display or speaker system suitable to instruct the subject 110 regarding the gesture key, such as the use of an LC display (LCD) or OLED pixelated display or any other suitable visual display.

[0030] In the illustrated example, a controller printed circuit board (PCB) 124 includes a controller and connection points for various components of the ACS 102 while the PCB 124 is coupled to an external computing system 126, such as a conventional computer or networked computing resource. Alternatively, the computing system 126 may in whole or in part be contained within the housing 106. Further alternatively, the computing system 126 may be a distributed computing system, with some components or resources physical located within or in proximity of the housing 106 and other components or resources accessed remotely. Moreover, the connection between the PCB 124 and the computing system 126 may be wired, wireless, or both. The wireless connection may be via any suitable wireless communication modality capable of supporting range and data rate requirements of the ACS, such as Bluetooth or IEEE 802.11. The PCB 124 includes a processor/controller, memory and other control components in order to communicate with and process data from the various modules that are part of the apparatus. Imagery data captured by sensor board 120 is received by the PCB 124 and may be processed by the PCB 124 or partially processed by the PCB 124 but then communicated to the computing system 126. [0031] It is noted that for the purposes of this disclosure the gesture key sequence is described in relation to combination with facial recognition. As noted above, doing so provides for access control without requiring contact with physical objections by the subject 110. However, the principles disclosed with respect to the gesture key sequence apply as well to other two-factor authentication systems. Consequently, in various examples, facial recognition may be replaced by another authentication factor, such as (smart)card access, PIN entry, voice, iris, fingerprint, etc.

[0032] FIGs. 2 A and 2B illustrate the ACS 102 implemented in a physical environment, in an example embodiment. In the illustrated example, the ACS 102 is a physical ACS in that the ACS 102 provides access to a secure asset 204 that is a physical space, such as a room. The ACS 102 controls a locking mechanism on a door 202, the enabling of which prevents someone in an unsecured area 206 from accessing the secure asset 204 and the disabling of which allows a subject 110 to open and pass through the door 202 into or to access the secure asset 204. It is to be recognized and understood that the door 202 and secure asset 204 as a room beyond the door 202 is presented for illustrative purposes and that the door 202 may be any suitable mechanism for restricting access of or to a physical space and that the secure asset 204 may be any physical space or object that may be subject to a need for security or restricted access.

[0033] Furthermore, while FIGs. 2 A and 2B illustrate the ACS 102 in a physical environment, it is to be recognized and understood that the same principles may apply to an electronic or logical environment. In such an example, the secure asset 204 may be an electronic device or system, such as a computer, computer network, or the like, or an electronic file that may be stored in a memory, data storage, or the like. Consequently, for the purposes of this disclosure, the secure asset 204 is understood to be any physical, electronic, or logical item or collection of items that may have limited and controllable access.

[0034] FIG. 3 illustrates a block diagram schematic of various components of an example ACS 102. In general, the ACS 102 can include one or more of an electronic memory 302, a processor 304, one or more antennas 306, a communication module 308, a network interface device 310, a user interface 122, and a power source 312.

[0035] The electronic memory 302 can be used in connection with the execution of application programming or instructions by the processor 304, and for the temporary or long-term storage of program memory 318 and/or credentials 316 or other authorization data, such as credential data, credential authorization data, or access control data or instructions. For example, the electronic memory 302 can contain executable instructions 314 that are used by the processor 304 to run other components of the ACS 102 and/or to make access determinations based on credentials 316. The electronic memory 302 can comprise a computer readable medium that can be any medium that can contain, store, communicate, or transport data, program code, or instructions for use by or in connection with processor 304 specifically or the ACS 102 generally. The computer readable medium can be, for example but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device. More specific examples of suitable computer readable medium include, but are not limited to, an electrical connection having one or more wires or a tangible storage medium such as a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), Dynamic RAM (DRAM), any solid-state storage device, in general, a compact disc read-only memory (CD-ROM or DVD-ROM), or other optical or magnetic storage device. Computer-readable media includes, but is not to be confused with, computer-readable storage medium, which is intended to cover all physical, non- transitory, or similar embodiments of computer-readable media.

[0036] The processor 304 can correspond to one or more computer processing devices or resources. For instance, the processor 304 can be provided as silicon, as a Field Programmable Gate Array (FPGA), an Application-Specific Integrated Circuit (ASIC), any other type of Integrated Circuit (IC) chip, a collection of IC chips, or the like. As a more specific example, the processor 304 can be provided as a microprocessor, Central Processing Unit (CPU), or plurality of microprocessors or CPUs that are configured to execute instructions sets stored in a memory 318 and/or the electronic memory 302.

[0037] The antenna 306 can correspond to one or multiple antennas and can be configured to provide for wireless communications between the ACS 102 and a credential or key device. The antenna 306 or antennas can be arranged to operate using one or more wireless communication protocols and operating frequencies including, but not limited to, the IEEE 802.15.1, Bluetooth, Bluetooth Low Energy (BLE), near field communications (NFC), ZigBee, GSM, CDMA, Wi-Fi, RF, UWB, and the like. By way of example, the antenna 306 can be RF antenna(s), and as such, may transmit/receive RF signals through free space to be received/transferred by a credential or key device having an RF transceiver. In some cases, at least one antenna 306 is an antenna designed or configured for transmitting and/or receiving UWB signals (referred to herein for simplicity as a “UWB antenna”) such that the reader can communicate using UWB techniques. The communication module 308 can be configured to communicate according to any suitable communications protocol with one or more different systems or devices either remote or local to the ACS 102.

[0038] The network interface device 310 includes hardware to facilitate communications with other devices over a communication network utilizing any one of a number of transfer protocols (e.g., frame relay, internet protocol (IP), transmission control protocol (TCP), user datagram protocol (UDP), hypertext transfer protocol (HTTP), etc.). Example communication networks can include a local area network (LAN), a wide area network (WAN), a packet data network (e.g., the Internet), mobile telephone networks (e.g., cellular networks), Plain Old Telephone (POTS) networks, wireless data networks (e.g., IEEE 802.11 family of standards known as Wi-Fi, IEEE 802.16 family of standards known as WiMax), IEEE 802.15.4 family of standards, and peer-to-peer (P2P) networks, among others. In some examples, network interface device 310 can include an Ethernet port or other physical jack, a Wi-Fi card, a Network Interface Card (NIC), a cellular interface (e.g., antenna, electromagnetic signal filters, and associated circuitry), or the like. In some examples, network interface device 310 can include a plurality of antennas to wirelessly communicate using at least one of single-input multiple-output (SIMO), multiple-input multiple-output (MIMO), or multiple-input single-output (MISO) techniques.

[0039] The user interface 122 can include one or more input devices and/or display devices. Examples of suitable user input devices that can be included in the user interface 122 include, without limitation, one or more buttons, a keyboard, a mouse, a touch-sensitive surface, a stylus, a camera, a microphone, etc. Examples of suitable user output devices that can be included in the user interface 122 include, without limitation, one or more LEDs, an LCD panel, a display screen, a touchscreen, one or more lights, a speaker, etc. It should be appreciated that the user interface 122 can also include a combined user input and user output device, such as a touch-sensitive display or the like. [0040] The power source 312 can be any suitable internal power source, such as a battery, capacitive power source or similar type of charge- storage device, etc., and/or can include one or more power conversion circuits suitable to convert external power into suitable power (e.g., conversion of externally supplied AC power into DC power) for components of the ACS 102. The power source 312 can also include some implementation of surge protection circuitry to protect the components of the ACS 102 from power surges.

[0041] ACS 102 can also include one or more interlink 320 operable to transmit communications between the various hardware components of the reader. A system interlink 320 can be any of several types of commercially available bus structures or bus architectures.

[0042] FIGs. 4A-4D illustrate a gesture key sequence of eye configurations, in an example embodiment. Note that for the purposes of this disclosure, the eyes will be referred to as first eye 402 and second eye 404 in order to avoid confusion that may arise from the use of “right” and “left”. To clarify the schematics presented in FIGs. 4A-4D, the ACS 102 may identify the first eye 402 and the second eye 404 and the gesture being performed by the first eye 402 and second eye 404 in whole or in part on the basis of the identification of the presence of one or more of eyelash 406, sclera 408, iris 410, and pupil 412.

[0043] FIG. 4 A, which may be representative of an image captured by the ACS 102 or a gesture that may be expected to be captured by the ACS 102 as part of the gesture key sequence at a first time, shows the first eye 402 and second eye 404 both open, e.g., as an eyes open gesture. The ACS 102 may identify the eyes open gesture, for example, on the basis of detecting some or all of the eyelashes 406, sclera 408, iris 410, and pupil 412 of each of the first and second eyes 402 and 404, respectively.

[0044] FIG. 4B, which may be representative of an image captured by the ACS 102 or a gesture that may be expected to be captured by the ACS 102 as part of the gesture key sequence at a second time later than the first time, shows the first eye 402 closed and the second eye 404 open, e.g., as a first wink gesture. The ACS 102 may identify the first wink gesture by, for example, identifying the eyelashes 406 and/or the absence of the sclera 408, iris 410, and pupil 412 of the first eye 402 and the presence of the sclera 408, iris 410, and pupil 412 of the second eye 404.

[0045] FIG. 4C, which may be representative of an image captured by the ACS 102 or a gesture that may be expected to be captured by the ACS 102 as part of the gesture key sequence at a third time later than the second time, shows the first eye 402 open and the second eye 404 closed, e.g., as a second wink gesture different than the first wink gesture. The ACS 102 may identify the second wink gesture by, for example, identifying the sclera 408, iris 410, and pupil 412 of the first eye 402 and the presence of the eyelashes 406 pointing downward or eyelid 414 curved upwards and/or the absence of the of the sclera 408, iris 410, and pupil 412 of the second eye 404.

[0046] FIG. 4D, which may be representative of an image captured by the ACS 102 or a gesture that may be expected to be captured by the ACS 102 as part of the gesture key sequence at a fourth time later than the third time, shows the first eye 402 closed and the second eye 404 closed, e.g., as an eyes closed or blink gesture. Consequently, FIGs. 4A- 4D may be understood to illustrate an example gesture key sequence based off of an eyes open gesture followed by a first wink gesture followed by a second wink gesture followed by an eyes closed gesture. While the gesture key sequence of FIGs. 4A-4D includes only eye gestures for the purposes of illustration, it is to be recognized and understood that other gesture key sequences may include other facial and body gestures in addition to or instead of some or all of the gestures of FIGs. 4A-4D.

[0047] As noted herein, detection of the gesture key sequence is in general a temporal process in that information from previous images factor into the decision that a given gesture is completed. By way of example, the blinking of the eyes take place across several image frames and knowing how long a subject 110 has their eyes shut relative to other blinking gestures may be part of the gesture key sequence. To determine if a subject 110 blinks or winks can be accomplished by running the face 108 image through a face segmentation algorithm where landmarks of the face are analyzed and coordinates of which are determined. In general, the landmarks of the face 108 include landmarks that track the shape of the eyes 402, 404 or eyelids 414. Blinking changes these landmarks and so can be used to determine the blinking or winking of a subject 110. Similarly, image processing algorithms like the Hough Transform may be applied to find circle or circular segments (OpenCV code of cv2.HoughCircles() tutorial https://opencv-python- tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py _houghcircles/py_houghcircl es.html , https://en.wikipedia.org/wiki/Circle_Hough_Transform) that correspond to the iris 410 and/or pupils 412 of the subject 110. A blink by the subject may be identified by determining when these identified circles are present, indicating that the first eye 402 is open, and when the circles are not present, indicating that the first eye 402 is closed. A method of determining how the change in position of face features from a neutral state can be used to interpret expression may be found in European Patent Application No. EP1953675, US Patent No. 9,690,369, and U.S. Patent No. 10,417,483, all of which are incorporated by reference herein in their entirety. Similar algorithms when used to examine a temporal sequence of images can be used to identify the execution of facial gestures.

[0048] FIG. 5 illustrates various face images demonstrating various mouth gestures that may alternately or additionally be part of a gesture key sequence, in example embodiments. The gestures include smile 502, frown 504, neutral 506, pursed lips 508, and mouth open 510. These gestures are presented without limitation and it is to be recognized and understood that various other mouth gestures may be implemented instead of or in addition to the gestures illustrated in FIG. 5.

[0049] While gestures, e.g., those illustrated in FIGSs. 4A-4D and FIG. 5, are presented as individual gestures, it is to be recognized that the individual gestures may be utilized in combination with other individual gestures as a compound gesture including multiple individual gestures presented simultaneously and identified by the ACS 102 as a single gesture in a gesture key sequence. Thus, for instance, a compound gesture may include the smile 502 and first wink gesture of FIG. 4B. In such an example, the gesture key sequence illustrated in FIGs. 4A-4D may replace the first wink gesture with a compound gesture of a smile 502 and first wink gesture, in which the subject 110 winks their first eye 402 and in conjunction smiles 502. In the case of such a compound gesture, the presence of one individual gesture, e.g., the smile 502, would be necessary but not sufficient to accomplish the compound gesture.

[0050] A streaming video from the lens system 118 and sensor board 120 may be analyzed temporally to detect the occurrence of a given gesture. The temporal change across image frames may be used to detect the gesture, though a gesture may be detected by examining a single image. In the latter case, by way of example, the system may just be looking for the mouth of the subject 110 to be closed, then open and shut three times and the counting of the mouth opening can be performed by finding the image frame wherein the mouth is fully or mostly open. However, such a process may leave the ACS 102 open to the risk of an attacker using pictures of a closed and then an open mouth in rapid succession to try to fool the ACS 102. Therefore, if the ACS 102 utilizes and analyzes the video stream and is looking for the mouth opening across several frames and closing across several image frames, this makes the system more robust to presentation attacks. The software algorithm may be a handcrafted one in that specific features are looked for in the face images, but may alternatively or in combination include machine learning algorithms such as neural networks (NN) and support vector machines (SVM). These machine learning algorithms may be trained with a set of ground truth images, such as of the subject 110 performing the face gesture of his or her gesture key sequence, but may alternatively be trained with or in combination with images of other subjects making similar and dissimilar gestures with the appropriate ground truth.

[0051] This gesture-based, second factor also provides for increased robustness of presentation attack detection (PAD). Since the ACS 102 is expecting a facial gesture, if a presentation attack is a photograph, a simple two-dimensional (2D) imaging facial recognition system may be fooled, but the photograph is static and cannot emulate the movement of eyebrows, winking, squinting, etc. that the true subject 110 may provide as part of their face gesture key sequence. A mask is a more complex face presentation attack in that it is three-dimensional (3D), however, it does not have the flexibility to mimic the range of face gestures allowed by the ACS 102.

[0052] The gestures may also incorporate spoken or silent forming of words with the mouth by the subject 110, e.g., “lip reading”. The detection of lip reading is disclosed in US Patent No. 9,997,159, which is incorporated herein by reference in its entirety.

Consequently, the gesture key sequence may be or may include mouth gestures such as the subject 110 mouthing, whispering, or speaking words or a phrase. Lip reading may be understood as the ability of the ACS system to image and recognize a certain mouth- related gesture. Lip reading need not be related to interpreting the mouth gesture as a specific word, set of words, or phrase(s). In the case of speaking of a password or phrase, the captured voice recording of the spoken key may be used along with lip reading of the mouth gesture as an additional security measure. Moreover, as disclosed herein, the phrase may be prompted by a question, such as “what is your mother’s maiden name” or the like, as known in the art.

[0053] FIG. 6 illustrates eye motion or positioning as a gesture, in example embodiments. In the illustrated examples, the first eye 402 is variously engaging in look right 602, look up 604, look straight ahead 608, look down 606, and look left 610. Note that for the purposes of this illustration the directions are presented from the standpoint of an observer of the subject 110, e.g., the ACS 102. The determination of the gesture being performed by the first eye 402 may, for example, be made on the basis of a relative position or size of the sclera 408, iris 410, and/or pupil 412.

[0054] As noted above, the eye motion or positioning gestures illustrated in FIG. 6 may be combined with other gestures, such as those of FIGs. 4A-4D, to create compound gestures. Of necessity, the eye positioning gestures 602, 604, 606, 608, 610 may be combined with eyes open and first and second winking gestures but not typically (though not impossible) with the eyes closed or blink gestures. It is further noted that eye motion gestures may be gestures identified over time, and that, e.g., an eye sweep gesture that starts with look left 602, progresses to look straight ahead 608, and ends with look right 610 may be understood either as a single compound gesture that is captured over time or may be understood as a series of discrete gestures that constitute in whole or in part the gesture key sequence.

[0055] The movement of the eye can be determined using the previously described landmarks for the face where landmarks of the shapes of the eyes are used to locate the eyelids and then the Hough transform can be used to determine the location of the iris 410 or pupil 412 center relative to the position of the eyelids and particularly the corner landmarks of the eye. In another method, the centers of the pupil 412 and/or iris 410 can be determined (as is typically performed by iris recognition software and can be determined once the circular boundaries are found with a Hough Transform) and those center(s) locations can be examined relative to the position of the specular reflection light emanating from an optional illumination module 112.

[0056] FIGs. 7A and 7B illustrate the effect of light reflections on an eye for the purposes of identification of the eye positioning gestures, in an example embodiment. In FIG. 7A, the first eye 402 is making the look straight ahead 608 gesture. A specular reflection 702 may be roughly at or near the center of the pupil 412. However, as shown in FIG. 7B, when the subject 110 moves their first eye 402 to the look right 602 gesture, the boundaries of the iris 410 and pupil 412 shift relative to the specular reflection 702. This shift in the center of the iris 410 and pupil 412 relative to the specular reflection 702 can be used to determine eye movement and eye gestures.

[0057] FIGs. 8A and 8B are a flowchart for implementing a gesture key sequence by the ACS 102, in an example embodiment. It is to be recognized and understood that while the ACS 102 is described with particularity, the operations described herein may not necessarily be limited only to the ACS 102 in particular and may additionally or alternatively be performed by or on any suitable system or device. While the flowchart is described with respect to the processor 304, it is to be recognized and understood that some or all of the blocks may be performed by some or all of the processor 304 itself or any other processing, hardware, software, or firmware resources of the ACS 102. Moreover, as will be disclosed in subsequent figures, the ACS 102 is not constrained only to operate pursuant to the flowchart of FIGs. 8 A and 8B.

[0058] At block 802, the operations start, e.g., by placing the ACS 102 in an operating condition to start acquiring images. At block 804, the ACS 102 initiates biometric scanning with the scanning system 104 using lens system 118 and sensor board 120. At block 806, the ACS 102 acquires an image of a biometric presentation from the sensor board 120, such as the face 108. The ACS 102 may acquire the image upon the subject 110 coming into visual range of the lens system 118, though block 806 may operate in a functionally continuous stream, i.e., not dependent on the presence of the subject 110 or any other object to acquire images. In an example, the acquisition of an image that is ultimately analyzed may be dependent on the subject 110 being detected, e.g., by coming into range as determined by a range finder module, such as an ultrasonic, near infrared, lidar, or other system known in the art. The ACS 102 may further or alternatively determine range by attempting to autofocus on a presumed subject and, if unable to do so, may assume that the subject or presumed subject is out of range.

[0059] At block 808, the processor 304 of PCB 124 and/or the computing system 126 more generally segment a raw file of the image in order to locate landmarks that are characteristic of a human face and to generate a face template. The segmentation of the raw file may utilize segmentation algorithms known in the art, such as those presented by U.S. Patent Nos. 5,835,616, 6,301,370, and 9,471,829, as well as algorithms available in open source code such as OpenCV and its face detect.py script (see coding examples cited in https://realpython.com/face-recognition-with-python/), all of which are incorporated by reference herein in their entirety. As part of such algorithms, classifiers may be employed that compare features detected in the image to expected features of a face. The more features that match and the closer the features match, the higher the classifier score.

[0060] At block 810, the processor 304 determines if the match scores are sufficiently high relative to a predetermined threshold to positively identify a face 108 with acceptable quality. If not, at block 812, the processor 304 proceeds to determine if a timeout condition has been met. If not, the processor 304 returns to block 806 and seeks to acquire a new biometric facial presentation. If the timeout condition has been met, the processor 304 proceeds to block 832 and determines that an authentication for access to a secure asset 204 has not been met. If at block 810 the processor 304 determines that the face classifier score is sufficiently high relative to the predetermined threshold then the processor 304 proceeds to block 814.

[0061] At block 814, the segmented face is used to generate a face template and match a face template that is stored in a database, e.g., as stored in the electronic memory 302, e.g., the credentials 316. Depending upon the use case of the ACS 102, the match may be a 1 :N, 1 :few or a 1 : 1 match. With a 1 :N match, the ACS 102, prior to matching a face template, does not know who the subject is. Therefore, the ACS 102 may search through a number N of subject records in the credentials 316 to try to find a match with the face template generated from the face imagery presented. With a l :few match, the ACS 102 may have a certain amount of information to narrow down the search criteria from the full N subjects that may be available for searching. The information could be forms of identification including soft biometrics such as gait, ethnicity, gender, height, etc., or may be identification via a gesture key that might not be unique to a single subject. In order to perform a 1 : 1 identity verification with face matching, the ACS 102 receives information regarding who the subject 110 might be (e.g., information from an identity card implementing RF, magnetic stripe information, or employing NFC), pulls up a record of the subject 110 and then compares the retrieved face template to the template generated from the face imagery presented. If the segmented face does not match the face template then the processor 304 proceeds to the not-authenticated block 832. Alternatively, if the face capture allotted time has not elapsed, the process flow may go back to block 806 and a new image is acquired for biometric ID analysis.

[0062] If the segmented face does match the face template then the processor 304 proceeds to block 816 to retrieve the gesture key(s) or gesture key sequence(s) associated with the face template, if the processor 304 has not already retrieved the gesture key(s) or gesture key sequence(s). It is noted that the electronic memory 302 may hold more than one gesture key sequence for a subject 110. The different gesture keys or gesture key sequences may be utilized for different levels of system authorization or different secure assets 204, changing authorization for the same secure asset 204, or any other purposes. [0063] The selection between 1:N and 1 : 1 matching may depend on the availability of extrinsic information. In an example, the ACS 102 may be configured to receive a wireless signal, such as cellular, Bluetooth, 802.11, or the like, from a cellphone or other mobile device that a subject 110 may be expected to have on their person as they approach the ACS 102. On the basis of the wireless signal, the ACS 102 may infer the likely identity of the subject 110 and the template matching may occur on the basis of 1 : 1 identification to either confirm or disconfirm the actual identity of the subject 110 in relation to the expected identity. Other mechanisms for arriving at 1 : 1 identity may be utilized. Moreover, in the event that one or more attempts at 1 : 1 identity are unsuccessful, the processor 304 may then proceed to 1 :N identity across the credentials 316. The generation of face templates and performing face template matching is disclosed in U.S. Patent Nos. 6,681,032 and 9,135,500, both of which are incorporated herein by reference in their entirety.

[0064] At block 818, the processor 304 causes the user interface 122 to prompt the subject 110 for the gesture key sequence. As disclosed herein, the prompt may be audio, visual, or any other suitable mechanism. At block 820, the ACS 102 begins acquiring images as at block 806, but directed to collecting images of the subject 110 performing the gesture key sequence as prompted at block 818. The acquiring of images in block 820 may be a continuation of the acquiring of images at block 806.

[0065] At block 822 the processor 304 analyzes image and/or sequence data for use in determining if the gesture key sequence has been met. The processing of the collected images may utilize the same or similar image recognition processes described above with respect to block 808 and throughout this flowchart. At block 824, the processor 304 determines if the gesture key sequence has been met. If not, the processor 304 proceeds to block 826 to determine if a gesture capture timeout condition has been met. If the timeout condition has not been met, then the processor 304 returns to the block 820 to continue acquiring image data. If the timeout condition has been met, then the processor 304 proceeds to the not-authenticated block 832.

[0066] Consequently, the blocks 820, 822, 824, and 826 operate as a loop capturing multiple images by the ACS 102 in order to determine if the correct gesture key sequence has been given by the subject 110. After identifying the subject and retrieving the gesture key sequence, the apparatus continues to scan and capture an image stream at block 820. As each image is captured the processor 304 analyzes the image at block 822 and determines if gestures that might make up a full gesture key sequence are detected over time. At block 824 the system checks if the latest captured image has completed a gesture key sequence. If not, the processor 304 checks if the time to capture a gesture has expired and if not, the apparatus loops back in workflow and captures another image to analyze at block 820.

[0067] Determination that a gesture key sequence is complete may utilize information not only of the current captured image but also information from previous images captured. For instance, to detect a blinking gesture the ACS 102 may need to have recorded that the eyes were first open then closed then open again, which depending upon the speed of the subject 110 making the gestures and the frame rate of the lens system 118 and sensor board 120 may take more or fewer image frames, but at minimum may span three image frames. In addition to recording that a gesture was made, the ACS 102 may also store the time (or number of image frames or other temporal related metric) that it took to make the gesture in the event that the cadence of providing the gesture key sequence is relevant to the completion of the gesture key sequence.

[0068] The gesture key(s) stored in the electronic memory 302 can take several forms. A gesture key sequence could be or comprise video imagery of the subject 110 making the gesture sequence during an enrollment process. However, the storage of video may be inefficient use of electronic memory 302 resources. Alternatively, still images or subimages of key points of the video that are relevant to the individual gestures of the gesture video may be stored in the electronic memory 302 as a gesture key sequence or data to be used to analyze images of the subject making the gesture key sequence.

[0069] Additionally or alternatively, metadata describing the gesture(s) or gesture key sequence(s) may be stored and retrieved for the matching process. By way of example, if a gesture key sequence is blinking twice, smiling once, followed by a first eye 402 wink, then a primary area of importance for the purposes of the matching may first be the region of the subject’s first eye 402 and second eye 404 together, followed by the subject's 110 mouth, and followed by the first eye 402. Consequently, image data related to other parts of the face 108 of the subject not relevant to the gesture key sequence at any given time may be disregarded and not stored in the electronic memory 302. In this manner, identification of the subject may not be just from the gesture sequence but from analyzing how the subject’s specific eye/ eyes or mouth look when winking or smiling if these elements are part of the subject’s gesture key. Alternatively, once a portion of a gesture is recorded, for example a blink, only the fact that a blink was detected may be stored with no imagery data whether in image or metadata form.

[0070] The gesture key sequence data stored may be gesture-specific or subjectspecific, depending upon how the ACS 102 is implemented. In certain examples of gesture-specific gesture keys, the data may only include information regarding the gestures themselves and not how a subject looks while executing the gesture sequence. For example, the eyes open gesture of FIG. 4A may be assigned the numeral 1, the first wink gesture of FIG. 4B may be assigned the numeral 2, the second wink gesture of FIG. 4C may be assigned the numeral 3, and the blink gesture of FIG. 4D may be assigned the numeral 4. The smile 502 gesture may be assigned numeral 10, the frown 504 may be assigned numeral 11, the neutral 506 may be assigned numeral 12, the pursed lips 508 may be assigned numeral 13, and the mouth open 510 may be assigned numeral 14. In such an example, a gesture key sequence may be stored as simply as “4,4,10,2”, where the comma (or other suitable character) within the quotes serves as a delimiter for the distinct gestures of the gesture key sequence for the purposes of this text description. Compound gestures may consist of numeral pairs, e.g., with the first wink gesture combined with a smile 502 coded as “2.10”, where the dot (or other suitable character) within the quotes serves as a delimiter between the two or more gestures making up the compound gesture for the purposes of this text description. Similarly, timing requirements may be coded as numerals. For instance, a long time period may be coded as 21 and a short time period may be coded as 20. Consequently, the stored gesture key might be “4-21,4-20,10-20,2-20”, where the dash (or other suitable character) within the quotes serves as a delimiter between the gesture and its time code.

[0071] For a subject-specific gesture key, the same or similar gesture-specific data as described earlier may be stored, but then information regarding how the subject 110 specifically makes one or more of the gestures in the gesture key sequence may be stored in the electronic memory 302. For example, if a subject 110 when winking its first eye 402 cannot fully close its first eye 402, or in the process of closing first eye 402 substantially changes appearance of second eye 404, these facts or similar facts may be noted in the credentials 316. Similarly, the credentials 316 might store how the subject 110 smiles 502. As described earlier, this stored information might be as images of those areas of the subject’s face or as metadata representing how the subject 110 makes the gesture. For the example of a smile 502, frown 504, neutral 506, pursed lips 508, mouth open 510, or other mouth gesture, the relative coordinates of relevant landmarks of the mouth when the subject 110 makes a given mouth gesture can be stored based upon imagery captured during enrollment.

[0072] The metadata for subject-specific gestures may extend beyond the immediate area of the face 108 that is being moved. The movement of the face 108 from a neutral pose may tend to expose new features in the face 108, particularly the more aged the subject 110 is. By way of example, a smile 502 can expose creases and/or dimples at the corner of the mouth as well as expose creases where the skin around the mouth meets the skin of the cheeks, in addition to exposing any dimples in that area, particularly in the cheeks. Again, the relative coordinates (e.g., Euclidean coordinates and angle of inclination of the feature) of landmarks representing these creases and dimples may be stored in the credentials 316 and may be factored in for identifying the subject 110 or various specific gestures.

[0073] If the gesture key sequence is complete at block 824, the processor 304 proceeds to block 828 to determine if the gesture key sequence received is correct by matching the received key to that of a recorded key from the electronic memory 302. If the correct gesture key was not received the processor 304 proceeds to block 830 to determine if the gesture capture timeout period has lapsed. If the timeout has lapsed, then the processor 304 proceeds to the not-authenticated block 832. If the timeout period has not lapsed then then the processor 304 may return to block 818 and prompt the subject for the gesture sequence, or may alternatively proceed to block 820 and continue acquiring images in the event that another prompt for the gesture key sequence is not desired for any reason.

[0074] However, if at block 828 the processor 304 identifies a match between the identified gesture sequence and/or a subset of the identified gesture sequence and the required gesture key sequence then the subject 110 is deemed to have been authenticated and the processor 304 proceeds to block 834. It is noted that the subset of the identified gesture sequence may be utilized as a subject may make unintentional gestures prior to intending to start the gesture key sequence, e.g., an unintended or inadvertent blink while walking up to the ACS 102 or before intending to start the gesture key sequence. Consequently, certain gestures in a gesture sequence that come immediately before or after a subset that corresponds to a gesture key sequence may be disregarded. Upon reaching block 834 the ACS 102 then grants access to the secure asset 204 to the subject 110. Alternatively, the authentication may be only part of wider authentication needed to access the secure asset 204 and the authentication may be output to another system for use in authenticating the subject 110.

[0075] In an alternative example, the gesture may serve as the basis for providing a 1 : 1 identification of the face 108 and subject 110. In such an example, as a personal identification number (PIN) may be entered into a keypad first and then a biometric obtained, the gesture key can be given first to obtain a subject 110 record from the credentials 316 and then the face identification is performed. While in various examples the gesture key may not be unique, by having the gesture key given first the number N may be reduced, which may conserve system resources. Consequently, in an alternative example, block 814 may be moved after the blocks related to the obtaining and identification of the gesture, including but not necessarily limited to blocks 816, 818, 820, 822, and 824. Moreover, the various blocks of the flowchart may be adapted as needed to account for the fact that the ACS 102 may not necessarily know what the “correct” gesture key sequence is for the purposes of block 828. In such an example, the completion of a gesture key sequence may be determined by inaction by the subject 110, by recognizing a particular gesture that the system understands to mean the gesture key is complete (similar to hitting enter key for passwords or the hash key for many touchpad ID applications), or by a gesture capture timeout condition, whereupon whatever gesture sequence has been obtained is compared in order to identify the likely subject 110 or subjects in the event that the gesture key sequence is not unique to a single subject 110, whereupon the block 814 may be performed against the known image associated with the subject(s) 110.

[0076] FIGs. 9A and 9B are a flowchart for implementing a gesture key sequence by the ACS 102, in an example embodiment. It is to be recognized and understood that while the ACS 102 is described with particularity, the operations described herein may not necessarily be limited only to the ACS 102 in particular and may additionally or alternatively be performed by or on any suitable system or device. While the flowchart is described with respect to the processor 304, it is to be recognized and understood that some or all of the blocks may be performed by some or all of the processor 304 itself or any other processing, hardware, software, or firmware resources of the ACS 102. Moreover, as will be disclosed in subsequent figures, the ACS 102 is not constrained only to operate pursuant to the flowchart of FIGs. 9 A and 9B.

[0077] The flowchart of FIGs. 9A and 9B relates to that of FIGs. 8A and 8B. Various blocks from FIGs. 9A and 9B operate the same or effectively the same as similarly titled blocks from FIGs. 8A and 8B. Thus, in the interest of conciseness, the discussion of FIGs. 9A and 9B will be limited to the differences between the flowchart of FIGs. 8A and 8B and the flowchart of FIGs. 9A and 9B. Thus, block 902 relates to block 802; block 904 relates to block 804; block 906 relates to block 806; block 908 relates to block 808; and block 910 relates to block 810.

[0078] Flowing from block 910, there need not be a related face capture timeout block 812, with the ACS 102 simply seeking to capture an image of a face 108 or the subject 110 on a more generally continual basis. If the face quality is not acceptable then the processor 304 proceeds back to block 906. If the face quality is acceptable then the processor 304 proceeds to block 912 to determine if the face 108 matches another face according to 1 : 1, l :few, or 1:N depending upon the information ACS has to potentially down-select potential faces being presented. If not acceptable, then the processor 304 returns back to block 906. If a face 108 is matched then the process proceeds to block 914.

[0079] At block 914, the processor 304 generates or otherwise selects, from a predetermined set of gesture keys or gesture key sequences, one gesture key sequence to present to the subject 110 via the user interface 122. In various examples, the credentials 316 stored in the electronic memory 302 may include a variety of gestures obtained from the subject 110 at an earlier date and the processor 304 may select some of the stored gestures for inclusion in a gesture key sequence. In such an example, the processor 304 may look for the resultant images to conform not only to each individual gesture in the appropriate sequence but to replicate how the subject 110 previously performed each individual gesture, according to methods described herein and as known in the art. At block 916, the prompt for the gesture key sequence would then be what specific gesture(s) to perform and in what order rather than a general prompt to perform a gesture key sequence. Otherwise, blocks 920, 922, 924, 926, 928, 930, 932, and 934 function as their respective counterparts in FIGs. 8 A and 8B.

[0080] FIGs. 10A and 10B are a flowchart for implementing a gesture key sequence by the ACS 102, in an example embodiment. It is to be recognized and understood that while the ACS 102 is described with particularity, the operations described herein may not necessarily be limited only to the ACS 102 in particular and may additionally or alternatively be performed by or on any suitable system or device. While the flowchart is described with respect to the processor 304, it is to be recognized and understood that some or all of the blocks may be performed by some or all of the processor 304 itself or any other processing, hardware, software, or firmware resources of the ACS 102. Moreover, as will be disclosed in subsequent figures, the ACS 102 is not constrained only to operate pursuant to the flowchart of FIGs. 10A and 10B.

[0081] The flowchart of FIGs. 10A and 10B relates to that of FIGs. 8 A, 9B, 9 A, and 9B. Various blocks from FIGs. 10A and 10B operate the same or effectively the same as similarly titled blocks from FIGs. 8 A, 9B, 9A, and 9B. Thus, in the interest of conciseness, the discussion of FIGs. 10A and 10B will be limited to the differences between the flowcharts of FIGs. 8 A, 9B, 9A, and 9B and the flowchart of FIGs. 10A and 10B. Thus, block 1002 relates to block 902; block 1004 relates to block 904; block 1006 relates to block 906; block 1008 relates to block 908; block 1010 relates to block 910; block 1012 relates to block 912; block 1016 relates to block 816; block 1020 relates to block 820; block 1022 relates to block 922; block 1024 relates to block 924; block 1028 relates to block 928; block 1034 relates to block 934; block 1030 relates to block 930, and block 1032 relates to block 932.

[0082] The flowchart of FIGs. 10A and 10B differ from the flowcharts of FIGs. 8 A, 8B, 9 A, and 9B in that the negative condition of both of blocks 1020 for completion of the gesture sequence and block 1022 for the correct gesture key sequence being detected flow into block 1026 to determine if the gesture capture timeout has occurred. If the gesture capture timeout has occurred then the processor 304 proceeds to block 1028 and does not authenticate the subject 110. If the gesture capture timeout has not occurred then the processor 304 returns to block 1016 and acquire images. Consequently, the operation of the flowchart of FIGs. 10A and 10B varies from that of FIGs. 8 A and 8B principally in that the flowchart presents fewer conditions preventing the return to more image gathering and instead operates in relatively less constrained loop in comparison with the flowchart of FIGs. 8 A and 8B.

[0083] FIGs. 11 A and 1 IB are a flowchart for implementing a gesture key sequence by the ACS 102, in an example embodiment. It is to be recognized and understood that while the ACS 102 is described with particularity, the operations described herein may not necessarily be limited only to the ACS 102 in particular and may additionally or alternatively be performed by or on any suitable system or device. While the flowchart is described with respect to the processor 304, it is to be recognized and understood that some or all of the blocks may be performed by some or all of the processor 304 itself or any other processing, hardware, software, or firmware resources of the ACS 102.

Moreover, as will be disclosed in subsequent figures, the ACS 102 is not constrained only to operate pursuant to the flowchart of FIGs. 11A and 1 IB.

[0084] The flowchart of FIGs. 11 A and 1 IB relates to that of FIGs. 8A, 9B, 9A, 9B, 10A, and 10B. The flowchart of FIGs. 11 A and 1 IB generally incorporate a temporal element to the gesture key sequence, specifying the timing with which gestures may be required to be performed. Various blocks from FIGs. 11 A and 1 IB operate the same or effectively the same as similarly titled blocks from FIGs. 8 A, 8B, 9 A, 9B, 10A, and 10B. Thus, in the interest of conciseness, the discussion of FIGs. 11 A and 1 IB will be limited to the differences between the flowcharts of FIGs. 8 A, 8B, 9A, 9B, 10 A, and 10B and the flowchart of FIGs. 11 A and 1 IB. Broadly speaking, the flowchart of FIGs. 11 A and 11B implements various blocks and operations disclosed in FIGs. 8A, 8B, 9A, 9B, 10A, and 10B utilizing specific scores at various times.

[0085] Thus, block 1102 relates to block 1002. Block 1104 relates to block 1004 to initiate a biometric scan but also sets Last Match Score to equal zero (0). Block 1106 relates to block 1006. Block 1108 relates to block 1008. Block 1110 relates to block 1010, where if the face image is not of acceptable quality the processor 304 proceeds to block 1130 to determine if the face capture timeout has occurred. If the face image is of acceptable quality, the processor 304 proceeds to block 1112. Note that the appending operation of block 1112 may be a literal appending of the image to a sequence of images to be analyzed for a gesture or may be the metadata associated with the image. By way of example, instead of a temporal sequence of actual images (e.g., bitmap, PNG, or TIFF images), the sequence may be a series of a set of face landmark coordinates as described herein where each set of face landmark coordinates corresponds to a collected image. At block 1114, the processor 304 generates a match score based on a degree of match between the face 108 as segmented in block 1108 and as obtained from the credentials 316. As discussed herein, a match score may be 1 : 1 or 1 Tew if information to downselect the subject’s identity is available to the ACS or may be a 1 :7V match. Note that while a match score is discussed with particularity for steps 1114, 1116, and 118, it is to be recognized and understood that a quality score may be utilized instead of or in addition to the match score based on the same principles disclosed herein. With the quality score, the image with the top-quality score may be kept or the images with the top M quality scores may be kept for matching at a later time (for instance in block 1124 where actual face matching is performed for identification).

[0086] At block 1116, if the match score is greater than the Last Match Score, the processor proceeds to block 1118. If not, the processor 304 proceeds to block 1120. At block 1118, the processor sets the Last Match Score to equal the match score. At block 1120, the processor 304 analyzes the temporal image sequence generated through the repeated operation of block 1112. At block 1122, the processor 304 generates a temporal gesture key sequence based on the analysis of the temporal image sequence at block 1120. At block 1124, the processor 304 determines if the Last Match Score meets a requirement for verification of the subject 110. If the processor 304 determines that the match score does not meet the requirement the processor proceeds to block 1130. If the processor 304 determines that the match score does meet the requirement the processor 304 proceeds to block 1126.

[0087] At block 1126, the processor 304 determines if the temporal gesture key and face authenticate. If the temporal gesture key and face do authenticate then the processor 304 proceeds to block 1128 and authenticates the subject. If either the temporal gesture key or face do not authenticate then the processor 304 proceeds to block 1130. At block 1130 the processor 304 determines if a face capture timeout has occurred. If not, the processor 304 returns to block 1106 and acquires another image. If the timeout condition has been met the processor 304 proceeds to block 1132 and does not authenticate the subject 110.

EXAMPLES

[0088] Example 1 is an access control system (ACS) configured to control access to a secure asset, comprising: a scanning system, including a lens system configured to sense light rays and a sensor board configured to capture image data; an electronic memory configured to store a credential of a subject and a gesture key sequence; a processor, operatively coupled to the electronic memory and to the scanning system, configured to: acquire images over time of a biometric presentation of the subject upon the subject coming into visual range of the lens system; identify at least one of the images as corresponding to the subject; access the gesture key sequence from the electronic memory; identify gestures performed over time by the subject in the images; compare the gestures as identified to the gesture key sequence; and grant access to the secure asset to the subject based on the gestures as identified matching the gesture key sequence and the subject being identified in at least one of the images.

[0089] In Example 2, the subject matter of Example 1 includes, wherein the gestures are facial gestures and wherein the images acquired include a face of the subject.

[0090] In Example 3, the subject matter of Example 2 includes, wherein the facial gestures include mouth gestures.

[0091] In Example 4, the subject matter of Examples 1-3 includes, wherein the gesture key sequence comprises a plurality of gestures organized in a predetermined sequence.

[0092] In Example 5, the subject matter of Example 4 includes, wherein the gesture key sequence is associated with the subject and stored with the credential.

[0093] In Example 6, the subject matter of Examples 4-5 includes, a user interface operatively coupled to the processor; wherein the gesture key sequence is generated by the access control system and presented to the subject via the user interface upon the subject coming into visual range of the lens system.

[0094] In Example 7, the subject matter of Examples 1-6 includes, wherein the processor is configured to identify the subject before identifying gestures performed over time by the subject.

[0095] In Example 8, the subject matter of Examples 1-7 includes, wherein the processor is configured to identify the subject concurrently with identifying gestures performed over time by the subject.

[0096] In Example 9, the subject matter of Examples 1-8 includes, wherein the processor is configured to identify the subject after identifying gestures performed over time by the subject.

[0097] In Example 10, the subject matter of Examples 1-9 includes, wherein the gesture key sequence includes at least one gesture performed by a body part of the user other than the face.

[0098] Example 11 is a computer readable medium comprising instructions which, when implemented by a processor, cause the processor to perform operations comprising: acquire images over time of a biometric presentation of a subject upon the subject coming into visual range of a lens system of an access control system; identify at least one of the images as corresponding to the subject; access a gesture key sequence from an electronic memory of the access control system; identify gestures performed over time by the subject in the images; compare the gestures as identified to the gesture key sequence; and grant access to a secure asset to the subject based on the gestures as identified matching the gesture key sequence and the subject being identified in at least one of the images.

[0099] In Example 12, the subject matter of Example 11 includes, wherein the gestures are facial gestures and wherein the images acquired include a face of the subject.

[0100] In Example 13, the subject matter of Example 12 includes, wherein the facial gestures include mouth gestures.

[0101] In Example 14, the subject matter of Examples 11-13 includes, wherein the gesture key sequence comprises a plurality of gestures organized in a predetermined sequence.

[0102] In Example 15, the subject matter of Example 14 includes, wherein the gesture key sequence is associated with the subject and stored with a credential stored in the electronic memory.

[0103] In Example 16, the subject matter of Examples 14-15 includes, wherein the access control system includes a user interface operatively coupled to the processor, wherein the gesture key sequence is generated by the access control system and wherein the computer readable medium further comprises instructions that cause the processor to prompt the subject to provide the gesture key sequence, via the user interface, upon the subject coming into visual range of the lens system.

[0104] In Example 17, the subject matter of Example 16 includes, wherein the prompt is a question.

[0105] In Example 18, the subject matter of Example 16 includes, wherein the prompt is a gesture or gesture sequence to be mimicked by the subject.

[0106] In Example 19, the subject matter of Example 16 includes, wherein the prompt is to induce the subject to give a coded response.

[0107] In Example 20, the subject matter of Examples 11-19 includes, wherein the instructions cause the processor to identify the subject before identifying gestures performed over time by the subject. [0108] In Example 21, the subject matter of Examples 11-20 includes, wherein the instructions cause the processor to identify the subject concurrently with identifying gestures performed over time by the subject.

[0109] In Example 22, the subject matter of Examples 11-21 includes, wherein the instructions cause the processor to identify the subject after identifying gestures performed over time by the subject.

[0110] In Example 23, the subject matter of Examples 11-22 includes, wherein the gesture key sequence includes at least one gesture performed by a body part of the user other than the face.

[0111] Example 24 is a processor-implemented method of controlling access to a secure asset, comprising: acquiring images over time of a biometric presentation of a subject upon the subject coming into visual range of a lens system of an access control system; identifying at least one of the images as corresponding to the subject; accessing a gesture key sequence from an electronic memory of the access control system; identifying gestures performed over time by the subject in the images; comparing the gestures as identified to the gesture key sequence; and granting access to a secure asset to the subject based on the gestures as identified matching the gesture key sequence and the subject being identified in at least one of the images.

[0112] In Example 25, the subject matter of Example 24 includes, wherein the gestures are facial gestures and wherein the images acquired include a face of the subject.

[0113] In Example 26, the subject matter of Example 25 includes, wherein the facial gestures include mouth gestures.

[0114] In Example 27, the subject matter of Examples 24-26 includes, wherein the gesture key sequence comprises a plurality of gestures organized in a predetermined sequence.

[0115] In Example 28, the subject matter of Example 27 includes, wherein the gesture key sequence is associated with the subject and stored with a credential stored in the electronic memory.

[0116] In Example 29, the subject matter of Examples 27-28 includes, wherein the access control system includes a user interface operatively coupled to the processor, wherein the gesture key sequence is generated by the access control system, and wherein the method further comprises prompting the subject, via the user interface, upon the subject coming into visual range of the lens system.

[0117] In Example 30, the subject matter of Examples 24-29 includes, wherein identifying the subject is before identifying gestures performed over time by the subject.

[0118] In Example 31, the subject matter of Examples 24-30 includes, wherein identifying the subject is concurrent with identifying gestures performed over time by the subject.

[0119] In Example 32, the subject matter of Examples 24-31 includes, wherein identifying the subject is after identifying gestures performed over time by the subject.

[0120] In Example 33, the subject matter of Examples 24-32 includes, wherein the gesture key sequence includes at least one gesture performed by a body part of the user other than the face.

[0121] Example 34 is at least one machine-readable medium including instructions that, when executed by processing circuitry, cause the processing circuitry to perform operations to implement of any of Examples 1-33.