Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
PROACTIVE DEVICE AUTHENTICATION PLATFORM
Document Type and Number:
WIPO Patent Application WO/2019/156680
Kind Code:
A1
Abstract:
Embodiments of the invention can describe a method comprising receiving, a plurality of data packets associated with a plurality of online events and determining that an online event is a statistical anomaly. A community group associated with the online event can be determined by a computer, which can then initiate re-authentication events for members of the community group. The method may further comprise assessing a threat level for the community group associated with the online event and comparing the threat level against a threshold value. Assessing a threat level can include determining a risk score for the online event, determining a risk status for the community group associated with the online event, and identifying a risk feature using an optimization process.

Inventors:
HARRIS THEODORE D (US)
O'CONNELL CRAIG (US)
LI YUE (US)
KOROLEVSKAYA TATIANA (US)
Application Number:
PCT/US2018/017652
Publication Date:
August 15, 2019
Filing Date:
February 09, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
VISA INT SERVICE ASS (US)
International Classes:
H04L29/06; H04L9/32
Foreign References:
US9231962B12016-01-05
US20180033006A12018-02-01
US20150371044A12015-12-24
US8875267B12014-10-28
US20080120214A12008-05-22
Attorney, Agent or Firm:
JEWIK, Patrick et al. (US)
Download PDF:
Claims:
WHAT IS CLAIMED IS:

1 . A method comprising:

a) receiving, by a computer, a plurality of data packets associated with a plurality of online events;

b) determining, by the computer, that an online event in the plurality of online events is a statistical anomaly;

c) determining, by the computer, a community group associated with the online event; and

d) initiating, by the computer, re-authentication events for members of the community group.

2. The method of claim 1 , wherein b) comprises:

determining one or more statistical distributions relating to the plurality of data packets;

performing a statistical test based on a comparison of the online event to the one or more statistical distributions; and

detecting the statistical anomaly from the statistical test

3. The method of claim 1 , further comprising, after a) and before b): regenerating a plurality of community groups using the plurality of data packets and historical data for the plurality of community groups, wherein a risk status is updated for one or more community groups in the plurality of community groups

4 The method of claim 3, further comprising, after c) and before d): assessing a threat level for the community group associated with the online event; and

determining that the threat level exceeds a threshold value.

5 The method of claim 4, wherein assessing a threat level comprises: determining a risk score for the online event; and determining a risk status for the community group associated with the online event; and

identifying a risk feature of the community group using an optimization process.

6. The method of claim 5, wherein the risk score for the online event is determined using one or more supervised machine learning models.

7. The method of claim 5, wherein the optimization process comprises an ant colony optimization algorithm.

8. The method of claim 1 , wherein d) comprises:

querying an account associated with each member of the community group;

determining one or more devices accessing the account;

sending an authentication challenge to the one or more devices; and restricting access to the account by the one or more devices until a user of the one or more devices delivers a correct authentication response to the computer.

9. The method of claim 8, further comprising:

receiving, in response to the re-authentication events initiated for the members of the community group, a plurality of authentication results;

determining, based on the plurality of authentication results, an accuracy level; and

adjusting a sensitivity for one or more machine learning components based on the accuracy level.

10. The method of claim 1 , wherein the plurality of data packets comprise device command sequences associated with the plurality of online events.

1 1. A server computer comprising:

a processor; a network interface; and

a computer-readable comprising executable instructions in the form of code, the instructions including a method comprising:

a) receiving a plurality of data packets associated with a

plurality of online events;

b) determining that an online event in the plurality of online events is a statistical anomaly;

c) determining a community group associated with the online event; and

d) initiating re-authentication events for members of the

community group.

12. The server computer of claim 1 1 , wherein b) comprising:

determining one or more statistical distributions relating to the plurality of data packets;

performing a statistical test based on a comparison of the online event to the one or more statistical distributions; and

detecting the statistical anomaly from the statistical test.

13. The server computer of claim 1 1 , wherein the method further comprises, after a) and before b):

regenerating a plurality of community groups using the plurality of data packets and historical data for the plurality of community groups, wherein a risk status is updated for one or more community groups in the plurality of community groups.

14. The server computer of claim 13, wherein the method further comprises, after c) and before d):

assessing a threat level for the community group associated with the online event; and

comparing the threat level against a threshold value, wherein d) occurs only if the threat level exceeds the threshold value.

15. The server computer of claim 14, wherein assessing a threat level comprises:.

determining a risk score for the online event;

determining a risk status for the community group associated with the online event; and

identifying a risk feature of the community group using an optimization process.

16. The server computer of claim 15, wherein the risk score for the online event is determined using one or more supervised machine learning models.

17. The server computer of claim 15, wherein the optimization process comprises an ant colony optimization algorithm.

18. The server computer of claim 1 1 , wherein d) comprises: querying an account associated with each member of the community group;

determining one or more devices accessing the account;

sending an authentication chaiienge to the one or more devices; and restricting access to the account by the one or more devices until a user of the one or more devices delivers a desired authentication response to the server computer.

19. The server computer of claim 18, wherein the method further comprises:

receiving, in response to the re-authentication events initiated for the members of the community group, a plurality of authentication results;

determining, based on the plurality of authentication results, an accuracy level; and adjusting a sensitivity for one or more machine iearning components based on the accuracy levei.

20. The server computer of ciaim 1 1 , wherein the plurality of data packets comprise device command sequences associated with the plurality of online events.

Description:
PROACTIVE DEVICE AUTHENTICATION PLATFORM

CROSS-REFERENCES TO RELATED APPLICATIONS

[0001] None.

BACKGROUND

[0002] It is common for users to perform everyday tasks remotely over networks and from interconnected devices. Users are typicaiiy associated with a plurality of personal accounts that provide the users with their digital Identities during interactions. For example, users may be associated with social media accounts, online banking accounts, media streaming accounts, etc., which allow the users to identify themselves over the internet and access any resources attributed to them in today’s technological environment, service providers and accounts issued therefrom can be accessed over the internet or“from the cloud” using generally any network-enabled device. In other systems such as payment networks, accounts can be accessed from common access devices, such as point-of-sale terminals or ATM machines.

[0003] However, the widespread use of digital identities in everyday life has greater incentivized hackers to breach user accounts, especially at a system wide or network level. When digital information is accessed from a central access point by several devices, hackers are drawn to inspect the access point for any vulnerability that can be exploited against multiple users and thus provide greater gain for the attacker. For example, public WiFi networks at coffee shops, public libraries, airports, and universities may be especially vulnerable to attacks which can be exploited against an initial user and his or her peers. In another example, outdated credit card machines and other point-of-sale devices may be of particular interest to an attacker who can skim payment information from several consumers on any given day. In other examples, high traffic websites can be used to install malware or Trojans into multiple devices, unbeknownst to the users. [0004] Furthermore, interconnected device security threats are dynamic in nature and do not just manifest themselves when users in a system are authenticated. ATM machines, remote traffic sensors, APIs, and servers can be hijacked at any time, and the evidence of the hijack may only be visible by looking at how users, accounts, and devices are behaving as a whole and within the network in which they are interacting. Even when a particular attack is detected, it is of utmost importance to determine the source of vulnerability and identify and restrict any other affected accounts before the vulnerability spreads throughout the network.

[0006] Embodiments of the invention described herein address these and other problems, individually and collectively.

BRIEF SUMMARY

[0006] One embodiment of the invention describes a method comprising receiving a plurality of data packets associated with a plurality of online events and determining that an online event within the plurality is a statistical anomaly. A community group associated with the online event can be determined by a computer, which can then initiate re-authentication events for members of the community group.

[0007] The method may further comprise assessing a threat level based on a risk score for the online event, a risk status for the community group associated with the online event, and risk features identified from an optimization process. The threat level can be compared against a threshold value, wherein re-authentication events for members of the community group occurs only if the threat level exceeds the threshold value.

[0008] Additional embodiments can describe identifying a risk feature using an optimization process. In the process, a graph comprising nodes connected by edges can be obtained. One or more weights for one or more edges can be updated by performing path optimizations, each of which use a set of agents to explore the graph over cycles to reduce a cost function. Nodes can be grouped based on the weights of their of connecting edges, and a group can be assigned as a risk feature (i.e., the group including a node for a risk status). Furthermore, nodes for the group assigned as a risk feature can be compared to nodes for a community group to determine a relation, and a risk status of the community group based on a relation to the risk feature can be determined.

[0009] Other embodiments may be describe computers, systems, and apparatuses for implementing various methods executed in embodiments. Further details regarding embodiments of the invention can be found in the Detailed Description and the Figures described herein.

BRIEF DESCRIPTION OF THE DRAWINGS

[0010] FIG. 1 shows an illustration of a system for proactively authenticating users according to an embodiment.

[0011] FIG. 2 shows a block diagram of a server computer for performing functions in an authentication system according to embodiments.

[0012] FIG. 3 shows a swim-lane diagram of a proactive authentication process according to embodiments.

[0013] FIG. 4 shows a diagram of a process flow at an authentication system according to an embodiment.

[0014] FIG. 5 shows a depiction of identifying new paths in a graph according to an embodiment.

[0016] FIG. 6 shows a depiction of overlapping communities according to an embodiment. TERMS

[0016] A“user device” may refer to a computing device operable by a user.

Computing devices may include any device having a processor and a computer- readable medium. Many user devices are network-enabled devices that allow for remote communications over a communications network, such as the internet. A user device can also be a mobile device that is easily worn or carried by an individual, such as a smart phone, smart watch, fitness tracker, or other wearable device. Other examples of user devices can include IOT devices, personal computers, lafops, PDAs, smart vehicles, smart televisions or any other machine or device capable of processing, receiving, transmitting, and storing data.

[0017] An“access device” may be any suitable device for providing access to an external computer system. An access device may be in any suitable form. Some examples of access devices include point of sale (POS) devices, cellular phones, PDAs, personal computers (PCs), tablet PCs, hand-held specialized readers, set-top boxes, electronic cash registers (ECRs), automated teller machines (ATMs), virtual cash registers (VCRs), kiosks, security systems, access systems, Websites, and the like. An access device may use any suitable contact or contactless mode of operation to send or receive data from, or associated with, a portable communication device. In some embodiments, where an access device may comprise a POS terminal, any suitable POS terminal may be used and may include a reader, a processor, and a computer- readable medium. A reader may include any suitable contact or contactless mode of operation. For example, exemplary card readers can include radio frequency (RF) antennas, optical scanners, bar code readers, or magnetic stripe readers to interact with a portable communication device.

[0018] A“credential” may be any suitable information that serves as reliable evidence of worth, ownership, identity, or authority. A credential may be a string of numbers, letters, or any other suitable characters, as well as any object or document that can serve as confirmation. Examples of credentials include value credentials, identification cards, certified documents, access cards, passcodes and other login information, etc.

[0019] “Authentication” may refer an action of providing proof of genuineness or validity. In computer networking, authentication of a user may refer to verifying a user’s digital identity. This may be achieved by evaluating the user’s credentials. In this manner, only legitimate users can gain access to shared resources or any resources attributed to a requesting user.

[0020] An“authentication result” may refer to the result of an authentication process.

In an authentication process, determinations are made as to whether or not a person or other entity is genuinely who the person or entity claims to be. For example, an authentication result can be positive if a user’s identity is successfully confirmed with some degree of certainty, while an authentication result can be negative if the user cannot be correctly identified or is suspected to be misrepresenting his or her identity.

[0021] An“authentication challenge” may refer to a challenge/question posed to a user or entity during an authentication process, such as in the case of a challenge-response authentication protocol. In a challenge-response authentication protocol, an

authentication challenge may involve asking a user for his or her credentials (e.g.

username and password, biometric, or other security question) and an authentication response may include the credentials provided in response to the challenge. If the desired authentication response is received by the authorizing computer or entity that posed the challenge, the user may be successfully authenticated and granted access.

[0022] The term“online” may refer to a state in which a device is connected to a communications network, such as the internet. In such a manner, the device is able to receive and send data globally across connections. The term“offline” may refer to any state in which a device is not online.

[0023] An“online event” may refer to an action that occurs over a communications network. For example, online events can include login attempts, requests for services, downloads, uploads, or any other set of actions that can be performed by a device in the network. An online event can further be associated with data packets that communicate information about the online event. The data packets can further be aggregated, recorded, and/or logged when an online event takes place. For example, device command sequences and other processing instructions can be cached so that facts relating to an online event can be inspected. In one illustrative example, device commands generated during multiple failed login attempts by a user can be recorded and used to identify trends and patterns of fraudulent users.

[0024] A“re-authentication event" may refer to an online event in which a user accessing resources online is challenged to present further authentication in order to continue access. For example, users already logged into an online account may not be able to continue to interact with the online account.

[002S] “Artificial intelligence” may refer to the simulation of human intelligence by a computer or machine. The term“artificial intelligence model" or“Al model” may refer to a model, such as a statistical model, that can be used to predict outcomes in an intelligent way and as necessary for achieving specified tasks. According to some artificial intelligence methods, an Al model may be developed using a learning algorithm, in which training data is classified based on known or inferred patterns so that the Al may learn concepts for achieving a target goal. Such an artificial intelligence model is often referred to as a“machine learning model.”

[0026] “Machine learning” may refer to an artificial intelligence process in which software applications may be trained to make accurate predictions through learning.

The predictions can be generated by applying input data to a predictive model formed from performing statistical analysis on aggregated data. Machine learning that involves learning patterns from a topological graph can be referred to as“graph learning” Some non-limiting examples of types of machine learning processes can include clustering, neural nets, reinforcement learning, deep learning, etc. [0027] A“graph" may refer to a diagram showing one or more relations between one or more variables. In graph theory, a graph may refer to a plane of distinct vertices connected by edges. The distinct vertices in a graph may be referred to as“nodes.” Each node may correspond to a unique data element representing specific information, such as for categorical information for an event, profile, or entity. The nodes may be related to one another by a set of edges, E. An“edge" may be described as an unordered pair composed of two nodes as a subset of the graph G = (V, E), where is G is a graph comprising a set V of vertices (nodes) connected by a set of edges E. In this manner, relations between data can be expressed as“graph structures.” A collection of data that is organized according to graph structures may be referred to as a“graph database. For example, a graph database for a transaction network may comprise nodes representing transactions, which may be connected by edges to one or more nodes that are related to or describe the transaction, such as nodes representing information of a device, a user, a transaction type, etc. in a“weighted graph”, an edge may be associated with a numerical value, referred to as a“weight”, that may be assigned to the pairwise connection between two nodes, thus quantifying their relationship. The edge weight may be identified as a strength of connectivity between two nodes and/or may be related to a cost or distance, as it often represents a quantity that is required to move from one node to the next in some instances, a weight may represent the probability that two nodes will ever be expressed together in a data sample, as recorded from historical data. For example, in a graph representing survey data for the population of San Francisco, a node for‘San Francisco’ can be connected by edges to a node for‘male’ as well as to‘female’, where the weights of each edge may represent the probability that, picking one person at random from the population, the person will be either male or female.

[0028] A“community” or“community group" may refer to a group/coilection of nodes in a graph that are densely connected within the group. As such, nodes in the same community may have similar characteristics from which decisions about an individual node can be similarly applied to its peers. A community may be identified from a graph using a graph learning algorithm, such as a graph learning algorithm for mapping protein complexes. Communities identified using historical data can be used to classify new data for making predictions. As one example, communities that classify types of consumers can be used to predict a user’s preferences based on his or her peers in a community. With regards to communities identified for a data set of user profile data, account profile data, or device profile data in a network,“members” of a community may refer to specific users, accounts, or devices for which profile data has been classified into the community. Communities can further overlap if members are allowed to belong to more than one community. This can be done by performing a clustering algorithm multiple times on a graph in iterations, where a node used as the clustering seed on a given iteration is removed from the graph during the subsequent iteration. More information relating to generating overlapping communities from graph structures can be found at:

Li, Min & Chen, Jianer & Wang, Jianxin & Hu, Bin & Chen, Gang. (2008). Modifying the DPCIus Algorithm for identifying Protein Complexes Based on New Topological Structures. BMC bioinformatics. 9 398. 10.1 186/1471 -2105-9-398.

[0029] A“data set” may refer to a collection of related sets of information composed of separate elements that can be manipulated as a unit by a computer. A data set may comprise known data, which may be seen as past data or“historical data.” Data that is yet to be collected, may be referred to as future data or“unknown data” When future data is received at a later point in time and recorded, it can be referred to as“new known data” or“recently known” data, and can be combined with initially known data to form a larger history A data set can be aggregated as a collection of parts or“data packets".

[0030] “Supervised machine learning” or simply“supervised learning” may refer to a type of machine learning process in which a learning algorithm is performed on labeled data. A supervised learning algorithm may be used to infer a function from training data that is already labeled. When given a training data set consisting of inputs and possible outputs, inputs vectors can be mapped to desired output values as a means of learning patterns from the data set. As a simple example, the effect of independent variables (inputs) on a dependent variable (output) can be mapped as a trend line or“regression line” that can be used to make predictions. In the more specific example of image recognition, images labeled based on their contents (e.g. pictures of cats, dogs, faces, etc.) can be mapped against their pixel data (input vectors) so that the contents of new images can be predicted based on learned pixel patterns.

[0031] “Unsupervised learning” may refer to a type of machine learning process in which a learning algorithm is performed on unlabeied data. An unsupervised learning algorithm can be used to identify hidden patterns for inputs and/or groups of inputs. Often times this is performed based on a degree of similarity between data elements. One method of unsupervised learning can be cluster analysis, in which the measure of similarity between data elements can be defined by a Euclidean distance or probabilistic distance. Common clustering algorithms may include hierarchical clustering, k=means clustering, Gaussian mixture models, self-organizing maps, and Hidden Markov models, amongst others. For a topological graph in which data elements are represented as nodes connected by edges assigned a weight, grouping of data can be performed based on the weights of edges connecting nodes.

[0032] A“statistical distribution” or“probabilistic distribution” can be an organization of data that links outcomes in a statistical experiment to their probability of occurrence.

For example, a Gaussian function can be used to generate a normal distribution for recorded data, such as recorded actions of devices in a network (i.e. online events and data packets thereof) so as to map typical behavior within the network.

[0033] An“anomaly” may refer to a deviation from expected behavior A“statistical anomaly” can refer to an occurrence that falls outside expected statistical results. For example, the expected statistical results for a group of data or type of data may be determined from a statistical distribution, statistical trend line, or other statistical pattern

Q previously observed, and a statistical anomaly can be any obtained data that does not adhere to the trends and patterns previously observed for the group. As one example, for a normal distribution, in which a data set is mostly aggregated around a mean value in a symmetric manner (i.e. normally distributed), an anomaly can be an outlier value that deviates too far from the mean, a set of occurrences that form a rare distribution (e.g. atypical deviation), or an unusual occurrence that causes a notable shift in the distribution (i.e. a significant change in the mean and/or standard deviation).

[0034] A“statistical test” may refer to a test used to determine the statistical significance of an observation. For example, a statistical test can be used to test a hypothesis in a probabilistic manner by performing trials and observing how many occurrences conform to expected results. In one form of a statistical test, new data that is obtained can be compared against a probabilistic distribution for a similar data set so as to make probabilistic determinations about the obtained data. As an example, data relating to current login attempts made by devices can be compared against a probabilistic distribution or probabilistic function mapped for previous login attempts, so as to determine the probabilistic context of any one login event being anomalous.

[003S] “Bayesian probability” may refer to a type of statistical test or statistical interpretation in which new or unknown data is treated conditionally based on a previous experiment relating to the particular case being examined (i.e. based on“conditional probability”). As examples, Bayes’ theorem can be used to evaluate the lending risk of potential borrowers based on evidence from past lending (e.g. recent occurrences of credit default), or can be used to evaluate the probability of a particular stock failing based on recent price behavior for a stock index.

[0036] A“test statistic” may refer to a quantity statistically derived from a data set, namely for the purpose of evaluating a statistical test. Examples of test statistics may include, but are not limited to a k-statistic, Z-statistic, t-statistic, and/or an F statistic. In this manner, samples of data can be both interpreted and evaluated, such as in the case of testing a hypothesis, detecting a statistical outlier or anomaly, or for making a prediction.

[0037] A“threat level” may refer to a measure of malicious activity. In the context of computer networking, a threat level may refer to the condition of one or more devices in the network. For example, a threat level may be evaluated as“low”,“high”, or “moderate”, thereby indicating the degree to which a threat may exist. A threat level can further be evaluated for a particular group of devices based on their collective activity in the network.

[0038] A“threshold value” may include a magnitude that needs to be exceeded for a certain result. A threshold value can include a minimum value, a maximum value, and/or an equal value. In decision making, a threshold value can be compared to measured quantities for the purpose of triggering an effect. For example, when determining an appropriate response to an event based on its potential threat, a threat level can be evaluated on a 0-100 scale, 0 being minimal threat and 100 being maximal threat, and an alert can be triggered if the threat level is assessed to be 70 or higher, with 70 being the threshold value for taking action.

[0039] A“risk status” may refer to an evaluated state of riskiness for one or more accounts, devices, users, or entities. For example, a risk status for an account can be evaluated as high risk’ or low risk’ or more simply‘good’ or‘bad’ depending on data attributed to said account. The data attributed to the account may include actions taken by a user relating to the account, which may be identified as more risky or less risky actions over time. In one non-limiting example, a risk status can be evaluated using machine learning, such as by applying learned classifications of risky vs non-risky to

[0040] A“risk feature” may be one or more characteristics believed to be indicative of a particular risk status. For example, suspicious actions taken by a user may be marked as potentially fraudulent or risky, and characteristics of said actions can be considered a risk feature in the context of machine learning,“features" may refer to categories of input values (i.e. attributes) that, when set as variables for the learner (i.e. dimensions), may lead to most accurate predictions. When a risk feature is present in historical data and communicated in a machine learning process, it can be used to better make predictions about the overall riskiness or threat level of a particular event, account, user, entity, device, or group in the future.

[0041] An“optimization process” or“mathematical optimization” may refer to finding a best solution in relation to alternatives and according to set criterion. This may include using an“optimization algorithm.” For example, an optimization algorithm may include a mathematical technique for finding maximum or minimum values within a defined domain or range.

[0042] An“ant colony optimization algorithm” or simply“ant colony optimization” may refer to a technique for using a set of computational agents to search or explore an information space for approximate solutions to an optimization problem by sharing information between agents over cycles. The search by the computational agents for a solution is analogized to ants (i.e. agents) searching for shortest paths to food (i.e. solutions). In one implementation of ant colony optimization involving a topological graph, the computational agents may be programmed to reduce a cost function representing the cost of linking a first type of node (nest) to a second type of type of node (food source) A nest and food source can also be referred to as a“nest node” and“target node” respectively. A link between a nest and a food source is often referred to as a path, having a particular cost or length as defined by the sum of edge- weights included in the path calculated by a given computational agent. At each cycle of the algorithm, the computational agents record path information so as to converge to a solution over time. Path information, typically referred to as“pheromones” in accordance with the ant coiony anaiogy, can be communicated over cycles by altering a probability function that describes the probability that a given computational agent will search a particular path at a subsequent cycle, and thus determining its next position value. In this manner, an ant colony optimization can be used to perform path optimizations within a graph, such as for finding an approximate solution to the traveling salesman problem or similar routing problem in the context of machine learning, path optimizations may refer to detecting/identifying predictive features, as unlabeled data can be nests with their labels as the desired food source. In this manner, the most predictive features or“shortest paths” can be identified and accounted for by inferring and creating new relations in a graph, such as by the re-weighting of edges or creation of new edges. More information relating to ant colony optimization can be found at:

C. Blum, "Ant colony optimization: introduction and recent trends", Rhys. Life Reviews, vol. 2, pp. 353-373, 2005.

[0043] A“command sequence” may refer to a series of actions performed in succession. For example, in the context of computing, a command sequence may be a successive list of unique commands executed in a computer process, such as a sequence of Bash commands or any other unique sequence that executes a particular task when entered into a terminal and translated by the processor

DETAILED DESCRIPTION

[0044] Embodiments described herein can relate to methods, systems, and computers for proactively authenticating users. Such proactive authentication measures can be used to limit any damage caused by imposters within a network. In embodiments, actions taken by users can be recorded as historical information, which can then be used to train a multi-tier artificial intelligence model and/or perform other statistical analyses. An authentication system may be configured to detect anomalies that occur amongst connected devices, and can then use A! models to score and predict the probability of activity being indicative of an imposter or compromised account.

[004S] Upon detecting a threat, the authentication system may further determine potential devices that have been affected based on characteristics shared between a user of the compromised account and other users of other devices within the user’s community. For example, users that are frequently active at a particular location, service provider, and/or point in time can be queried and alerted by the authentication system upon identifying a threat that has potentially spread to the identified users. As one example, a particular mobile application and typical time of day to be using the application may be highly correlated with a specific type of user and thus, a specific type of security vulnerability.

[0046] FIG. 1 shows an illustration of a system for proactively authenticating users according to an embodiment. System 100 may comprise a plurality of users including user 101 , user 102, user 103, user 104, user 105, etc. The users may each be operating one or more computing devices, such as device 1 1 1 , device 1 12, device 1 13, device 1 14, device 1 15, etc, which may communicate with service providers and/or with each other. For example, user 101 may use a smart phone to contact friends through a messaging service or through a social media application. As another example, user 104 may use a smart card to interact with an ATM network and obtain funds. Other examples can include the use of IOT devices, wearable devices, autonomous vehicles, remote sensors, or any other interaction involving users and network enabled devices.

[0047] The computing devices may be configured to communicate with an

authentication system 130 over a communications network 120. For example, the computing devices may be configured to perform a remote authentication process with authentication system 130 in order to use certain controlled applications and/or functions, such as access to user accounts, payment functions, media content, etc. in embodiments, the authentication system 130 may comprise one or more server computers configured to perform the required authentication processes.

[0048] The authentication processes performed at the authentication system 130 may include analyzing data contained in one or more memory stores, such as events cache 130A, historical data database 130B, and graph database 130C. The analysis of said data may be used to evaluate recorded actions and execute appropriate decisions in response. For example, events cache 130A may comprise data for authentication events that can be identified as either typical or anomalous, such as information pertaining to login attempts. As another example, historical data database 130B may comprise a history of transaction data that may be used to determine trends and make predictions about future consumer behavior. In yet another example, graph database 130C may include a plurality of graphs that group users by shared characteristics, which can be used to understand features of different segments within a network and classify groups of computing nodes or accounts associated therewith. The analysis may comprise building an artificial intelligence mode! for identifying security threats and determining potentially affected accounts. The artificial intelligence model can further be compared against new data over time (i.e. evaluation data) so as to assess the accuracy of predictions and provide improved results through the adjustment of individual mode! components.

[0049] As previously explained, the authentication system described may be configured to use the artificial intelligence model to identify compromised accounts and proactively initiate re-authentication for accounts that are similar. Taken as one illustrative example, user 103 may use wearable device 1 13 to access a public WiFi network at a local park. The wearable device 1 13 may further store a payment application that can be used to conduct transactions with the users payment account. The user may typically use the application to make purchases at health food stores once or twice per day on average. At some point in time, a criminal actor may hack into the public WiFi network and use the WiFi network to gain access to the payment application and conduct fraudulent transactions on behalf of the user. In this example, the fraudulent activity of the payment application may be detected as relatively noteworthy or anomalous by authentication system 130. One or more machine learning models can then be used by the authentication system 130 to identify which features of the compromised account are related to a security vulnerability or potential entry point for an attack. [00S0] For example, a machine learning model may identify a correlation between fraud and accounts regularly conducting transactions at health food stores within a specific geographical area. The machine learning model may further identify that the accounts are commonly associated with the use of a fitness watch payment application. These correlations and commonalities may be used to group the accounts into a community, which may be treated as similar and deserving of the same responsive action from the authentication system 130. The authentication system 130 may then query a database for user’s belonging to the community, and may restrict use of their payment credentials until they are able to re-authenticate (e.g. by providing some means of identification to the authentication system, such as a password, PIN, security question, or other suitable identifying information). Thus, the threat of a criminal actor exploiting the particular community of vulnerable users can be contained. In

embodiments, the composition of community groups may be reflective of real-world behavior, thus allowing the authentication system 130 to deduce the scope of an attack based on shared characteristics. In this example, the vulnerable community of users may be a group of users that are frequently active around the local park which may have an unsecured public WiFi network, and were thus likely to be targeted.

Furthermore, the vulnerable community of users may more narrowly have been users at the local park that also use a specific payment application for wearable devices found to possess an exploitable vulnerability.

[00S1] In some embodiments, user activity after re-authentication can be further evaluated by authentication system 130 to learn and correct false alarms or

mischaracterized threats. As an example, a false positive rate can be calculated and used to measure or quantify incorrect identifications of fraud made by the authentication system 130. in an embodiment, a high rate of users re-authenticating on a first attempt or the absence of risky behavior for an extended period of time after re-authentication may be indicative of a high volume of false positives in one embodiment where the authentication system 130 includes multiple machine learning components or machine learning models, the authentication system 130 may further evaluate the accuracy of each machine learning component and may adjust their effect or effective sensitivity on threat level assessments made by the authentication system 130.

[0052] FIG. 2 shows a block diagram of a server computer for performing functions in an authentication system according to embodiments. Authentication system server computer 200 may be a server computer configured to authenticate one or more users over a network and determine a plurality of authentication results. The server computer may be a server computer of authentication system 130 and may be capable of accessing data of events cache 130A, historical data database 130B, and graph database 130C of FIG. 1. The users that may be authenticated may include user 101 102, 103, 104, 105, etc. of FIG. 1 , which can be remotely authenticated over

communications network 120 by server computer 200. in embodiments, the server computer may connect to the communications network via network interface 220, which may allow for the handling of data messages according to one or more communications protocols.

[0053] Authentication system server computer 200 may comprise a processor 210 for executing instructions stored in computer-readable medium 230. Computer-readable medium 230 may store instructions in the form of code, which may further be

programmed as one or more program modules. For example, server computer 200 may comprise data collection module 231 , threat assessment module 232, re- authentication module 233, and feedback module 234.

[0054] Data collection module 231 may be a module including code for collecting, processing, and monitoring incoming data, such as data for authentication requests and metadata associated therewith. The module may include code for an automated data aggregator 231 A, a signal processor 231 B, and an anomaly detector 231 C. Automated data aggregator 231 A may comprise code for receiving and storing data over network interface 220. For example, automated data aggregator 231 A may comprise

instructions for collecting device behavior and/or account login behavior. This may be done in a batch process where data logged at devices is collected over the communications network with a predetermined frequency, such as on a per second, per minute, or per hour basis. Other examples of data that can be aggregated may include transaction data, user profile data, network connection histories, API session and routing logs, sensor data, amongst others.

[0055] Signal processor 231 B may comprise code for processing signals into data that can be analyzed. In one embodiment, signal processor may comprise instructions for tagging command sequences and indexing command sequences that are categorical features for one or more machine learning models, such as risk features used by an A! to predict the presence of a network threat. In another embodiment, signal processor 231 may further comprise instructions for binning collected data. For example, signal processor 231 B may comprise instructions for combining two features into a single feature, so as to reduce the dimensionality of input data during a machine learning process. In one embodiment, signals processed using signal processor 231 B can include continuous signals collected from device sensors (e.g. motion sensors, camera feeds, audio feeds), as in the case of remote sensing at an access device or other device that may require authentication for its intended use, such as that of any IOT type device (smart vehicle, smart home, street light, appliance, etc.). In such an instance, signal processor 231 B may comprise instructions for quantizing signal data through analog to digital conversion, lossy compression, or other suitable technique

[0056] Anomaly detector 231 C may comprise instructions for performing statistical analyses required for detecting a statistical anomaly. In one embodiment, an

aggregation of incoming data that has been processed can be used by the

authentication system server computer 200 to generate probabilistic distributions (e.g. probability mass functions) for one or more categories of input data. A probabilistic distribution can, for example, Include expected occurrences of typical command behavior (i.e. command sequences) of a specific device/account or specific group of devices or accounts in one embodiment, data for an online event can be compared to a probabilistic distribution for a community group associated with the online event to determine if the online event is statistically anomalous. The comparison may include performing a statistical test, in which the event is determined to be anomalous if a measured test statistic is outside a predefined range. For example, a prior probability distribution (e.g. Bayesian prior) for a community group can be recalculated by the server computer when an associated online event occurs, and the online event may be considered anomalous if the mean and standard deviation for the newly calculated distribution falls outside of a predetermined window. A detected anomaly can later be evaluated by the authentication system server computer 200 for its threat level, as further described below.

[0057] Threat assessment module 232A may be a module of code for assessing a threat level based on one or more machine learning processes. This may include machine learning components of event-based risk evaluator 232A, peer-based risk evaluator 232B, and latent risk evaluator 232C. In one embodiment, threat assessment module 232 may comprise code for generating scores from a plurality of machine learning components and calculating a value for a threat level based on the scores generated (e.g using a weight average).

[00S8] Event-based risk evaluator 232A may comprise code for generating a risk score for an online event in one embodiment, the risk score for an online event may be generated using supervised learning. For example, event-based risk evaluator 232A may be used to compare data collected from a login event to a known trend of fraud in order to score the likelihood of risky behavior associated with the login. The data for scoring may comprise a set of information received during the online event which, as examples, may include an IP address, number of login attempts, amount of network traffic at the IP address, etc. In other examples, the online events may be transactions in a payment network or ATM network, and a risk score for an online event may be based on a transaction amount, transaction type, transaction location, time of day, method of payment, type of credit/debit card, etc. [00S9] Peer-based risk evaluator 232B may comprise code for identifying threats through grouping of similar peers. For example, peer-based risk evaluator 232B may comprise instructions for executing a graph learning or other unsupervised learning algorithm, so as to group devices or accounts that may have similar security

vulnerabilities into communities. In one embodiment, peer-based risk evaluator 232B may further comprise code for determining a risk status for a community group associated with an online event. For example, when a current online event is being analyzed, communities can be updated/regenerated using the appropriate learning algorithm, and a community group associated with the online event may be determined to be risky if it has been regenerated to include a high risk node. Thus, when data that is indicative of high risk activity is incorporated into a community as the result of a member of the community (an account) conducting high risk actions, the authentication system server computer 200 can identify the resulting effect on the members effective peers (e.g. similar accounts/users or devices connecting to a network from the same access point). In one embodiment, communities may overlap, such that members can belong to more than one community and can be characterized/classified according to multiple behavior patterns. As an example, a device can be classified according to its most frequent WiFi connections, as well as by the types of programs it has downloaded.

Thus, when the authentication system server computer 200 determines a risk

associated with a peer of either community, the device may potentially be alerted. More information regarding the prediction of network behavior based on overlapping communities may be found in WO Patent Application No. PCT/US2017/041537 titled “Machine Learning and Prediction Using Graph Communities” filed on July 1 1 , 2017 and assigned to the same assignee as the current application, and which is hereby incorporated by reference for ail purposes.

[0060] Latent risk evaluator 232C may comprise code for predicting fraud through the detection of latent features in collected data. In one embodiment, latent features may be inferred by performing an optimization process comprising a set of computational agents (i.e. an agent-based model). For example, a path optimization process, such as ant colony optimization, may be used to determine shortest paths in a graph between data points (nest nodes) linked to fraud (target node), and may further be used to infer a connection (edge) between fraud and other data points in the graph connected to said shortest paths. In this manner, computational agents employed for computing shortest paths to fraud can simulate future attacks by evaluating various paths that an attacker may attempt to exploit. In some embodiments, other metaheuristic techniques for optimization may be used, including evolutionary algorithms, particle swarm

optimization, simulated annealing, harmony search, among others. More information regarding using an agent-based model to detect latent features in a graph through path optimizations can be found in U.S. Patent Application No. 15/590,988 titled

“Autonomous Learning Platform for Novel Feature Discovery” filed on May 9, 2017 and assigned to the same assignee as the current application, and which is hereby incorporated by reference for ail purposes.

[0061] Re-authentication module 233 may be a module of code for initiating a re authentication process at one or more accounts. This may include re-authentication event generator 233A and affected accounts identifier 233B. Re-authentication event generator 233 may comprise code for generating re-authentication events that may restrict access until further authentication at an account has been completed. For example, the re-authentication events may include a process of locking out logged in users at an account and posing a security question controlling further access. In one embodiment, a re-authentication event generator 233A may comprise a challenge- response protocol, and re-authentication events may comprise sending an

authentication challenge to one or more devices and receiving from the devices a desired authentication response.

[0062] Affected accounts identifier 233B may comprise code for identifying one or more accounts associated with an identified threat triggered by an online event.

Affected accounts identifier 233B may further comprise generating a query for devices or network addresses relating to said identified accounts. For example, the query may comprise searching a record of devices or IP addresses from which an account was most recently and/or most frequently accessed. The authentication system server computer 200 may then send an authentication challenge to the queried devices and restrict access to their associated accounts until the correct authentication response is received. In one embodiment, affected accounts may be determined by the

authentication system server computer 200 based on a community group associated with the online event. For example, nodes representing data elements in the online event may be compared to nodes for each community in a list of learned communities.

In one embodiment, the comparison may be made by evaluating a vector similarity score, which may include generating a vector from online event data, generating a vector for a given community in a community list, and calculating an overlap. More information regarding scoring a degree of similarity between samples of data can be found in U.S. Patent Application No. 15/639,094 titled“GPU Enhanced Graph Model Build and Scoring Engine” filed on June 30, 2017 and assigned to same assignee as the current application, and which is hereby incorporated by reference for all purposes

[0063] Feedback module 234 may be a module of code for generating feedback to improve predictions. This may include outcome evaluator 234A and sensitivity calibrator 234B. Outcome evaluator 234A may comprise code for evaluating the accuracy level of predictions based on new data. For example, outcome evaluator 234A may comprise instructions for evaluating authentication results from re- authentication sessions to determine a false-positive rate quantifying a percentage of legitimate users who were asked to unnecessarily re-authenticate. Sensitivity Calibrator 234B may comprise code for adjusting and/or re-weighting components of threat assessment module 232 based on an evaluated accuracy level. For example, sensitivity calibrator 234B may comprise code for re-weighting the significance of latent risk evaluator 232C in assessing a threat level. This may be done based on evaluated outcomes. For example, a particular risk feature identified through ant colony optimization may prove to be less predictive over time and may increase an evaluated false-positive rate, and as a result, the effect or weight of the ant colony model in assessing/scoring a threat may be reduced by authentication system server computer 200

[0064] FIG. 3 shows a swim-lane diagram of a proactive authentication process according to embodiments. Process 300 may involve peer accounts 310, a requesting user 320, a user device 330, and an authentication system 340. The particular process shown may be an embodiment where a threat has been detected. In embodiments, the process may occur over a network, such as communications network 120 of FIG. 1. Authentication system 340 may include one or more server computers for controlling access to user accounts, including server computer 200 of FIG.2. User device 330 may be an device for accessing an account, such as any one of device 1 1 1 , 1 12, 1 13, and 1 14 of FIG. 1. As shown, the requesting user 320 in FIG. 3 may be a legitimate user or a fraudulent user who has illegally obtained legitimate credentials. Peer accounts 310 may be accounts belonging to peers identified as sharing characteristics as classified according to one or more overlapping communities, and may thus share one or more security vulnerabilities.

[0065] At step S301 , an attempt to access an account is initiated by a requesting user 320 at a device 330. For example, the requesting user 320 may submit a username and password for an account into a login form displayed on his or her smart phone. The login session may be considered an online event and may be associated with data that can be logged by the authentication system 340. For example, command sequences executed during the login session may be recorded by device 330 and uploaded to a server computer of authentication system 340, such as authentication system server computer 200 of FIG. 2

[0066] At step S302, an authentication request is submitted from device 330 to authentication system 340. The credentials supplied by the requesting user 320 may be generated into a data message and submitted to a remote server of authentication system 340 for verification and association with the account for which access is requested.

[0067] At step S303, authentication system 340 determines an initial authentication result and logs the event. For example, a server computer of authentication system 340 may query a database for credentials presented by the user (e.g. username and password) so as to verify that the credentials match stored records. The server computer may determine that the credentials received match retrieved records, generate a positive authentication result, and log any data collected from the event.

[0068] At step S304, authentication system 340 submits the authentication result and grants access to the device. For example, a server computer may send an

authentication result of‘credentials verified’ in a data message to the device 330 and may subsequently grant the device 330 access to the requested account, such as by sending data for the account to the device or by permitting any requested downloads.

[0069] At step S305, device 330 communicates the authentication result to the requesting user 320. For example, the device 330 may display a confirmation or may simply grant the requesting user 320 access to the account, such as displaying any loaded data for the account and/or assets stored therein.

[0070] At step S306, user 320 conducts actions from device 330. For example, the user may begin entering inputs into the device 330 for manipulating account data. At step 307, device commands are transmitted from device 330 to the authentication system 340. The authentication system may then receive and monitor the commands.

[0071] At step 308, the authentication system 340 performs one or more statistical tests. In embodiments, the authentication system 340 may detect an anomaly relating to the account in question as a result of the statistical test. In one embodiment, the statistical test may comprise comparing a distribution of command sequences logged for a community group associated with the initial login session in one embodiment, one or more peer groups associated with the login session may be determined using a graph learning algorithm in which data for online events are grouped into communities based on historical data.

[0072] At step 309, the authentication system 340 applies data for the online event to a risk model. For example, an IP address may be scored against a supervised learning model to generate a risk score. At step 310, communities are regenerated from historical data and data for the online event. For example, relations between the historical data and data for the online event can be updated in a graph database to which a graph learning algorithm is applied. At step 31 1 , a risk status for the community associated with the event is determined. For example, the community group associated with the online event may be scored against risk features predictive of either a risk status of‘good or‘bad. At step 312, the authentication system 340 searches for new risk features. For example, the authentication system 340 may use an optimization technique to infer new predictive features, and may then generate an additional risk score for the community.

[0073] At step 313, a threat level for the online event is assessed. The threat level may be assessed based on a plurality of machine learning models. For example, the threat level may be a weighted average of scores generated in step 309 through 312. The threat level may then be compared to a threshold value for triggering re

authentication events.

[0074] At step 314, community peers affected by the determined threat are queried in embodiments, members of the community group associated with the online event may be identified, and authentication system 340 may query for device identifiers and/or network addresses associated with the community members. At step 315, re- authentication challenges are posed to peer accounts 310. For example, a challenge- response session may be initiated at the devices queried in step 314 that restricts access until a security question is answered correctly. At step 316, re-authentication responses are received from the peer accounts by the authentication system 340. For example, users at peer accounts 310 may attempt to answer the security question posed by the authentication system 340 in step 315.

[0075] At step 317, outcomes are evaluated by the authentication system 340. The authentication system 340 may evaluate the success of peers to re-authenticate and may determine its accuracy level in identifying the group as being vulnerable. For example, the authentication system 340 can determine a false-positive ratio for assessing how many legitimate and authorized users were forced to re-authenticate due to the authentication system 340 identifying a particular anomaly as a threat.

[0076] At step 318, sensitivities for machine learning components are adjusted based on the evaluated outcomes. For example, the authentication system 340 may reduce the effect of a particular machine learning model when assessing/scoring a threat level associated with a given anomaly. As an example, it may be determined that a risk feature identified using the optimization process in step 312 may be less predictive over time, and thus the authentication system 340 may reduce the effect of the optimization process in assessing a threat by re-weighting the component or by adjusting a model sensitivity.

[0077] FIG. 4 shows a diagram of a process flow at an authentication system according to an embodiment initially, a user requests authentication at 401. For example, the requesting user may submit credentials to the authentication system through a login form or access device.

[0078] Information from the user is collected at 402. This may include the credentials submitted by the user to the authentication system as well as device information (e.g. IP address, location, etc.). Collected information may be stored in a knowledge base at 404. The knowledge base may comprise one or more databases, such as database 130A, 130B, and/or 130C of FIG. 1. The collected information stored in the knowledge base may additionally be processed for assessing a threat to particular accounts and initiating re-authentication as described in greater detail further below. [0079] The authentication system may determine if the user is known at 403. if the requesting user is not known (i.e. invaiid credentials are submitted), then the

authentication system may return‘authenticated = false’ at 422 indicating that the user was not successfully authenticated for the requested account if, however, the user is known (i.e. valid credentials), collected information may be compared to historical data at 419 and may be applied to determine an authentication score 420. For example, information regarding known instances of fraudulent use of accounts may be used to score the likelihood that the user is authentic. Additionally, the collected information may be used to regenerate communities at 418. At 421 , it is determined if the authentication score exceeds a threshold, in which case the authentication system returns‘authenticated = true’ at 423. Otherwise, the authentication system returns ‘authenticated = false’ at 422.

[0080] As previously mentioned, a signal for online events including requested authentication events at accounts can be processed at 405. When a given signal is collected by the authentication system, it is determined if the signal has caused a statistical shift in recorded behavior for one or more accounts at 406. if a signal change has been detected, a risk analysis can be initiated at 408. Otherwise, the online event can simply be logged at 407.

[0081] In embodiments, a risk analysis may involve a multi-tiered A! model. This may include applying a supervised risk model to an online event, determining a risk status for a community group or peer group relating to an associated account using an

unsupervised model, and identifying risk features using a metaheuristic optimization technique such as ant colony optimization. As such, a risk score can be determined using one or more risk models at 409. A community for the account associated with a given online event can be evaluated at 410 And, results from an ant colony

optimization process may be aggregated at 41 1. The outputs from 409, 410, and 41 1 may each be weighted and used to assess a threat level associated with the online event at 412. Additionally, the ant colony optimization process can be restarted at 413. [0082] It is determined if the threat level is assessed to be high enough to trigger an alert at 414. In one embodiment, this may be determined relative to a predetermined threshold value established at the authentication system. For example, the threat level may be assessed on a 0-100 scale, and only a threat level above a level/score of 70 may trigger an alert.

[0083] If the threat level is above the predetermined threshold value, a query for associated peers can be performed at 416. Otherwise, the online event can simply be logged by the authentication system at 415. In one embodiment, the query for associated peers performed at 416 may be based on one or more communities associated with the online event. For example, the online event may occur at an account that is associated with a particular group of accounts, users, or devices thereof (e.g. a particular consumer group, particular type of device, location, time of day, network, etc.), which may prompt the authentication system to query for other accounts in the group as they may have also been affected by the determined threat.

[0084] Once the associated peers in a community have been queried, re- authentication for the peers may be initiated at 417. For example, users accessing peer accounts at devices may be asked to provide credentials in order to continue accessing said accounts. As such, in embodiments, accounts can be authenticated proactively and thus contain a threat and prevent attackers from further misusing any affected accounts.

[0085] FIG. 5 shows a depiction of identifying new paths in a graph according to an embodiment. Shown are paths in a graph linking device communities to a risk status of either‘good’ or‘bad’ based on recorded actions. In embodiments, the paths may be considered predictive risk features linking device communities to a particular risk status. Here, data in a network may be expressed as nodes, and the relationship between nodes can be expressed as edges. In embodiments, edges can further have weights defining the strength of a relationship between nodes. Here, the edges are labeled based on the percentage of community members that move from a first node to the second node for the edge. In other words, the edges are weighted based on the probability that a community will perform a set of actions.

[0086] As can be seen, at a first time period 1 510, device communities 1 51 1 , community 2 512, and community 3 513 may be linked to actions which are related to a status of‘good 514 or non-risky behavior. Then, at second time period 520 the status for community 3 523 may shift towards fraudulent behavior, in which 5 % of active devices in community 3 523 are compromised and linked to a risk status of‘bad’ 525 as determined by risky actions taken by the devices. At a third time period 3 530, computational agents of an ant colony optimization algorithm (i.e. hive learners) may find a shortest path (risk feature 540) between actions of community 2 532 to the nearest risk status, which is the risk status of bad‘535’. As such, devices of community 2 532 may be identified as risky by the authentication system.

[0087] FIG. 6 shows a depiction of overlapping communities according to an embodiment. Graph 600 may be a graph for login events in a network, such as in communications network 120 of FIG. 1. Further, the network may be a network that is monitored by authentication system 130 of FIG. 1 , for identifying threats and initiating authentication events. In graph 600 shown, nodes for data elements relating to login events are represented in relation to one another by edges. This may include nodes for specific wireless routers, IP addresses, device identifiers, users, locations, and login times, which are connected by edges to express how often they appear together in any given login event with the network. According to embodiments, nodes can be grouped into communities based on weight, such that each community may include nodes that are highly connected. In this example, the weight of an edge between two nodes may be the probability of the two nodes occurring together in a login event.

[0088] For graph 600 shown, the weights are shown in Euclidean space, with nodes being more highly correlated if they are separated by a shorter distance. As such, the communities may be generated using an appropriate graph learning algorithm for generating overlapping communities of dense linkage, as previously referred to. For example, the graph learning algorithm may comprise generating communities by iteratively performing a clustering algorithm on the graph, with the most highly connected nodes in the graph acting as seed nodes for each community. In the example shown, the seed nodes may be nodes for specific wireless routers to which users may connect to, such as router -1 601 , router-2 602, and router-3 603.

[0089] From these communities, users can be treated similarly by an authentication system according to their shared characteristics and network behavior. For example, user X 605 and user Y 606 may both belong to community A 610, which may be based, in part on their network behavior being highly correlated to router-1 601. if an attacker were to infiltrate the network through router-1 601 and compromise any associated devices, the authentication system may force re-authentication for all devices belonging to community A 610, including devices of user X 605 and devices of user Y 606. In addition, the authentication system may further identify any risk features associated with an attack, which may affect peers of another community that is found to be associated with the online event. For example, an online event may include command sequence 604, which may be actions executed by a malicious program. The command sequence may be associated with community B 620, which includes user Z 606 As such, user Z 606 may also be forced to re-authenticate.

[0090] Embodiments of the invention provide a number of technical advantages. For example, the authentication system may continuously monitor online events for anomalies that may warrant further investigation, such as suspicious sequences of commands executed by devices. The system can investigate anomalous behavior using multiple predictive models, which in combination may identify threats associated with different types of risk that occur in a network. Unlike other authentication systems, the authentication system described allows for identifying compromised accounts and devices prior to any indicative behavior being observed, and can do so for an entire group of related accounts rather than individually. In addition, the system can be corrected and mitigated against false-positive alerts by calibrating the different components of a multiple component artificial intelligence system.

[0091] As described, the inventive service may involve implementing one or more functions, processes, operations or method steps. In some embodiments, the functions, processes, operations or method steps may be implemented as a result of the execution of a set of instructions or software code by a suitably-programmed computing device, microprocessor, data processor, or the like. The set of instructions or software code may be stored in a memory or other form of data storage element which is accessed by the computing device, microprocessor, etc. In other embodiments, the functions, processes, operations or method steps may be implemented by firmware or a dedicated processor, integrated circuit, etc.

[0092] Any of the software components or functions described in this application may be implemented as software code to be executed by a processor using any suitable computer language such as, for example, Python, Java, C++ or Perl using, for example, conventional or object-oriented techniques. The software code may be stored as a series of instructions, or commands on a computer-readable medium, such as a random access memory (RAM), a read-only memory (ROM), a magnetic medium such as a hard-drive or a floppy disk, or an optical medium such as a CD-ROM. Any such computer-readable medium may reside on or within a single computational apparatus, and may be present on or within different computational apparatuses within a system or network.

[0093] While certain exemplary embodiments have been described in detail and shown in the accompanying drawings, it is to be understood that such embodiments are merely illustrative of and not intended to be restrictive of the broad invention, and that this invention is not to be limited to the specific arrangements and constructions shown and described, since various other modifications may occur to those with ordinary skill in the art. [0094] As used herein, the use of "a", "an" or "the" is intended to mean "at least one", unless specifically indicated to the contrary.