Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
AUTOMATED PREDICTIVE MODELING AND FRAMEWORK
Document Type and Number:
WIPO Patent Application WO/2017/139237
Kind Code:
A1
Abstract:
Systems and methods of a predictive framework are provided. The predictive framework comprises plural neural layers of adaptable, executable neurons. Neurons accept one or more input signals and produce an output signal that may be used by an upper-level neural layer. Input signals are received by an encoding neural layer, where there is a 1:1 correspondence between an input signal and an encoding neuron. Input signals are received at the encoding layer and processed successively by the various neural layers. An objective function utilizes the output signals of the topmost neural layer to generate predictive results for the data set according to an objective. In one embodiment, the objective is to determine the likelihood of user interaction with regard to a specific item of content in a set of search results, or the likelihood of user interaction with regard to any item of content in a set of search results.

Inventors:
SHAN YING (US)
HOENS THOMAS RYAN (US)
JIAO JIAN (US)
WANG HAIJING (US)
YU DONG (US)
MAO JC (US)
Application Number:
PCT/US2017/016759
Publication Date:
August 17, 2017
Filing Date:
February 06, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MICROSOFT TECHNOLOGY LICENSING LLC (US)
International Classes:
G06N3/04; G06Q30/02
Foreign References:
US20140279773A12014-09-18
Other References:
LI DENG ET AL: "Deep Learning: Methods and Applications", FOUNDATIONS AND TRENDS IN SIGNAL PROCESSING, vol. 7, no. 3-4, 30 June 2014 (2014-06-30), pages 197 - 387, XP055365440, ISSN: 1932-8346, DOI: 10.1561/2000000039
KAIMING HE ET AL: "Deep Residual Learning for Image Recognition", 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 10 December 2015 (2015-12-10), arXiv, pages 1 - 12, XP055353100, Retrieved from the Internet [retrieved on 20170309]
YOAV GOLDBERG: "A Primer on Neural Network Models for Natural Language Processing", 5 October 2015 (2015-10-05), XP055273933, Retrieved from the Internet [retrieved on 20160520]
Attorney, Agent or Firm:
MINHAS, Sandip et al. (US)
Download PDF:
Claims:
Claims

1. A computer-implemented framework for providing predicted results, comprising: a plurality of neural layers comprising a plurality of middle neural layers and an encoding layer, wherein:

each neural layer comprises a plurality of neurons, each neuron comprising an executable object that accepts one or more inputs and generates an output; and

the encoding layer is the first neural layer, the encoding layer comprising a plurality of encoding neurons having a 1 : 1 correspondence with a plurality of input signals for a set of data to evaluate; and

an objective function that, in execution, determines predicted results from the output signals of a topmost neural layer of the plurality of neural layers according to a predetermined objective;

wherein, in execution, the framework obtains input signals for a set of data to evaluate by way of the encoding layer and processes the input signals successively through the plurality of neural layers to a topmost neural layer; and

wherein the objective function determines the predicted results from the output signals of the topmost neural layer according to the predetermined objective and provides the predicted results to a requesting party.

2. The computer-implemented framework of Claim 1, wherein, in execution, the framework further:

iteratively processes a plurality of sets of training data and provides predicted results from each set of training data to a results analysis process;

receives corrective data from the results analysis process; and

back-propagates the corrective data through the framework to update a current model of the framework.

3. The computer-implemented framework of Claim 2, wherein the current model of the framework is a validated model that meets predetermined accuracy thresholds.

4. The computer-implemented framework of Claim 3, wherein a second neural layer comprises a condensing layer, the condensing layer comprising a plurality of condensing neurons having a 1 : 1 correspondence with a plurality of output signals from the plurality of encoding neurons of the encoding layer as input signals to the condensing neurons.

5. The computer-implemented framework of Claim 4, wherein each condensing neuron is configured to reduce the dimensionality of the input signal into a denser form.

6. The computer-implemented framework of Claim 4, wherein a third neural layer comprises a residual layer, the residual layer comprising a plurality of residual neurons having a 1 : 1 correspondence with a plurality of output signals from the plurality of condensing neurons of the condensing layer as input signals to the residual neurons.

7. The computer-implemented framework of Claim 6, wherein each residual neuron is configured to add back elements of an input signal after two layers of rectified linear operations (ReLU) are applied to the input signal, thereby identifying a per-element maximum operator.

8. The computer-implemented framework of Claim 1, wherein the predetermined objective of the objective function is to determine the likelihood of user interaction with regard to a specific item of content in a set of search results.

9. The computer-implemented framework of Claim 1, wherein the specific item of content in the set of search results comprises an advertisement in the set of search results.

10. The computer-implemented framework of Claim 1, wherein the specific item of content in the set of search results comprises a sponsored search result in the set of search results.

11. The computer-implemented framework of Claim 1, wherein the predetermined objective of the objective function is to determine the likelihood of user interaction with regard any item of content in a set of search results.

12. A computer system configured to generate predicted results with regard to base data, the computer system comprising a processor and a memory, wherein the processor executes instructions stored in the memory as part of or in conjunction with additional executable components to generate the predicted results, comprising:

an executable predictive framework, wherein the predictive framework comprises:

a plurality of neural layers comprising a plurality of middle neural layers and an encoding layer, wherein: each neural layer comprises a plurality of neurons, each neuron comprising an executable object that accepts one or more inputs and generates an output; and

the encoding layer is the first neural layer and comprises a plurality of encoding neurons having a 1 : 1 correspondence with a plurality of input signals for a set of data to evaluate; and

an objective function that, in execution, determines predicted results from the output signals of a topmost neural layer of the plurality of neural layers according to a predetermined objective;

wherein, in execution, the framework obtains input signals for a set of data to evaluate by way of the encoding layer and processes the input signals successively through the plurality of neural layers to a topmost neural layer; and

wherein the objective function determines the predicted results from the output signals of the topmost neural layer according to the predetermined objective and provides the predicted results to a requesting party; and a results analysis module, wherein in execution the results analysis module accepts predicted results for a set of data, determines the accuracy of the predicted results in view of actual results regarding the set of data, and generates corrective data for the framework according to the predicted results and the actual results; wherein the predictive framework back-propagates the corrective data through the framework to update a current model of the predictive framework.

13. The computer system of Claim 12, wherein, in execution, the information service module further:

determines a user profile of the first user according to the dynamic user identifier of a plurality of user profiles associated with the first user; and

stores the user information in the data store in association with the determined user profile of the first user.

14. The computer system of Claim 13, wherein the current model of the predictive framework is a validated model that meets predetermined accuracy thresholds of predicted results.

15. A computer-implemented method for determining predicted results for a set of data, the method comprising: providing an executable predictive framework having a validated model, the executable predictive framework comprising:

a plurality of neural layers comprising a plurality of middle neural layers and an encoding layer, wherein:

each neural layer comprises a plurality of neurons, each neuron comprising an executable object that accepts one or more inputs and generates an output; and

the encoding layer is the first neural layer and comprises a plurality of encoding neurons having a 1 : 1 correspondence with a plurality of input signals for a set of data to evaluate; and

an objective function that, in execution, determines predicted results from the output signals of a topmost neural layer of the plurality of neural layers according to a predetermined objective;

wherein, in execution, the framework obtains input signals for a set of data to evaluation by way of the encoding layer and processes the input signals successively through the plurality of neural layers to a topmost neural layer; and

wherein the objective function determines the predicted results from the output signals of the topmost neural layer according to the predetermined objective and provides the predicted results to a requesting party;

obtaining input signals corresponding to a set of data for processing by the predictive framework;

processing the set of data by the predictive framework;

obtaining the predictive results from the objection function; and

providing the predictive results to a requesting party.

Description:
AUTOMATED PREDICTIVE MODELING AND FRAMEWORK

Background

[0001] A challenge for search engine providers is to be able to predict the likelihood that a person will interact with a given item of content or, stated in the context of multiple items of a search results page, which of all the search results and/or items of content in a search results page will the person select or interact with?

[0002] In order to generate probabilities/likelihoods regarding user interaction with any give item of content on a search results page, or a specific item of content, search engine providers utilize and combine a wide range of criteria, conditions, and/or factors into a formula (or set of formulae) to generate the various probabilities and likelihoods of user interaction. However, while the results (probabilistic determinations) of the formulas are determined according to query logs and training models, the formulae are human crafter by one or more persons. These "encoders" select from the various signals (i.e., the criteria, conditions, and/or factors that are available to the search engine in regard to receiving a query) and, based on their expertise and intuition, determine how to interpret, condition, combine and weight the selected signals, and then produce the formula (or formulae) that generates a number representing a probability or likelihood of user interaction.

[0003] Obviously, when the formulae are generated according to a particular person's (or group of persons') expertise, experience and intuition, expansion and modifications are, also, the product of intuition and experimentation. Also, if an encoder leaves the group or company, a void is created and institution knowledge regarding "how" and "why" a formula is crafted in a certain manner is often lost. For example, if a particular signal is no longer available, or if additional signals become available, modifying a given formula requires the expertise, experimentation, and intuition that was needed to generate the formula in the first place, and the person that originally crafted the formula may or may not still be available to assist. In short, these human-crafted formulae are fragile and unmanageable.

Summary

[0004] The following Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. The

Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

[0005] According to aspects of the disclosed subject matter, systems and methods for providing a predictive framework are provided. The predictive framework comprises plural neural layers of adaptable, executable neurons. Neurons accept one or more input signals and produce an output signal that may be used by an upper-level neural layer. Input signals are received by an encoding neural layer, where there is a 1 : 1

correspondence between an input signal and an encoding neuron. Input signals for a set of data are received at the encoding layer and processed successively by the plurality of neural layers. An objective function utilizes the output signals of the topmost neural layer to generate predictive results for the data set according to an objective. In one

embodiment, the objective is to determine the likelihood of user interaction with regard to a specific item of content in a set of search results, or the likelihood of user interaction with regard to any item of content in a set of search results.

[0006] According to additional aspects of the disclosed subject matter, a

computer-implemented framework for providing predicted results is presented. The framework comprises a plurality of neural layers including a plurality of middle neural layers and an encoding layer. The framework also comprises an objective function.

Regarding the plurality of neural layers, each neural layer comprises a plurality of neurons, where each neuron is an executable object that accepts one or more inputs and generates an output. The encoding layer is the first neural layer and comprises a plurality of encoding neurons having a 1 : 1 correspondence with a plurality of input signals for a set of data to evaluate/process. The objective function, in execution, determines predicted results from the output signals of a topmost neural layer of the plurality of neural layers according to a predetermined objective. In operation, the objective function determines the predicted results from the output signals of the topmost neural layer according to the predetermined objective and provides the predicted results to a requesting party.

[0007] According to further aspects of the disclosed subject matter, a computer system configured to generate predicted results with regard to base data is presented. The computer system includes a processor and a memory, where the processor executes instructions stored in the memory as part of or in conjunction with additional executable components to generate the predicted results. The additional components include an executable predictive framework and a results analysis module. Regarding the framework, the framework includes a plurality of neural layers comprising a plurality of middle neural layers and an encoding layer. Each neural layer comprises a plurality of neurons, where each neuron is an executable object that accepts one or more inputs and generates an output. The encoding layer is the first neural layer and comprises a plurality of encoding neurons having a 1 : 1 correspondence with a plurality of input signals for a set of data to evaluate. The objective function determines predicted results from the output signals of the topmost neural layer of the framework according to a predetermined objective. In operation, the framework obtains input signals by way of the encoding layer and processes the input signals successively through the plurality of neural layers to the topmost neural layer where the objective function takes the output signals and generates predictive data. Additionally, a results analysis module obtains the predicted results for a set of data, determines the accuracy of the predicted results in view of actual results regarding the set of data, and generates corrective data for the framework according to the predicted results and the actual results. The corrective data is then back-propagated through the framework to update the current model of the predictive framework.

[0008] According to still further aspects of the disclosed subject matter, a computer- implemented method for determining predicted results for a set of data is presented. The method includes the step of providing an executable predictive framework, where the framework operates according to a validated model. The framework comprises a plurality of neural layers comprising a plurality of middle neural layers and an encoding layer. Each neural layer comprises a plurality of neurons, where each neuron is an executable object that accepts one or more inputs and generates an output. The encoding layer is the first neural layer and comprises a plurality of encoding neurons having a 1 : 1

correspondence with a plurality of input signals for a set of data to evaluate. An objective function determines predicted results from the output signals of the topmost neural layer of the plurality of neural layers according to a predetermined objective. In execution, the framework obtains input signals for a set of data to evaluation by way of the encoding layer and processes the input signals successively through the plurality of neural layers to a topmost neural layer. The method further includes obtaining input signals corresponding to a set of data for processing by the predictive framework. The set of data is processed by the predictive framework. The predictive results are obtained from the objection function and provided to a requesting party

Brief Description of the Drawings

[0009] The foregoing aspects and many of the attendant advantages of the disclosed subject matter will become more readily appreciated as they are better understood by reference to the following description when taken in conjunction with the following drawings, wherein:

[0010] Figure 1 is a pictorial diagram illustrating an exemplary network environment suitable for implementing aspects of the disclosed subject matter;

[0011] Figure 2 is a block diagram illustrating the exemplary framework with regard to training and validation of the framework;

[0012] Figure 3 is a block diagram illustrating an exemplary environment in which predictive results are provided to a computer user in regard to a request data set

[0013] Figure 4 is a flow diagram illustrating an exemplary routine suitable for training a predictive framework, such the framework of Figure 1, according to aspects of the disclosed subject matter;

[0014] Figure 5 is a flow diagram illustrating an exemplary routine for providing predictive results in regard to an actual set of data;

[0015] Figure 6 is a block diagram illustrating an exemplary computer readable medium encoded with instructions to generate predictive results of a request data set; and

[0016] Figure 7 is a block diagram illustrating an exemplary computing device configured to provide framework for generating predictive results according to aspects of the disclosed subject matter.

Detailed Description

[0017] For purposes of clarity and definition, the term "exemplary," as used in this document, should be interpreted as serving as an illustration or example of something, and it should not be interpreted as an ideal or a leading illustration of that thing. Stylistically, when a word or term is followed by "(s)", the meaning should be interpreted as indicating the singular or the plural form of the word or term, depending on whether there is one instance of the term/item or whether there is one or multiple instances of the term/item. For example, the term "user(s)" should be interpreted as one or more users.

[0018] By way of definition as used herein, the term "neuron" refers to an executable object (a software object, a hardware object, or a combination of the two) that accepts one or more inputs and generates an output. Generally speaking, a neuron performs interpretive, combinatorial, and transformative operations of the various input signals to produce and output signal. Each neuron executes according to a set of predetermined functions (which may be unique or shared among other neurons of the neural layer in which it resides) in order to identify some characteristic from the inputs. Each neuron also includes state information, reflecting the current state of functionality as well as weightings assigned to input signals. Neurons may also include prior state information reflecting previous states of the neurons. As a whole, the neurons of each neural layer collectively comprise a current model of the predictive framework which governs how predicted data is generated from input signals.

[0019] The term "neural layer" refers to an arrangement or collection of neurons that accepts inputs/signals from neurons of a lower level (or input/signals that have not yet been processed by a neural layer) and generates outputs/signals for an upper layer.

Generally, neurons of the same neural layer do not interact with each other with regard to signals, i.e., one neuron of a neural layer does not provided input to or utilize output from another neuron of the same neural layer.

[0020] In contrast to a human-crafted formula for generating predictive results, a framework for automated predictive modeling and generation is presented.

Advantageously, input is accepted without regard to human combinatorial efforts.

According to aspects of the disclosed subject matter, deep learning techniques employing multiple layers of neural networks are utilized to automatically combine, transform and process multiple input signals in order to identify probabilities of user interaction with an item of content, such as (by way of illustration and not limitation): search results, advertisements, information links, a search results page, and the like. Additionally, while much of the discussion of the disclosed subject matter is cast in regard to predicting user interaction with regard to one or more items of content among search results, it is anticipated that the disclosed subject matter may be suitably applied to other scenarios in which predictive results are desired, utilizing multiple neural layers that include an encoding layer having a 1 : 1 correspondence between input signals and encoding neurons, training models, a suitably configured objective function and back-propagation of results. By way of illustration and not limitation, the objective function of a predictive framework may be suitably configured to conduct regression analysis, classifications, and rankings in a variety of fields including image processing, speech processing, finance, and the like.

[0021] As suggested above, while the number of possible combinations of signals increases logarithmically with human-crafted formulae, according to the disclosed subject matter the number of human-crafted elements (i.e., neurons) increases linearly with the number of signals used. More particularly, rather that utilizing human-crafted and curated combinations, to input signals into the framework the disclosed subject matter utilizes encoding neurons in a 1 : 1 (one to one) correspondence with input signals. In other words, for each input signal, there is an encoding neuron in what is referred to as the encoding layer. An encoding neuron is associated with a particular signal and encodes that input signal into a format that can be utilized by the other neural layers of the framework, irrespective of any other input signal that is input to the framework. Indeed, combinations of signals in order to identify the objective (e.g., the likelihood of user interaction with an item of content) are automatically generated and processed through the various neural layers of the framework. As will be discussed below, the processing of input signals causes the framework to refined its operations in light of the predetermined objective according to training data set and validated according to validation data sets.

[0022] Turning then to the figures, Figure 1 is a block diagram illustrating an exemplary framework 100 for modeling and generating predictive information regarding whether a person/computer user will interact with an item of content, or with one or more items of content in a search results page. According to aspects of the disclosed subject matter, the framework 100 includes a plurality of neural layers 102-112 and an objective function 114. Generally speaking, each of the neural layers comprise a plurality of neurons, such as neurons 118, ei-en, Ci-Cm, and Ri-R 0 . As discussed above, each neuron executes according to a set of predetermined functions (which may be unique or shared among other neurons of the network layer in which it resides) in order to identify some characteristic from the inputs that it receives. The output of a neuron from a lower level is the provided as input to one or more neurons of the neuron layer immediately above, where "above" and "below" (and "upper" and "lower") refer to processing order of neural layers. For example with regard to box 116, neurons L1-L3 are on a lower neural layer 120 and the output of these neurons are input signals to neurons U1-U3 on the upper neural layer 122. In this example, the lower neural layer 120 is "below" the "upper" neural layer 122 in regard to the processing flow of data.

[0023] In regard to the input that each neuron receives, each of the various neurons may individually associate and control the weight that is attributed to the input signals that the neural receives. For example, neuron Ui, which receives inputs from neurons L1-L3, may associate different weighting values to of the input signals, i.e., neuron Ui may weight (i.e., assign a weigh to) the input signal from neuron L2 as zero, effectively ignoring the input altogether, and associate other weights to the input signals from neurons Li and L3.

[0024] As indicated above, beginning with input signals Si-Sn, these signals are processed through the various neural levels of the framework 100, i.e., a deep learning framework, each neural layer producing more and more abstract results (output signals) until, at the topmost level, the objective function 114 utilizes (according to its own functions, weighting factors, and heuristics) the output signals of the topmost neural layer 112 according to a predetermined objective, e.g., to generate a likelihood of user interaction with one or more items of content within a set of search results or, alternatively, a likelihood of user interaction with regard to a particular item of content in a set of search results.

[0025] By way of illustration and not limitation, input signals Si-Sn may correspond to a variety of factors including: user identification (i.e., the user to which the set of search results are presented); query; entity/subject matter of the query; keywords; advertiser; advertisement campaign; time of day; day of week; holiday; season; gender of the user; advertisement; and the like. Indeed, these signals often correspond to the same signals that encoders use, in combination, to generate their predictions as to user interaction and, in many cases, correspond to thousands of signals for consideration. These signals are inputs to a first neural layer 102, referred to as an encoding layer.

[0026] As indicated above and according to aspects of the disclosed subject matter, the encoding layer 102 typically has a 1 : 1 correspondence between input signals and neurons. Each neuron ei-en is encoded such that it can receive the corresponding input signal and encode it such that the data/input signal can be utilized by other neurons in the

framework 100. By way of illustration and not limitation, an input signal representing a query submitted by a computer user may be encoded into 3 -letter N-grams and mapped into a matrix of all potential 3-letter N-grams (a matrix or vector of approximately 56,000 entries.) Of course, the content of certain input signals, such as an input signal representing the time of day, may not require encoding. In such cases, the encoding neuron may be optionally excluded or be encoded to do nothing but to pass the

information through.

[0027] A condensing layer 104, sometimes called a stacking layer, optionally included and positioned just above the encoding layer 102, is provided to reduce the sparsity or data size of the output signals from the encoding neurons ei-en. Typically, the condensing layer 104 also has a 1 : 1 correspondence between input signals (i.e., the output signals of the encoding neurons) and condensing neurons Ci-Cn. By way of illustration and with regard to the example above regarding the matrix/vector of 3-letter N-grams, the matrix of N-grams will typically be extremely sparse matrix: i.e., perhaps less than a dozen non-zero entries in an otherwise empty matrix of 56,000 entries. Obviously, an input signal that is sparse requires substantial processing and storage concerns. Accordingly, the

corresponding condensing neuron of the condensing layer is tasked with reducing the sparsity or size of the input signal into a denser format without a loss of fidelity, i.e., the condensed signal accurately represents the underlying information. Of course, just as with the encoding neurons, the content of certain input signals, such as an input signal representing the time of day, may not require condensing. In such cases, the condensing neuron of a condensing layer may be optionally excluded or be encoded to do nothing but to pass the information through.

[0028] A third, residual layer 106 is optionally included in the framework 100 and resides immediately above the condensing layer 104. The unique structure of each residual neuron Ri-R n (where n denotes the number of input signals) is to add back elements of an input signal after two layers of rectified linear operations (ReLU) to identify a per-element maximum operator, as shown in Figure 1 as the exploded diagram of residual neuron Ri. In similar manner with the previous 2 neural layers, the residual layer typically has a 1 : 1 correspondence between input signals (i.e., the output signals of the condensing neurons) and residual neurons Ri-Rn.

[0029] While the first layers described above typically comprise a 1 : 1 correspondence between input signals and layer neurons, the "middle" neural layers are not so constrained. Indeed, each middle neural layer may comprise many more neurons than there are input signals for the layer and each neuron may accept any number of the available input signals. According to aspects of the disclosed subject matter, a suitably configured framework, such as framework 100, may include a plurality of middle neural layers.

While framework 100 illustrates five middle neural layers, including neural layers

108-112, the disclosed subject matter is not so constrained. Indeed, as the ellipses indicate, there may be any number of middle neural layers, though typically at least three are utilized to achieve the desired results.

[0030] Additionally and according to further embodiments of the disclosed subject matter, one or more neurons of a middle neural layer receives all of the output signals of the previous, lower neural layer. Indeed, it may be common that each of the neurons of a given middle neural layer receive each of the output signals of the previous neural layer as input signals. Typically, though not exclusively, neurons of a given neural layer are encoded to process the input signals in a manner that is unique among the other neurons of the neural layer. Various weightings assigned to input signals, combinatorial directives and processing, and other encoding provide diversity among the various neurons of a neural layer. Further still, the number of neurons in a first middle level, such as neuron level 108, need not match the number of neurons in another middle level, such as neuron levels 110 or 112.

[0031] The objective function 114 accepts the output signals of the neurons of the topmost neural layer 112 and performs the final analysis of the signals to produce an output: a likelihood of user interaction with regard to the basis data (i.e., the basis from which the input signals Si-Sn are drawn.) In regard to an item of content of a search result, the produced output is the likelihood of user interaction with regard to that item, or in regard to a search results page, the produced output is the likelihood of user interaction with an item of content within the search results page.

[0032] In order to ensure that the exemplary framework 100 generates reliable results, the framework must be trained. Turning to Figure 2, Figure 2 is a block diagram illustrating the exemplary framework 100 with regard to training and validation of the framework. Indeed, in order to train the framework, one or more sets of training data 202-206 are utilized. More particularly, input signals corresponding to the sets of training data are identified and supplied to the framework, which processes the data as set forth above. The objective function 114, based on the processing of the input signals, generates predicted results 208. These predicted results are then supplied to a results analysis service 212 that access the actual results corresponding to the current set of training data, such as actual results 218. Based on the actual results, the results analysis service 212

determines/identifies corrective data with regard to the predicted results, and provides the corrective data 210 to the framework 100 (via the objective function 114) with regard to the current set of training data. Corrective data 210 includes indications as to what is incorrect with regard to the predicted results 208. For its part, the framework back- propagates the corrective data "down" through the neural layers, and in so doing, corrections to the operation of the various neurons of the various middle neuron layers may take place. These corrections may include associating different weights to the various input signals, transforming and/or combining results in a new manner, and the like.

[0033] As indicated, multiple sets of training data are used to train the framework 100 by way of processing and back-propagation of corrective data. At some point in the training of the framework, after some determined number of sets of training data are processed, a set of validation data is also processed. The set (or sets) of validation data are then processed through the framework 100 such that the objective function 114 generates predicted results 208 for the validation data. The results analysis service 212 accesses the actual results 216 corresponding to the set of validation data 214 to determine the performance/accuracy of the current state of the framework 100. However, in validating the current state of the framework 100, the results analysis service does not provide corrective data 210 for back-propagation as it does with sets of training data. The corrective data 210 is not provided to the framework 100 due to the desire to not have the framework train on the validation data 214 which might lead to the current model accurately predicting the results for the set of validation data 214 without being accurate with regard to other sets of data. Rather, the corrective data 210 is used to evaluate whether or not the current state of the framework 100 (referred to as the current model 220) meets one or more predetermined accuracy thresholds for predicting results. Upon determining that a current model meets or exceeds the predetermined accuracy thresholds, that model can then be used, in conjunction with the framework, for generating predictive results in regard to actual sets of data as illustrated in regard to Figure 3.

[0034] Figure 3 is a block diagram illustrating an exemplary environment 300 in which predictive results are provided to a computer user 301 in regard to a request data set 302, i.e., a data set for which predictive results are requested. As described above, various input signals are identified from the request data set 302 and supplied to the framework 100 having a current model that has been validated to meet or exceed predetermined accuracy thresholds. Indeed, the input signals are provided to the framework 100 via the encoding layer 102. The input signals are then processed by the various layers of the framework such that the objective function 114 generates predicted results 304. These predicted results are then provided to the requesting computer user 301 via a computing device 306.

[0035] As part of an on-going learning process, in addition to providing the predicted results 304 to the requesting computer user 301, the predicted results are also provided to the results analysis process 212. The results analysis process then obtains actual results 310 regarding the objective (e.g., whether a computer user interacted with an item of content), typically from a search engine that provides the results to a requesting user, generates corrective data 308 according to the actual results, and provides the corrective data to the framework for back-propagation.

[0036] Turning now to Figure 4, Figure 4 is a flow diagram illustrating an exemplary routine 400 suitable for training a predictive framework, such as framework 100, according to aspects of the disclosed subject matter. Beginning at block 402, a framework suitable for deep learning, as described above in regard to framework 100 of Figure 1 is provided. According to aspects of the disclosed subject matter, the framework includes an embedding layer 102 having a 1 : 1 correspondence between input signals and embedding neurons within the embedding layer. Additionally, a condensing layer 104 may also be optionally included, as well as a residual layer 106, each having a 1 : 1 correspondence between input signals (from the immediate lower level neural layer) to neurons.

[0037] At block 404 a first set of training data is obtained or accessed. At block 406, input signals corresponding to the currently processed training data are obtained. At block 408, the set of training data is processed by the framework 100 according to the input signals identified for the set of training data. Processing the input signals is described above in regard to Figure 1.

[0038] At block 410, the predicted results are determined according to the objective function 114 and, at block 412, an analysis of the predicted results is conducted in view/against the actual results.

[0039] At block 414, corrective data is generated in light of the predicted vs. actual results. At block 416, the corrective data is back-propagated through the various neural levels of the framework. At block 441820, a next set of training data is obtained and the routine 400 returns to block 408 to continue processing training data.

[0040] Also shown in routine 400, periodically in conjunction with training the current model of the framework 100, validation for current workability and accuracy is also determined. At these times and as shown in decision block 420, a determination is made as to whether the current results (as determined by the current model of the framework) are within accuracy thresholds. As described above, validating a current model includes processing a set of validation data and examining the predicted results against the actual results. If the predicted results are within acceptable validation thresholds, at block 422 the framework's current model is stored as a validated model. Alternatively, if the predictive results are not within predetermined accuracy thresholds, the current model of the framework is not stored as a validated model.

[0041] It should be appreciated that while Figure 4 illustrates the steps of 420 and 422 returning to block 418, this is for illustrative purpose of the routine 400 and should not be construed as a limiting embodiment. Indeed, validation of a current framework model may operate in a parallel thread with regard to processing sets of training, or be interspersed within the sets of training data. Of course, when processing a set of validation data (as described above) the corrective data and back-propagation do not typically occur in order to use the validation sets as validation sets, not training sets. Moreover, routine 400 does not include an ending step. While framework training may come to a close, the lack of an ending step is indicative of ongoing training of the framework, including using actual request data for training purposes.

[0042] Turning now to Figure 5, Figure 5 is a flow diagram illustrating an exemplary routine 500 for providing predictive results in regard to an actual set of data. Beginning at block 502, a framework 100 with a validate module is provided. As suggested above, a validate model corresponds to a working model that has been validated as providing predictive results within predetermined accuracy thresholds.

[0043] At block 504, request data 302 is obtained for processing and, at block 506, input signals from the request data are obtained. At block 508, the request data, as represented by the obtained input signals, is processed by the framework 100, beginning with the encoding layer 102. At block 510, predictive results 304 are determined by the objective function 114 of the framework 100. At block 512, the predictive request are provided to the requesting party 301.

[0044] In addition to providing the predictive results and as an optional extension of routine 500, at block 514 actual results 310 corresponding to the request data 302 is accessed and, at block 516, an analysis of the predictive results is made by the results analysis service in view of the actual results. At block 518, corrective data 308 is generated and at block 520, the corrective data is back propagated through the framework 100 as described above in regard to Figures 1 and 3. Thereafter, routine 500 terminates.

[0045] Regarding routines 400 and 500 described above, as well as other processes describe herein, while these routines/processes are expressed in regard to discrete steps, these steps should be viewed as being logical in nature and may or may not correspond to any specific actual and/or discrete steps of a given implementation. Also, the order in which these steps are presented in the various routines and processes, unless otherwise indicated, should not be construed as the only order in which the steps may be carried out. Moreover, in some instances, some of these steps may be combined and/or omitted. Those skilled in the art will recognize that the logical presentation of steps is sufficiently instructive to carry out aspects of the claimed subject matter irrespective of any particular development or coding language in which the logical instructions/steps are encoded.

[0046] Of course, while these routines and processes include various novel features of the disclosed subject matter, other steps (not listed) may also be carried out in the execution of the subject matter set forth in these routines. Those skilled in the art will appreciate that the logical steps of these routines may be combined together or be comprised of multiple steps. Steps of the above-described routines may be carried out in parallel or in series. Often, but not exclusively, the functionality of the various routines is embodied in software (e.g., applications, system services, libraries, and the like) that is executed on one or more processors of computing devices, such as the computing device described in regard Figure 6 below. Additionally, in various embodiments all or some of the various routines may also be embodied in executable hardware modules including, but not limited to, system on chips (SoC's), codecs, specially designed processors and or logic circuits, and the like on a computer system.

[0047] As suggested above, these routines and/or processes are typically embodied within executable code modules comprising routines, functions, looping structures, selectors and switches such as if-then and if-then-else statements, assignments, arithmetic computations, and the like. However, as suggested above, the exact implementation in executable statement of each of the routines is based on various implementation configurations and decisions, including programming languages, compilers, target processors, operating environments, and the linking or binding operation. Those skilled in the art will readily appreciate that the logical steps identified in these routines may be implemented in any number of ways and, thus, the logical descriptions set forth above are sufficiently enabling to achieve similar results.

[0048] While many novel aspects of the disclosed subject matter are expressed in routines embodied within applications (also referred to as computer programs), apps (small, generally single or narrow purposed applications), and/or methods, these aspects may also be embodied as computer-executable instructions stored by computer-readable media, also referred to as computer-readable storage media, which are articles of manufacture. As those skilled in the art will recognize, computer-readable media can host, store and/or reproduce computer-executable instructions and data for later retrieval and/or execution. When the computer-executable instructions that are hosted or stored on the

computer-readable storage devices are executed by a processor of a computing device, the execution thereof causes, configures and/or adapts the executing computing device to carry out various steps, methods and/or functionality, including those steps, methods, and routines described above in regard to the various illustrated routines. Examples of computer-readable media include, but are not limited to: optical storage media such as Blu-ray discs, digital video discs (DVDs), compact discs (CDs), optical disc cartridges, and the like; magnetic storage media including hard disk drives, floppy disks, magnetic tape, and the like; memory storage devices such as random access memory (RAM), read- only memory (ROM), memory cards, thumb drives, and the like; cloud storage (i.e., an online storage service); and the like. While computer-readable media may reproduce and/or cause to deliver the computer-executable instructions and data to a computing device for execution by one or more processors via various transmission means and mediums, including carrier waves and/or propagated signals, for purposes of this disclosure computer readable media expressly excludes carrier waves and/or propagated signals.

[0049] Turning to Figure 6, Figure 6 is a block diagram illustrating an exemplary computer readable medium encoded with instructions to generate predictive results of a request data set as described above. More particularly, the implementation 600 comprises a computer-readable medium 608 (e.g., a CD-R, DVD-R or a platter of a hard disk drive), on which is encoded computer-readable data 606. This computer-readable data 606 in turn comprises a set of computer instructions 604 configured to operate according to one or more of the principles set forth herein. In one such embodiment 602, the processor- executable instructions 604 may be configured to perform a method, such as at least some of the exemplary methods 400 and 400, for example. In another such embodiment, the processor-executable instructions 604 may be configured to implement a system, such as at least some of the exemplary system 700, as described below. Many such

computer-readable media may be devised, by those of ordinary skill in the art, which are configured to operate in accordance with the techniques presented herein.

[0050] Turning now to Figure 7, Figure 7 is a block diagram illustrating an exemplary computing device 700 configured to provide framework for generating predictive results according to aspects of the disclosed subject matter. The exemplary computing device 700 includes one or more processors (or processing units), such as processor 702, and a memory 704. The processor 702 and memory 704, as well as other components, are interconnected by way of a system bus 710. The memory 704 typically (but not always) comprises both volatile memory 706 and non-volatile memory 708. Volatile memory 706 retains or stores information so long as the memory is supplied with power. In contrast, non-volatile memory 708 is capable of storing (or persisting) information even when a power supply is not available. Generally speaking, RAM and CPU cache memory are examples of volatile memory 706 whereas ROM, solid-state memory devices, memory storage devices, and/or memory cards are examples of non-volatile memory 708.

[0051] As will be appreciated by those skilled in the art, the processor 702 executes instructions retrieved from the memory 704 (and/or from computer-readable media, such as computer-readable media 600 of Figure 6) in carrying out various functions of predictive framework 100 as set forth above. The processor 702 may be comprised of any of a number of available processors such as single-processor, multi-processor, single-core units, and multi-core units.

[0052] Further still, the illustrated computing device 700 includes a network

communication component 712 for interconnecting this computing device with other devices and/or services over a computer network, including other user devices, such as user computing devices 306 of Figure 3. The network communication component 712, sometimes referred to as a network interface card or NIC, communicates over a network using one or more communication protocols via a physical/tangible (e.g., wired, optical, etc.) connection, a wireless connection, or both. As will be readily appreciated by those skilled in the art, a network communication component, such as network

communication component 712, is typically comprised of hardware and/or firmware components (and may also include or comprise executable software components) that transmit and receive digital and/or analog signals over a transmission medium (i.e., the network.)

[0053] The exemplary computing device 700 further includes a predictive framework 100 that comprises plural neural layers 714. Moreover, these neural layers include at least an encoding layer 102, as well as an optional condensing layer 104 and an optional residual layer 106. Further still, in addition the encoding, condensing and residual layers, the predictive framework 100 includes plural middle layers, as discussed above in regard to Figure 1.

[0054] Also included as part of the predictive framework 100 of the exemplary computing device 700 is an objective function 114. The objective function 114 utilizes the output signals of the topmost neural layer to determine predictive results regarding the basis data in accordance with a predetermined objective. As discussed above, in one embodiment the objective function determines predictive results corresponding to the likelihood of user interaction with regard to an item of content of search results, or corresponding to the likelihood of user interaction with regards to a search results page. By way of illustration and not limitation, the objective function may be trained to determine the likelihood of user interaction with regard to an advertisement on a search results page for a given search query, or the likelihood of user interaction with regard to a paid search result within a search results page. Of course, as should be readily appreciated, the predictive framework 100 could be advantageously trained according to any number of predetermined objective such that the objective function generates predictive or interpretive results with regard to a variety of objectives outside of the domain of search results.

[0055] As shown in Figure 7, the exemplary computing device 700 further includes a results analysis service 212. As discussed above, the results analysis module analyzes the predicted results in view of actual results (for a given set of training data) to determine whether the current model of the framework meets or exceeds one or more accuracy thresholds. Further, the results analysis service 212 generates corrective data that may be utilized by the predictive framework 100 to update one or more neurons within the various neural levels 714 through back-propagation as discussed above.

[0056] Also included in the exemplary computing device 700 is a data store 718. The data store includes/stores information for use by the predictive framework 100 including one or more sets of training data 724-728 with corresponding actual results 734-738. Further, the data store 718 includes/stored one or more validation sets 720 with corresponding actual results 722.

[0057] Regarding the various components of the exemplary computing device 700, those skilled in the art will appreciate that many of these components may be implemented as executable software modules stored in the memory of the computing device, as hardware modules and/or components (including SoCs - system on a chip), or a combination of the two. Indeed, components may be implemented according to various executable embodiments including executable software modules that carry out one or more logical elements of the processes described in this document, or as a hardware and/or firmware components that include executable logic to carry out the one or more logical elements of the processes described in this document. Examples of these executable hardware components include, by way of illustration and not limitation, ROM (read-only memory) devices, programmable logic array (PLA) devices, PROM (programmable read-only memory) devices, EPROM (erasable PROM) devices, and the like, each of which may be encoded with instructions and/or logic which, in execution, carry out the functions described herein.

[0058] Moreover, in certain embodiments each of the various components of the exemplary computing device 700 may be implemented as an independent, cooperative process or device, operating in conjunction with or on one or more computer systems and or computing devices. It should be further appreciated, of course, that the various components described above should be viewed as logical components for carrying out the various described functions. As those skilled in the art will readily appreciate, logical components and/or subsystems may or may not correspond directly, in a one-to-one manner, to actual, discrete components. In an actual embodiment, the various components of each computing device may be combined together or distributed across multiple actual components and/or implemented as cooperative processes on a computer network.

[0059] While various novel aspects of the disclosed subject matter have been described, it should be appreciated that these aspects are exemplary and should not be construed as limiting. Variations and alterations to the various aspects may be made without departing from the scope of the disclosed subject matter.