Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SYSTEM AND METHOD FOR CLIENT-SIDE MODEL TRAINING IN RECOMMENDER SYSTEMS
Document Type and Number:
WIPO Patent Application WO/2017/178870
Kind Code:
A1
Abstract:
A method and system is implemented by a computing device functioning as a consumption device in a local recommender system. The method is for updating a local model including user preferences and content item ratings for a catalog of content items. The method to efficiently update, at the consumption device, the local model that is a subset of a model maintained by an offline recommender system. The method to compute a user latent vector as a product of a first sum and a second sum, the first sum being an inverse of a product of content item latent vectors with user ratings and corresponding transpose content item latent vectors with a regularization term added, the second sum being a product of content item latent vectors with user ratings and available user ratings, to multiply the user latent vector by the content item latent vectors to generate content item ratings, and to rank content items by the content item ratings to produce content item recommendations for a current user.

Inventors:
FELLER EUGEN (US)
FORGEAT JULIEN (US)
Application Number:
PCT/IB2016/052184
Publication Date:
October 19, 2017
Filing Date:
April 15, 2016
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
ERICSSON TELEFON AB L M (PUBL) (SE)
International Classes:
G06Q30/06
Domestic Patent References:
WO2012013996A12012-02-02
WO2012105884A12012-08-09
WO2016040211A12016-03-17
Foreign References:
US20150012378A12015-01-08
Other References:
None
Attorney, Agent or Firm:
DE VOS, Daniel M. (US)
Download PDF:
Claims:
CLAIMS

What is claimed is:

1. A method implemented by a computing device functioning as a consumption device in a local recommender system, the method for updating a local model including user preferences and content item ratings for a catalog of content items, the method to efficiently update, at the consumption device, the local model that is a subset of a model maintained by an offline recommender system, the method comprising:

computing (611) a user latent vector as a product of a first sum and a second sum, the first sum being an inverse of a product of content item latent vectors with user ratings and corresponding transpose content item latent vectors with a regularization term added, the second sum being a product of content item latent vectors with user ratings and available user ratings;

multiplying (613) the user latent vector by the content item latent vectors to generate content item ratings; and

ranking (615) content items by the content item ratings to produce content item

recommendations for a current user.

2. The method of claim 1, further comprising:

registering (605) a new user with the local recommender system; and

requesting (607) content item latent vectors from a session model builder of a

consumption session manager.

3. The method of claim 1, further comprising:

logging (601) into the local recommender system by an existing user; and

requesting (603) a current user ratings matrix, current user latent vector, and content item latent vectors from a session model builder of a consumption session manager.

4. The method of claim 2, wherein the content item latent vectors are filtered to the subset of the model maintained by the offline recommender system.

5. The method of claim 1, further comprising:

requesting retraining of the local model including a rating entered by a user or implicit feedback collected by the local recommender system.

6. The method of claim 1, further comprising:

requesting a filtered local model and filtered ratings, where the filter is partially based on user identifier and collected sensor data.

7. The method of claim 6, wherein the filtered local model has a reduced size relative to the model of the offline recommender system excluding a subset of content item vectors and associated ratings.

8. The method of claim 7, further comprising:

storing the filtered local model in local model storage of the local recommender system to enable real-time updates of the filtered local model and generation of recommendations using the filtered local model.

9. A computing device to execute a method, the computing device functioning as a consumption device in a local recommender system, the method for updating a local model including user preferences and content item ratings for a catalog of content items, the method to efficiently update, at the consumption device, the local model that is a subset of a model maintained by an offline recommender system, the computing device comprising:

a non-transitory machine readable storage medium (157) having stored therein a local recommender system and a consumption application; and

a processor (175) coupled to the non-transitory machine readable storage medium, the processor to execute the local recommender system and consumption application, the local recommender system to compute a user latent vector as a product of a first sum and a second sum, the first sum being an inverse of a product of content item latent vectors with user ratings and corresponding transpose content item latent vectors with a regularization term added, the second sum being a product of content item latent vectors with user ratings and available user ratings, to multiply the user latent vector by the content item latent vectors to generate content item ratings, and to rank content items by the content item ratings to produce content item recommendations for a current user.

10. The computing device of claim 9, wherein the consumption application configured to register a new user with the local recommender system, and to request content item latent vectors from a session model builder of a consumption session manager.

11. The computing device of claim 9, wherein the consumption application to log into the local recommender system by an existing user, and to request a current user ratings matrix, current user latent vector, and content item latent vectors from a session model builder of a consumption session manager.

12. The computing device of claim 10, wherein the content item latent vectors are filtered to the subset of the model maintained by the offline recommender system.

13. The computing device of claim 9, wherein the local recommender system to request retraining of the local model including a rating entered by a user or implicit feedback collected by the local recommender system.

14. The computing device of claim 9, wherein the local recommender system to request a filtered local model and filtered ratings, where the filter is partially based on user identifier and collected sensor data.

15. The computing device of claim 14, wherein the filtered local model has a reduced size relative to the model of the offline recommender system excluding a subset of content item vectors and associated ratings.

16. The computing device of claim 15, wherein the local recommender system to store the filtered local model in local model storage of the local recommender system to enable real-time updates of the filtered local model and generation of recommendations using the filtered local model.

17. A non-transitory machine-readable storage medium that provides instructions that, if executed by a processor, will cause said processor to perform operations including a method implemented by a computing device functioning as a consumption device in a local

recommender system, the method for updating a local model including user preferences and content item ratings for a catalog of content items, the method to efficiently update, at the consumption device, the local model that is a subset of a model maintained by an offline recommender system, the operations comprising:

computing (611) a user latent vector as a product of a first sum and a second sum, the first sum being an inverse of a product of content item latent vectors with user ratings and corresponding transpose content item latent vectors with a regularization term added, the second sum being a product of content item latent vectors with user ratings and available user ratings;

multiplying (613) the user latent vector by the content item latent vectors to generate content item ratings; and ranking (615) content items by the content item ratings to produce content item recommendations for a current user.

18. The non-transitory machine readable medium of claim 17, having further instructions stored therein, which when executed cause operations further comprising:

registering (605) a new user with the local recommender system; and

requesting (607) content item latent vectors from a session model builder of a

consumption session manager.

19. The non-transitory machine readable medium of claim 17, having further instructions stored therein, which when executed cause operations further comprising:

logging (601) into the local recommender system by an existing user; and

requesting (603) a current user ratings matrix, current user latent vector, and content item latent vectors from a session model builder of a consumption session manager.

20. The non-transitory machine readable medium of claim 18, wherein the content item latent vectors are filtered to the subset of the model maintained by the offline recommender system.

Description:
SYSTEM AND METHOD FOR CLIENT-SIDE MODEL TRAINING IN

RECOMMENDER SYSTEMS

TECHNICAL FIELD

[0001] Embodiments of the invention relate to the field of recommendation systems; and more specifically, to the improvement of the real-time generation of recommendations using a client side model based on a subset of the model maintained by the recommender system.

BACKGROUND

[0002] Recommender systems play a key role in providing personalized service experience in various domains (e.g., e-commerce, media delivery) by recommending relevant items to users. Good recommender systems contribute to user retention and drive up sales. In a traditional recommender system, users interact with the recommender system via a user interface such as a web browser or dedicated application. The user interface presents content to users and assists the users in finding additional content by making recommendations on available content that may be of interest to the users based on prior feedback from the user. The feedback received from the user may either be explicit and/or implicit feedback. Explicit feedback can be entered into the system by means of ratings for specific content items. Examples of implicit feedback include things such as time, location of the user, and browsing behavior of the user. Such feedback can be referred to as user preferences. Once user preferences are collected, they are used by the recommender system to generate predictions for content items that users are most likely to consume (e.g., read, watch, listen, or buy). A list of recommendations is generated based on the generated predictions and displayed to the users on their device via the user interface.

[0003] Constructing quality recommender systems is a challenging task for several reasons. First, many recommender systems need to deal with a massive scale of millions of users interacting with the system. This scale results in high computation and data infrastructure requirements. Moreover, this large scale requires algorithms that can be easily run in a distributed manner to handle the load of the millions of users. Second, recommender systems need to provide accurate predictions, ideally in most cases, in a near real-time fashion. For example, as users browse a content item catalog and interact with the recommender system they need to be provided with updated recommendations. The aforementioned real-time

requirements call for recommender algorithms and systems that are capable of performing incremental/online updates. [0004] There are two standard approaches to build recommender systems: server- side and client-side. One popular server-side approach for building recommender systems is referred to as collaborative filtering. Collaborative filtering approaches only leverage user preference information (e.g., ratings, time, or user location) to produce recommendations. Collaborative filtering approaches are usually further divided intra neighborhood and model -based approaches. Neighborhood approaches typically make use of similarity metrics (e.g., Cosine or Pearson similarity metrics) to compute similarities between users and/or content items. In model-based approaches, the user-item preferences are modeled in the form of a rating matrix. The rating matrix is typically very sparse as users rarely rate all available content items. Matrix

Factorization (MF) techniques are used to determine the latent (i.e., the non-explicit or patent) factors yielding the user-item preferences. In other words, MF identifies the latent factors (e.g., genre preference, actor preference, etc.) that led the users to give a rating of a content item. The factors are called latent because they are not visible to the system at the time of the rating and must be inferred or similarly determined by examining correlations between ratings of content items. In an attempt to determine the latent factors, MF decomposes the rating matrix into two lower-rank matrixes U and M. Each row and column in U corresponds to a user and a latent factor, respectively. Similarly, each row and column in M corresponds to a latent factor and content item, respectively. The user latent factor indicates the strength for a particular user to favor the latent factor. Similarly, the content item latent factor represents the strength of the content item (e.g., a movie, consumer item or similar consumable) having that latent factor. A rating prediction for a particular user and item is computed by multiplying the user vector in U with the corresponding content item vector in M.

[0005] Alternatively, a client-side system may be utilized. The key idea of such a system is to run many of the recommender system components at the client-side while keeping the model training and periodic updates at the server side. Specifically, the following components are run on the client-side: prediction, ranking, and filtering. Running the aforementioned components at the client-side helps to decrease load on the backend and enables the overall recommender system to achieve higher scalability. In some implementations, for the model training in the backend the K-Nearest Neighbor (KNN) algorithm may be used. Such client-side recommender systems work well when the number of items is small (e.g., TV programs).

SUMMARY

[0006] In one embodiment, a method is implemented by a computing device functioning as a consumption device in a local recommender system. The method is for updating a local model including user preferences and content item ratings for a catalog of content items. The method to efficiently update, at the consumption device, the local model that is a subset of a model maintained by an offline recommender system. The method to compute a user latent vector as a product of a first sum and a second sum, the first sum being an inverse of a product of content item latent vectors with user ratings and corresponding transpose content item latent vectors with a regularization term added, the second sum being a product of content item latent vectors with user ratings and available user ratings, to multiply the user latent vector by the content item latent vectors to generate content item ratings, and to rank content items by the content item ratings to produce content item recommendations for a current user.

[0007] In another embodiment, a computing device to execute a method. The computing device functioning as a consumption device in a local recommender system. The method for updating a local model including user preferences and content item ratings for a catalog of content items. The method to efficiently update, at the consumption device, the local model that is a subset of a model maintained by an offline recommender system. The computing device comprising a non-transitory machine readable storage medium having stored therein a local recommender system and a consumption application, and a processor coupled to the non- transitory machine readable storage medium. The processor to execute the local recommender system and consumption application. The local recommender system to compute a user latent vector as a product of a first sum and a second sum, the first sum being an inverse of a product of content item latent vectors with user ratings and corresponding transpose content item latent vectors with a regularization term added, the second sum being a product of content item latent vectors with user ratings and available user ratings, to multiply the user latent vector by the content item latent vectors to generate content item ratings, and to rank content items by the content item ratings to produce content item recommendations for a current user.

[0008] In one embodiment, a non-transitory machine-readable storage medium that provides instructions that, if executed by a processor, will cause said processor to perform operations including a method implemented by a computing device functioning as a consumption device in a local recommender system. The method for updating a local model including user preferences and content item ratings for a catalog of content items. The method to efficiently update, at the consumption device, the local model that is a subset of a model maintained by an offline recommender system. The operations to compute a user latent vector as a product of a first sum and a second sum, the first sum being an inverse of a product of content item latent vectors with user ratings and corresponding transpose content item latent vectors with a regularization term added, the second sum being a product of content item latent vectors with user ratings and available user ratings, to multiply the user latent vector by the content item latent vectors to generate content item ratings, and rank content items by the content item ratings to produce content item recommendations for a current user.

BRIEF DESCRIPTION OF THE DRAWINGS

[0009] The invention may best be understood by referring to the following description and accompanying drawings that are used to illustrate embodiments of the invention. In the drawings:

[0010] Figure 1 is a diagram of one embodiment of a recommender system.

[0011] Figure 2 is a diagram of one embodiment of a session start up or bootstrap process.

[0012] Figure 3 is a flowchart of the generalized session startup process.

[0013] Figure 4 is a diagram of one embodiment of a content item rating update process.

[0014] Figure 5 is a flowchart of the generalized content rating update process.

[0015] Figure 6 is a flowchart of the local recommendation computation process.

DETAILED DESCRIPTION

[0016] The following description describes methods and apparatus for improving the efficiency and operation of recommender systems. The embodiments provide a client-side implementation of the recommender system where a local model is updated and maintained at the consumption device. The processes include a method for updating the local model from a central server (i.e., offline recommender system) as well as a method for initiating a local model. The processes further include a method for updating the local model such that it is

computationally efficient while producing new recommendations over large data sets. These methods scale to support recommender systems with large data sets. In the following description, numerous specific details such as logic implementations, opcodes, means to specify operands, resource partitioning/sharing/duplication implementations, types and

interrelationships of system components, and logic partitioning/integration choices are set forth in order to provide a more thorough understanding of the present invention. It will be appreciated, however, by one skilled in the art that the invention may be practiced without such specific details. In other instances, control structures, gate level circuits and full software instruction sequences have not been shown in detail in order not to obscure the invention. Those of ordinary skill in the art, with the included descriptions, will be able to implement appropriate functionality without undue experimentation.

[0017] References in the specification to "one embodiment," "an embodiment," "an example embodiment," etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.

[0018] Bracketed text and blocks with dashed borders (e.g., large dashes, small dashes, dot- dash, and dots) may be used herein to illustrate optional operations that add additional features to embodiments of the invention. However, such notation should not be taken to mean that these are the only options or optional operations, and/or that blocks with solid borders are not optional in certain embodiments of the invention.

[0019] In the following description and claims, the terms "coupled" and "connected," along with their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other. "Coupled" is used to indicate that two or more elements, which may or may not be in direct physical or electrical contact with each other, co-operate or interact with each other. "Connected" is used to indicate the establishment of communication between two or more elements that are coupled with each other.

[0020] The operations in the flow diagrams will be described with reference to the exemplary embodiments of the other figures. However, it should be understood that the operations of the flow diagrams can be performed by embodiments of the invention other than those discussed with reference to the other figures, and the embodiments of the invention discussed with reference to these other figures can perform operations different than those discussed with reference to the flow diagrams.

[0021] An electronic device stores and transmits (internally and/or with other electronic devices over a network) code (which is composed of software instructions and which is sometimes referred to as computer program code or a computer program) and/or data using machine-readable media (also called computer-readable media), such as machine-readable storage media (e.g., magnetic disks, optical disks, read only memory (ROM), flash memory devices, phase change memory) and machine-readable transmission media (also called a carrier) (e.g., electrical, optical, radio, acoustical or other form of propagated signals - such as carrier waves, infrared signals). Thus, an electronic device (e.g., a computer) includes hardware and software, such as a set of one or more processors coupled to one or more machine-readable storage media to store code for execution on the set of processors and/or to store data. For instance, an electronic device may include non- volatile memory containing the code since the non-volatile memory can persist code/data even when the electronic device is turned off (when power is removed), and while the electronic device is turned on that part of the code that is to be executed by the processor(s) of that electronic device is typically copied from the slower nonvolatile memory into volatile memory (e.g., dynamic random access memory (DRAM), static random access memory (SRAM)) of that electronic device. Typical electronic devices also include a set or one or more physical network interface(s) to establish network connections (to transmit and/or receive code and/or data using propagating signals) with other electronic devices. One or more parts of an embodiment of the invention may be implemented using different combinations of software, firmware, and/or hardware.

Overview

[0022] The prior art has numerous disadvantages that the embodiments of the present invention overcome. The prior art, in particular prior art server side implementations, require a massive number of servers and all related processes operate entirely at the backend of the recommender system. Server-side implementations such as the neighborhood-based approaches have several shortcomings such as performance/prediction issues under sparse data and limited scalability for large datasets. In some implementations, these server-side implementations are centralized and require calculating a similarity metric for all users using the constrained Pearson Algorithm.

[0023] Model-based server-side implementations also have advantages and disadvantages relative to the neighborhood based approach. The model-based approaches have been shown to better address sparse datasets and scalability problems. Also, several efficient algorithms have been discovered that can train models (i.e., derive U and M) at scale. Such training can be computed in parallel on a massive number of servers. As a result several distributed data processing systems have emerged that implement algorithms to train models. One such algorithm is called Alternating Least Squares (ALS). In ALS, the matrices U and M are computed at the backend in an iterative manner. First U is mixed, and M is computed by solving a least-squares minimization problem, then M is fixed and U is computed in a similar manner. The iteration process repeats until U and M no longer change or the change falls below a threshold. However, these model-based approaches have disadvantage as well. Unfortunately, with increasing real-time recommendation requirements, such server-side model based approaches tend to run into several limitations. Every time a new rating is submitted by a user, the entire model needs to be retrained for the new rating to be taken into account. Such an operation with potentially millions of users and items can take hours and is therefore not suitable in the context where real-time recommendations are required. Furthermore, continuous communication with the server backend adds latency and networking overheads that limit the real-time capabilities of the recommender system. [0024] In comparison, a client-side approach performs better than a prior art service side implementation in terms of real-time performance requirements. However, the prior art client side implementations also suffer from several drawbacks. First, the prior art client side implementation is limited to handling only a few numbers of items. The limitation comes from the fact that a model needs to be kept small in order to be sent from the server to the client. Second, the K-nearest neighbor (KNN) algorithm is typically used that requires computing similarities between all items, which is a computationally expensive operation.

[0025] The embodiments of the invention overcome these defects of the prior art overcome by providing an alternative client-side recommender system to enable accurate and near real-time recommendations in the presence of new users and user preferences. The embodiments operate such that in the event of new user arrival or registration with the recommender system, an initial set of user preferences (explicit and/or implicit) is provided during registration or interaction with the system, respectively. Consequently, the embodiments avoid a cold-start problem in computing recommendations. The embodiments of the invention perform initial and periodic model training at the server side. However, in contrast to server side implementations, the embodiments do not rely on KNN and do not send the entire model to the client. Indeed, such an approach cannot work with the type of systems where the set of content items exceeds a set of a thousand content items.

[0026] To enable a recommender system to scale to support thousands of content items the embodiments use a form of Matrix Factorization where model training is split between the server and client-side. Specifically the embodiments of the recommender system combine training, prediction, ranking, and filtering at the client-side. The embodiments of the

recommender system do model training and periodic retraining from scratch at the backend. However, when the user attempts to interact with the recommender system, the recommender system only send a subset of content item latent vectors to the client. The content item latent vectors are used to train a user-specific model directly at the client-side using a training algorithm such as the Least Squares method. Such a model training split between server and client is enables near real-time recommendations. The embodiments can produce

recommendations as a user of a client interacts with the content items in question. Moreover, no requests need to be sent to the backend server to produce continuous recommendations for the content item subset. Such requests would further reduces the recommendation responsiveness, thus avoiding such requests improves the user experience.

[0027] The embodiments provide several advantages over the prior art. The embodiments provide an improved client- side model training recommender system and an improved process to produce recommendations at the client side via incremental training. Examples provided herein may focus on explicit user preferences, where a user provides direct feedback about a content item (e.g., ratings), however, one skilled in the art would understand that the

embodiments of the recommender system work with implicit preferences as well. For example, implicit feedback, where a user preference is inferred by sensor information or user activity (e.g., number of times a movie was watched) can be used instead of or in conjunction with user ratings.

[0028] Figure 1 is a diagram of one embodiment of a recommender system. The embodiments of the recommender system encompass an architecture that is composed of three main layers an offline recommender system 101, consumption session manager 121 and consumption device 151.

[0029] The offline recommender system 101 includes a model serving layer 109 and model storage 105. The model serving layer is executed by a set of compute resources 107 that may include a set of processing devices, memory, networking interfaces and similar resources. The model storage 105 may be stored in a non-transitory storage medium such as a set of magnetic or optical drives that may be local or remote to the compute resources 107. The model storage 105 can be maintained by a database management system or similar system. Any number or arrangement of compute resources 107 and non-transitory storage medium 103 may be provided that may be distributed over any number of systems such as in a cloud system.

[0030] The model storage 105 is a data store that contains the computed model generated by the recommender system that tracks the preferences related to a catalog of content items a set of users of the recommender system. The computed model consists of two matrices which, when multiplied, yield all estimated ratings for all content items and all users in the form of a user/content item matrix. The model serving layer is a software module (i.e., software code or component) that is exposing querying application programming interfaces (APIs) for the model. Examples of typically supported API functions include for a given couple (user;movie), provide the rating as estimated by the model and for a given user, provide top n movies with highest ratings as estimated by the model.

[0031] In the embodiments, the offline recommender system has a limited role and

responsibility. An initial subset of the model date may be retrieved from the offline

recommender system and user preferences may be reported back to the offline recommender system. An incremental matrix factorization is used in the on the client side of the recommender system as described further below, but this does not put any restriction on any optimizations that can be added to a server-side recommender system as long as it is capable of exposing a user/content item matrix with ratings. For example, specific techniques like Pearson correlation can be used to estimate ratings for new users and solve the cold-start problem. [0032] The recommender system can include a consumption session manager 121. The consumption session manager can be implemented by a set of compute resources 123 including processing devices and memory that are shared with the offline recommender system 101 or independent therefrom. These compute resources 123 can have any number or arrangement. The role of the consumption session manager 121 is to build a low scale model for a specific context. The context consists of a user, a viewing session on the consumption device where the session is defined as the act of searching and eventually consuming a content item (e.g., a movie) on the device, and some context information that relates to the session, for example, the device on which the session is happening, the time of the day, the day of the week and similar information.

[0033] The consumption session manager 121 has two software implemented components a session context analyzer 127 and a session model builder 125. The session context analyzer 127 collects session context data as provided by the consumption device 151 and merges it with overall contextual information such as time of day. The role of the session model builder 125 is more complex, it provides a low scale model to the consumption client based on the full scale model as exposed by the recommender system and the contextual information collected by the session context analyzer.

[0034] In the embodiments, the session model builder 125 generates the low-scale model by picking the top-n content items returned by the server-side offline recommender system 101. The way the low-scale model is created can be further optimized by using rules applied on the user historical consumptions and the provided context (for example, the user never watches TV series in the morning; the user likes to watch documentaries on his iPad, and similar information) and adding mechanisms to include some content items below top-n ratings to guarantee that a certain level of diversity and serendipity is present in the recommendations.

[0035] The consumption device 151 is the device on which content items are searched and played; the consumption device 151 can be any type of device such as a mobile phone, a tablet, a smart TV, a web browser in a computer and similar devices. Within the device, there are three sub-systems, the local recommender system 153, the consumption application 155 and the sensors 167 the device is equipped with. These systems can be implemented using a non- transitory machine-readable storage medium 157 and compute resources 175. The non-transitory storage medium can be any number and arrangement of storage devices within a consumption device 151. The compute resources 175 can be processor devices, memory and network interfaces that implement the software components of the local recommender system 153. The local recommender system 153 can be seen as a compact version of a typical recommender system designed to run with a very small footprint so that it can be executed locally on consumption devices with limited compute resources 175 (e.g., smartphones, smart TVs and similar devices) or within a web browser in which case the execution environment may be for example a Javascript virtual machine. Compactness is achieved by drastically reducing the amount of data and computation by only considering the simplified model provided by the consumption session manager 121 according to the consumption context.

[0036] The local recommender system 153 is composed of a local model storage

component 159 used to store the initial model issued by the consumption session manager 121 and updated by the local training layer 163. The initial model consists of the user latent vector and a subset of all the content item latent vectors as the content items are filtered to only keep those that are relevant to the consumption session. The local recommender system 153 also includes a local rating storage used to store implicit and explicit rating provided by or derived from the user within the session. A local training layer 163which takes as input the current model stored in the model storage layer 159 as well as all the ratings stored in the local rating storage 161 and outputs a new model to be stored in the local model storage component 159. A local model serving layer 165 allows the consumption application 155 to query the current local model as the user browses it.

[0037] The benefit of computing the recommendations on the client side at the consumption device 151 is that each time a rating is sent to the client side local recommender system 153, an updated current local model can be made available in a matter of tens or hundreds of millisecond thus allowing the user interface to present new content items that take user inputs into consideration in real-time. Without the client side local recommender system 153, the user interface has to rely on one of the following options. First, using a server side traditional matrix factorization recommender system, which is how many state-of-the-art solutions are

implemented but since training is long and expensive, this does not permit having real-time recommendation updates on the client. Second, using a server side incrementally updated model trained the same way as the one that is described herein, however, this allows the system to compute very fast model updates but refreshing the content items on the client user interface is then subject to network latency on the uplink (when the client needs to upload the new rating to the server) and the downlink (when the server needs to send an updated list of recommendations to the client). These delays can very easily affect the user experience as the graphical user interface needs to be refreshed very fast for a smooth experience (typically no more than 200 or 300 milliseconds).

[0038] The consumption application 155 is the application used by the user to search select and consume content. It is composed of a user interface (UI) 169 that permits the user to browse content, search for content, rate content in an explicit manner (by assigning a score to a movie) and view content. Content ratings are sent to the local recommender system 153 as they are issued. The session context builder 171 is used at the beginning of the consumption session (i.e. when the user starts the application) to compile information on the session context which is then sent to consumption session manager 121 where it will be used to generate a session-specific consumption model.

[0039] Finally, the consumption device 151 is also equipped with sensors 167 that are used to acquire relevant information when building the session context. Available sensors 167 will be different based on the type of the consumption device 151 but examples of device sensor include microphone, camera, temperature, accelerometer, global positioning systems (GPS) and similar systems. Example of information these sensors 167 are capable or acquiring are: the identity of the user 173 that is accessing a content item derived from sensors 167 such as a microphone, camera or digital print sensor, whether the user 173 is indoors or outdoors from the temperature sensor, light level from the camera or the microphone, whether the user 173 is traveling or stationary from the accelerometer or GPS. Inferring this kind of knowledge from raw sensor data is often not trivial yet possible and any combination of such data can be tracked by the recommender system.

[0040] The components of the recommender system can have network interfaces and communicate with one another via a network 181. The network 181 can be any type and size of network including a local area network, wide area network (e.g., the Internet) or similar network. The connections to the network can be through any number of intermediate networks and the network 181 itself can be composed of any number and arrangement of intermediate machines and sub-networks.

[0041] The embodiments of the recommender system encompass a number of inter-related processes. These processes include, but are not limited to a session initiation or bootstrap process, a content rating process during a session and a content rating update process.

[0042] Figure 2 is a diagram of one embodiment of a session start up or bootstrap process. The diagram shows how the initial local model is obtained by the consumption device and local recommender system when the user starts a session on the consumption device. Figure 3 is a flowchart of the generalized session startup process. Figure 2 provides a specific implementation as an example. Thus, these two Figures are described in conjunction with one another using the implementation of Figure 2 as an example of the process of Figure 3.

[0043] The session startup process begins when user starts the session by activating the consumption application on the consumption device. Via the user interface of the consumption application a user can command the session context builder to create a new session (Block 301). This may be done via a create session function of the user interface. The session context builder, in response to receiving this command, collects sensor data from the consumption device sensors to acquire more information on the context of the session, for example, who the user is that may be accessing the content, what time it is and similar information (Block 303). The session context builder may query the sensor devices through an acquire sensor data function or similar operation.

[0044] The session context builder contacts the consumption session manager over the network (e.g., over the Internet), provides the user identity as well as the collected sensor data and requests a low footprint filtered recommendation model along with corresponding (filtered) ratings from the user that will be used for the session (Block 305). This contact may be via a get local model function or similar operation. The session model builder of the consumption session manager may service this request. The session model builder sends user and sensor information to the session context analyzer within the consumption session manager (Block 307). In response, the session context analyzer sends back a set of filtering rules (Block 309). This data may be done via a get session rules function or similar function of the session model builder.

[0045] Subsequently, the session model builder requests the recommendation model from the offline recommender system for the user in the session (Block 311). This model consists of the user vector for the user who started the session as well as all the content item vectors. This may be implemented as a get model function or similar operation of the session model builder. The session model builder gets the ratings of the user from the offline recommender system

(Block 313). The ratings request may be via a get ratings function or similar operation of the session model builder. The requests of the session model builder may be serviced by the model serving layer of the offline recommendation system.

[0046] The session model builder uses the filtering rules obtained from the session context analyzer to filter out content item vectors from the recommendation model in order to reduce its size and remove content items that are not relevant to the session (Block 315). Ratings are also filtered (i.e. only keeping ratings for content items that have not been filtered out). The filtered model and the filtered ratings are sent to the session context builder on the consumption device by the session model builder (Block 317) as a response to the get model function or operation.

[0047] The received filtered model is stored on the local recommender system on the consumption device in the local model storage where it will be used to provide real-time recommendations during the session (Block 319). The storage of the local model may be via a store model function or operation. The filtered ratings are also stored on the local recommender system in a local rating storage as they are also required to train the real-time current local model (Block 321). [0048] Figure 4 is a diagram of one embodiment of a content item rating update process. The diagram shows how the local model is updated by the consumption device and local

recommender system when the user provides additional rating information during a session on the consumption device. Figure 5 is a flowchart of the generalized content rating update process. Figure 4 provides a specific implementation as an example. Thus, these two Figures are described in conjunction with one another using the implementation of Figure 4 as an example of the process of Figure 5.

[0049] The content item rating update process is triggered whenever a user rates a content item on the client (Block 501). While using the consumption application user interface, the user has the ability to rate a content item (e.g., a movie) by giving it a score (e.g., a score based on a 5 star rating stem) which triggers the content rating update process. The process stores the rating into the local rating store (Block 503). For example, the consumption application user interface uses one of the local model serving layer APIs to submit the rating, specifically the local model serving layer may stores the {content item;rating} couple into the local rating storage. The local model serving layer sends a request to the local training layer to retrain the model (Block 505). For example, a trigger training function or similar operation is utilized.

[0050] The Local training layer then retrains the model (Block 507). In some embodiments, the retraining fetches all local ratings from the local rating storage (Block 509) (e.g., via get ratings function), fetches the current local model from the local model storage (Block 511) (e.g., via a get model function), then trains the current local model to produce a new model

(Block 513) (e.g., via a train model function). In some embodiments, the training of the new model uses a one-sided Least Squares function as further described herein below. Once the new model is trained, it is stored into the local model storage by the local model serving layer (Block 515).

[0051] After the model has been retrained, the next time the display of content items is refreshed on the consumption application (Block 517), the user interface will use another API of the local model serving layer to get a list of content items to be displayed (519). In an example where the content items are movies, a get movie function may be used by the user interface to query the local model serving layer. The local model serving layer will query the newly trained model to get a list of content items to return to the user interface (Block 521). In the example, the query model function is utilized. The user interface shows a new list of recommendations that are computed while taking into account the rating that was submitted at the beginning of the process (Block 523).

[0052] The content displayed on the user interface does not necessarily consist of a raw list of recommendations, consumption applications typically add additional filtering and manipulation of the data to make sure that there is enough variety in the displayed list of content items, to advertise content items, to filter out some of the movies the user has already seen and so on.

[0053] The example of a trigger in response to an explicit user rating is provided. However, one skilled in the art would understand that implicit or other derived feedback or rating information could also trigger a retraining of the model. The training process is asynchronous to the rating submission and queries for recommendations and takes an amount of time in the range of hundreds of milliseconds to a few seconds. The local recommender system may not guarantee that the recommendations will be updated the next time the content is refreshed on the consumption device user interface but it will be updated at most within a few seconds after the rating was submitted. The ratings submitted by the user also need to be transmitted to the offline recommender system in order to consolidate the offline model, this is not shown on the diagram and is a separate process that may be asynchronous with this process.

[0054] Figure 4 is a diagram of one embodiment of a content item rating update process. The diagram shows how the local model is updated by the consumption device and local

recommender system when the user provides additional rating information during a session on the consumption device. Figure 5 is a flowchart of the generalized content rating update process. Figure 4 provides a specific implementation as an example. Thus, these two Figures are described in conjunction with one another using the implementation of Figure 4 as an example of the process of Figure 5.

[0055] Figure 6 is a flowchart of the local recommendation computation process. The local recommendation algorithm is implemented as part of the client side local recommender system and supports two scenarios: (1) a new user enters the local recommender system and starts giving ratings, (2) an existing user needs to receive new recommendations. The local

recommendation algorithm can be understood as being applied in the context of the content rating update process described above.

[0056] When a new user enters or registers with the local recommendation system, no explicit information (i.e., ratings, likes/dislikes) and/or other implicit information exists (e.g., number of times a user has watched a certain movie) exists (Block 605). In this case the local recommender system requests a subset of the available content item latent vectors, for example the top-N most popular content item latent vectors (e.g., movie latent vectors) from the session model builder (Block 607), which interfaces with the offline recommender system that maintains the full or actual model for the recommender system. The session model builder and session context analyzer may retrieve the model and filter it, in this example to provide the top-N most popular content item latent vectors. Once the top-N most popular content item latent lectors are received the current local model is trained on the consumption device. The filters at the consumption session manager are not limited to generating the top-N most content items. Indeed, any subset of content items deemed relevant can be sent.

[0057] In another case, an existing user needs to receive new recommendations after logging into or joining the recommender system (Block 601). The local recommender requests the current user ratings, current user latent vector along with a subset (i.e., filtered subset) of content item latent vectors from the session model builder (Block 603). In either a new user or existing user case, once a model has been received from the session model builder, a local model training procedure is applied (Block 609).

[0058] Traditional algorithms such as Alternating Least Squares (ALS) perform model training from scratch in an alternating manner. First the user item vectors are fixed and item latent vectors are updated to minimize the prediction error. Then the item latent vectors are fixed and user latent vectors are updated to minimize the prediction error. This process repeats until latent vectors no longer change. Such an approach adds a significant cost (computation and storage) given the potentially large size of the models in terms of both users and content items (e.g., movies). It is therefore not practical to be applied at the consumption device level.

[0059] To enable model training on the client- side/device, the embodiments fix the item latent vectors and update only the user item vectors. This way computation cost can be reduced, as it is no longer need to access content item vectors belonging to all users. Moreover, the amount of data that needs to be sent to the client is minimized. Indeed, only item latent vectors need to be sent to the client. Given that the embodiments only send a subset of the content item latent vectors (i.e., initially the most popular ones) this also minimizes the amount of data to send over the network. This allows the embodiments to achieve responsive recommendations and even enable recommendations when the device becomes disconnected from the network.

[0060] The local model-training algorithm works as follows. The received model data contains the user id, item id, and corresponding ratings r_uv. Each vector in the subset of content item latent vectors is called q_v. The received model includes, in this example, N such content item latent vectors from the session model builder. Once all data is fetched from the received model data, the pro

[0061] The solution to p_u is given by taking the inverse of the sum of q_v dot products and a regularization term lambda times I_k over all known ratings r_u*. The regularization term is used to avoid model overfitting. Such an inverse can be computed in 0(n A 2).

[0062] Stated more broadly, the user latent vector is computed as a result of a product of two sums. In the first sum, item latent vectors for which user ratings are known are multiplied with their corresponding transpose item latent vectors. A regularization term is added to the result (a matrix) of the item latent vector multiplications to avoid model overfitting. Finally, the inverse of the result (again a matrix) is taken. I_k in the regularization term corresponds to an identity matrix (ones on the diagonal) of dimension k.

[0063] In the second sum, content item latent vectors for which ratings exist are multiplied with all ratings and summed up. The summation results in the final vector which gets multiplied with the result (a matrix) of the first sum (as described above). A multiplication of a matrix with a vector, results in a vector that forms our user latent vector.

[0064] Once p_u is computed, the process can multiply p_u user latent vector with the q_v content item latent vectors to produce recommendations for user u (Block 613). In some embodiment, the process can multiply p_u with all q_v, and then rank the resulting ratings in descending order and show the corresponding content items to the user (Block 615).

[0065] While the flow diagrams in the figures show a particular order of operations performed by certain embodiments of the invention, it should be understood that such order is exemplary (e.g., alternative embodiments may perform the operations in a different order, combine certain operations, overlap certain operations, etc.).

[0066] While the invention has been described in terms of several embodiments, those skilled in the art will recognize that the invention is not limited to the embodiments described, can be practiced with modification and alteration within the spirit and scope of the appended claims. The description is thus to be regarded as illustrative instead of limiting.