Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
SYSTEM AND METHOD FOR PREDICTING REPEAT BEHAVIOR OF CUSTOMERS
Document Type and Number:
WIPO Patent Application WO/2018/069817
Kind Code:
A1
Abstract:
System and method for predicting repeat behaviour of customers are disclosed. In an embodiment, the method includes abstracting a customer interaction data associated with interactions of the customer with respect a target entity into a common data format (CDF) to obtain an abstracted customer interaction data. Based on at least a portion of the abstracted customer interaction data, a set of features corresponding to the target entity are extracted. The set of features characterizes customer interaction with respect to the target entity. Based on the set of features, a prediction model is predicted to predict repeat behaviour probability of the customer with respect to the target entity.

Inventors:
AGARWAL PUNEET (IN)
KAZMI AUON HAIDAR (IN)
SHROFF GAUTAM (IN)
Application Number:
PCT/IB2017/056227
Publication Date:
April 19, 2018
Filing Date:
October 09, 2017
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
TATA CONSULTANCY SERVICES LTD (IN)
International Classes:
G06E1/00
Foreign References:
US20140222506A12014-08-07
US20070094066A12007-04-26
US20070244741A12007-10-18
US20150081469A12015-03-19
Attorney, Agent or Firm:
KOSHAL, Amit et al. (IN)
Download PDF:
Claims:
WE CLAIM:

1. A processor- implemented method for predicting repeat behavior of a customer, the repeat behavior indicative of a probability of the customer repeatedly perform one of purchasing and utilizing a target entity, the method comprising:

abstracting, into a common data format (CDF), a customer interaction data associated with interactions of the customer with respect to at least one target entity, to obtain an abstracted customer interaction data, via one or more hardware processors, abstracting the customer interaction data into the CDF comprises mapping data dimensions from the customer interaction data associated with interactions of the customer with respect to the at least one target entity with the CDF;

extracting, based on at least a portion of the abstracted customer interaction data, a set of features corresponding to the least one target entity, the set of features characterizing customer interaction with respect to the target entity, via the one or more hardware processors; and

modeling, based on the set of features, a prediction model to predict repeat behavior probability of the customer with respect to the at least one target entity, via the one or more hardware processors.

2. The method as claimed in claim 1, wherein abstracting the customer interaction data into the CDF further comprises identifying at least a plurality of mandatory fields and a plurality of optional fields in the customer interaction data to be utilized for extracting the set of features.

3. The method as claimed in claim 1, wherein the set of features comprises at least one of customer based features, target entity based features, customer-target entity interaction based features, profile based features, and similarity based features.

4. The method as claimed in claim 1, wherein the customer interaction data comprises at least a transaction history data of the customer with respect to the at least one target entity and product information associated with a plurality of products.

5. The method as claimed in claim 4, wherein modeling the prediction model based on the set of features comprises:

obtaining a training data using the transaction history data of a plurality of labeled customer sample, wherein a customer sample of the plurality of labeled customer samples with a repeater label is classified as positive sample, and a customer sample of the plurality of labeled customer samples with a non-repeater label is classified as negative sample; and

learning the prediction model based on the training data and the set of features.

6. The method as claimed in claim 5, further comprising updating the prediction model based on products considered for offer, campaign parameters and test set of customers.

7. The method as claimed in claim 6, further comprising generating an optimization model based at least on the repeat behavior probability obtained from the prediction model, the optimization model configured to provide recommendations of offers to the customer with respect to the at least one target entity to maximize an Expected Incremental Revenue (ER) for a set of products (S) and customers (C),

wherein the ER is difference between Lift(c,p) at time = (t + 1), and offer cost OC(p ) incurred at time = t,

i.e. ER(c; p) = {Lift(c; p) - OC(p)}

wherein, Lift(c,p) represents expected revenue from the customer C shopping for a product p after a campaign,

i.e. Lift(c; p) = {RP(c; p) *(l-BP(c; p))*P(p)}

where, BP(c,p) is the fraction of transactions in customer C s' transaction history data associated with the product p.

8. The method as claimed in claim 7, wherein maximizing the ER comprises offering a product /½ to the customer c such that:

ph = rg Ph max ER(c, pi)

9. The method as claimed in claim 7, further comprising determining a total ER and a total budget associated with a set of offered products comprises:

Total budget is:

CC +∑ (c p') c op OC (p '), and

Total expected incremental revenue is:

∑ (c', p') e op ER (c p )

where, CC is the campaign cost representative of costs associated with the campaign, and

offered products (OP) = {(c p ) p ' represent the product offered to customer c' according to the optimization model.

10. A system for predicting repeat behavior of a customer, the repeat behavior indicative of a probability of the customer repeatedly performing one of purchasing and utilizing a target entity, the system comprising:

one or more memories; and

one or more hardware processors, the one or more memories coupled to the one or more hardware processors, wherein the one or more hardware processors are capable of executing programmed instructions stored in the one or more memories to:

abstract, into a common data format (CDF), a customer interaction data associated with interactions of the customer with respect to at least one target entity, to obtain an abstracted customer interaction data, abstracting the customer interaction data into the CDF comprises mapping data dimensions from the customer interaction data associated with interactions of the customer with respect to the at least one target entity with the CDF;

extract, based on at least a portion of the abstracted customer interaction data, a set of features corresponding to the least one target entity, the set of features characterizing customer interaction with respect to the target entity; and

modeling, based on the set of features, a prediction model to predict repeat behavior probability of the customer with respect to the at least one target entity.

11. The system as claimed in claim 10, wherein to abstract the customer interaction data into the CDF, the one or more hardware processors are further configured by the instructions to identify at least a plurality of mandatory fields and a plurality of optional fields in the customer interaction data to be utilized for extracting the set of features.

12. The system as claimed in claim 10, wherein the set of features comprises at least one of customer based features, target entity based features, customer-target entity interaction based features, profile based features, and similarity based features.

13. The system as claimed in claim 10, wherein the customer interaction data comprises at least a transaction history data of the customer with respect to the at least one target entity and product information associated with a plurality of products.

14. The system as claimed in claim 13, wherein to model the prediction model based on the set of features, the one or more hardware processors are further configured by the instructions to:

obtain a training data using the transaction history data of a plurality of labeled customer sample, wherein a customer sample of the plurality of labeled customer samples with a repeater label is classified as positive sample, and a customer sample of the plurality of labeled customer samples with a non-repeater label is classified as negative sample; and

learn the prediction model based on the training data and the set of features.

15. The system as claimed in claim 14, wherein the one or more hardware processors are further configured by the instructions to update the prediction model based on products considered for offer, campaign parameters and test set of customers.

16. The system as claimed in claim 15, wherein the one or more hardware processors are further configured by the instructions to:

generate an optimization model based at least on the repeat behavior probability obtained from the prediction model, the optimization model configured to provide recommendations of offers to the customer with respect to the at least one target entity to maximize an Expected Incremental Revenue (ER) for a set of products (S ) and customers (C),

wherein the ER is difference between Lift(c,p) at time = (t + 1), and offer cost OC(p ) incurred at time = t,

i.e. ER(c; p) = {Lift(c; p) - OC(p)}

wherein, Lift(c,p) represents expected revenue from the customer C shopping for a product p after a campaign,

i.e. Lift(c; p) = {RP(c; p) *(l-BP(c; p))*P(p)}

where, BP(c,p) is the fraction of transactions in customer C s' transaction history data associated with the product p.

17. The system as claimed in claim 16, wherein to maximize the ER, the one or more hardware processors are further configured by the instructions to offer a product pit to the customer c such that:

ph = rg Ph max ER(c, pi)

18. The system as claimed in claim 16, wherein the one or more hardware processors are further configured by the instructions to determine a total ER and a total budget associated with a set of offered products:

Total budget is:

CC +∑ (c p') c op OC (p '), and

Total expected incremental revenue is:

∑ (c', p') e op ER (c p )

where, CC is a campaign cost representative of costs associated with the campaign, and

offered products (OP) = {(c p ) p ' represent the product offered to customer c' according to the optimization model.

Description:
SYSTEM AND METHOD FOR PREDICTING REPEAT BEHAVIOR OF

CUSTOMERS

CROSS-REFERENCE TO RELATED APPLICATIONS AND PRIORITY

[001] The present invention claims priority to Indian provisional specification (Title: System and method for predicting repeat behavior of customers) No. 201621034672, filed in India on October 10, 2016.

TECHNICAL FIELD

[002] The present disclosure in general relates to customer behaviour prediction, and more particularly to system and method for predicting repeat behaviour of customers using machine learning models. The repeat behaviour is further utilized for modelling an optimization model that provides recommendations of offers to the customer.

BACKGROUND

[003] Consumer brands often run promotional campaigns and offer incentives such as discounts or coupons to attract new customers. In electronic commerce (also referred to as, e- commerce) industry pertaining to retail businesses, after such promotional campaigns, shopping behaviour of the customers is analysed in order to identify the customers that are more likely to make a repeat purchase after an initial incentivized purchase. By focusing on such potential customers in subsequent targeted marketing campaigns, merchants may greatly reduce promotional costs and enhance the return on investment (ROI) for the products. Typically, the targeted marketing campaigns can be organised not just for products but also for various other target entities such as, products, retail store, merchant, market, store-chain, and the like. A likelihood of customer associating with a target entity after a promotional campaign may be termed as 'repeat behaviour' of the customer for/with respect to said target entity.

[004] Conventionally, various techniques are available for identifying relevant/potential customers that can be targeted in subsequent target campaigns. Such techniques utilize the transaction history of customers to model the customer loyalty and repeat behavior. Another conventional technique computes a parameter called Current Customer Lifetime Value (CLV) indicative of total potential value from a customer for a relationship life to decide which customers should be allowed to churn and which should be cherished to continue. Yet another conventional technique utilizes click-behavior in addition to the transaction history to identify customer repeat behavior. These conventional techniques utilizes different features can be used for same task of prediction, thereby creating difference in machine learning models adopted in each case.

[005] The inventors here have recognized several technical problems with such conventional systems, as explained below. The conventional frameworks or models use different features for the task of prediction, primarily because every retail company has its unique model of operations. For example, e-commerce data contains users' click-behavior also apart from transaction history, and therefore said models face data challenge to pre- process the data. Moreover, it becomes challenging to use such custom models in a generic prescriptive analytics framework.

SUMMARY

[006] The following presents a simplified summary of some embodiments of the disclosure in order to provide a basic understanding of the embodiments. This summary is not an extensive overview of the embodiments. It is not intended to identify key/critical elements of the embodiments or to delineate the scope of the embodiments. Its sole purpose is to present some embodiments in a simplified form as a prelude to the more detailed description that is presented below.

[007] In view of the foregoing, an embodiment herein provides methods and systems for predicting repeat behavior of a customer. The repeat behavior is indicative of a probability of the customer repeatedly purchasing or utilizing a target entity. The method for predicting repeat behavior of the customer includes abstracting, into a common data format (CDF), a customer interaction data associated with interactions of the customer with respect to at least one target entity, to obtain an abstracted customer interaction data, via one or more hardware processors. Abstracting the customer interaction data into the CDF comprises mapping data dimensions from the customer interaction data associated with interactions of the customer with respect to the at least one target entity with the CDF. Further, the method includes extracting, based on at least a portion of the abstracted customer interaction data, a set of features corresponding to the least one target entity, the set of features characterizing customer interaction with respect to the target entity, via the one or more hardware processors. Furthermore, the method includes modeling, based on the set of features, a prediction model to predict repeat behavior probability of the customer with respect to the at least one target entity, via the one or more hardware processors.

[008] In another aspect, a system for predicting repeat behavior of a customer is provided. The repeat behavior is indicative of a probability of the customer repeatedly purchasing or utilizing a target entity. The system includes one or more memories; and one or more hardware processors, the one or more memories coupled to the one or more hardware processors, wherein the one or more hardware processors are capable of executing programmed instructions stored in the one or more memories to abstract, into a common data format (CDF), a customer interaction data associated with interactions of the customer with respect to at least one target entity, to obtain an abstracted customer interaction data. Abstracting the customer interaction data into the CDF comprises mapping data dimensions from the customer interaction data associated with interactions of the customer with respect to the at least one target entity with the CDF. The one or more hardware processors are further configured by the instruction to extract, based on at least a portion of the abstracted customer interaction data, a set of features corresponding to the least one target entity. The set of features characterizes customer interaction with respect to the target entity. Furthermore, the one or more hardware processors are further capable of executing programmed instructions to model, based on the set of features, a prediction model to predict repeat behavior probability of the customer with respect to the at least one target entity.

[009] In yet another aspect, a non-transitory computer-readable medium having embodied thereon a computer program for executing a method for predicting repeat behavior of a customer is provided. The repeat behavior is indicative of a probability of the customer repeatedly purchasing or utilizing a target entity. The method for predicting repeat behavior of the customer includes abstracting, into a common data format (CDF), a customer interaction data associated with interactions of the customer with respect to at least one target entity, to obtain an abstracted customer interaction data. Abstracting the customer interaction data into the CDF comprises mapping data dimensions from the customer interaction data associated with interactions of the customer with respect to the at least one target entity with the CDF. Further, the method includes extracting, based on at least a portion of the abstracted customer interaction data, a set of features corresponding to the least one target entity, the set of features characterizing customer interaction with respect to the target entity. Furthermore, the method includes modeling, based on the set of features, a prediction model to predict repeat behavior probability of the customer with respect to the at least one target entity. BRIEF DESCRIPTION OF THE FIGURES

[0010] The detailed description is described with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the drawings to reference like features and modules.

[0011] FIG. 1 illustrates a networking environment implementing repeat behaviour prediction of customers, in accordance with an example embodiment.

[0012] FIG. 2 illustrates a block diagram of a customer repeat behaviour prediction system, of accordance with an example embodiment.

[0013] FIG. 3 illustrates an example for presenting common data format (CDF) in a tabular format for repeat behaviour prediction of customers, in accordance with an example embodiment.

[0014] FIG. 4 illustrates an example for presenting sample feature set in a tabular format for repeat behaviour prediction of customers, in accordance with an example embodiment.

[0015] FIG. 5 illustrates an example system architecture for repeat behaviour prediction of customers, in accordance with an example embodiment.

[0016] FIG. 6 illustrates an example process flow for repeat behaviour prediction of customers, in accordance with an example embodiment.

[0017] FIG. 7 illustrates an example UI for offer optimization for customer repeat behaviour prediction, in accordance with an example embodiment.

[0018] It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative systems and devices embodying the principles of the present subject matter. Similarly, it will be appreciated that any flow charts, flow diagrams, and the like represent various processes which may be substantially represented in computer readable medium and so executed by a computer or processor, whether or not such computer or processor is explicitly shown. DETAILED DESCRIPTION

[0019] The probability of a customer repeatedly acquiring a target entity, for example, brands merchants, stores, products, after a promotional campaign indicates repeat behaviour of the customer towards the target entity. Herein acquiring may refer to one of purchasing or utilizing the target entity. As the repeat behaviour is triggered by promotional campaigns, product portfolio managers of the target entity may wish to maximize the gain with respect to marketing budget allocated for the offers, and cost of said offers. Maximization of the gain with respect to the marketing budget can be achieved by making personalized offers to specific customers that are more likely to become loyal. Typically in order to making such personalized offers, various statistical model are utilized that are capable of predicting buying behaviour of customers based on transactions history thereof.

[0020] Conventional models for predicting the repeat behaviour of the customers utilizes different features for the same task of costumer behaviour prediction, primarily because every organization/ retail company operates on a unique personalized model. For example, an e- commerce retail store may utilize e-commerce data such as users' click-behaviour in addition to transaction history to predict the repeat behaviour of the customers. Typically, a prediction model for such retail store may perform significant processing to pre-process the e-commerce data followed by feature generation. Additionally, such prediction models may require a view of effective factors/features affecting the repeat behaviour of consumers, when such factor and/or features are lost during transactions. Moreover, such models that are customized to the target entity may not be efficient for use in maximizing gains on planned marketing budget in a generic prescriptive analytics framework.

[0021] The repeat behaviour of the customers is affected by various factors such as presentation-related factors (for instance, website layout for online shopping and product placement in shelves in retail), customer service, available payment methods, availability of variety of choices, product quality, and so on. However, conventional prediction models primarily utilize the transaction history of customers to predict the customer loyalty and repeat behaviour, thereby yielding inaccurate prediction of customer repeat behaviour.

[0022] Various embodiments disclosed herein provide a generic framework with capabilities of data pre-processing, feature extraction and predictive modelling could be very helpful in analysis of above mentioned challenges. Various embodiments disclosed herein provide method and system to facilitate prediction of repeat behaviour of the customers in an accurate manner. For example, the embodiments provide a system capable of extracting various features that may capture different indicators of customer loyalty and repeat behaviour. In addition, the disclosed system is capable of predicting and suggesting the target entity, for example, the products that may be made offer on so that customer returns back to buy the product next time. In an implementation, the disclosed system utilizes a customer interaction data with the target entity to predict such repeat behaviour and at the same time suggest products. The customer interaction data includes various parameters such as (such as transaction history of the customer transaction data history, products information, and so on.

[0023] In an implementation, the disclosed system includes an offer optimization module for recommending products based on the predictions. In an implementation, the disclosed system utilizes customer preferences (as modelled from transaction history in form of repeat behaviour) and promotional campaign costs for offer optimization. The system may include modules for processing the information received from various sources, in a manner that said information may present the optimal product recommendations to the customer.

[0024] In an embodiment, the system includes a meta-model for transactional data representation and feature engineering. The system enables exhaustive feature-set generation and includes different machine learning algorithms at user's disposal. The optimization model which helps the users in selecting the right products for making offers to individual customers. Features generated from framework can be applicable on multiple datasets associated with different domains. It is to be noted that the features generated are the ones generally applicable across such type of transactional datasets and may work with various datasets.

[0025] In an embodiment, the method for predicting customer repeat behaviour enables capturing of the transaction history data associated with the customer using a computer and display device, creating a common data format to data schema mapping of the transaction history data, feature engineering to generate feature sets, model learning to produce predictions of repeat behaviour, formulating a strategy of recommending offers by using an optimization model, selecting products to recommend and then communicating information regarding the selected products to the same users using the same computer and display device. A detailed description of the above described system and method for predicting customer repeat behaviour is described further with respect to FIGS. 1 through 7.

[0026] The method(s) and system(s) for predicting repeat behaviour of customer are further described in conjunction with the following figures. It should be noted that the description and figures merely illustrate the principles of the present subject matter. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the principles of the present subject matter and are included within its spirit and scope. Furthermore, all examples recited herein are principally intended expressly to be only for pedagogical purposes to aid the reader in understanding the principles of the present subject matter and the concepts contributed by the inventor(s) to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the present subject matter, as well as specific examples thereof, are intended to encompass equivalents thereof.

[0027] FIG. 1 illustrates a network environment 100 implementing a system 102 for predicting repeat behaviour of customer, according to an embodiment of the present subject matter. The system 102 is configured to utilize transaction history of the customer to predict the repeat behavior. Additionally, the system 102 is capable of recommending products to the customers that may assist in enhancing their loyalty with respect to the target entity. In an embodiment, in case the target entity is a product, the system 102 may suggest the products based on transaction history as well as promotional campaign cost associated with the target entity to the customers. Herein, it will be noted that for the purpose of this description, the aforementioned examples of the target entity are considered. However, for various other applications/domains, the target entity may be different from the above mentioned examples. For example, domain such as healthcare, the target entity may include healthcare provider where it may be required to determine the customer's (or, patient's in this case) visiting behavior to the healthcare provider. In general, the disclosed embodiments can be applied to various domains where a behavior pattern is to be predicted. Such behavior pattern is predicted with respect to the corresponding target entity.

[0028] Although the present disclosure is explained considering that the system 102 is implemented on a server, it may be understood that the system 102 may also be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a notebook, a workstation, a cloud-based computing environment and the like. It will be understood that the system 102 may be accessed by multiple users through one or more user devices 106-1, 106-2... 106-N, collectively referred to as user devices 106 hereinafter, or applications residing on the user devices 106. Examples of the user devices 106 may include, but are not limited to, a portable computer, a personal digital assistant, a handheld device, a Smartphone, a Tablet Computer, a workstation and the like. The user devices 106 are communicatively coupled to the system 102 through a network 108. Herein, the users of the user-devices 106 may include one or more of the target entities, product managers, and so on.

[0029] In an embodiment, the network 108 may be a wireless or a wired network, or a combination thereof. In an example, the network 108 can be implemented as a computer network, as one of the different types of networks, such as virtual private network (VPN), intranet, local area network (LAN), wide area network (WAN), the internet, and such. The network 106 may either be a dedicated network or a shared network, which represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), and Wireless Application Protocol (WAP), to communicate with each other. Further, the network 108 may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices. The network devices within the network 108 may interact with the system 102 through communication links.

[0030] As discussed above, the system 102 may be implemented in a computing device 104, such as a hand-held device, a laptop or other portable computer, a tablet computer, a mobile phone, a PDA, a smartphone, and a desktop computer. The system 102 may also be implemented in a workstation, a mainframe computer, a server, and a network server. In an embodiment, the system 102 may be coupled to a data repository, for example, a repository 112. The repository 112 may store data processed, received, and generated by the system 102. In an alternate embodiment, the system 102 may include the data repository 112.

[0031] FIG. 2 illustrates a block diagram of a system 200, for predicting repeat behaviour of customer, in accordance with an example embodiment. The customer repeat behaviour prediction system 200 (hereinafter referred to as system 200) may be an example of the system 102 (FIG. 1). In an example embodiment, the system 200 may be embodied in, or is in direct communication with the system, for example the system 102 (FIG. 1). In an embodiment, the system 200 facilitates in using transaction history data of customer to predict repeat behaviour of the customer. Additionally, the system facilitates in presenting optimal product recommendations to the customer based on the predicted repeat behaviour. The system 200 includes or is otherwise in communication with one or more hardware processors such as a processor 202, at least one memory such as a memory 204, and a user interface 206. The processor 202, memory 204, and the user interface 206 may be coupled by a system bus such as a system bus 208 or a similar mechanism.

[0032] The one or more hardware processors such as the processor 202 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that facilitates in managing access to a financial account. Further, the processor 202 may comprise a multi-core architecture. Among other capabilities, the processor 202 is configured to fetch and execute computer-readable instructions or modules stored in the memory 204. The processor 202 may include circuitry implementing, among others, audio and logic functions associated with the communication. For example, the processor 202 may include, but are not limited to, one or more digital signal processors (DSPs), one or more microprocessor, one or more special- purpose computer chips, one or more field-programmable gate arrays (FPGAs), one or more application-specific integrated circuits (ASICs), one or more computer(s), various analog to digital converters, digital to analog converters, and/or other support circuits. The processor 202 thus may also include the functionality to encode messages and/or data or information. The processor 202 may include, among other things, a clock, an arithmetic logic unit (ALU) and logic gates configured to support operation of the processor 202. Further, the processor 202 may include functionality to execute one or more software programs, which may be stored in the memory 204 or otherwise accessible to the processor 202.

[0033] The one or more memories such as the memory 204, may store instructions, any number of pieces of information, and data, used by a computer system, for example the system 200 to implement the functions of the 200. The memory 204 may include for example, volatile memory. Examples of volatile memory may include, but are not limited to volatile random access memory (RAM). The non-volatile memory may additionally or alternatively comprise an electrically erasable programmable read only memory (EEPROM), flash memory, hard drive, or the like. Some examples of the volatile memory includes, but are not limited to, random access memory, dynamic random access memory, static random access memory, and the like. Some example of the non-volatile memory includes, but are not limited to, hard disks, magnetic tapes, optical disks, programmable read only memory, erasable programmable read only memory, electrically erasable programmable read only memory, flash memory, and the like. The memory 204 may be configured to store information, data, applications, instructions or the like for enabling the system 200 to carry out various functions in accordance with various example embodiments. Additionally or alternatively, the memory 204 may be configured to store instructions which when executed by the processor 202 causes the system 200 to behave in a manner as described in various embodiments.

[0034] In an embodiment, the memory 204 includes a plurality of modules 220 and a repository 240 for storing data processed, received, and generated by one or more of the modules 220. The modules 220 may include routines, programs, objects, components, data structures, and so on, which perform particular tasks or implement particular abstract data types. The repository 240, amongst other things, includes a system database 242 and other data 244. The other data 244 may include data generated as a result of the execution of one or more modules in the modules 220. The repository 240 is further configured to maintain a customer interaction data. The customer interaction data may include, but is not limited to, transaction history of customers 246, a plurality of customer-target entity pairs with repeater label 248, and products information 250. The details of customer interaction data along with the transaction history of customers, the plurality of customer-target entity pairs with repeater label, and the products information are described further in the description below.

[0035] In an example embodiment, the user interface 206 is in communication with the processor 202. Examples of the user interface 206 include but are not limited to, input interface and/or output user interface. The input interface is configured to receive an indication of a user input. The output user interface provides an audible, visual, mechanical or other output and/or feedback to the user. Examples of the input interface may include, but are not limited to, a keyboard, a mouse, a keypad, a touch screen, soft keys, and the like. Examples of the output interface may include, but are not limited to, a display such as light emitting diode display, thin-film transistor (TFT) display, liquid crystal displays, active- matrix organic light-emitting diode (AMOLED) display, a microphone, a speaker, ringers, vibrators, and the like. In an example embodiment, the user interface 206 may include, among other devices or elements, any or all of a speaker, a microphone, a display, and a keyboard, touch screen, or the like. In this regard, for example, the processor 202 may comprise user interface circuitry configured to control at least some functions of one or more elements of the user interface 206, such as, for example, a speaker, ringer, microphone, display, and/or the like. The processor 202 and/or user interface circuitry comprising the processor 202 may be configured to control one or more functions of one or more elements of the user interface 206 through computer program instructions, for example, software and/or firmware, stored on a memory, for example, the one or more memories 204, and/or the like, accessible to the processor 202. Herein, the memory for example the memory 204 and the computer program code configured to, with the hardware processor for example the processor 202, causes the system 200 to perform various functions described herein under.

[0036] According to the present subject matter, the system 200 facilitates in predicting repeat behavior of the customers. The system 200 may access customer interaction data associated with interactions of the customer with respect to at least one target entity in order to predict the repeat customer behavior. In an embodiment, the customer interaction data may be stored in the repository. Alternatively, the customer interaction data may be stored in an external repository associated with the target entity, and said external repository may be accessed by the system 200.

[0037] The system 200 abstracts a customer interaction data into a common data format (CDF) to obtain an abstracted customer interaction data. The customer interaction data is associated with interactions of the customer with respect to at least one target entity. As described previously, the customer interaction data may include transaction history data 246 of customers, the plurality of customer-target entity pairs with repeater label 248, and the products information 250 associated with a plurality of products. The product information 250 may include, for example, number of units bought, weight, price, company, category, brand, online-actions on products (click, add-to-cart, add-to-favorite for online shopping, and so on).

[0038] In an embodiment, the transaction history data may include transaction baskets of every interaction/visit of the customer with respect to the target entity. Alternatively, in case of online shopping, the transaction history also contains click data of customers. In an example representation, the transaction history data of the customer may be represented by a time-series, as shown below:

T c = { (h(t!), g(ti)), (h(t 2 ), g(t 2 )), (h(t 3 ), g(t 3 )), . . . }

Here, T c represents a transaction of a customer c, h( ) = < pi : Ii; pi : h;■■■> represents transaction basket at time

g(ti) represents information about the target market/store-chain/website customer visited,

Pi is a product, and Ii is the product information.

[0039] The system 200 abstracts the customer interaction data into the CDF by mapping data dimensions from the customer interaction data with the CDF. In an embodiment, the system 200 may accept input from the user, and utilized said user information to create mapping of CDF to data dimensions or data schema. In an example embodiment, the said user input may include names of fields present in the transaction history data. The CDF identifies different dimensions present in transactional history data. Herein, a data dimension refers to a broad categorization of type of data present in the dataset. For example, fields such as market, store- chain or merchant can be put under marketplace dimension. Examples of data dimensions in a CDF for an example case are shown in FIG. 3.

[0040] The abstraction provided by the CDF facilitates the data preprocessing and subsequent feature engineering, as will be explained later in the description. In an embodiment, said abstraction of the customer interaction data into the CDF identifies at least a plurality of mandatory fields and a plurality of optional fields in the customer interaction data to be utilized for extracting the set of features. For example, referring to the example shown in FIG. 3, various fields in column # are shown as mandatory and optional respectively.

[0041] Based on at least a portion of the abstracted customer interaction data, the system 200 extracts, a set of features corresponding to the least one target entity. Herein, the set of features characterizes customer interaction with respect to the target entity. In an embodiment, the system 200 may include a feature engineering module that may operate on the abstracted customer interaction data to generate said exhaustive set of features. It will be noted that the system 200 is capable of extracting features from with respect to different types of target-entities, such as product, merchant or store-chain. For example, if target entity is product, the system may capture customer's interest and buying behavior with the product the customer was made offer on. On the other hand, if target entity is a merchant, the system may capture the affinity of customer with products sold by the merchant.

[0042] In an embodiment, the set of features includes at least one of customer based features, target entity based features, customer-target entity interaction based features, profile based features, and similarity based features. Various sets of features are described as herein under: (i) Customer based features: The Customer based features, characterizing the buying behavior of the customer may include age group, gender, income and so on. The customer based features are derived from the available information regarding customer-demographics and transaction history. For example, the buying behavior of the customers, obtained from the transaction history may include includes, quantity of purchasing the items, the time spent on website, click rates, and so on.

(ii) Target Entity based features: The target entity based features represents the characteristics of target entity. For example, if the target-entity is a product, the target entity based features may include, but are not limited to, repeater rate for the product's company, brand and category.

(iii) Customer- Target Entity Interaction-based features: Customer's transaction history and the product that is being offered to the customer collectively are indicators of customer's interest in the offer. The customer-target entity interaction based features are based on the interactions between customer and target entity. For example, the frequency of customer buying a particular brand of offer-product or the average of amount spent by the customer on offer-product. In an embodiment, the system 200 is capable of analyzing different fields present in the dimension 'Product' to generate features for plurality of characteristics. In alternative examples, if more characteristics of product are present, they can easily be added to CDF to Data Schema mapping and the system 200 can be used to generate features on them.

(iv) Product based features: Product based features indicates the features of the products. For example, Number of click, Number of purchases, and Cost of the product.

(v) Similarity based features: The similarity based features are obtained from the profiles of customers or products or merchants drawn on various fields. For example, if a customer bought products similar to offer-product in past, then features based on such similar products are considered as indicators of the repeat behavior of the customer.

[0043] Herein, it will be noted that the interaction-based features may have an added functionality where user can select a window or at least a portion of the transaction data to be considered for calculating the set of features. A list of some of the above mentioned features is shown in FIG. 4.

[0044] In an embodiment, the system 200 converts the exhaustive set of feature into a desirable format suitable for machine learning. In an embodiment, the system 200 may include a learning pre -processor module (not shown) converts the exhaustive set of feature into the desirable format suitable for machine learning. Based on the set of features, the system 200 further models the prediction model to predict repeat behaviour probability of the customer with respect to the at least one target entity. Herein, the term 'repeat behaviour probability' refers to probability of precision of repeat behaviour at a threshold. In other words, the repeat behaviour probability is defined as a Prediction Score taken at a threshold. The repeat probability is a fraction of number of customers that are actual repeaters in labelled data with the number of customers who would be considered as repeaters by the prediction model at said threshold.

[0045] In an embodiment, the prediction model may be a supervised machine learning model. Modelling of the prediction may include obtaining a training data using the transaction history data of the plurality of labelled customer samples (for example, including customer- target entity with repeater label) corresponding to a target entity. For example, the labelled customer samples may include customer-target entity pairs with repeater labels. The system 200 may utilize said training data and the set of features to model and learn the prediction model. Herein, for the purpose of learning or training the prediction model, a customer sample of the plurality of labelled customer samples with a repeater label is classified as positive sample, and a sample customer of the plurality of labelled customer sample with a non-repeater label is classified as negative sample. Moreover, the machine learning compatible feature sets are used by a machine learning algorithm to learn the prediction model with repeater as positive class and non-repeater as negative class. The objective of parameter tuning is to maximize the Area Under ROC Curve (AUC). The Framework user may tweak the learning model for different evaluation metric such as such as- F-measure, Precision, Recall, and so on. Based on the training using machine learning approach, the prediction model outputs repeat behaviour probabilities of the customer samples for target entities. In particular, during the training stage, the customers samples considered may be training samples. Once the prediction model is trained with training customer samples, it can be utilized for predicting repeat behaviour of other (non -training) customer samples. In an example embodiment, the prediction model may be trained on Vowpal Wabbit that includes several in-built machine learning algorithms

[0046] As is described previously, herein repeat behaviour represents the probability of a customer buying the product (or merchant/market/store-chain) after promotional campaign. Hence, it is required to formulate a strategy of recommending offers to the customers in such as manner so as to maximize the gain achieved by way of providing offers to the customers in the promotional campaign with respect to their marketing budget, and cost of said promotional offers. [0047] In an embodiment, the system includes an optimization model that is capable of recommending products to customers to achieve the objective of gain maximization. The optimization model operates based on predictions produced by the prediction model. In an embodiment, the optimization model considers both customer preferences (as modeled from transaction history in form of repeat behavior) and promotional campaign costs for offer optimization. The optimization model is explained further in the description below.

[0048] The objective of the optimization model is to maximize an Expected Incremental Revenue (ER) for a given set of products and customers. In an embodiment, the system generates the optimization model based at least on the repeat behaviour probability obtained from the prediction model. The optimization model is configured to provide recommendations of offers to the customer with respect to the at least one target entity to maximize the Expected Incremental Revenue (ER) for a set of products (S) and customers (c).

Herein, the ER is difference between Lift(c,p) at time = (t + 1), and offer cost OC(p) incurred at time = t,

i.e. ER(c; p) = {Lift(c; p) - OC(p)}

wherein, Lift(c,p) represents expected revenue from the customer c shopping for a product p after a campaign,

i.e. Lift(c; p) = {RP(c; p)*(l-BP(c; p)) *P(p)}

where, BP(c,p) is the fraction of transactions in customer c s' transaction history data associated with the product p.

Buying Probability, BP(c, p) is the fraction of transactions in customer c's history where product p was bought.

Here, the second term (l-BP(c, p)) is to ensure that previously loyal customers of product p are not offered the same product.

Prediction Score(c, p) generated by the prediction model is not a probability, therefore the optimization model generates a Repeat Probability, RP(c, p) as Precision of the model at threshold = Prediction Scorefc, p).

[0049] The Precision at threshold = Prediction Scorefc, p) is taken as repeat probability because it is a fraction of number of customers that are actual repeaters in labelled data with the number of customers who may be considered as repeaters by the optimization model at this threshold. Price of a product p is indicated as P(p) and Offer Cost as OC(p). Campaign Cost, CC represents miscellaneous costs associated with campaign such as promotions, advertisements, and so on.

[0050] The system is caused to generate the optimization model with an optimization goal to maximize the Expected Incremental Revenue. In an embodiment, the system 200 learns the prediction model for repeat behaviour prediction using transaction history data of the labelled customer samples. Now, considering C denotes a set of customers into consideration and S denotes a set of products, the learnt model is used to predict repeat behaviour scores for these customer-product pairs.

[0051] The ER(c, S) V c C C, and Vpi C S, as

ER(c, S) = {ER(c, pi), ER(c, p 2 ), . . . , ER(c, pi). . . } is calculated.

The optimization model is configured to provide recommendations to offer customer c, the product ph such that /½ = arg p h max ER( c, pi)

[0052] For the product offered ph, Expected Incremental Revenue from customer c is ER(c, ph) and budget incurred is OC(ph). The optimization model keeps offering one/no product to customer till the budget gets exhausted. Let Offered Products (OP) = {(c p ) p ' is the product offered according to strategy to customer c '}, then,

Total Budget = CC +∑ (c ; p-) e op OC (p )

Total Expected Incremental Revenue =∑ ( C ; p-) e op ER (c p ')

So, for a given budget, the disclosed optimization model suggests products from set OP that should be offered to customers with the expected revenue as mentioned above.

[0053] Various embodiments provide method and system for predicting repeat behaviour of customer, along with its abstract feature generation module, which works on multiple types of datasets. Certain embodiments describe offer optimization model, which prescribes what offers should be made to a customer. An example system architecture having various modules for predicting repeat behaviour of the customers is described further with reference to FIG. 5.

[0054] FIG. 5 illustrates example system architecture 500 for predicting repeat behaviour of customers, in accordance with an example embodiment. The system architecture 500 is shown to include a prediction model, for example a prediction model 502 and an optimization model 504. The prediction model 502 is configured to predict repeat behaviour probability of customers. [0055] As described with reference to FIGS. 2-5, the system takes customer interaction data with the at least one target entity as input, and extracts a set of features therefrom. The system includes an input module 506 that may include the customer interaction data. As illustrated in FIG. 5, the input module is shown to include transaction history data of the customer and may include a CDF to data schema mapping, transaction history data, selected features and data split for parameter tuning.

[0056] The input module 506 provides said input including the transaction history to a feature engineering module 508. The feature engineering module 508 includes a feature extractor sub-module 510 and a learning pre -processor sub-module 512. The feature extractor sub- module 510 extracts a set of features from the customer interaction data and provides said features to the learning pre -processor sub-module 512. The learning pre-processor sub- module 512 pre-processes the extracted features and converts said features into desired format for Machine Learning. The feature engineering module 508 provides the pre- processed features to machine learning based prediction module 514. The prediction module 514 includes a learning sub-module 516 and the prediction model 502. The selected features are used by the learning sub-module 516 to learn the prediction model with repeater as positive class and non-repeater as negative class. The learning of the prediction model 502 is already explained with reference to FIG. 2. The prediction model 502 gives repeat behaviour probabilities of test customers for target entities.

[0057] The optimization model 504 received input from the prediction model in form of repeat behaviour prediction probability, buying probabilities (from the input module 506) and campaign constraints module 518 including campaign parameters, products considered for offers and test set of customers. Based on the repeat behaviour prediction probability and the campaign constraints, the optimization module 504 provides recommendations for offering products to the customers. For example the optimization module may provide the recommendations via an output module 520. The output module 520 may include customer- offer pair list 522, and a graphical analysis 524 of the recommendations. In an embodiment, the output module may be embodied in a user interface (UI), as will be explained further with reference to FIG. 7. In an embodiment, the prediction model is further updated based on the campaign constraints such as products considered for offer, campaign parameters and test set of customers. An example flow-diagram representing functionalities of the system 102 for predicting customer behaviour is described further with reference to FIG. 6. [0058] FIG. 6 illustrates a flow diagram of a method 600 for predicting customer behaviour, in accordance with an example embodiment. The method 600 depicted in the flow chart may be executed by a system, for example, the system 102/200 of FIG. 1/ FIG. 2, respectively. In an example embodiment, the system 102/200 may be embodied in a computing device. Operations of the flowchart, and combinations of operation in the flowchart, may be implemented by various means, such as hardware, firmware, processor, circuitry and/or other device associated with execution of software including one or more computer program instructions. For example, one or more of the procedures described in various embodiments may be embodied by computer program instructions. In an example embodiment, the computer program instructions, which embody the procedures, described in various embodiments may be stored by at least one memory device of an apparatus and executed by at least one processor in the apparatus. Any such computer program instructions may be loaded onto a computer or other programmable apparatus (for example, hardware) to produce a machine, such that the resulting computer or other programmable apparatus embody means for implementing the operations specified in the flowchart. It will be noted herein that the operations of the method 600 are described with help of system 200. However, the operations of the method 600 can be described and/or practiced by using any other system.

[0059] The method 600 for predicting repeat behaviour of a customer initiates at 602. As previously described, the repeat behaviour is indicative of a probability of the customer repeatedly purchasing or utilizing a target entity. At 602, the method 602 includes abstracting a customer interaction data into a CDF to obtain an abstracted customer interaction data. The customer interaction data is associated with interactions of the customer with respect to at least one target entity. In an embodiment, abstracting the customer interaction data into the CDF includes mapping data dimensions from the customer interaction data associated with interactions of the customer with respect to the at least one target entity with the CDF. An example CDF based on experimental results on Kaggle-AVS and UCAI-RBP datasets (customer interaction data) is illustrated in FIG. 3.

[0060] At 604, the method 600 includes extracting a set of features corresponding to the least one target entity based on at least a portion of the abstracted customer interaction data. Herein, the set of features characterizing customer interaction with respect to the target entity. An example of the set of features derived from the abstracted customer interaction data (using Kaggle-AVS and IJCAI-RBP datasets) and the user data is illustrated in FIG. 4. [0061] At 606, the method 600 includes modelling a prediction model based on the set of features to predict repeat behaviour probability of the customer with respect to the at least one target entity. An example of modelling the repeat behaviour probability by the prediction model is described further with reference to FIG. 7 in conjunction with FIGS. 3 and 4.

[0062] FIG. 7 illustrates an example UI 700 of a system for predicting customer behaviour in accordance with an example embodiment. The UI 700 displays an interface for providing recommendations by exploiting predictions generated earlier for selected customer-product pairs. The UI 700 includes an products catalogue window 702; a constraints window 704, and a results window 706. Additionally or alternatively, the UI 700 may include a graph 710 illustrating a comparison between the recommended offers provided by using the disclosed optimization model and random selection, as will be explained later.

[0063] The products catalogue window 702 displays a plurality of products, price and offers (or discounts) corresponding to said products. The constraints window 704 may include tabs for providing marketing budget and campaign costs. Based on said constraints and offer information, the optimization module may determine best possible recommendations for the customers, as shown in the results window 708.

[0064] Referring collectively now to FIGS. 3, 4 and 6, experimental results for prediction model and the optimization model are presented. The experiments are performed by using example customer interaction data such as Kaggle-AVS and UCAI-RBP datasets. In an example scenario, the Kaggle-AVS 1 data, transaction history of customers is available with the repeat behaviour of a subset of customers. Attributes such as company and brand of product, store chain, purchase amount, and so on are provided. Analysis is done on 159,857 customers and 14 products, out of which 27% customers are labelled repeaters for some products (class imbalance ratio of 1:3). Features such as shown in FIG. 5B, are extracted by the system. Almost 100 features were selected to generate repeat behaviour scores for selected customer-product pairs. Customers are randomly divided into training, testing and validation sets in ratio 3: 1: 1. The offer optimization interface as shown in FIG. 6 exploits predictions generated earlier for selected customer product pairs. The 'discount' shown in FIG. 6 is equivalent to Offer Cost (OC) here. For a marketing budget of 200,000 and a campaign cost of 50,000, the offer optimization model gives a set of recommendations of products to different customers. It is shown under Result where drop-down has customer ids to whom corresponding product should be offered. The graph shows comparison of the optimization algorithm with the strategy when products are randomly suggested for the customers. By increasing budget, more customers were offered products and revenue keeps on increasing. The saturation occurs when all the customers were offered some product for which incremental revenue was expected. As shown by the dotted line in graph as shown in FIG. 6, for the assigned budget, the revenue expected by recommending products to customers randomly, is 43,361 dollars. Alternatively, the offer optimization model estimates a revenue of 306,562 dollars for the same set of products and assigned budget. It also indicates that the maximum revenue attainable from the customers is 340,000 dollars (approx.) for the given constraints. Thus, the system's predictions were used and exploited by the offer optimization model to finally recommend the products to customers with maximum Expected Incremental Revenue. Random selection is chosen as baseline as this is the general industrial practice. Thus, the prediction model's results are compared with the results when products are randomly suggested to customers. To test the generality of the system for providing recommendations, the experiments were done on data provided by UCAI-RBP 2 also. Data comprises of around 212,000 customers each in labelled and unlabelled sets. The target-entity is merchant with 1994 merchants present in dataset. This problem also has a high class-imbalance ratio (1: 15). The disclosed system was able to generate a comprehensive set of features for this dataset. Logistic regression was used as learning model and the inventors were able to get an AuC of 0.676 on competition's test set (winner had AuC of 0.71).

[0065] Various embodiments disclose method and system for customer repeat behaviour prediction. In an embodiment, a generic framework for customer repeat behaviour prediction is described along with an abstract feature generation module, which works on multiple types of datasets. The system also includes an offer optimization module, which prescribes what offers should be made to a customer.

[0066] The foregoing description of the specific implementations and embodiments will so fully reveal the general nature of the implementations and embodiments herein that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. Therefore, while the embodiments herein have been described in terms of preferred embodiments, those skilled in the art will recognize that the embodiments herein can be practiced with modification within the spirit and scope of the embodiments as described herein.

[0067] It is intended that the disclosure and examples be considered as exemplary only, with a true scope and spirit of disclosed embodiments being indicated by the following claims.