Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
TALENT MANAGEMENT PLATFORM
Document Type and Number:
WIPO Patent Application WO/2019/108133
Kind Code:
A1
Abstract:
A method and system of determining suitability of an individual for a job position is disclosed. The method comprises: identifying a plurality of individual characteristics from an individual profile data set; retrieving a data model built based on the job demographic profile data set, wherein the job demographic profile data comprises historical data associated with the plurality of individual characteristics of the job position; inputting the identified plurality of individual characteristics into the data model; and computing a score for the individual based on the input into the data model, wherein the score is used to determine the suitability of the individual for the job position.

Inventors:
KEPPO JUSSI (SG)
YANG ZHENGZHI (SG)
PRATAP VISHNU (SG)
Application Number:
PCT/SG2018/050583
Publication Date:
June 06, 2019
Filing Date:
November 29, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
X0PA AI PTE LTD (SG)
International Classes:
G06Q10/06; G06Q10/10
Foreign References:
US20120123956A12012-05-17
US20150006422A12015-01-01
US20140122355A12014-05-01
Attorney, Agent or Firm:
AMICA LAW LLC (SG)
Download PDF:
Claims:
CLAIMS

1. A method of determining suitability of an individual for a job position, the method comprises:

identifying a plurality of individual characteristics from an individual profile data set;

retrieving a data model built based on the job demographic profile data set, wherein the job demographic profile data comprises historical data associated with the plurality of individual characteristics of the job position;

inputting the identified plurality of individual characteristics into the data model; and

computing a score for the individual based on the input into the data model, wherein the score is used to determine the suitability of the individual for the job position.

2. A method of determining suitability of an individual for a job position, the method comprises:

identifying a job demographic profile data, wherein the job demographic profile data set comprises historical data associated with the plurality of individual characteristics of the job position; and

building a data model based on the job demographic profile data set, wherein input of the identified plurality of individual characteristics into the data model allows for computation of a score for the individual, wherein the score is used to determine the suitability of the individual for the job position.

3. A method of determining suitability of an individual for a job position, the method comprises:

identifying a plurality of individual characteristics from an individual profile data set;

identifying a job demographic profile data set, wherein the job demographic profile data set comprises historical data associated with the plurality of individual characteristics of the job position;

building a data model based on the job demographic profile data set;

inputting the identified plurality of individual characteristics into the data model; and

computing a score for the individual based on the input into the data model, wherein the score is used to determine the suitability of the individual for the job position.

4. The method according to any one of claims 1 to 3, comprises performing feature engineering to generate a modified individual characteristic from at least one of the plurality of individual characteristics.

5. The method according to any one of claims 1 to 4, comprises identifying a plurality of requirements from a job description for the job position; wherein building the data model comprises building a plurality of data models, each data model corresponding to a requirement and quantifies the relevance of the individual characteristic to the requirement.

6. The method according to claim 5, comprises matching each identified individual characteristic to the corresponding data model; inputting each identified individual characteristic to the corresponding data model to generate a plurality of individual data model scores; and computing a final score by summing the plurality of individual data model scores.

7. The method according to claim 6, wherein computing the final score comprises applying a non-linear transformation to each individual data model score to obtain a plurality of transformed individual data model scores; summing the plurality of transformed individual data model scores to obtain an overall score; applying a second non-linear transformation to the overall score to obtain the score.

8. The method according to any one of claims 5 to 7, wherein the data model is built by natural language processing methods.

9. The method according to claim 8, wherein the natural language processing method is any one selected from word embedding of the individual characteristic, topic modelling of the individual characteristic, and term frequency-inverse document frequency of the individual characteristic.

10. The method according to claim 9, wherein the individual characteristic is an experience level of the individual and the natural language processing method employed includes word embeddings and Latent Dirichlet allocation.

11. The method according to any one of claims 6 to 10, comprises applying rule-based filtering based on the match made.

12. The method according to any one of claims 6 to 10, wherein the score is a relevance score and indicates the relevance of the individual profile to the job description.

13. The method according to any one of claims 1 to 4, wherein the job demographic profile data set comprises an employment history of every person in the database.

14. The method according to claim 13, and wherein a deep neural network algorithm is implemented to train the data model.

15. The method according to any one of claims 13 to 14, wherein the data model is optimised through a weighted cost function.

16. The method according to any one of claims 13 to 15, wherein the data model is built by blending two or more statistical models.

17. The method according to any one of claims IB to 16, wherein the score is a loyalty score and indicates the probability the individual leaves the job position over a point in time or over a time period.

18. The method according to any one of claims 1 to 4, wherein the job demographic profile data set comprises a time period a person spends in a first job position before moving to a second job position, wherein the second job position is a promotion from the first job position.

19. The method according to claim 18, wherein the job demographic profile data set comprises key performance indicators of an organisation.

20. The method according to any one of claims 18 to 19, wherein the score is a productivity score and is computed based on a selected reference.

21. The method according to any one of claims 1 to 4, wherein the job demographic profile data set comprises a salary level of a job in the market, past salary offers to a plurality of candidates and an acceptance rate of past salary offers.

22. The method according to claim 21, wherein the score is an acceptability score and indicates the probability of the individual accepting a job offer at a given salary level.

23. The method according to any one of claims 13 to 22, comprises building a causal data model to identify a causal relationship between at least one of the plurality of individual characteristics and the computed score; and computing a second score reflecting the elasticity of the score.

24. The method according to claim 23, comprises generating an alert when a change in the at least one of the plurality of individual characteristics causes the score to fall below or rise above a threshold level.

25. The method according to any one of claims 13 to 22, comprises merging a plurality of data models to generate a merged data set to build a second causal data model.

26. The method according to claim 25, wherein building the second causal data model comprises calculating a propensity score for an individual in the merged data set; matching the propensity scores for all individuals in the merged data set to generate a PSM data set comprising pairs of individuals, each pair comprising a first individual who has received treatment, and a second individual who is a control; and performing statistical analysis to determine the impact of treatment.

27. The method according to claim 26, wherein the statistical analysis is a paired t-test method.

28. The method according to any one of claims 25 to 27, wherein a first data set corresponding to a loyalty score is merged with a second data set corresponding to a productivity score to build the second causal data model.

29. The method according to any one of claims 1 to 4, wherein the job demographic profile data set comprises the plurality of individual characteristics in people who have previously qualified for a similar or identical job position in a similar or identical industry.

30. The method according to claim 29, wherein computing the score comprises applying a cosine similarity function to determine the similarity between the individual and the job demographic profile data set.

31. The method according to claim 30, wherein a weighted cosine similarity function is applied to give a higher weightage to a job profile which appears more frequently in the job demographic profile data set.

32. The method according to claim 31, wherein the weighted cosine similarity function is:

33. The method according to any one of claims 29 to 32, wherein the score is a similarity score and indicates the similarity of the individual to a stereotypical job profile in the job demographic profile data set.

34. The method according to any one of claims 1 to 33, wherein the job profile demographic data set is identified from a database comprising an organisation data set and a market data set.

35. The method according to claim 34, comprises subsetting the organisation data set and the market data set to form the job demographic profile data set to build the data model.

36. The method according to any one of claims 1 to 35, comprises computing a hireability score by combining the relevance score and one or more additional score.

37. The method according to claim 36, wherein combining the plurality of the computed score and/or second score is by applying any one selected from a linear function, a step-like function, and a logistic function.

38. The method according to claim 37, comprises allocating each requirement of the job description into one of the following groups: an essential component, a variable component and an optional component.

39. The method according to claim 38, wherein computing the score comprises at least one of the following:

(a) applying a non-linear transformation to the variable component to obtain a variable component range;

(b) applying a step function to the essential component;

(c) allocating a weightage to each component, wherein the optional component has a smaller weightage.

40. The method according to any one of claims 1 to 39, comprises receiving an individual profile from a user terminal.

41. The method according to any one of claims 1 to 40, comprises parsing the individual profile into the individual profile data set.

42. The method according to any one of claims 1 to 41, comprises scraping the Internet to build an individual online data set to form part of the individual profile data set.

43. The method according to any one of claims 1 to 42, comprises checking for errors in the individual; and penalising the computed score for errors found.

44. The method according to claim 43, wherein checking for errors is by a symmetric spelling correction method.

45. The method according to any one of claims 1 to 44, comprises adjusting the computing step of the score.

46. The method according to any one of claims 1 to 45, comprises ranking the individual based on the computed score relative to a plurality of other candidates.

47. The method according to any one of claims 1 to 46, comprises removing an individual from consideration for the job position if the score is below a qualifying score.

48. A non-transitory computer readable medium comprises instructions that when executed cause at least one computing device to perform the method according to any one of claims 1 to 47.

49. A system for determining suitability of an individual for a job position, the system comprises at least one processor; a non-transitory computer readable medium comprising instructions that when executed cause the at least one processor to perform the method according to any one of claims 1 to 47.

50. The system according to claim 49, comprises at least one of the following: the database, wherein the database comprises the market data set and/or the organisation data set; a user terminal for submitting the individual profile; and a parser tool to convert the individual profile into the individual profile data set.

Description:
Talent Management Platform

Technical Field

The invention relates to talent management. In particular, it relates to computer based techniques for calculating scores for an employee or potential employee to determine their suitability for an available job position, and a platform to support such techniques.

Background

Human resource is a key component in any organisation or company. No matter the degree of automation used, manpower is still required by organisations. The key challenges for any organisation includes hiring a talented and capable individual and retaining the individual. However, problems arise both in the recruitment phase and employment phase.

Summary

In a first aspect, there is a method of determining suitability of an individual for a job position, the method comprises: identifying a plurality of individual characteristics from an individual profile data set; retrieving a data model built based on the job demographic profile data set, wherein the job demographic profile data comprises historical data associated with the plurality of individual characteristics of the job position; inputting the identified plurality of individual characteristics into the data model; and computing a score for the individual based on the input into the data model, wherein the score is used to determine the suitability of the individual for the job position.

In a second aspect, there is a method of determining suitability of an individual for a job position, the method comprises: identifying a job demographic profile data, wherein the job demographic profile data set comprises historical data associated with the plurality of individual characteristics of the job position; and building a data model based on the job demographic profile data set, wherein input of the identified plurality of individual characteristics into the data model allows for computation of a score for the individual, wherein the score is used to determine the suitability of the individual for the job position.

In a third aspect, there is a method of determining suitability of an individual for a job position, the method comprises: identifying a plurality of individual characteristics from an individual profile data set; identifying a job demographic profile data set, wherein the job demographic profile data set comprises historical data associated with the plurality of individual characteristics of the job position; building a data model based on the job demographic profile data set; inputting the identified plurality of individual characteristics into the data model; and computing a score for the individual based on the input into the data model, wherein the score is used to determine the suitability of the individual for the job position.

The methods described above and herein are computer implemented methods, and the steps described herein may be performed by one or more processors, or computing devices. Preferably, the method comprises performing feature engineering to generate a modified individual characteristic from at least one of the plurality of individual characteristics.

Preferably, the method comprises identifying a plurality of requirements from a job description for the job position; wherein building the data model comprises building a plurality of data models, each data model corresponding to a requirement of the plurality of requirements and quantifies the relevance of the individual characteristic to the requirement.

Preferably, the method comprises matching each identified individual characteristic to the corresponding data model; inputting each identified individual characteristic to the corresponding data model to generate a plurality of individual data model scores; and computing a final score by summing the plurality of individual data model scores.

Preferably, computing the final score comprises applying a non-linear transformation to each individual data model score to obtain a plurality of transformed individual data model scores; summing the plurality of transformed individual data model scores to obtain an overall score; applying a second non-linear transformation to the overall score to obtain the score.

Preferably, the data model is built by natural language processing methods. The natural language processing method may be any one selected from word embedding of the individual characteristic, topic modelling of the individual characteristic, and term frequency-inverse document frequency of the individual characteristic.

In an embodiment, the individual characteristic is an experience level of the individual and the natural language processing method employed includes word embeddings and Latent Dirichlet allocation.

Preferably, the method comprises applying rule-based filtering based on the match made. More preferably, the rule-based filtering includes a location filter.

Preferably, the score is a relevance score and indicates the relevance of the individual profile to the job description.

In an embodiment, the job demographic profile data set comprises an employment history of every person in the database. Further, a deep neural network algorithm is implemented to train the data model.

Preferably, the data model is optimised through a weighted cost function.

Preferably, the data model is built by blending two or more statistical models.

Preferably, the score is a loyalty score and indicates the probability the individual leaves the job position over a point in time or over a time period. In other words, the loyalty score indicates the likelihood of the individual staying in the job position beyond a certain time period after being hired for the job position. In an embodiment, the job demographic profile data set comprises a time period a person spends in a first job position before moving to a second job position, wherein the second job position is a promotion from the first job position. The promotion of a person may be used as a proxy for the productivity of the person in the organisation. The second job position is preferably in the same organisation as the first job position, but may be another organisation too. This has an advantage as it does not require the organisation to provide sensitive commercial information.

In an embodiment, the job demographic profile data set comprises key performance indicators of an organisation.

Preferably, the score is a productivity score and is computed based on a selected reference.

In an embodiment, the job demographic profile data set comprises a salary level of a job in the market, past salary offers to a plurality of candidates and an acceptance rate of past salary offers.

Preferably, the score is an acceptability score and indicates the probability of the individual accepting a job offer at a given salary level.

Preferably, the method comprises building a causal data model to identify a causal relationship between at least one of the plurality of individual characteristics and the computed score; and computing a second score indicating the elasticity of the score.

Preferably, the method comprises generating an alert when a change in the at least one of the plurality of individual characteristics causes the score to fall below or rise above a threshold level.

Preferably, the method comprises merging a plurality of data models to generate a merged data set to build a second causal data model.

More preferably, building the second causal data model comprises calculating a propensity score for an individual in the merged data set; matching the propensity scores for all individuals in the merged data set to generate a PSM data set comprising pairs of individuals, each pair comprising a first individual who has received treatment, and a second individual who is a control; and performing statistical analysis to determine the impact of treatment. The statistical analysis may be a paired t-test method.

Preferably, a first data set corresponding to a loyalty score is merged with a second data set corresponding to a productivity score to build the second causal data model.

In an embodiment, the job demographic profile data set comprises the plurality of individual characteristics in people who have previously qualified for a similar or identical job position in a similar or identical industry. Preferably, computing the score comprises applying a cosine similarity function to determine the similarity between the individual and the job demographic profile data set. More preferably, a weighted cosine similarity function is applied to give a higher weightage to a job profile which appears more frequently in the job demographic profile data set. The score is a similarity score and indicates the similarity of the individual to a stereotypical job profile in the job demographic profile data set.

The weighted cosine similarity function may be as follows:

Preferably, the job profile demographic data set is identified from a database comprising an organisation data set and a market data set.

Preferably, the method comprises subsetting the organisation data set and the market data set to form the job demographic profile data set to build the data model.

Preferably, the method comprises computing a hireability score by combining the relevance score and one or more additional score (like the loyalty score, productivity score, acceptability score). More preferably, combining the plurality of the computed score and/or second score is by applying any one selected from a linear function, a step like function, and a logistic function.

Preferably, the method comprises allocating each requirement of the job description into one of the following groups: an essential component, a variable component and an optional component.

Preferably, computing the score comprises at least one of the following:

(a) applying a non-linear transformation to the variable component to obtain a variable component range;

(b) applying a step function to the essential component;

(c) allocating a weightage to each component, wherein the optional component has a smaller weightage.

Preferably, the method comprises receiving an individual profile from a user terminal.

Preferably, the method comprises parsing the individual profile into the individual profile data set.

Preferably, the method scraping the Internet to build an individual online data set to form part of the individual profile data set.

Preferably, the method comprises checking for errors in the individual; and penalising the computed score for errors found. For example, checking for errors is by a symmetric spelling correction method. Preferably, the method comprises adjusting the computing step of the score.

Preferably, the method comprises ranking the individual based on the computed score relative to a plurality of other candidates.

Preferably, the method comprises removing an individual from consideration if the score is below a qualifying score.

In a fourth aspect, there is provided a non-transitory computer readable medium comprising instructions that when executed cause at least one computing device to perform the method according to the above methods.

In a fifth aspect, there is provided a system for determining suitability of an individual for a job position, the system comprises at least one processor; a non-transitory computer readable medium comprising instructions that when executed cause the at least one processor to perform the method according to the above methods.

Preferably, the system comprises at least one of the following: the database, wherein the database comprises the market data set and/or the organisation data set; a user terminal for submitting the individual profile; and a parser tool to convert the individual profile into the individual profile data set.

Features that are described in the context of an embodiment may correspondingly be applicable to the same or similar features in the other embodiments. Features that are described in the context of an embodiment may correspondingly be applicable to the other embodiments, even if not explicitly described in these other embodiments. Furthermore, additions and/or combinations and/or alternatives as described for a feature in the context of an embodiment may correspondingly be applicable to the same or similar feature in the other embodiments.

The talent management platform provides organisations with an ability to manage their human resource requirements from the initial recruiting process till the management of employees in the organisation. The platform allows organisations to simplify and improve their recruitment process by utilising available information to find the most suitable candidates for an available job position, and may be used to predict the loyalty and productivity of an employee to find a most suitable fit for the organisation. The platform can also be used to determine the likelihood that a candidate will accept a job offer and the suitability of the candidates for a job or task, i.e. whether the organisation should hire a particular candidate. Various aspects of the platform shows how salary, promotion, learning initiatives and other characteristics of the organisation varies an employee's productivity and loyalty, or a potential employee's acceptability of the job offer, and forecasted productivity and loyalty, and thus the hireability of the job candidate. The talent management platform may be further configured to generate recommendations or a retention strategy for the organisation to identity best performing employees with a high likelihood to leave and retain them. The talent management platform allows the organisation to use their resources more effectively to hire and retain their employees, providing savings in time and costs. Brief Description of Figures

Figure 1 shows an overview of an embodiment of the invention;

Figure 2 shows a schematic diagram of a database;

Figures 3A and 3B show embodiments of the invention;

Figure 4 shows an embodiment of the invention;

Figure 5 shows an embodiment of the invention;

Figure 6 shows an embodiment of the invention;

Figure 7 shows an embodiment of the invention;

Figure 8 shows a system of an embodiment of the invention;

Figure 9 shows an embodiment of the invention;

Figure 10 shows an embodiment of the invention.

Detailed Description

In recruiting individuals for a job position, organisations face many problems. In the initial recruitment phase, there is a need for the organisation (or a recruiter) to shortlist a list of candidates from a large number of submissions. However, mere checking whether a candidate fulfils the minimum requirements of the job position may often result in a "poor fit" between the job position and the individual, as there are many traits of an individual which are not immediately apparent in the individual's resume (or curriculum vitae). Further, after hiring an individual the organisation needs to spend time and resources to train the individual who may leave after a short training period to the detriment of the organisation. Recruiters typically rely on their experience to shortlist candidates but this is often subjective and inaccurate. Thus, there is a need for a systematic method to technically analyse and determine the suitability of an individual (or candidate) for a job position. The methods and system described herein apply artificial intelligence and other analytical techniques to available data in a specific manner, and use the results to determine the suitability of the individual for a job position. For example, the market data is typically skewed and without applying specific techniques, the results obtained are inaccurate.

Figure 1 shows an overview of the workflow to use the talent management platform. Data is obtained from a user of the talent management platform in block 105, i.e. the user is an organisation looking to hire or retain an employee. The data may also be provided by an individual looking to apply for a job in the organisation (or by a recruiter), and could be in the form of a physical curriculum vitae (or a scanned copy), or the individual could enter the data into a webform. The data obtained is then prepared, sorted, and filtered in block 110. The data is analysed and predictions or forecasts are made based on the data in block 115. The prediction or forecasts are combined into solutions for the user for further interpretation in block 120.

Figure 8 shows an example of a system 800 which may be used as the platform. The system 800 comprises a client terminal 805 and a server terminal 815, the client 805 and server terminals 815 being connected via a network 810, for example a wide area network like the internet or a local area network like an intranet, and is shown schematically in Figure 8. Multiple client terminals 805 and server terminals 815 may be used as required. Examples of devices usable as the client terminal 805 include personal computers, mobile devices and the like. The server 815 generally comprises a processor 820, a memory 825, a user input device 830, an output device 835 to render the results and solutions in a readable format (e.g. text file, tabular file, graphs), and a network interface 840. The server 815 may further contain control logic 845 and a database 200. Different embodiments described herein may be performed by different server terminals 815 and/or client terminals 805, hence each server terminal 815 need not contain the same components. For example, a server terminal 815 may be a database server 815 while another is an application server 815.

Figure 9 shows an overview of the workflow 900 for a user to use the system 800. In block 905, the user first accesses the system 800, which recognises the user in block 910, for example with a user identification and password system. The required data is prepared for analysis by data checking and assembling in block 915, and data appending, updating and subsetting in block 920. The data from the various sources need not come from a single database, or need not be in a single data structure, thus there may be a need to connect the data and update it in the form of a time series. In block 925, analysis and modelling of the data takes place as described above.

In block 930, the system 800 determines if there are additional functionalities, for example other predictive solutions or value, which may be obtained from the user inputs and data sets. If no additional functionalities are required, the system 800 assembles and visualises output into the output device for the user to view (block 935) and subsequently exits (block 940). If present or required, the additional functionalities and the specified data are prepared (block 945) with further data analysis and modelling (block 950). The system 800 then loops back to block 930, and if there are no other functionalities, creates the output (block 935) and exits (block 940).

Figure 10 shows a method 1000 to be performed by the system 900 to aid the recruitment of a job candidate. When there is an open job position in the organisation, interested candidates will submit their curriculum vitae. This could be an existing employee in the organisation, an internal reference of an external candidate, an external candidate, an external human resource recruiter, or from an existing database of curriculum vitae for identical or similar positions. The candidate's curriculum vitae is submitted to the system (block 1005) either by accessing an organisation's job portal, human resource management system, excel document, entry into an online form, Portable Document Format file, a text file, a Hyper-Text Markup Language file, an image file, and other readable format. The system 1000 analyses the candidate's curriculum vitae (block 1010) and determines if it meets a first hiring threshold (block 1020).

In an embodiment, analysing the candidate's curriculum vitae (block 1010) includes identifying whether the curriculum vitae contains a plurality of qualification parameters. The qualification parameters typically include at least one of the following: educational and/or professional qualifications, a specific skill set, work experience, expected income, expected employment benefits, and ability to travel or relocate. The qualification parameters are typically provided or set by the hiring organisation with a minimum set of requirements for a job, alternatively the system 700 may suggest the qualification parameters based on the market data 210 and/or organisation data 205. In an embodiment, the second set of qualification parameters are divided into intelligence quotient (IQ) factors, skill factors, and emotional quotient (EQ) factors. A score is determined for each of the intelligence quotient (IQ) factors, skill factors, and emotional quotient (EQ) factors (block 1015). Intelligence quotient (IQ) factors and emotional quotient (EQ) factors scores are obtained by making the candidate to take a customized online test, while the skills factor score is obtained by matching the candidate's curriculum vitae and job description skill features. The different factors can also be weighted to value certain factors more. This can be determined by the organisation with the open job position, or could be identified from the organisation and/or market data. In an example, the final score comprises 10% to 60% weightage of each of the three scores, such that the total is 100%. For example, for a specific job the organisation may determine they wish to employ a weightage of 40% IQ factor, 25% skill factor, and 35% EQ factor, using this a total score is obtained. The weightage of each factor is usually provided by the hiring organisation, or could use a default weightage, or could be based on market data. It is subsequently determined if the candidate meets the first hiring threshold (block 1020), this could be with respect to the total score, or for each individual score.

If the first hiring threshold is not met, the candidate is notified of non-selection for the next round (block 1025), and may additionally be added into a database for future opportunities. If the first hiring threshold is met, a second pass analysis is conducted (block 1030) by computing the matching, loyalty, productivity and hireability scores as described by the methods described herein. This allows the hiring organisations to take into consideration more factors than what are present in the candidate's curriculum vitae and decrease the guesswork involved in existing recruitment processes.

It is further determined if the scores meet a second threshold (block 1035), if not the candidate is informed of the non-selection. The candidate is then invited for further interviews. This may include a video interview (block 1040), a psychometric test and/or gamification etc. (block 1045), and determine if the candidate meets a third hiring threshold (block 1050). Gamification is a term where programs/courses/selection are done via a gamified way. For example, a hackathon where developers code to get a job but is based on a real life scenario or problem. The candidate is subsequently invited for a final round of interview (block 1055) or informed of non-selection (block 1025). The number of interviews may be reduced or increased as required. For example, there could be additional interview rounds, or an organisation may wish to conduct the final interview directly after the second threshold test. This depends on the requirements of the hiring organisation and may be readily varied without deviating from the methods described.

The two stage analysis may be useful if there are a large number of candidates for a job; if there are a smaller number of candidates, the first threshold analysis may be combined with the second threshold analysis.

The method 1000 provides an advantage over existing methods which rely only on the individual personal data provided in the curriculum vitae of the individual and may not provide the best fit for both the individual and organisation, and may also cause the organisation to miss out some applicants who may contain other qualification factors which are not immediately identifiable through the curriculum vitae. The system 800 and method 1000 allows the organisation to consider more qualification factors in a candidate to increase the diversity and different abilities present in the organisation, which is not readily done by conventional methods.

In use, instructions for the methods described herein can be provided to be used directly by the human resource department of an organisation, or provided to the organisation as a third party service, for example as a software as a service model via a client terminal 805. Application programming interfaces (APIs) may be used to allow the system 800 to interact with the organisation's existing human resource management system or human capital management system, or a separate database 200 containing the necessary organisation data.

Data analytics model workflow

A method to determine the suitability of an individual (or candidate) for an available job position is shown in Figure 3. The method (300) comprises

(a) identifying a plurality of individual characteristics from an individual profile data set (block 305);

(b) identifying a job demographic profile data set, wherein the job demographic profile data set comprises historical data associated with the plurality of individual characteristics of the job position (block 310);

(c) building a data model based on the job demographic profile data set (block 315);

(d) inputting the identified plurality of individual characteristics into the data model (block 320); and

(e) computing a score for the individual based on the input into the data model, wherein the score is used to determine the suitability of the individual for the job position (block 325).

The data model used in the methods herein can be prepared and provided to an organisation to be used with the scoring method, and need not be performed by the same organisation. The market and/or organisation data set will most likely contain more characteristics (or features) than the individual characteristics available, or only certain individual characteristics will be relevant for a particular method. Thus, only those individual characteristic relevant to the method will be identified in the individual profile data set. Further, only the historical data (i.e. the market and/or organisation data) associated with the identified individual characteristics will be identified in the job demographic profile data set. By associated with, the relevant historical data could be directly or indirectly related to the individual characteristic.

In another embodiment shown in Figure 3B, the method (350) comprises:

(a) identifying a plurality of individual characteristics from an individual profile data set (block 355);

(b) retrieving a data model built based on the job demographic profile data set, wherein the job demographic profile data comprises historical data associated with the plurality of individual characteristics of the job position (block 360); and (c) inputting the identified plurality of individual characteristics into the data model (block 365); and

(d) computing a score for the individual based on the input into the data model, wherein the score is used to determine the suitability of the individual for the job position (block 370).

In another embodiment, the method comprises:

(a) identifying a job demographic profile data, wherein the job demographic profile data set comprises historical data associated with the plurality of individual characteristics of the job position; and

(b) building a data model based on the job demographic profile data set, wherein input of the identified plurality of individual characteristics into the data model allows for computation of a score for the individual, wherein the score is used to determine the suitability of the individual for the job position.

The method may further comprise any one of the following:

a) subsetting the organisation data set and the market data set to form a the job demographic profile data set to build the data model;

b) receiving an individual profile from a user terminal;

c) parsing the individual profile into the individual profile data set;

d) scraping the Internet to build an individual online data set to form part of the individual profile data set;

e) ranking the individual based on the computed score relative to a plurality of other candidates; and

f) removing an individual from consideration for the job position if the score is below a qualifying score.

1. Data Description

A database 200 contains data to be used by a system 700 or talent management platform. There are generally three categories of data in the database 200 as shown in Figure 2 - organisation data 205, market (or third party) data 210 and individual profile data 215. The database 200 referred herein are not limited to a single physical device, and can refer to several physical devices located in different locations. In particular, the organisation data 205 may be stored by the organisation itself and separate from the remaining data.

The data may be in structured (e.g. data entered into fields of a form) or unstructured (e.g. no pre-defined data model or is not organised in a pre-defined manner) form, and is converted into machine readable format. For example, the system or platform may allow a user (be it the organisation recruiting or the individual applying for the job) to enter to required information in specified fields. Alternatively, the system converts the unstructured data into a machine readable format by parsing. Natural language processing systems may be used to parse the data into a suitable data set, and is explained further below.

Referring to Figure 2, organisation data 205 refers to data on the organisation including information of the company such as its size, industry, revenue, profits, description of jobs in the organisation, and human resource data of the organisation's employees. The human resource data can be sourced from an organisation's Human Resource Management System or Human Capital Management System and includes information such as the employee's age, ethnicity, education, job skills and experience, performance review, employment tenure for various job positions in the organisation (including the mean and median values), salary of various job positions in the organisation. The organisation data set may be provided by an organisation which uses the method, system or platform described herein. However, not every organisation may be willing or have such data readily on hand to be used, and is not essential to the database 200 and methods but allow for better results. When an organisation is seeking to hire a person for a job position, a job description including the requirements the person should possess will be provided by the organisation. This job description may then be used to create a job profile data set and is part of the organisation data 205. The job description may be provided as a hard copy which is converted into a machine readable format, or submitted through a webform.

Market data 210 includes the salary and length of stay for a particular job position (upon which the average of each can be derived), and includes those which are present in multiple industries and industry specific jobs. The market data 210 may be obtained from surveys of employees and/or employers, third party providers or from public data, for example government employment statistics. The market data 210 may be obtained for particular jobs, industries, countries, and regions, and the data set used can be modified accordingly as desired and required. The market data 210 varies largely depending on the job and industry. For example, some jobs may require skills that are highly transferable between industries and/or jobs which are highly mobile and need not be restricted by geographic region. On the other hand, some jobs may be highly specific for various reasons including country specific requirements, or a specific skill set.

Individual data 215 refers to data on the candidates applying for a job or employees of an organisation. The individual data 215 includes personal individual data 220 or information typically obtained in the individual's curriculum vitae or employment records such as age, ethnicity, education, skills and work experience. These may be some examples of individual characteristics that may be identified and used in the analysis methods described herein. For an employee, this individual data 220 is also part of the organisation data as stated above. The individual data 220 could also be have been stored in an organisation's existing database of potential candidates for previous job openings or just a general submission by the candidate. The individual data 215 can be provided in the form of a hard copy (e.g. Portable Document Format file, a text file, a Hyper-Text Mark-up Language file, and other readable format) of the curriculum vitae in which it will be processed first and converted into a machine readable form. Alternatively, the individual could be asked to provide such information directly into a user interface (or webform) which captures the individual data 215 in a structured format.

In addition, the talent management platform sources for information from the Internet to establish the individual's Internet footprint 225, i.e. scraping public data to build the individual's online data set. Examples of web scraping techniques that can be used include application programming interface links, text pattern matching, HTTP programming, HTML parsing, DOM parsing, vertical aggregation, semantic annotation recognition, and machine learning. The individual's email and/or name can be used to find information relevant to the individual, and used to compile an individual online profile.

The individual characteristics include any "push" or "pull" factors that may induce an employee to leave or stay with the organisation. For example, some individual characteristics include the age, education qualifications, work experience, gender, ethnicity, marital status, salary of the employee (the salary here is taken to be the financial remuneration to the employee including any fixed or variable wage component, like bonuses, commissions, and options), work benefits (e.g. leave, opportunity for professional development, flexible work arrangements), commuting distance and time from home, work satisfaction level, relationships with colleagues, relationships with clients, work environment, diversity and other characteristics of the workforce in the firm. The individual data includes the data on the individual's current job, and previous job/s. The industry data may also include the individual's current and previous company and industry. Every individual is different, and may value different factors or benefits provided differently.

The individual data may be provided by the individual itself applying for the position, or by the organisation to analyse the suitability of a candidate (be it external or internal candidate) through a user terminal either locally on a device, or through a webform. The individual data is received by a receiving module, an extraction module to extract out the relevant data, a sorting and filtering module to sort and filter the data. For example, information regarding an individual can be found from the individual's Linkedln account including the number of social media contacts, the people or groups the individual follows, the groups the individual joins, recommendations the individual has.

2. Data Preparation

If the market data 210 is in the form of free text and hence unstructured, the data is converted into a semi structured format using a parser tool, which is further converted into a structured format in the form of tables. Further data cleaning is performed to remove null values, unnecessary punctuations and outliers. Feature engineering is also performed to derive features that are useful when computing the similarity scores. This may include deriving completely new features or creating dummy variables for existing features.

Feature engineering is the process of using both domain knowledge and statistical behaviour of the data to create features that enhance the machine learning predictive capability. There are various ways of doing feature engineering, combining several raw features to generate a new meaningful feature is one approach. An example of feature engineering would be using a feature like experience (in number of years) of a person to derive the seniority of a person. Usually in a job requirement, the experience is represented in a range (like at least 3 years, 3-5 years or mid-senior level, etc.) By creating this engineered feature (a modified individual characteristic) for seniority, the manner experience is advertised in the recruitment domain is replicated to be used in the methods herein. The appropriate ranges may be determined using inputs from domain experts. Another advantage is that grouping experiences helped in reducing the skewness in the data. For example, if there is a relative high number of candidates with experience less than 5 years feature engineering as described will reduce the skewness of the data set. It will be appreciated that other job requirements may be similarly engineered to generate other modified individual characteristics.

Here only one feature is used to perform feature engineering but we can also use a combination of existing features to perform feature engineering. If feature engineering is done appropriately, it increases the predictive power through an iterative process, and provides more meaningful interpretation of the machine learning algorithms by creating features from raw data that are more interpretable and help to facilitate the machine learning process.

The market data 210 may be analysed to determine trends and/or patterns, to allow the system 100 to identify a job demographic profile data set by a data analysis engine. For example, the job demographic profile may include information such as the average salary for a particular job, additional factors can be added including age, education, gender, location (e.g. city, state, country, region, and continent), industry etc. to define the job demographic profile more accurately or broadly as required. For example, a job demographic profile may establish that individuals in a certain age group has a high tendency to change jobs or companies every two years. In addition to the job demography, job description features such as education, past experiences, skills and certifications are also considered to identify the right candidate in the hiring process.

Feature engineering may also performed on the market data set. For example, based on the job industry mentioned in the job description, we compare how close this job industry is to the job industry of the candidate. For example, if the job is about the internet industry and the candidate is from the information technology industry, the job might be still relevant to the candidate.

The above is similarly applicable to the organisation data 205 supplied by the company to supplement the market data. An organisation specific job demographic profile data set can be created based on the market and organisation data, and is a subset of the job demographic profile. For example, the average tenure of an employee in a specific job may be higher than other similar organisations and could reflect the better human resource management policies in the organisation or other factors that may not immediately be apparent from numerical data. In another example, the average tenure of an employee in a specific job could be shorter than the industry average but could be attributable to a job rotation or promotion policy within the organisation. These information may not be readily available outside the organisation, but could provide a more accurate reflection of the organisation's policies and strategies.

The organisation may also provide a job description containing the requirements for the available job position, some non-limiting examples of these requirements include the education level, number of years of experience, salary range, technical skills required, travel frequency, specific experience in managing a team or project. The job description may be provided as structured or unstructured data, and may be converted to a machine readable format to form the job profile data set 207 as described for the individual profile data set. The requirements for the position may typically be divided into three types: essential, variable, and optional. Essential requirements are those which the organisation considers to be necessary for the position, variable requirements are those which the organisation may accept a wider range than indicated, and the optional requirements are those which are good to have for the candidates. Thus, the job profile data set comprises an essential component, a variable component, and an optional component, each corresponding to the requirements of the position.

The method may further check for errors in the individual profile data set 215, and penalise the score for the errors found in the check. This is because frequent misspelling in the curriculum vitae, or submission, will reflect inadequate preparation or carelessness in the person, and would imply a higher chance of unfitness for the job. A fast spelling check on the submitted individual data is conducted and a measurement of the spelling errors would be provided to penalise the overall score. The spelling check may be done by string comparison using a symmetric spelling correction method to have fast string checking and editing, in which the editing count is used to measure the number of spelling errors.

3. Model Building 3.1 Ensemble Matching Algorithm Overview

In this method, natural language processing (NLP) techniques is used to measure how relevant the candidate's resume would be compared to the job description using text analytical methods.

In this example, the method (400) comprises:

(a) identifying a plurality of individual characteristics from an individual profile data set (block 405);

(b) identifying a job demographic profile data set, wherein the job demographic profile data set comprises historical data associated with the plurality of individual characteristics of the job position (block 410);

(c) identifying a plurality of requirements from a job description for the job position (block 415);

(d) building a data model based on the job demographic profile data set, wherein building the data model comprises building a plurality of data model, each data model corresponding to a requirement of the plurality of requirements and quantifies the relevance of the individual characteristic to the requirement (block 420);

(e) matching each identified individual characteristic to the corresponding data model (block 425);

(f) inputting each identified individual characteristic to the corresponding data model to generate a plurality of individual data model scores (block 430); and (g) computing a score for the individual based on the input into the data model, wherein the score is used to determine the suitability of the individual for the job position, wherein computing the (final) score is by combining the plurality of individual data model scores (block 435).

To analyse the relevance of the resume to the job description, the resume is parsed and different approaches are adopted for each section. Each individual's profile is segmented into several sections with different weightage for each as described below. In general, it is difficult to use a single model to analyse the relevance of each section of the individual's profile to the job description.

The data on the individual is compiled into an individual profile data set 215, from which a plurality of individual characteristics are identified by a processor 820 in block 305. In this example, the processor 820 identifies a job demographic profile data set, wherein the job demographic profile data set comprises historical data associated with the plurality of individual characteristics of the job position in block 310. A plurality of requirements are further identified from a job description for the job position and can be compiled into a job profile data set 207. Subsequently, the processor 820 builds multiple data models, each data model corresponding to a specific requirement and quantifies the relevance of the individual characteristic to the requirement. The processor 820 further matches and inputs each individual characteristic to the corresponding data model to compute an individual score for each corresponding data model, and combining the individual scores from corresponding data model to compute a final score.

Different models may be used to analyse each section, after which the scores from each section is combined. As an example, for location, we use both keywords and geographical distance to quantify the proximity. For experience level, we use nonlinear transformation of the experience so that we do not set a strict range as the requirement. For experience description, we use Latent Dirichlet Allocation (LDA) topic modelling to see the similarity between the individual and the job requirement. For skills or some key requirements, we use customized word embeddings to compare the numerical representation of words/phrases, the model is trained on a large number of text from job descriptions and resume elaborations. These models are typically probabilistic models and thus the individual model score is a probability score that is combined to give a final score to determine the relevance of an individual's CV to the job description.

Matching Design

1. Location: majorities of the jobs prefer the candidates to be in close proximity.

Currently rule-based filtering is implemented to get people in the same country. For small countries like Singapore, using geographical distance measurement would be able to provide the actual proximity score.

2. Experience level: each job has certain requirement on experience level, and there is usually an upper and lower bound expected from recruiters. Instead of doing strict cutoffs, a non-linear transformation is implemented to give variable tolerance bandwidth to each experience level. The experience level requirement is from the job description, and the individual's total experience is calculated from the employment history. This model compares the personal total experience to the required experience level in the job description. Instead of a strict bound, e.g. 5-10 years which the applicants must have experience between 5 and 10 years, or maybe between 4 and 11 years with 1 year tolerance, the score is calculated based on how much the personal experience deviates from the required value, and the deviation is transformed nonlinearly such that even if the individual falls short of the requirement the individual may still be considered for the job. However, if the difference from the range requirement is too large, the model will penalise the individual heavily.

3. Must-have (essential) requirements: for the skills/education/background which are crucial to the job description, the model deploys a step-like function to simulate the importance of such requirements.

4. Good-to-have (optional) requirements: for the desirable skills, a smaller weightage is allocated. Moreover, skill similarities are used instead of rule-based filtering.

5. Experience description: for long text chunks inside the work experience/activities/awards description, the model is able to extract the topics from the text and calculate the topic similarity compared to the job description. This similarity measurement tells how relevant the person's history is to the job scope.

6. Spelling check: frequent misspelling in the resume will reflect inappropriate writing in the preparation, and would imply a higher chance of unfitness. Fast spelling check is conducted on the resume write-ups and a measurement of the spelling errors would be provided to penalize the overall relevance score.

7. Fine-tuning based on users' behaviour: with the focus on different sections for relevance comparison, the users' decision is collected to understand which section is more critical and the model would be fine-tuned to have better overall matching score.

The above criteria have been identified to be important factors in designing and implementing the matching solution.

Implemented Models/Methods

1. Rule-based filtering: identifying the key words/phrases which are required based on the job description, keep those profiles which have fulfilled most of the requirements.

2. Geo-location measurement: using google API, retrieve the location longitude and latitude to calculate the geographical distance between two places.

3. Nonlinear smoothing: through a nonlinear function with mean, lower bound and upper bound values provided, fit the experience tolerance value with a variable buffer based on the variance of the bounds, so for more junior level which the difference between bounds and mean value is small, the tolerance derived would be low and for more senior jobs, the tolerance would be larger.

4. Word embeddings: mapping word/phrases to higher dimensional vector space to quantify the meaning of them. Calculate the similarity by comparing the word vectors. 5. Topic modeling: Latent Dirichlet Allocation/probabilistic latent semantic analysis are utilized to retrieve the topic information from text paragraphs to summarize the topics.

6. String comparison: symmetric spelling correction method is adopted to have fast string checking and editing. The editing count will be used to measure the mis spelling errors.

7. Term frequency-inverse document frequency (tf-idf): a numerical statistic that reflects how important a word is to a document.

8. Supervised learning: to fine-tune the overall relevance score calculation, machine learning models are used with the scores from each section as the input and the users' decision as the final output. The bias would be reduced through large user base, and the final relevance calculation would be more appropriate.

Each individual characteristic may be of a numerical value or is text based. Thus, the numerical value may be matched to a fixed value or range in the job profile data set 207, for example 3 years of experience matched to a requirement of at least 2 years of experience. For a text based individual characteristic, word embedding and/or topic modelling is performed instead. For example, a person's curriculum vitae may contain long lines of text in the work experience, activities, and award sections. However, since it is unlikely for every person to use the same word/s to describe the same item, this will often lead to errors and inaccurate results. In addition, job titles and awards may differ between organisations thus leading to errors as well. Word embedding is to map a word or a phrase in the sentence to a higher dimensional vector space to quantify the meaning of the word or phrase. Subsequently, the similarity between the word vectors are calculated and added to the scoring function. In topic modelling, latent Dirichlet Allocation and/or probabilistic latent semantic analysis methods are utilised to retrieve the topic information from text paragraphs to summarise the topics and calculate the similarities with the requirements of the job profile. This provides a more accurate result to determine how relevant a person's history (work experience, education levels and skills) are to the job requirements, and removes or reduces variances due to differences between organisations. Each section of the job requirements may utilise one or more of the natural language processing models or methods, and is targeted to calculate the relevance for each section inside the job requirement based on the individual characteristics.

Model Outcome Evaluation

In addition, the method 300 may further apply rule-based filtering. This could be based on the match made between the individual characteristics and the job profile requirements (i.e the number of matches and/or quality of the match). One of the rules used could be a location filter. This could be by country, state, city, or geographical distance, and is dependent in part on the job and individual. For example, in a small country the geographical distance may be more relevant, since the distance between two points may not be significant. On the other hand in a large country with many states and cities, people may have less inclination to move out of the country. Thus, this is dependent on the country in question. The location measurement may be performed using the Google Maps API (or other similar programmes) to retrieve the coordinates of a location (longitude and latitude) and measure the geographical distance between them. The method may factor into account transport links between the two locations in determining the location filter.

The matching of the individual characteristic to a component of the job profile data set, may have several areas to evaluate, so for each area it could be viewed as a rule-based filtering. However, it is not a simple yes or no filtering. For the above example of location filter, it also has the geospatial analysis to understand the physical distance to perform the filtering.

The individual data model will provide an individual data model score for each requirement (or section) in the job description. A different non-linear transformation may be applied to each score, and all the transformed scores are summed up to get an overall score. A final non-linear transformation is performed on the overall score to obtain a final score.

The nonlinear transformation for the individual data model score varies across the different scores, and it is not a fixed formula. As an example, a benchmark formula is used as a starting point, and with more validation and decision available, reinforcement learning is used to train the nonlinear transformation formula for each individual requirement (or section).

For example, the benchmark transformation for word embedding may be:

1+ g(0.65 x( 0.B 3f) i f x < °- 5 - and x if x ³ 0.5.

In addition, the method may subtract 70 from the must-have skill word embedding score as we set 70 as the benchmark (the numerical value provided is purely for illustrative purpose and may be any other suitable value). If it is negative, a continuously increasing order is taken to amplify the negative score such that if the difference becomes too negative, no matter how well the individual scores for the other sections, the individual would not be able to obtain a positive score, in other words be qualified for the job position. This is an example of a non-linear transformation. A similar non-linear transformation may be applied to each section, but with slight differences to account for the difference in the individual characteristic in the section (i.e. the non-linear transformation applied to each section is similar but unique to that section).

The final matching score is obtained by another transformation which is 100 / (1 + exp(- overall score)).

100

Final matching score = + g _ (overall score)

In an embodiment, the score calculated is termed the relevance score has been used and evaluated both internally and externally by recruiters/managers. With more than 50 jobs being tested before the fine-tuning, 65% of the people recommended to be relevant for a job position using a base model (with tf-idf) were validated as being truly relevant by expert recruiters and through the hiring organisation. For example, out of a shortlist of 20 candidates, 13 candidates have been found to be a good match (in other words the relevance score has achieved at least 65% accuracy for those jobs). For the ensemble matching algorithm, using the method, models and final matching score described, an accuracy of 70-75% has been achieved. The accuracy will further increase with additional fine tuning, and if the percentage of good resume is higher for each job.

3.2 Loyalty Model

Overview

The loyalty score reflects the probability or likelihood that the individual leaves the organisation, and could be provided as a point in time, for example the likelihood that the individual leaves in a year, or provided as a plot over a time interval, for example the likelihood that the individual leaves over a whole year period may vary due to various reasons. For example, there may be a higher tendency for employees to leave after a calendar year or financial year for the employee to receive an annual bonus, or the organisation could award a loyalty bonus for every five years of service, thus an employee is more likely to leave only after a certain financial bonus is received. This is similarly applicable to other benefits described herein.

A method to calculate the loyalty score is described and shown in Figure 3. The method (300) comprises:

(a) identifying a plurality of individual characteristics from an individual profile data set (block 305);

(b) identifying a job demographic profile data set, wherein the job demographic profile data set comprises historical data associated with the plurality of individual characteristics of the job position (block 310);

(c) building a data model based on the job demographic profile data set (block 315);

(d) inputting the identified plurality of individual characteristics into the data model (block 320); and

(e) computing a score for the individual based on the input into the data model, wherein the score is used to determine the suitability of the individual for the job position (block 325).

The processor (820) may identify and compile the job profile demographic data set a database (200) comprising an organisation data set (205) and a market data set (210). The method (300) may further comprise subsetting the organisation data set and the market data set to form a job demographic profile data set. In this example, the data model is termed a loyalty model and may be considered a predictive model.

Model Design

The profile information (the individual's past employment history) is used as base features for building the loyalty model described here. The employment history of the individual has the factual target variable - whether they have left the job within the first one year after they joined, and it is from this that the target variable values are derived, and this provides the foundation for supervised learning. The employment history of every individual is extracted and formatted as the job demographic profile data set. With the common seasonality discovered through people's resigning behaviour, a time period (for example one year) is chosen as the time horizon to calculate the loyalty score. The predictive loyalty model utilises supervised learning strategy and converts the loyalty problem to a binary classification problem. The loyalty model may utilise linear models, for example continuous linear models and/or over time horizon linear models.

Besides the existing features inside the personal data, we have also performed feature engineering to come up with new features, e.g. the frequency of job change, education status during the work, similarity of education background to every job. Most of the features are comprehensible and the feature importance results are easy to explain. More than 300 different features are derived for the modelling stage. The loyalty and performance models share a similar feature engineering process. Examples of some feature engineering include an individual's education qualification is compared with the job function to see how similar he/she is doing something he/she has studied before. The company information from those previous jobs are also some features derived, such as the company size and industry. The idea of such feature engineering is to try to consider various factors which could affect the fitness of a person to a job, and this fitness would be reflected in terms of loyalty (length of stay).

To minimize bias in the model such as gender or ethnicity group bias, the model does not include such features as a single feature in the modelling process. Moreover, the training data is re-sampled to have a more balanced distribution for such features so the bias in the historical data is not carried forward to the future prediction.

The loyalty model may implement a deep neural network algorithm to have fast model training process and accurate prediction results. The loyalty model utilises more than 300 different features, such that tree-based methods which include RandomForest or Gradient Boosting (GBM) would take too long for training as there are too many combinatorial possibilities to compare the entropy change. The deep neural network algorithm is able to nonlinear transform the 300 different features to significantly fewer features based on the number of hidden layers and number of nodes for each layer, and converges much faster. Moreover, deep neural network is able to fit for hard-to-guess nonlinear statistical behaviour because of the intrinsic nature of the algorithm, it could provide better model fitting results with proper overfitting prevention. Further, two or more models may be blended to provide a better fit or faster fitting. Examples of models that may be used include continuous linear models, over time horizon linear models, Random Forest and Neural Networks models.

With domain knowledge in recruitment, the model is optimized through a weighted cost function, which gives different importance to false positive and false negative prediction. The model training process needs to minimize a cost function for convergence, e.g. if accuracy is chosen, the cost function would be the number of incorrect matching. However, for our situation, it is not appropriate to choose accuracy because on average only 20-30% of the people leave within the first one year. The model might give very high false negative values if it predicts almost everyone is not going to leave. Instead, we use a customized Fl-score cost function so that we balance the false positive and false negative predictions.

The loyalty model may be considered a predictive model, when the target variable is known, the model is able to outperform the normal human judgement. In addition, it is possible to use the feedback provided by the hiring organisation if the hired employee leaves within one year (or the chosen time period used in the model), and can allow fine- tuning for the loyalty model. Thus, the loyalty model allows organisations to hire individuals who are more likely to stay with the organisation beyond a certain time period and minimises the organisation's costs in hiring and retraining people to fill the available positions caused by a high turnover in the employees.

3.3 Performance Model

Overview

The productivity score measures how productive an employee is or how productive a potential employee would be. For the former, this can be obtained at least in part from the organisation data based on the performance review of the employee, including input from a supervisor or manager. For a potential employee, this could be from the candidate's past work experience, education, or a personal recommendation and this could be the individual characteristic identified in the individual profile data set, and how these translate to the future productivity. The productivity score is computed relative to a reference value. A default reference value could be set as the best performing individual in an organisation or industry, but it will be apparent that other reference values may be used as desired.

A productivity score for an employee or potential employee can also be obtained by determining or computing the productivity score by reference to a productivity threshold. By using data from the individual's current and/or previous jobs (for an unemployed job candidate), organisation data, and market data, and the linear models (categorical), and machine learning models. The productivity threshold may be any suitable reference point in the organisation or industry. For example, the reference could be the top performing individual in a job or task in the organisation or industry. An alternate reference could be the mean performance of the job or task. Thus, the productivity score is computed based on a selected reference. The productivity score can be determined based on the organisation data alone and/or with the market data. This could involve analysing the work performance review of an employee and empirical scores provided by a supervisor. The productivity score may also be determined for potential employees based on the existing work done, but may be more limited due to the limited information for a specific industry or job type. The productivity score of a job candidate can also reflect the predicted future performance of the individual, and can be based on different key performance indicators.

The method (300) used to calculate the productivity score is similar to that used for the loyalty score. The difference lies in the individual characteristics identified, the training set used from the job demographic profile data set, and the type of statistical models used. For example, the job demographic profile data set may comprise a time period a person (or of all people in the organisation and/or industry) spends in a first job position in an organisation before the person is promoted to a second job position in the organisation (i.e. the second job position is at a higher level than then first job position). The second job position may also be in another organisation, i.e. the person leaves for a higher level job at another organisation. If the key performance indicators of an organisation are available, these may instead be used or in combination with time period Feature engineering may also be done to generate modified individual characteristics as per the loyalty model.

Model Design

In an example, an organisation's actual Key Performance Indicators (KPIs) is often not publicly available, which may be due in part to sensitivity of the information. Instead, the expected promotion of an individual based on the work summary information is used. The average promotion time for individuals in a particular industry is used to compare to whether the person got promoted quicker than his/her peers in the same industry and/or job position. The dataset used for building the model is the same as the one mentioned above for building the loyalty model. Again, the profile information is used as base features (or individual characteristics) for building the performance model. In addition, we have also performed feature engineering to come up with new features from the existing base features, for example the frequency of job change, education status during the work, similarity of education background to every job.

The work summary contains information on job title changes within any particular company. This is considered as a promotion within the company and is used as a proxy for the performance of the individual, and all information that is not a promotion is ignored from our analysis. The time taken for an individual to get promoted is compared with the average rate of promotion based on the industry and seniority of an individual. This is the expected promotion for an individual. It is a binary indicator where 1 means that the individual is promoted quicker than his/her peers and 0 otherwise. Thus, an individual who is promoted more quickly would be expected to have a better performance than another individual in the same position.

The performance model is a predictive model, when the target variable is known, the model is able to outperform the normal human judgement. In addition, it is possible to use the feedback provided by the organisation of the actual performance indicator of the individual who eventually took the job position to perform the fine-tuning for the predictive model.

3.4 Acceptability Model

Overview

In this example, a method 400 to compute and use an acceptability score for a job candidate. The job candidate can be an external candidate or an internal candidate within the organisation seeking a transfer. The acceptability score reflects the likelihood of a job candidate accepting a job offer from the organisation and depends in part on the individual characteristics in the individual profile data set, and the job demographic profile. For example, the acceptability model predicts the probability of the candidate accepting the job offer based on his/her current salary, expected salary and the offered salary. In general, a job candidate is less likely to accept a job offer which is below the market average and vice versa.

Model Design

The individual characteristics used for the acceptability score could be same or similar to that used for the loyalty score or productivity score. For example, some individual characteristics could include the salary of the employee (the salary here is taken to be the financial compensation to the employee including any fixed or variable wage component, like bonuses, commissions, and options), work benefits (e.g. leave, opportunity for professional development, flexible work arrangements), commuting distance and time from home, work satisfaction level, relationships with colleagues, relationships with clients, work environment, diversity and other characteristics of the workforce in the firm. The individual data includes the data on the individual's current job, and previous job/s. The industry data may also include the individual's current and previous company and industry. In addition, the data on the new (i.e. the hiring) organisation or company and industry may be used in the analysis.

In an example, information on current salary, expected salary, whether an offer was given to the candidate (Yes/No answer), whether the offer was accepted by the candidate (Yes/No answer), if the offer was rejected by the candidate, what is the reason for the rejection and offered salary to the candidate is collected to form a training set to build a salary model that predicts whether a new hire would accept the offer at a particular offered salary.

Since the target outcome is dichotomous (Offer accepted or not), linear models (categorical) are used to build the salary data model. The outcome is a probability score giving the likelihood of the candidate to accept the job offer, and is termed an acceptability score.

Furthermore, the acceptability score can be used to perform causal analysis like the loyalty score and productivity score. For example, it can be determined using a causal data model whether the time interval between the last interview date and job offer date has an impact on whether a job offer is accepted. In another example, another causal model could be about the number of rounds of negotiation after the job is offered and how that impacts the acceptability.

3.5 Causal Models

Overview

The loyalty model, performance model and benchmark models described thus far provide predictions at an individual level, and may be further utilised with causal models to determine the impact of treatment on outcome on a group level, for example the impact of performance on loyalty once the individual loyalty and performance scores have been obtained. In causal analysis, interactive impact is important. The reasons and the impact of those factors do not only depend on the candidates' information, but also depend on the details of the job. Therefore, some job information is needed for the feature engineering and modelling. Model Design

The method (600), shown in Figure 6, comprises:

(a) identifying a plurality of individual characteristics from an individual profile data set (block 605);

(b) identifying a job demographic profile data set, wherein the job demographic

profile data set comprises historical data associated with the plurality of individual characteristics of the job position (block 610);

(c) building a data model based on the job demographic profile data set (block 615);

(d) inputting the identified plurality of individual characteristics into the data model (block 620);

(e) computing a score for the individual based on the input into the data model,

wherein the score is used to determine the suitability of the individual for the job position (block 625);

(f) building a causal model to identify a causal relationship between at least one of the plurality of individual characteristics and the computed score (block 630); and

(g) Computing a second score indicating the elasticity of the score (block 635).

This allows the method 600 to determine how the change in an individual characteristic will affect the loyalty score, i.e. the elasticity of the loyalty score. Although the loyalty model may show how an individual characteristic affect the loyalty score, it is not the correct way of measure the impact of a treatment on outcome since the sample is biased. In order to make the sample unbiased, we have to select a subsample which has the same characteristics except that one group is given treatment and the other is control. In scenario, we would be able to measure the true impact of a treatment on outcome.

Based on how the loyalty score varies with the change in the individual characteristic, recommendations can be provided on how to improve the loyalty score of the individual. In particular, this provides a retention strategy for the employee when the loyalty score falls below the retention threshold. For example, it can be determined how an increase in the salary in the immediate time or future will affect the loyalty score of the employee. Other benefits or incentives may also be provided, for example a promotion to the employee, additional leave benefits or a study award. Thus, after the loyalty score is obtained for an individual, by further identifying a causal relationship between the loyalty score and at least one of the individual characteristics, a second score may be computed to reflect the elasticity of the loyalty score.

The method 600 may further comprise generating an alert when a change in at least one of the individual characteristics causes the loyalty score to fall or rise below a retention threshold. This is generally only applicable to an existing employee of the organisation. The alert can be sent to a supervisor or human resource department. The retention threshold may be determined from the organisation and/or market data to determine the probability that an employee with a certain loyalty score leaves. This serves to highlight to a supervisor or human resource department that an employee has a high likelihood to leave, and additional action may be required. It could also be used to indicate that measures taken are effective in reducing the likelihood of the employee leaving.

After computing the loyalty score, a causal relationship between at least one of the individual characteristic and the loyalty score is identified and analysed to determine how the individual characteristic affects the loyalty score. For example, the identification and analysis can be done by a causal relationship identification module. For example, the analysis of the causal relationship can be done using at least one of the Rubin causal model, the method of Instrumental Variables and Difference-in-differences method. For example, the effect of changes in the salary (or financial remuneration), promotion to the employee and/or provision of training (or skills development) are analysed to determine how the change will affect the individual's loyalty score.

For example, the following approach can be used in the analysing the causal relationship: the difference between what was expected and what was predicted is called the residual error. It is calculated as: residual error = expected - predicted. Just like the input observations themselves, the residual errors from a time series can have temporal structure like trends, bias, and seasonality. Any temporal structure in the time series of residual forecast errors is useful as a diagnostic as it suggests information that could be incorporated into the predictive model. An ideal model would leave no structure in the residual error, just random fluctuations that cannot be modelled.

Structure in the residual error can also be modelled directly. There may be complex signals in the residual error that are difficult to directly incorporate into the model. Instead, you can create a model of the residual error time series and predict the expected error for your model. The predicted error can then be subtracted from the model prediction and in turn provide an additional lift in performance. A simple and effective model of residual error is an autoregression. This is where some number of lagged error values are used to predict the error at the next time step. These lag errors are combined in a linear regression model, much like an autoregression model of the direct time series observations.

Based on the productivity score, the retention strategy provided to improve the loyalty score may be varied to reflect the productivity of the employee, and to preferably retain employees with high productivity scores. Counterfactual analysis can also show different scenarios to retain employees with different productivity scores and the costs associated to raise their loyalty scores by the same extent or to cross a certain threshold, like the retention threshold. The system may also be used to show how changes in the salary of the employee or a promotion given to the employee may change the productivity of the employee.

In an example, a method (700) to determine the causal relationship of performance on loyalty is shown in Figure 7, the loyalty model (705) and performance model (710) are built and may be further merged in order to come up with one dataset that contains the treatment as well as the outcome (block 715). The propensity score is calculated (720), one-on-one matching is performed, and statistical analysis is done on the matched data (730). The method (700) comprises: merging a plurality of the data models (for example the loyalty model and the performance model explained above) to generate a second causal data model. The method (700) may further comprise retrieving a plurality of data models from a database (200).

The second causal data model be built by calculating a propensity score for an individual in the merged data set; matching the propensity scores for all individuals in the merged data set to generate a propensity score matching (PSM) data set comprising pairs of individuals, each pair comprising a first individual who has received treatment, and a second individual who is a control; and performing statistical analysis to determine the impact of treatment.

Propensity Score Matching (PSM) is used to estimate the impact of promotion (treatment, in this case) on loyalty (outcome). PSM is generally done on observational studies where random assignment of treatment to subjects is not possible. PSM removes selection bias between treatment and control groups. Propensity score is simply the probability of receiving treatment, given covariates.

If the treatment is defined by A=l(promoted) and control is A=0(not promoted) then formally the propensity score, is defined by

p ί = R(A = 1\C i )

ni. Propensity score for person / which is a function of and we are indexing it by /, because person / has a unique set of covariates X,. So this is the probability of treatment, given that person's particular set of covariance.

The propensity score is matched to achieve balance. The match can be made either on the entire set of covariates by taking the distance between them or by simply matching on the propensity score. In a randomized trial, the probability of treatment given covariates would not depend on the covariates and hence it will be 0.5. But in an observational study, the propensity score is unknown. However, the propensity score depends on observational data i.e. A and X, both of which are available in the data set. So the propensity score is estimated by treating the treatment as the outcome. Since the treatment is binary, we have used logistic regression to estimate the propensity scores, i.e. calculate the propensity score (block 720). The model is fitted to get the predicted probabilities or fitted values for each subject.

The propensity score is a probability but it is unknown for an observational study. The model is fitted to estimate the probability of a person to get the treatment. This is done by using all the features of a person as input and using the treatment as the outcome (usually a binary). Hence a classification model is built to estimate the probabilities. Once we get the probability, distance measures like nearest neighbours can be used to determine pairs of treated and control subjects.

The propensity score is a scalar, so each person will just have a single value of the propensity score, and will be a single number between zero and one for each person. This greatly simplifies the matching problem as only one variable needs to be matched as opposed to a whole set of variables. So essentially, the propensity score is summarizing all the covariates (X), and then is just match on that summary. The propensity score calculation is an intermediate step that is used to identify pairs of individuals having the same characteristics except that one receives the treatment whereas the other is in control group.

Finally the matching is performed. Usually, matching might cause a reduction in the sample size because for each person on the treatment group we are trying to find a person in the control group such that their propensity scores match (block 725). Once the matching is completed, a dataset comprising pairs of individuals who are same in their characteristics except that one in the pair is given the treatment and the other in the pair is the control.

Paired t-test is performed on outcome to determine the mean impact of treatment on the treated (block 730). The null hypothesis assumes that the true mean difference between the paired samples i.e. the group that got promoted and the one that did not is zero. Under this model, all observable differences are explained by random variation. Conversely, the alternative hypothesis assumes that the true mean difference between the paired samples is not equal to zero. Statistical significance was determined by looking at the p-value. The significance or p-value is <0.05 which means that null hypothesis (as stated above) can be rejected and the alternative hypothesis (as stated above) can be accepted. Thus, this shows whether there is a statistically significant difference (or impact) of promotion (in the form of the performance model as explained above) on the loyalty (i.e. whether the impact is random or not).

3.6 Benchmark Model

Overview

Figure 5 illustrates another embodiment of the invention. In this embodiment, the individual is compared against the market or industry data, in particular a pool of profiles (or people) who are working and/or have worked in a similar job and/or industry, to find the similarity of the individual compared to the industry norm.

In the Benchmark Model, a similarity score is determined by how similar a person is to people who have ever been qualified for this type of job position in our database. This is to distinguish a potential applicant based on common traits found in a group of individuals working in the same job function/industry. If the candidate has a stereotypical profile in the industry or is an outlier, as being more similar to the stereotypical profile adds more credibility to the profile. In many situations, an applicant might not be a good job fit even though they might seem to be based on their job profile/CV. By using the power of big data, we match an applicant against a pool of profiles who have ever worked in a similar job/industry and find their similarity with this pool of candidates.

Workflow Design

Figure 5 illustrates further how the system and method 500 works. The processor 820 identifies a plurality of individual characteristics from the individual profile data set 215. The market data set 210 is subset by the processor 820 to generate a job demographic profile data set, which will be used as a benchmark for the applicant's profile. The profile information is used as base features for calculating the Benchmark score. In addition to it, we have also performed feature engineering to come up with new features from the existing base features, for example the frequency of job change, education status during the work, similarity of education background to every job. By identifying different individual characteristics and the associated job demographic profile, a different job demographic profile data set is obtained. Inputting the identified individual characteristics allows a similarity score between the applicant and the industry profile for the job can be computed. In particular, the similarity score, which has a value between 0 and 1 inclusive, may be computed using weighted cosine similarity.

The idea behind using a weightage here is to give more importance to items appearing in our base dataset using for calculating the benchmark score. So, if a particular job profile like software engineer appears more in the market data set, it will be given higher weightage compared to a more obscure profession. This will give more credibility to the Benchmark score.

Cosine similarity is a measure of similarity between two non-zero vectors of an inner product space that measures the cosine of the angle between them. The cosine of 0° is 1, and it is less than 1 for any other angle in the interval [0, 0.5n). It is thus a judgment of orientation and not magnitude: two vectors with the same orientation have a cosine similarity of 1, two vectors oriented at 90° relative to each other have a similarity of 0, and two vectors diametrically opposed have a similarity of -1, independent of their magnitude. The cosine similarity is particularly used in positive space, where the outcome is neatly bounded in [0,1] The name derives from the term "direction cosine": in this case, unit vectors are maximally "similar" if they're parallel and maximally "dissimilar" if they're orthogonal (perpendicular). This is analogous to the cosine, which is unity (maximum value) when the segments subtend a zero angle and zero (uncorrelated) when the segments are perpendicular.

The cosine of two non-zero vectors can be derived by using the Euclidean dot product formula:

This formula takes in the prepared trained array ( from base dataset ) and test array ( incoming new CV) and outputs the similarity score for each item in test set against each item in train dataset (similarity score matrix). A weight matrix is generated for each item in training dataset by calculating average similarity score against all other items in the training dataset. This is multiplied with the similarity score matrix. The product is then normalized by the summation of weights in the weight matrix. The final result is an array of scores for each item in the test dataset The similarity score may be used alone or in conjunction with the scores obtained from the other methods described herein in order to determine the suitability of the candidate and in ranking multiple candidates. The similarity score may be viewed as a distance score between two profiles - the individual applying for the job position and the stereotypical profile of people working in the job position.

3.7 Hireability Scoring (Hireability score)

The overall scoring function is a hybrid of above-mentioned models, and further utilize the domain knowledge and user-decision to refine the final scoring function. As an example, the relevance score (and the data models) used may be combined with one of the loyalty score/model, productivity score (performance model) and the acceptability score (salary model). In particular, all the three predictive models are combined together with the relevance score to generate a hybrid model.

The job requirements may be grouped into an essential component, a variable component and an optional component as explained above. The method may apply a step-like function to the essential component to simulate the importance of the essential component of the requirements. For the person's skills that matches the optional component, a smaller weightage is given in computing their contribution to the score.

The most commonly used function would be a linear function, e.g. y = ax. A step function is a non-linear function which is like going up stair case, e.g. when x is in [0, 2), y = 1, x in [2, 4), y = 2. In another example, a logistic function is used to approximate the step function behaviour.

The linear function shows a consistent coefficient, so the increasing behaviour is proportional. On the other hand, the logistic function is an S-shape function which shows smooth but sharp change within a small range, and on two ends of the domain values, the change is insignificant. The appropriate function is chosen based on the statistical behaviour to obtain the required results.

While analysing a CV, people would not treat every piece of information of the candidate equally. For each job, there are certain critical requirements and some nice to have skills/experience. Therefore, it requires different approaches to evaluate each section of the CV, and using such approach would guarantee more reasonable results. While adopting different approaches here, we provide a consistent and appropriate way to understand the CV for each section and summarize the information.

The method may be further optimised to fine tune the overall score calculating, by using machine learning models with the scores from each section as the input and the users' (hiring side) decision as the final output. The user's decision is collected to understand which section (or individual characteristic) is more critical and the model is fine tuned for more accurate results. The bias would be reduced through a large user base, and the final relevance calculation would be more accurate.

Table 1: Machine Learning Models and Functions

Table 1 shows an example of different statistical models and functions that may be used to compute the relevant score in the various models described herein. The different statistical models may be blended to obtain each of the relevance, loyalty, and productivity scores. A way to obtain the models to compute the relevant score is by using stacked generalisation. Another way is to use blending (or stacked ensembling) to obtain the relevant score. The combined models allow the system and method to compute the loyalty score, which reflects the likelihood that the individual will leave at a point in time or over a period of time. For example, the loyalty score can be obtained by using the linear models (categorical), machine learning models like random forest and neural networks as shown in Table. More than one algorithm may be used to develop a model to analyse part of the scope, and we combine the output of each model to form the overall score.

There are many different machine learning algorithms available, and each one has different model fitting behaviour, and each one has pros and cons in terms of the accuracy and the interpretability. Linear model mentioned here is usually helpful for causal analysis, and the non-linear machine learning algorithms such as Random Forest- based, support vector machine (SVM)-based algorithms could give better prediction results for classification problems with non-linear behaviour. The logic behind model selection is based on the statistical distribution of the training data, the domain knowledge and the model result evaluation.

Model selection is an iterative process. In order to select the right model, the relationship between the features and the target variable is checked. One method would be to plot a single feature vs the target variable and check if the relation is linear or not. Alternatively, the correlation coefficient between the two variables may be determined. The correlation coefficient varies between -1 and +1, with values near -1 or +1 indicating linearity. Since repeating this process for each feature can be tedious, a linear model may be built instead to check the accuracy. It the outcome is not good, it could mean that either the features are not good or the relation between features and target is non- linear. The training data may be subsequently used to build a non-linear model like SVM, neural network or random forest.

The scoring is based on the output of statistical models. For categorical outcomes the models give out a probability value. So if a person has a loyalty of 60%, it means that the probability of a person staying in the company for the next 12 months is 0.60. Multiple models are used to check the consistency of the model outcome across different machine learning models. Another reason for using a selection of models is because models differ in their characteristics in terms of how good they are in finding the relation between the features and the outcome. If the relation between a feature and an outcome is linear then a linear model will suffice but if the relation is non-linear, then using a linear model to fit the data would lead to inaccurate results. Instead, models like neural networks will be better at capturing non-linear relationships. Blending of the models may be done as required, for example the ensemble matching (or relevance) score. Multiple models are run and the scores from each of these models is combined either through voting or weighting to come up with one score.

The methods described herein are provided on a talent management platform, and serves to change and/or enhance the way recruitment is done, by optimising the six KPI's of hiring:

1. Quality of hire

2. Cost to hire

3. Time to hire

4. Retention of hire

5. Diversity of hire

6. Candidate's experience of hiring process

Recruiters care about the job fit, whether the candidate will stay in job and whether the candidate will perform in a job before hiring the candidate. Getting constant feedbacks and listening to the pain-points recruiters face in their day-to-day activities has been the motivation behind building loyalty and performance models.

In addition, the methods, models, scores, and system described herein help the recruitment process to be much more efficient with better hiring quality. The workflow design of the platform provides an effective and interactive way for stakeholders to manage the information and all relevant processes. The insights and the ease through modelling and automation make the method/model really helpful.

The models change the way people do the hiring and the screening of candidates. They combine different unique elements in an unconventional way to evaluate every candidate from several perspectives. The workflow design and the method provide secure and fast data access and analytical solution. The methods described herein apply specific techniques to provide more accurate results, for example by correcting for skewed data sets, and bias in the models.

Certain embodiments are described herein as including logic, modules or mechanisms. Modules may comprise either software modules (for example code embodied on a computer readable medium) or hardware implemented modules which have configured or arranged in a certain manner to perform certain operations. For example, one or more computer systems (or one or more processors) may be configured by software as a hardware implemented module to perform the described operations. A hardware implemented module may be implemented mechanically or electronically. For example, a hardware implemented module may comprise dedicated circuitry or logic that is permanently configured (like a field programmable gate array or an application specific integrated circuit), or programmable logic or circuitry.

In various embodiments, modules or software can be used to practice certain aspects of the invention. For example, software-as-a-service (SaaS) models or application service provider (ASP) models may be employed as software application delivery models to communicate software applications to clients or other users. Such software applications can be downloaded through an Internet connection, for example, and operated either independently (e.g., downloaded to a laptop or desktop computer system) or through a third-party service provider (e.g., accessed through a third-party web site). In addition, cloud computing techniques may be employed in connection with various embodiments of the invention. In certain embodiments, a "module" may include software, firmware, hardware, or any reasonable combination thereof.

Moreover, the processes associated with the present embodiments may be executed by programmable equipment, such as computers. Software or other sets of instructions that may be employed to cause programmable equipment to execute the processes may be stored in any storage device, such as a computer system (non-volatile) memory. Furthermore, some of the processes may be programmed when the computer system is manufactured or via a computer-readable memory storage medium.

A "computer," "computer system," "computing apparatus," "component," or "computer processor" may be, for example and without limitation, a processor, microcomputer, minicomputer, server, mainframe, laptop, personal data assistant (PDA), wireless e-mail device, smart phone, mobile phone, electronic tablet, cellular phone, pager, processor, fax machine, scanner, or any other programmable device or computer apparatus configured to transmit, process, and/or receive data. Computer systems and computer- based devices disclosed herein may include memory for storing certain software applications used in obtaining, processing, and communicating information. It can be appreciated that such memory may be internal or external with respect to operation of the disclosed embodiments. The memory may also include any means for storing software, including a hard disk, an optical disk, floppy disk, ROM (read only memory), RAM (random access memory), PROM (programmable ROM), EEPROM (electrically erasable PROM) and/or other computer-readable memory media. In various embodiments, a "host," "engine," "loader," "filter," "platform," or "component" may include various computers or computer systems, or may include a reasonable combination of software, firmware, and/or hardware.

In various embodiments of the present invention, a single component may be replaced by multiple components, and multiple components may be replaced by a single component, to perform a given function or functions. Except where such substitution would not be operative to practice embodiments of the present invention, such substitution is within the scope of the present invention. Any of the servers described herein, for example, may be replaced by a "server farm" or other grouping of networked servers (e.g., a group of server blades) that are located and configured for cooperative functions. It can be appreciated that a server farm may serve to distribute workload between/among individual components of the farm and may expedite computing processes by harnessing the collective and cooperative power of multiple servers. Such server farms may employ load-balancing software that accomplishes tasks such as, for example, tracking demand for processing power from different machines, prioritizing and scheduling tasks based on network demand, and/or providing backup contingency in the event of component failure or reduction in operability.

In general, it will be apparent to one of ordinary skill in the art that various embodiments described herein, or components or parts thereof, may be implemented in many different embodiments of software, firmware, and/or hardware, or modules thereof. The software code or specialized control hardware used to implement some of the present embodiments is not limiting of the present invention. For example, the embodiments described hereinabove may be implemented in computer software using any suitable computer programming language such as .NET, SQL, MySQL, or HTML using, for example, conventional or object-oriented techniques. Programming languages for computer software and other computer-implemented instructions may be translated into machine language by a compiler or an assembler before execution and/or may be translated directly at run time by an interpreter. Examples of assembly languages include ARM, MIPS, and x86; examples of high level languages include Ada, BASIC, C, C++, C#, COBOL, Fortran, Java, Lisp, Pascal, Object Pascal; and examples of scripting languages include Bourne script, JavaScript, Python, Ruby, PHP, and Perl. Various embodiments may be employed in a Lotus Notes environment, for example. Such software may be stored on any type of suitable computer-readable medium or media such as, for example, a magnetic or optical storage medium. Thus, the operation and behaviour of the embodiments are described without specific reference to the actual software code or specialized hardware components. The absence of such specific references is feasible because it is clearly understood that artisans of ordinary skill would be able to design software and control hardware to implement the embodiments of the present invention based on the description herein with only a reasonable effort and without undue experimentation.

Various embodiments of the systems and methods described herein may employ one or more electronic computer networks to promote communication among different components, transfer data, or to share resources and information. Such computer networks can be classified according to the hardware and software technology that is used to interconnect the devices in the network, such as optical fiber, Ethernet, wireless LAN, HomePNA, power line communication or G.hn. The computer networks may also be embodied as one or more of the following types of networks: local area network (LAN); metropolitan area network (MAN); wide area network (WAN); virtual private network (VPN); storage area network (SAN); or global area network (GAN), among other network varieties. As employed herein, an application server may be a server that hosts an API to expose business logic and business processes for use by other applications. Examples of application servers include J2EE or Java EE 5 application servers including WebSphere Application Server. Other examples include WebSphere Application Server Community Edition (IBM), Sybase Enterprise Application Server (Sybase Inc), WebLogic Server (BEA), JBoss (Red Hat), JRun (Adobe Systems), Apache Geronimo (Apache Software Foundation), Oracle OC4J (Oracle Corporation), Sun Java System Application Server (Sun Microsystems), and SAP Netweaver AS (ABAP/Java). Also, application servers may be provided in accordance with the .NET framework, including the Windows Communication Foundation, .NET Remoting, ADO.NET, and ASP.NET among several other components. For example, a Java Server Page (JSP) is a servlet that executes in a web container which is functionally equivalent to CGI scripts. JSPs can be used to create HTML pages by embedding references to the server logic within the page. The application servers may mainly serve web-based applications, while other servers can perform as session initiation protocol servers, for instance, or work with telephony networks. Specifications for enterprise application integration and service-oriented architecture can be designed to connect many different computer network elements. Such specifications include Business Application Programming Interface, Web Services Interoperability, and Java EE Connector Architecture.

Embodiments of the methods and systems described herein may divide functions between separate CPUs, creating a multiprocessing configuration. For example, multiprocessor and multi-core (multiple CPUs on a single integrated circuit) computer systems with co-processing capabilities may be employed. Also, multitasking may be employed as a computer processing technique to handle simultaneous execution of multiple computer programs.

In various embodiments, the computer systems, data storage media, or modules described herein may be configured and/or programmed to include one or more of the above-described electronic, computer-based elements and components, or computer architecture. In addition, these elements and components may be particularly configured to execute the various rules, algorithms, programs, processes, and method steps described herein. The description is merely exemplary in nature and is not intended to limit the invention or the application and uses of the invention. Furthermore, there is no intention to be bound by any theory presented in the preceding background of the invention or the following detailed description. It will be understood by those skilled in the technology concerned that many variations or modifications in details of design or construction may be made without departing from the present invention.

Other technical advantages may become readily apparent to one of ordinary skill in the art after review of the following figures and description.