Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
ANALYSIS AND COMPARISON OF CHARACTER-CODED DIGITAL DATA, IN PARTICULAR FOR JOB MATCHING
Document Type and Number:
WIPO Patent Application WO/2021/089129
Kind Code:
A1
Abstract:
Computer-implemented job matching is disclosed. Based on a combination of pattern matching and concept extraction from natural language information, a template is automatically amended with an abstract concept of a concept network. Concept extraction is based on an association of a natural language expression with an abstract concept of a concept network. Pattern matching is used to discriminate between relevant and irrelevant natural language expressions based on the context of a natural language expression. For each of multiple different combinations of a candidate and a job,a numerical matching value based on corresponding candidate and job templates is automatically calculated. An ordered list associated with the multiple different combinations is displayed via a visualization means, or sent to a user device for display via a visualization means. The list comprises an ordering based on the numerical matching values.

Inventors:
DEVOS GEERT (BE)
PLATTEAU FRANK (BE)
Application Number:
PCT/EP2019/080286
Publication Date:
May 14, 2021
Filing Date:
November 05, 2019
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
NALANTIS NV (BE)
NALANTIS HOLDING LTD (CN)
International Classes:
G06Q10/10; G06F40/10
Domestic Patent References:
WO2019106437A22019-06-06
Foreign References:
US20090276415A12009-11-05
US20150317610A12015-11-05
US20160232160A12016-08-11
Other References:
SPEER, CHINHAVASI: "ConceptNet 5.5: An Open Multilingual Graph of General Knowledge", PROCEEDINGS OF THE THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-17), 2017, pages 4444 - 4451
Attorney, Agent or Firm:
CRABBE, Ellen (BE)
Download PDF:
Claims:
CLAIMS

1. Computer-implemented method for job matching, comprising the steps of:

• automatically amending for each candidate of one or more candidates a candidate template based on character-coded digital data comprising candidate information expressed in a natural language;

• automatically amending for each job of one or more jobs a job template based on character-coded digital data comprising job information expressed in a natural language;

• automatically calculating for each of multiple different combinations of a candidate and a job a numerical matching value based on corresponding candidate and job templates; and

• displaying via a visualization means, or sending to a user device for display via a visualization means, an ordered list associated with the multiple different combinations, which list comprises an ordering based on the numerical matching values, wherein a template is automatically amended via retrieving an abstract concept of a concept network based on a combination of pattern matching and concept extraction from the natural language information and inserting the abstract concept in the template, wherein concept extraction is based on an association of a natural language expression with an abstract concept of a concept network, wherein pattern matching is used to discriminate between relevant and irrelevant natural language expressions based on the context of a natural language expression.

2. Computer-implemented method according to any one of the preceding claims, comprising the steps of: providing a job batch comprising a plurality of digital job documents, wherein each digital job document comprises character-coded digital data comprising job information expressed in a natural language; and automatically amending for each digital job document of the job batch a job template.

3. Computer-implemented method according to any one of the preceding claims, comprising the steps of: providing a candidate batch comprising a plurality of digital candidate documents, wherein each digital candidate document comprises character-coded digital data comprising candidate information expressed in a natural language; and automatically amending for each digital candidate document of the candidate batch a candidate template.

4. Computer-implemented method according to any one of the preceding claims, comprising verifying an amended template via:

• automatically generating from the amended template verification information expressed in a natural language, thereby converting each abstract concept to a natural language expression;

• displaying via a visualization means, or sending to a user device for display via a visualization means, the verification information;

• obtaining via a user input device, or receiving from a user device, confirmation and/or correction data based on the displayed verification information.

5. Computer-implemented method according to any one of the preceding claims, comprising the steps of: obtaining filter data, preferably via a graphical user interface or from a user device based on input via a graphical user interface; and displaying via the visualization means, or sending to the user device for display via a visualization means, an adjusted ordered list, wherein the adjusted ordered list is based on the filter data.

6. Computer-implemented method according to preceding claim 5, wherein the filter data is based on one or more abstract concepts of the concept network.

7. Computer-implemented method according to preceding claim 6, wherein the filter data is obtained via a graphical user interface or from a user device based on input via a graphical user interface, wherein the graphical user interface is associated with the visualization means, wherein the filter data is based on spatial reconfiguration, such as drag and drop, of a displayed abstract concept to a filter region displayed via the visualization means.

8. Computer-implemented method according to any one of the preceding claims, wherein pattern matching is used to discriminate between relevant and irrelevant natural language expressions based on phrasal analysis and/or semantic analysis of a phrase and/or paragraph containing a natural language expression.

9. Computer-implemented method according to any one of the preceding claims, wherein the concept network comprises a multitude of interconnected abstract concepts, and for each concept an expression in each of multiple natural languages.

10. Computer-implemented method according to preceding claim 9, wherein a natural language expression is connected to an abstract concept, preferably by a synonym connection.

11. Computer-implemented method according to any one of the preceding claims 9 and 10, wherein a natural language expression is not directly connected to another natural language expression.

12. Computer-implemented method according to any one of the preceding claims 9 to 11, wherein an expression in a first natural language is automatically added to the concept network, preferably via machine learning, based on translation pairs of documents of a corpus, wherein a translation pair of documents pertains to the first and a second natural language, said automatic adding of an expression comprising the steps of:

• detecting in a first document in the first natural language a first group of expressions, wherein an unknown expression of the first group is not comprised in the concept network, wherein at least two known expressions of the first group are comprised in the concept network;

• determining in a corresponding second document in the second natural language a corresponding second group of expressions, based on locations in the documents and/or the abstract concepts associated with the known expressions of the first group;

• determining a target expression in the second group corresponding to the unknown expression of the first group; and

• connecting the unknown expression of the first group to the abstract concept associated with the target expression of the second group.

13. Computer-implemented method according to any one of the preceding claims 9 to 12, wherein said multiple natural languages comprise at least English, German, French, Chinese, Japanese, Spanish, Portuguese, Swedish, Danish, Italian and Dutch.

14. Computer-implemented method according to any one of the preceding claims, wherein the concept network comprises a hierarchy of abstract occupation concepts.

15. Computer-implemented method according to preceding claim 14, wherein the hierarchy of occupation concepts is based on the Standard Occupational Classification (SOC) System.

16. Computer-implemented method according to any one of the preceding claims, wherein the concept network comprises a hierarchy of abstract competence concepts.

17. Computer-implemented method according to preceding claim 16, wherein the hierarchy of competence concepts is based on the International Standard Classification of Occupations (ISCO).

18. Computer-implemented method according to claim 14 or 15 and according to claim 16 or 17, wherein an occupation concept is connected to one or more competence concepts, preferably by a connotation connection.

19. Computer-implemented method according to preceding claim 18, comprising the step of automatically detecting a new occupation concept based on a corpus of documents, preferably via machine learning, by detecting clusters, preferably recurring clusters, of competence concepts in the corpus which correspond to an insufficient extent to a common occupation concept.

20. Computer-implemented method according to any one of the preceding claims, wherein the concept network comprises a hierarchy of abstract qualification concepts.

21. Computer-implemented method according to preceding claim 20, wherein a qualification concept is connected to one or more competence concepts.

22. Computer-implemented method according to preceding claim 21, wherein a candidate is associated with competence concepts, wherein a job is associated with competence concepts, wherein a gap in competence concepts between the candidate and the job is automatically determined, wherein a qualification concept is automatically determined to at least in part fill the gap in competence concepts, wherein a suggestion in a natural language of the determined qualification concept in association with the job is displayed via a visualization means, or sent to a user device for display via a visualization means.

23. Computer-implemented method according to any one of the preceding claims 21 and 22, wherein a candidate is associated with competence concepts, wherein a job is associated with competence concepts, wherein the calculation of a numerical matching value for a candidate and a job depends at least in part on the competence concepts associated with the candidate and the competence concepts associated with the job.

24. A computer system for job matching, wherein the computer system comprises means configured to perform a method according to any one of the preceding claims 1 to 23.

25. A computer program for job matching, wherein the computer program comprises instructions which, when the program is executed by a computer, cause the computer to carry out a method according to any one of the preceding claims 1 to 23.

Description:
ANALYSIS AND COMPARISON OF CHARACTER-CODED DIGITAL DATA,

IN PARTICULAR FOR JOB MATCHING

FIELD OF THE INVENTION

The present invention relates to computer-implemented analysis and comparison of character-coded digital data, in particular lexical, phrasal and semantic analysis (G06F 17/27), as well as natural language processing and generation (G06F 17/28).

BACKGROUND

WO 2019/106 437 A2 discloses a computer-implemented method for matching bids for work (e.g. job postings) with offers for work (e.g. resumes). Natural language processing techniques are utilized to interpret work-specific terminology from the bids for work and/or the offers for work. The bids for work are matched with the offers for work based on a predefined distance measure between the bids for work and the offers for work.

WO 2019/106 437 A2 discloses a compartmentalized set of vocabulary terms (Jobzi Ontology), which are contextualized, placed in a hierarchy, synonym-enriched and relationship-oriented. Relationships between terms may take a variety of forms, such as 'is a synonym to', 'is a part of', 'is a', etc.

WO 2019/106 437 A2 discloses an association (Meta Work) of a job title with corresponding information such as knowledge, skills, courses, and professional requirements including experience and salary. Skills are weighted, denoting the importance of a skill for the job title. Matching is based on weighted skills.

WO 2019/106 437 A2 discloses penalizing lapse of experience, e.g. having worked as a "truck driver" 10 years ago vs. currently working as a "truck driver".

WO 2019/106 437 A2 is however not suitable for parsing documents comprising narrative information, such as "I am a marketing director, reporting to the general manager", as is often encountered in a job application letter. With the methodology of WO 2019/106 437 A2, it would be unclear whether "marketing director", "general manager", none, or both are relevant. WO 2019/106 437 A2 discloses automatic parsing of resumes and job postings, but remains silent on verification of the parsed information.

WO 2019/106 437 A2 discloses an ontology, but remains silent on the creation of the ontology. On a rapidly evolving job market, with new job titles and skills emerging and subsiding, maintaining the ontology is labor-intensive.

WO 2019/106 437 A2 discloses an ontology, but remains silent on the use of multiple languages.

Speer, Chin and Havasi, entitled "ConceptNet 5.5: An Open Multilingual Graph of General Knowledge" , in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), pages 4444 - 4451 (2017) discloses a multilingual concept network. A concept network is a knowledge graph which connects words and phrases with labeled edges (e.g. 'Synonym', 'FormOf'). Multilingual functionality is realized in ConceptNet by labelling terms with a language (e.g. 'en', 'fr', 'it') and linking words in different languages, e.g. via a 'synonym' relation, e.g. 'polyglotte (fr)' being a 'synonym' of 'multilingual (en)'.

A problem with the concept network of Speer (2017) is that a multilingual diversity of natural language expressions which may be designated as synonyms, may each cover one or more typical common meanings in a particular language, particulars or partial overlaps of which can be lost in translation.

The present invention aim to resolve at least some of the above-mentioned problems.

SUMMARY OF THE INVENTION

In a first aspect, the present invention provides a computer-implemented method for job matching, according to claim 1.

In a second aspect, the present invention provides a computer system for job matching, wherein the computer system, such as a server, comprises means, such as a processor, configured to perform the method according to the first aspect.

In a third aspect, the present invention provides a computer program for job matching, wherein the computer program comprises instructions which, when the computer program is executed by a computer, such as a computer system according to the second aspect, cause the computer to carry out the method according to the first aspect. The present invention may further provide a tangible non-transitory computer-readable data carrier, such as a Compact Disk, a Hard Disk Drive or a Solid State Drive, comprising the computer program.

The three aspects of the present invention are interrelated. Therefore, every feature disclosed above or below may pertain to each of the aspects of the present invention, even if the feature has been disclosed in conjunction with a particular aspect.

The pattern matching allows for processing natural language information in narrative form, thereby discriminating between relevant and irrelevant natural language expressions. For example, in the expression "I am a marketing director, reporting to the general manager, and closely collaborating with a graphical designer", three natural language expressions (e.g. 'marketing director (en)'; 'general manager (en)'; and 'graphical designer (en)') related to abstract occupation concepts (e.g. Marketing_Director_OCC; General_Manager_OCC; and Graphical_Designer_OCC) may be identified. Via pattern matching, it may be assessed that the occupation of the person associated with the narrative, typically a candidate, is marketing director and not general manager or graphical designer. This has to be contrasted with a template filled out in a natural language, where a priori associations are evident via fields of the template, i.e. a natural language expression in the field for current occupation being natural language information of the candidate's current occupation.

The present invention in particular provides for automatically at least partially filling out a template with abstract concepts of a concept network based on natural language information in narrative form, such as a job application letter or a job posting, based on a combination of pattern matching and concept extraction from the natural language information in narrative form. The natural language information in narrative form is provided as character-coded digital data. This automation mitigates the need for manually filling out a variety of different templates. Template amendment may be carried out for a candidate and/or a job, such as a job application letter of a candidate or a job posting of a job.

Further advantages of the invention, and in particular of preferred embodiments, are disclosed in the detailed description below. DETAILED DESCRIPTION OF THE INVENTION

The present invention concerns a computer-implemented method, a computer system, and a computer program for job matching. The present invention has been summarized in the corresponding section above. In what follows, the present invention is described in detail, preferred embodiments are discussed, and the present invention is illustrated by means non-limitative examples.

Unless otherwise defined, all terms used in disclosing the invention, including technical and scientific terms, have the meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. By means of further guidance, term definitions are included to better appreciate the teaching of the present invention.

"A", "an", and "the" as used herein refers to both singular and plural referents unless the context clearly dictates otherwise. By way of example, "a compartment" refers to one or more than one compartment.

"Comprise", "comprising", "comprises" and "comprised of" as used herein are synonymous with "include", "including", "includes" or "contain", "containing", "contains" and are inclusive or open-ended terms that specify the presence of what follows (e.g. component) and do not exclude or preclude the presence of additional, non-recited components, features, elements, members, steps, known in the art or disclosed therein.

"Based on" as used herein is synonymous with "based at least in part on" and is an inclusive or open-ended term that specifies the presence of what follows and does not exclude or preclude the presence of additional, non-recited components, features, members, steps, known in the art of disclosed therein.

For each candidate of one or more candidates, preferably multiple candidates, a candidate template is automatically amended based on character-coded digital data comprising candidate information expressed in a natural language. For each job of one or more jobs, preferably multiple jobs, a job template is automatically amended based on character-coded digital data comprising job information expressed in a natural language. For each of multiple different combinations of a candidate and a job, a numerical matching value is automatically calculated based on corresponding candidate and job templates, preferably based on predetermined heuristic rules, more preferably based on a predefined distance measure between the candidate and job templates. An ordered list associated with the multiple different combinations is displayed via a visualization means, or sent to a user device for display via a visualization means. The list comprises an ordering based on the numerical values.

A template is automatically amended via retrieving an abstract concept of a concept network based on a combination of pattern matching and concept extraction from the natural language information comprised in the character-coded digital data and inserting the abstract concept in the template. A candidate template can be automatically amended via retrieving an abstract concept of a concept network based on a combination of pattern matching and concept extraction from candidate information expressed in a natural language, in particular in narrative form, comprised in character- coded digital data and inserting the abstract concept in the candidate template. A job template can be automatically amended via retrieving an abstract concept of a concept network based on a combination of pattern matching and concept extraction from job information expressed in a natural language, in particular in narrative form, comprised in character-coded digital data and inserting the abstract concept in the job template. Concept extraction is based on an association of a natural language expression with an abstract concept of a concept network. Pattern matching is used to discriminate between relevant and irrelevant natural language expressions based on the context of a natural language expression, in particular based on phrasal analysis and/or semantic analysis of a phrase and/or paragraph containing a natural language expression.

A "concept network" as used herein is synonymous with "ontology", and refers to a knowledge graph comprising nodes and labeled edges. A "node" of the concept network represents an abstract concept or an expression in a natural language. Preferably, an abstract concept is a natural language-independent concept, in the sense that it is not utilized in any natural language as such. For example, Marketing_Director_OCC is an abstract occupation concept for 'marketing director (en)'. The former is not utilized in any natural language as such, while the latter can be utilized in the natural language English. A non-limiting list of examples of abstract concept types comprises occupation concepts, competence concepts, work experience concepts, and qualification (or education) concepts. An "edge" or "connection" of the concept network connects two nodes, and comprises a relationship type. A relationship can be symmetric or asymmetric. A non-limiting list of examples of symmetric relationship types comprises 'antonym', 'distinct from', 'etymologically related to', 'located near', 'related to', 'similar to' and 'synonym'. A non-limiting list of examples of asymmetric relationship types comprises 'at location', 'capable of, 'causes', 'causes desire', 'created by', 'defined as', 'derived from', 'desires', 'entails', 'external URL', 'form of, 'has a', 'has context', 'has first subevent', 'has last subevenf, 'has prerequisite', 'has property', 'instance of, 'is a', 'made of, 'manner of, 'motivated by goal', 'obstructed by', 'part of, 'receives action', 'sense of, 'symbol of and 'used for'. For example, "concept network" is a 'synonym' to "ontology"; "copper" 'is a' "metal"; and "wheel" is a 'part of a "car". Particular relationships utilized in the concept network of the present invention are 'connotation', 'parent', 'child', and 'has domain' relationships.

Preferably, the concept network comprises a multitude of interconnected abstract concepts. Preferably, the concept network comprises for each concept an expression in a natural language. Preferably, the concept network comprises for each concept an expression in each natural language of multiple natural languages. Preferably, the multiple natural languages comprise at least two of English, German, French, Chinese, Japanese, Spanish, Portuguese, Swedish, Danish, Italian and Dutch. Preferably, the multiple natural languages comprise at least English, German, French, Chinese, Japanese, Spanish, Portuguese, Swedish, Danish, Italian and Dutch.

This is advantageous as it allows to match a resume or job application letter in a first natural language and a job posting in a second natural language different from the first natural language. In multilingual countries or regions, such as Belgium or Switzerland, or in multilingual environments, such as English being a preferred scientific communication language irrespective of the natural language of the locality, this avoids the need to manually or automatically translate a resume, job application letter or job posting, before matching can be performed.

Preferably, a natural language expression is connected to an abstract concept. Preferably, a natural language expression is connected to an abstract concept by a synonym connection. Preferably, a natural language expression is connected to one or more abstract concepts. Preferably, a natural language expression is only connected to abstract concepts. Preferably, a natural language expression is not connected to another natural language expression.

This is advantageous as it mitigates faulty translations due to only partially overlapping meanings for different natural language expressions, both for the same or different natural languages. Where a natural language expression can have multiple meanings, e.g. a 'handyman (en)' being a Sailor_OCC or a Repairer_OCC, the natural language expression is connected to each of the abstract occupation concepts, but most preferably not to other natural language expressions. Via pattern matching and/or contextual information, the correct abstract occupation concept can be identified. By traversing the concept network via abstract concepts, faulty translations or synonym usage can be mitigated.

Preferably, the concept network comprises a hierarchy of abstract occupation concepts. Preferably, abstract concepts of a hierarchy are connected via parent/child connections. Preferably, the hierarchy of occupation concepts is based on the Standard Occupational Classification (SOC) System. Preferably, the concept network comprises a plurality of abstract competence concepts. Preferably, the concept network comprises a hierarchy of abstract work experience concepts. Preferably, the plurality of competence concepts and/or the hierarchy of work experience concepts is based on the International Standard Classification of Occupations (ISCO). Preferably, an occupation concept is connected to one or more competence and/or work experience concepts. Preferably, an occupation concept is connected to one or more competence and/or work experience concepts via a connotation connection. Preferably, the concept network comprises a hierarchy of abstract qualification concepts. Preferably, a qualification concept is connected to one or more competence concepts. Preferably, a qualification concept is connected to one or more competence concepts via a connotation connection.

In a preferred embodiment, a job batch comprising a plurality of digital job documents is provided and processed. Each digital job document comprises character-coded digital data comprising job information expressed in a natural language. For each digital job document of the job batch, a job template is automatically amended, as described above.

In a preferred embodiment, a candidate batch comprising a plurality of digital candidate documents is provided and processed. Each digital candidate document comprises character-coded digital data comprising candidate information expressed in a natural language. For each digital candidate document of the candidate batch, a candidate template is automatically amended, as described above.

In an embodiment, an amended template is manually verified. Preferably, from the amended template verification information is automatically generated. Preferably, the verification information is expressed in a natural language. Preferably, each abstract concept is thereby converted to a natural language expression. Preferably, the verification information is displayed via a visualization means, or sent to a user device for display via a visualization means. Preferably, confirmation and/or correction data based on displayed verification information is obtained via a user input device, or received from a user device. In a preferred embodiment, filter data is obtained. Preferably, the filter data is obtained via a graphical user interface or from a user device based on input via a graphical user interface. An adjusted ordered list is displayed via the visualization means, or sent to the user device for display via a visualization means. The adjusted ordered list is based on the filter data. Preferably, the filter data is based on one or more abstract concepts of the concept network. Preferably, a user can filter for items in the ordered list based on one or more abstract concepts of the concept network. Preferably, the graphical user interface is associated with the visualization means. Preferably, the filter data is based on spatial reconfiguration, such as drag and drop, preferably drag and drop with a cursor device, of a displayed abstract concept to a filter region displayed via the visualization means.

In a preferred embodiment, a new occupation concept is automatically detected based on a corpus of documents, preferably via machine learning, by detecting clusters, preferably recurring clusters, of competence concepts in the corpus which correspond to an insufficient extent to a common occupation concept present in the concept network.

A candidate may be associated with competence concepts. A candidate may be associated with competence concepts directly, for example when competences are explicitly mentioned in natural language expressions in the character-coded digital data (e.g. a job application letter or resume), or indirectly, for example as implied by a qualification concept (e.g. education) or an occupation concept from previous employment.

A job may be associated with competence concepts. A job may be associated with competence concepts directly, for example when competences are explicitly mentioned in natural language expressions in the character-coded digital data (e.g. job posting), or indirectly, for example as implied by a qualification concept (e.g. education) as required or an occupation concept as required work experience.

In a preferred embodiment, a gap in competence concepts between a candidate and a job is automatically determined. A qualification concept is automatically determined to at least in part fill the gap in competence concepts. A suggestion in a natural language of the determined qualification concept in association with the job is displayed via a visualization means, or sent to a user device for display via a visualization means. In a preferred embodiment, the calculation of a numerical matching value for a candidate and a job depends at least in part on the competence concepts and/or work experience concepts associated with the candidate and the competence concepts and/or work experience concepts associated with the job.

In a preferred embodiment, an expression in a first natural language is automatically added to the concept network, preferably via machine learning, based on translation pairs of documents of a corpus. Preferably, the first natural language is a 'novel' natural language, for which the incorporation into the concept network is incomplete. A translation pair of documents pertains to the first and a second natural language, i.e. a document in the first natural language and a document in the second natural language, both documents being translations of one another. In a first document in the first natural language a first group of expressions is detected. An 'unknown' expression of the first group is not comprised in the concept network. At least two 'known' expressions of the first group are comprised in the concept network. In a corresponding second document in the second natural language a corresponding second group of expressions is detected, based on locations within the (first and second) documents and/or the abstract concepts associated with the known expressions of the first group. A target expression in the second group corresponding to the unknown expression of the first group is determined. The unknown expression of the first group is connected to the abstract concept associated with the target expression of the second group.

This is advantageous, as it allows for boot-strap acquisition of a novel language in the concept network. Based on an initial set of natural language expressions in the novel language added based on machine translation and/or manual addition, further natural language expressions in the novel language can be automatically added as disclosed above.

The invention is further described by the following non-limiting examples which further illustrate the invention, and are not intended to, nor should they be interpreted to, limit the scope of the invention.

EXAMPLE

A candidate uploads via his user device a document (e.g. resume or job application letter) to a server according to the second aspect of the present invention. The document comprises character-coded digital data, which is analyzed on the server, and based on which a candidate template associated with the candidate is amended. The document and the candidate template are stored in a database of the server.

Consider, for example, that the document comprises the sentence "I have closely monitored the metabolomics project." in the natural language English. Via pattern matching and concept extraction, the concepts "Initiative" and "Biomedical_Engineering_WE" are retrieved from the document and added to the candidate template. Via concept extraction, the natural language expressions 'monitor (en)' and 'metabolomics project (en)' are identified in the sentence, as well as their respective associated concepts "Initiative" and "Biomedical_Engineering_WE" from the concept network. Via pattern recognition, a degree of certainty is obtained that the identified concepts relate to the candidate, and if the degree of certainty is sufficiently high, the identified concepts are added to the candidate template. In this way, the document is fully analyzed.

The candidate template comprises a details part and a concepts part. The details part comprises fields pertaining to candidate details, such as, for example, name, address, pointers to uploaded files, language, seniority, work class, academic level, domain of work, address and/or phone number. These details are most preferably also automatically obtained based on pattern matching. The details part may in particular comprise geographic coordinates, such as latitude and longitude, to allow for distance filtering for job matching. Geographic coordinates may be retrieved automatically based on an address of the candidate. The details part may in particular comprise a candidate ID (for database retrieval), which is preferably a character string comprising alphanumeric characters and/or dashes. The concepts part is divided into a plurality of topics, including for example Experience, Language, Education and Competences. The concepts retrieved from the document have been added to the corresponding topics. For example, "Biomedical_Engineering_WE" denotes work experience in biomedical engineering, and is therefore added to the Experience topic. A concept may comprise a weight. Each concept may comprise a weight. A work experience concept may comprise a weight based on length of experience, preferably adjusted for penalizing lapse of experience. A competence concept may comprise a weight based on weights of associated concepts, e.g. work experience concepts, and/or prevalence in the document.

The template preferably also comprises in association with each concept the corresponding natural language expression of the document and/or a corresponding reference to a particular corresponding part of the document. A webpage is generated comprising verification information. The verification information is expressed in natural language and derived from the candidate template. Each concept of the template is thereto converted to a natural language expression corresponding to the natural language denoted in the details part of the candidate template. The webpage is sent to the user device of the candidate and displayed via a screen of the user device. The webpage is configured to, at the option of the candidate, amend natural language expressions via a user input device of the user device. The webpage furthermore comprises a confirmation button to confirm the originally represented and/or amended natural language expressions of the webpage. This allows for automatically generating an at least partially filled out template, and only manually editing where needed. Preferably, corrected information is retained for automatic improvement, preferably via machine learning, of document analysis.

The candidate may then perform a search for jobs, based on the candidate template. For a combination of the candidate template and a job template, a numerical matching value is obtained. An ordered list is generated, comprising job identifications and/or job pointers, ordered according to corresponding numerical matching values. A webpage comprising the ordered list is generated, sent to the user device, and presented to the candidate via the screen of the user device.

The webpage furthermore comprises functionality to set or unset filters and/or to reorder the list, such as, for example, based on distance or numerical matching value.

Upon selection of a job identification and/or pointer in the ordered list on the webpage, information specifically pertaining to the job is displayed, such as, for example, numerical matching value for each concept category, an overall numerical matching value, a job posting or pointer thereto, and information pertaining to the job template. The job template may also comprise a details part and a concepts part. The details part comprises fields pertaining to job details, such as, for example, company name, address, pointers to uploaded files, language, seniority, work class, academic level, domain of work, address and/or phone number. These details are most preferably also automatically obtained based on pattern matching. The details part may in particular comprise geographic coordinates, such as latitude and longitude, to allow for distance filtering for job matching. Geographic coordinates may be retrieved automatically based on an address of the job. The details part may in particular comprise a job ID (for database retrieval), which is preferably a character string comprising alphanumeric characters and/or dashes. The concepts part is divided into a plurality of topics, including for example Experience, Language, Education and Competences. A concept may comprise a weight. Each concept may comprise a weight.

If certain competences or educations are missing for the candidate to fully comply with the full outline of the job posting, suggestions for further qualification and/or education may be presented as well to the candidate.