Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
PSEUDONYMISATION AND REVERSAL OF PERSONALLY IDENTIFIABLE INFORMATION
Document Type and Number:
WIPO Patent Application WO/2019/132645
Kind Code:
A1
Abstract:
The present invention relates to a system and method for pseudonymisation and reversal of personally identifiable information for privacy protection. The present invention comprising a service requestor (100a) for initiating a request; and a service provider (101) for performing pseudonymisation upon receipt of an authorized request from the service requestor and returning outcome of pseudonymisation to the service requestor. The service provider (101) further comprises provider components within the service provider whereby the service provider perform a first pseudonymisation by taking personally identifiable information (PII) input and undergo a zero knowledge (ZK) function using a first key; and performs a second pseudonymisation using result of first pseudonymisation and undergo a ZK function using a second key.

Inventors:
GOH ALWYN (MY)
LEE KAY WIN (MY)
NG KANG SIONG (MY)
POH GEONG SEN (MY)
MOHAMAD MOESFA SOEHEILA (MY)
Application Number:
PCT/MY2018/050079
Publication Date:
July 04, 2019
Filing Date:
November 19, 2018
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
MIMOS BERHAD (MY)
International Classes:
H04L9/32; H04L9/08
Foreign References:
US20110202767A12011-08-18
US20120239942A12012-09-20
US20080165005A12008-07-10
US20170177798A12017-06-22
US20140006553A12014-01-02
Attorney, Agent or Firm:
MIRANDAH ASIA (MALAYSIA) SDN BHD (MY)
Download PDF:
Claims:
CLAIMS

1 . A system (100) for pseudonymisation and reversal of personally identifiable information for privacy protection comprising:

a service requestor (100a) for initiating a request;

a service provider (101 ) for performing pseudonymisation upon receipt of an authorized request from the service requestor (100a) and returning outcome of pseudonymisation to the service requestor (100a);

characterized in that

the service provider (101 ) further comprises provider components within the service provider (101 ); wherein the service provider (101 ) performs a first pseudonymisation by taking a personally identifiable information, Pll input and undergo a zero knowledge, ZK function using a first key; and

performs a second pseudonymisation using result of first pseudonymisation and undergo the ZK function using a second key.

2. A method (200) for pseudonymisation and reversal of personally identifiable information for privacy protection comprising steps of:

providing information comprising personally identifiable information, Pll elements, operative subset of Pll, authentication credentials by a service requestor to a service provider (202) ;

verifying information received from the service requestor by the service provider to confirm authorization of the service requestor to proceed with request by the service requestor (204);

proceeding with pseudonymisation computation on input of Pll elements or subset thereof of the Pll which would result in an assigned corresponding virtual person identifier, VPI element (206); assigning VPI as a pseudonymous identifier for Pll and proceeding with a reversal to obtain a real person identifier, RPI element (208); and providing both VPI and RPI elements to the service requestor (210);

characterized in that assigning VPI as a pseudonymous identifier for Pll and proceeding with a reversal to obtain a real person identifier, RPI element (208) further comprises steps of (300):

aggregating Pll elements for private information identifiable by one of Pll or RPI and said information is placed in a real person table, RPT within a database (302);

aggregating de-identified information, Dll elements which is deemed associated with pseudonym of subject identifiable as virtual person identifier, VPI and said elements are placed in a virtual person table, VPT within the same database (304);

computing VPI from Pll through a first zero knowledge, ZK function (306); and

computing RPI from VPI through a second zero knowledge, ZK function (308).

3. The method (200) according to Claim 2, wherein computing VPI from Pll through a first zero knowledge (ZK) function further comprises computing VPI as output from Pll as input containing additional input of a first master key to perform pseudonymisation using a first hashed message authentication code, HMAC (306).

4. The method (200) according to Claim 3, wherein said Pll and VPI constitute a unique input-output, 10 pair of mutual correspondence and said Pll is not covered from said VPI using any computation.

5. The method (200) according to Claim 2, wherein computing RPI from VPI through a second zero knowledge (ZK) function further comprises computing RPI as output from VPI as input containing additional input of a second master key using HMAC computation to perform reverse pseudonymisation (308).

6. The method (200) according to Claim 5, wherein said VPI and RPI constitute a unique 10 pair of mutual correspondence and said VPI is not covered from said RPI using any computation.

7. The method (200) according to Claim 2, wherein aggregating Pll elements which comprises private information to be protected (302) further comprises steps of (400):

computing VPI from Pll using a first key, key 1 when a Pll record is created (402); and

computing RPI from VPI using a second key, key 2 (404).

8. The method (200) according to Claim 2, wherein aggregating de-identified information, Dll elements which comprises all information of interest other than Pll deemed to be associated with pseudonym of subject identifiable as virtual person identifier, VPI element is placed in a virtual person table, VPT within the same database (304) further comprises computing VPI from Pll and storing said VPI in the VPT when a user with authorization creates a Dll record.

9. The method (200) according to Claim 3, wherein using HMAC computation (306) to perform pseudonymisation is applicable to an individual subject of interest and a group aggregation of multiple subjects, said HMAC computation further comprises generating pseudonymisation keys.

10. The method (200) according to Claim 9, wherein generating pseudonymisation keys further comprises steps (500):

specifying unique group identifier, Gl for each group aggregation of multiple subjects (502);

computing virtual group key, VGK as pseudonymisation key at each level of aggregation (504);

computing VPI specific to a particular subject of interest (506); computing RGK as reverse pseudonymisation key at each equivalent level of aggregation (508); and

computing RPI specific to same subject of interest (510).

1 1 . The method (200) according to Claim 9, wherein generating pseudonymisation keys further comprises protecting HMAC keys by:

accessing record of interest, as specified via particular identifier by service requestor (100a); assessing whether service requester possesses authority to request present HMAC computation of interest by service provider (101 ); and undertaking HMAC computations internal to HSM by service provider (1001 ) by requesting first identifier as input, and second identifier as output resulting in association of a first record to a second record.

12. The method (200) according to Claim 8, wherein computing VPI from Pll and storing said VPI in the VPT when a user with authorization creates a Dll record further comprises associating a particular Pll record with subject of interest to a pseudonymous Dll record; by

accessing Pll record of interest by service requestor (100a);

undertaking pseudonymisation computations by service provider (101 ) with access to first HMAC keys at every level of aggregation; undertaking assessment as to whether service requestor (100a) possesses authority to request pseudonymisation; and

associating Pll record of interest to Dll record with positive outcome of the assessment.

13. The method (200) according to Claim 8, wherein computing VPI from Pll and storing said VPI in the VPT when a user with authorization creates a Dll record further comprises associating a particular pseudonymous Dll record of interest to Pll record by

accessing Dll record of interest by service requestor (100a); and undertaking pseudonymisation computations through access to second HMAC keys at every level of aggregation by service provider (101 ) with service provider being able to undertake assessment as to whether service requestor possesses (100a) authority to request reverse pseudonymisation; and

associating Dll record of interest to Pll with positive outcome of the assessment.

Description:
PSEUDONYMISATION AND REVERSAL OF PERSONALLY IDENTIFIABLE

INFORMATION

FIELD OF INVENTION

The present invention relates to a system and method for pseudonymisation and reversal of personally identifiable information. In particular, the present invention provides a mechanism to separate personally identifiable information (Pll) from other personal data and allows authorized users to perform re-identification.

BACKGROUND ART

In the era of Big Data, increasing concern of data protection, privacy and the associated rights of individuals has implicated into enactments of legislation concerning data privacy in various countries. The most feared attack form of data privacy is the data leakage by the database administrators themselves. A data administrator may not be an authorized user to access certain data; however, the data administrator may have full ingress to the database.

The easiest solution is to encrypt the index of data using symmetric key. However, the conventional method only allows the same user who is authorized to perform the forward pseudonymisaton to perform the reverse operation eliminating the element of data security. Additionally, another challenge in safeguarding personal data through anonymisation is the risk of re-identification, brought about by expanding pool of publically available data that may be correlated to identify specific individuals.

United States Patent No. 9,130,949 B2 (hereinafter referred to as the US 949 B2 Patent) entitled“Anonymizing apparatus and anonymizing method” having a filing date of 27 June 2012 (Patentee: Fujitsu Ltd) discloses an anonymising apparatus inclusive of a processor configured to receive an instruction from a first apparatus, with outcome of an anonymised part of a first data based on a type of application programme that causes the first apparatus to send the instruction, and then transmit the anonymised first data to the first apparatus. The invention as disclosed in the US 949 B2 Patent provides for anonymization which is undertaken by means of hashed message authentication code (HMAC) computations and is subsequently irreversible. United States Patent No. US 9,355,258 B2 (hereinafter referred to as the US 258 B2 Patent) entitled“System and method for database privacy protection” having a filing date of 28 September 201 1 (Patentee: Tata Consultancy Services Ltd) discloses a system and a method for privacy preservation of sensitive attributes stored in a database, and furthermore reduces the complexity and enhances privacy preservation of the database by determining the distribution of sensitive data based on Kurtosis measurement. The invention as disclosed in US 258 B2 Patent enables irreversible outcome of anonymity by means of statistical analysis and transformation of attributes.

United States Patent No. US 8,627,483 B2 (hereinafter referred to as the US 483 B2 Patent) entitled“Data anonymization based on guessing anonymity” having a filing date of 18 December 2008 (Patentee: Accenture Global Services Ltd) defines privacy in terms of a guessing game based on guessing inequality; with the guessing anonymity of a sanitised record to be defined by the number of guesses an attacker needs to be able to correctly guess an original record used to generate a corresponding sanitised record. The invention as disclosed in the US 483 B2 Patent enables the irreversible outcome of anonymity by means of optimisation analysis and insertion of random noise into attribute valuations.

With reference to the above-mentioned citations, there is indeed a need for an improved system and method for pseudonymisation to enable authorised recovery of the personally identifiable information (Pll) to de-identified information (Dll) association and reversal thereof under stringent security conditions.

SUMMARY OF INVENTION

The present invention relates to a system and method for pseudonymisation and reversal of personally identifiable information. In particular, the present invention provides a mechanism to separate personally identifiable information (Pll) and de- identified information (Dll) elements, as respectively situated in real person tablespace (RPT) and (VPT) virtual person tablespace in database of interest, by means of hashed message authentication code (HMAC) or equivalent zero knowledge (ZK) computations.

One aspect of the invention provides a system (100) for pseudonymisation and reversal of personally identifiable information for privacy protection. The system comprising a service requestor (100a) for initiating a request; a service provider (101 ) for performing pseudonymisation upon receipt of an authorized request from the service requestor and returning outcome of pseudonymisation to the service requestor. The service provider (101 ) further comprises provider components within the service provider; the service provider performs a first pseudonymisation by taking a personally identifiable information (Pll) input and undergo a zero knowledge (ZK) function using a first key; and performs a second pseudonymisation using result of first pseudonymisation and undergo a ZK function using a second key.

Another aspect of the invention provides a method (200) for pseudonymisation and reversal of personally identifiable information for privacy protection. The method comprising steps of providing information comprising personally identifiable information (Pll) elements, operative subset of Pll, authentication credentials by a service requestor to a service provider (202); verifying information received from the service requestor by the service provider to confirm authorization of the service requestor to proceed with request by the service requestor (204); proceeding with pseudonymisation computation on input of Pll elements or subset thereof of the Pll which would result in an assigned corresponding virtual person identifier (VPI) element (206); assigning VPI as a pseudonymous identifier for Pll and proceeding with a reversal to obtain a real person identifier (RPI) element (208); and providing both VPI and RPI elements to the service requestor (210). In assigning VPI as a pseudonymous identifier for Pll and proceeding with a reversal to obtain a real person identifier (RPI) element (208) further comprises steps of (300) aggregating Pll elements which comprises private information to be protected to be identifiable by one of Pll or RPI is placed in a real person table (RPT) within a database (302); aggregating de-identified information (Dll) elements which comprises all information of interest other than Pll deemed to be associated with pseudonym of subject to be identifiable as virtual person identifier (VPI) element is placed in a virtual person table (VPT) within the same database (304); computing VPI from Pll through a first zero knowledge (ZK) function (306); and computing RPI from VPI through a second zero knowledge (ZK) function (308).

A further aspect of the invention provides that computing VPI from Pll through a first zero knowledge (ZK) function further comprises computing VPI as output from Pll as input containing additional input of a first master key to perform pseudonymisation using a first hashed message authentication code (HMAC) (306); said Pll and VPI constitute a unique input-output (10) pair of mutual correspondence and said Pll cannot be recovered from said VPI using any computation.

Yet another aspect of the invention provides that computing RPI from VPI through a second zero knowledge (ZK) function further comprises computing RPI as output from VPI as input containing additional input of a second master key using HMAC computation to perform reverse pseudonymisation (308); said VPI and RPI constitute a unique 10 pair of mutual correspondence and said VPI cannot be recovered from said RPI using any computation.

Still another aspect of the invention provides that aggregating Pll elements which comprises private information to be protected to be identifiable by one of Pll or RPI is placed in a real person table (RPT) within a database (302) further comprises steps of (400) computing VPI from Pll using a first key, key 1 when a Pll record is created (402) ; and computing RPI from VPI using a second key, key 2 (404).

Another aspect of the invention provides that aggregating de-identified information (Dll) elements which comprises all information of interest other than Pll deemed to be associated with pseudonym of subject to be identifiable as virtual person identifier (VPI) element is placed in a virtual person table (VPT) within the same database (304) further comprises computing VPI from Pll and storing said VPI in the VPT when a user with authorization creates a Dll record. Still another aspect of the invention provides that using HMAC computation (306) to perform pseudonymisation is applicable to an individual subject of interest and a group aggregation of multiple subjects, said HMAC computation further comprises generating pseudonymisation keys.

Yet another aspect of the invention provides that generating pseudonymisation keys further comprises steps (500) specifying unique group identifier (Gl) for each group aggregation of multiple subjects (502); computing virtual group key (VGK) as pseudonymisation key at each level of aggregation (504); computing VPI specific to a particular subject of interest (506); computing RGK as reverse pseudonymisation key at each equivalent level of aggregation (508); and computing RPI specific to same subject of interest (510).

Another aspect of the invention provides that generating pseudonymisation keys further comprises protecting HMAC keys by accessing record of interest, as specified via particular identifier by service requestor (100a); assessing whether service requester possesses authority to request present HMAC computation of interest by service provider (101 ); and undertaking HMAC computations internal to HSM by service provider (1001 ) by requesting first identifier as input, and second identifier as output resulting in association of a first record to a second record.

Still another aspect of the invention provides that computing VPI from Pll and storing said VPI in the VPT when a user with authorization creates a Dll record further comprises associating a particular Pll record with subject of interest to a pseudonymous Dll record; by accessing Pll record of interest by service requestor (100a); undertaking pseudonymisation computations by service provider (101 ) with access to first HMAC keys at every level of aggregation; undertaking assessment as to whether service requestor (100a) possesses authority to request pseudonymisation; and associating Pll record of interest to Dll record with positive outcome of the assessment.

Yet another aspect of the invention provides that computing VPI from Pll and storing said VPI in the VPT when a user with authorization creates a Dll record further comprises associating a particular pseudonymous Dll record of interest to Pll record by accessing Dll record of interest by service requestor (100a); and undertaking pseudonymisation computations through access to second HMAC keys at every level of aggregation by service provider (101 ) with service provider being able to undertake assessment as to whether service requestor possesses (100a) authority to request reverse pseudonymisation; and associating Dll record of interest to Pll with positive outcome of the assessment.

The present invention consists of features and a combination of parts hereinafter fully described and illustrated in the accompanying drawings, it being understood that various changes in the details may be made without departing from the scope of the invention or sacrificing any of the advantages of the present invention.

BRIEF DESCRIPTION OF ACCOMPANYING DRAWINGS

To further clarify various aspects of some embodiments of the present invention, a more particular description of the invention will be rendered by references to specific embodiments thereof, which are illustrated in the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail through the accompanying drawings.

FIG. 1 .0 is a diagram illustrating an architecture of the system of the present invention.

FIG. 2.0 is a flowchart illustrating the general methodology of the present invention.

FIG. 2.0a is a diagram illustrating the interaction between the service requestor and the service provider in computing RPI from Pll.

FIG. 2.0b is a diagram illustrating the interaction between the service requestor and the service provider in computing VPI from Pll.

FIG. 2.0c is a diagram illustrating the interaction between the service requestor and the service provider in computing RPI from VPI.

FIG. 3.0 is a flowchart illustrating the steps involved in assigning VPI as a pseudonymous identifier for Pll and proceeding with a reversal to obtain a real person identifier (RPI) element.

FIG. 4.0 is a flowchart illustrating the steps involved in aggregating Pll elements which comprises private information to be protected to be identifiable by one of Pll or RPI is placed in a real person table (RPT) within a database.

FIG. 4.0a illustrates a real person table and virtual person table at database level.

FIG. 5.0 and FIG. 5.0a are flowcharts illustrating the steps involved in generating pseudonymisation keys. FIG. 6.0 illustrates a real person table and virtual person table at database level for multi-tenancy model.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention relates to a system and method for pseudonymisation and reversal of personally identifiable information. The present invention particularly relates to a system and method for separation of personally identifiable information (Pll) and de- identified information (Dll) elements; such that pseudonymous traversal from particular Pll to corresponding de-identified information (Dll) is undertaken by means of a zero knowledge (ZK) computation with input of the Pll or a subset thereof as uniquely identifies the entirety of the data element, and output of the virtual person identifier (VPI) as uniquely identifies the entirety of the corresponding Dll element; and also such that the reverse-pseudonymous traversal from particular Dll to corresponding Pll is undertaken by means of a ZK computation with input of the VPI as uniquely identifies a particular Dll element, and output of the RPI as uniquely identifies the entirety of the corresponding Pll element. Hereinafter, this specification will describe the present invention according to the preferred embodiments. It is to be understood that limiting the description to the preferred embodiments of the invention is merely to facilitate discussion of the present invention and it is envisioned without departing from the scope of the appended claims.

Reference is first made to FIG. 1 .0 which illustrates an architecture of the proposed system, comprising a service requestor (100a) and a service provider (101 ). The service requestor (100a) can either be a human or an application. The service requestor (100a) initiates a request while the service provider(101 ) receives the request from the authorized service requestor (100a), and undertakes one of the pseudonymisation or reverse-pseudonymisation computations as requested, and then returns the computation output of pesudonymisation to the service requestor (100a). The service provider (101 ) further comprises provider components within the service provider (101 ) whereby the service provider performs a first pseudonymisation by taking a personally identifiable information (Pll) input and undergo a zero knowledge (ZK) function using a first key; and further performs a second pseudonymisation using result of first pseudonymisation and undergo the ZK function using a second key.

Reference is now made to FIG. 2.0. FIG. 2.0 is a flowchart illustrating the general methodology of the present invention. As illustrated in FIG. 2.0, the method (200) for pseudonymisation and reversal of personally identifiable information for privacy protection is initiated by first providing information comprising personally identifiable information (Pll) elements, operative subset of Pll, authentication credentials by a service requestor to a service provider (202). Subsequently, information received from the service requestor is verified by the service provider to confirm authorization of the service requestor to proceed with request by the service requestor (204). Thereafter, pseudonymisation computation is performed on input of Pll elements or subset thereof of the Pll which would result in an assigned corresponding virtual person identifier (VPI) element (206). VPI is assigned as a pseudonymous identifier for Pll and thereafter proceeding with a reversal to obtain a real person identifier (RPI) element (208). Both VPI and RPI elements are provided to the service requestor (210).

Reference is now made to FIG. 2.0a whereby FIG. 2.0a is a diagram illustrating the interaction between the service requestor and the service provider in computing RPI from Pll. As illustrated in FIG. 2.0a, the interaction commences with submission by the service requestor (100a) of a particular Pll or operative subset thereof to the service provider (101 ), and additionally other information deemed necessary, inclusive of authentication credentials. The service provider (101 ) would then undertake access by means of verification of the received authentication credentials and additionally the associated authorization capabilities, whether to accede to the particular service request. On accession, the service provider would then undertake the pseudonymisation computation on input of the particular Pll, or subset thereof. The pseudonymisation computation is performed by taking the Pll input and going through a ZK function. For example, HMAC is performed using a first key (202a). This is followed by another pseudonymisation computation which takes the result of the first pseudonymisation cpmputation and going through a ZK function using a second key (203a). The pseudonymisation computation results in output of the corresponding VPI; and subsequent to that the reverse-pseudonymisation computation on input of the particular VPI, resulting in output of the corresponding RPI. Both VPI and RPI are then returned to the service requestor, which would then append the RPI to the particular Pll element in the real person table (RPT), and the VPI to the corresponding Dll element in the VPT. The pseudonymisation and reverse-pseudonymisation computations can only be undertaken on access to the respective pseudonymisation and reverse- pseudonymisation keys (202a, 203a), as in turn subject to their respective access controls. For embodiments in which a higher degree of security is required, these keys (202a, 203a) can reside with a hardware security module (HSM) element in which the ZK computation is undertaken interior to such element.

Reference is now made to FIG. 2.0b which is a diagram illustrating the interaction between the service requestor and the service provider in computing VPI from Pll. As illustrated in FIG. 2.0b, the interaction mechanism commences with submission by the service requestor (100a) of a particular Pll or operative subset thereof to the service provider (101 ), and additionally other information deemed necessary to undertake access control associated with such service request. On accession, the service provider (101 ) would then undertake the pseudonymisation computation on input of the particular Pll, or subset thereof and going through a ZK function, for example HMAC using the first key (202b). /the pseudonymisation operation results in output of the corresponding VPI, which is then returned to the service requestor. The service requestor (1001 ) would subsequently be able to identity and access the Dll element identified by the VPI corresponding to the submitted Pll or subset thereof, with such pseudonymisation computation only possible on access to the particular pseudonymisation key (202b), as subject to the requisite access control.

Reference is now made to FIG. 2.0c which is a diagram illustrating the interaction between the service requestor and the service provider in computing RPI from VPI. As illustrated in FIG. 2.0c, the interaction mechanism commences with submission by the service requestor (100a) of a particular VPI to the service provider (101 ), and additionally other information deemed necessary to undertake access control (402) associated with such service request. On accession, the service provider (101 ) would then undertake the reverse-pseudonymisation computation on input of the particular VPI, resulting in output of the corresponding RPI, which is then returned to the service requestor. The service requestor (100a) would subsequently be able to identity and access the Pll element identified by the RPI corresponding to the submitted VPI and going through a ZK function, for example, HMAC using the second key (203c). Such reverse- pseudonymisation computation only possible on access to the particular reverse- pseudonymisation key (203c), as subject to the requisite access control. Reverse- pseudonymisation is typically a more sensitive operation relative to pseudonymisation, as can be realised by the comparative stringency of the applicable access controls. Reference is now made to FIG. 3.0 which is a flowchart illustrating the steps involved in assigning VPI as a pseudonymous identifier for Pll and proceeding with a reversal to obtain a real person identifier (RPI) element. As illustrated in FIG. 3.0, in assigning VPI as a pseudonymous identifier for Pll and proceeding with a reversal to obtain a real person identifier (RPI) element (208) further comprises steps of (300) first aggregating Pll elements which comprises private information to be protected. Aggregation of the Pll elements enables the private information to be identifiable by one of Pll or RPI and said information is placed in a real person table (RPT) within a database (302). Subsequently, de-identified information (Dll) elements which comprises all information of interest, other than Pll, and is deemed to be associated with pseudonym of subject identifiable as virtual person identifier (VPI) element is aggregated (304). The aggregated Dll elements are placed in a virtual person table (VPT) within the same database (304). Thereafter, VPI is computed from Pll and said VPI is stored in the VPT when a user with authorization creates a Dll record (305). Thereafter, VPI is computed from Pll through a first zero knowledge (ZK) function (306); and RPI is computed from VPI through a second zero knowledge (ZK) function (308). Computing of the VPI from Pll through the first zero knowledge (ZK) function further comprises computing VPI as output from Pll as input containing additional input of a first master key to perform pseudonymisation using a first hashed message authentication code (HMAC) (306) ; said Pll and VPI constitute a unique input-output (10) pair of mutual correspondence and said Pll cannot be recovered from said VPI using any computation. Computing RPI from VPI through the second zero knowledge (ZK) function further comprises computing RPI as output from VPI as input containing additional input of a second master key using HMAC computation to perform reverse pseudonymisation (308); said VPI and RPI constitute a unique 10 pair of mutual correspondence and said VPI cannot be recovered from said RPI using any computation.

Reference is now made to FIG. 4.0 which is a flowchart illustrating the steps involved in aggregating Pll elements which comprises private information to be protected. As illustrated in FIG. 4.0, in aggregating Pll elements (302) further comprises steps of (400) whereby VPI is first computed from Pll using a first key, key 1 when a Pll record is created (402); and subsequently computing RPI from VPI using a second key, key 2 (404). Reference is now made to FIG. 4.0a which illustrates the pseudonymisation traversal in the database of interest from a particular Pll element (401 a) in the RPT of the database to the corresponding Dll element in the VPT, and the reverse-pseudonymisation traversal from a particular Dll element to corresponding Pll element. The pseudonymisation traversal commences with the associated computation on inputs of the particular Pll, or subset thereof, and additionally the associated pseudonymisation key, and results in output of the corresponding VPI, which indicates the corresponding Dll as the correct destination for the traversal of interest. The reverse-pseudonymisation traversal commences with the associated computation on inputs of the particular VPI and additionally the associated reverse-pseudonymisation key and results in output of the corresponding RPI, which indicates the corresponding Pll as the correct destination for the traversal of interest. The pseudonymisation and reverse-pseudonymisation traversals are therefore only possible with access to the respective pseudonymisation and reverse-pseudonymisation keys, as subject to the requisite access control. Access to the database contents without access to the requisite keys would therefore yield no information pertaining to correspondence of RPT and VPT elements.

Reference is now made to FIG. 5.0 and FIG. 5.0a whereby FIG. 5.0 and FIG. 5.0a are flowcharts illustrating the steps involved in generating pseudonymisation keys. As illustrated in FIG. 5.0, in generating pseudonymisation keys (500), unique group identifier (Gl) is specified for each group aggregation of multiple subjects (502). Thereafter, virtual group key (VGK) is computed as pseudonymisation key at each level of aggregation (504) followed by computation of VPI specific to a particular subject of interest (506). Subsequently, real group key (RGK) is computed as reverse pseudonymisation key at each equivalent level of aggregation (508). Finally, RPI specific to same subject of interest is computed (510). As illustrated in FIG. 5.0a, each step of the key generation process corresponds to a particular level of the hierarchy in the group of interest. Undertaking a particular key generation computation requires specification of at least one VGK for each level of the group hierarchy, and correspondingly at least one RGK for the corresponding level of hierarchy for the reverse-pseudonymisation computation. A particular VGK at one particular level of the hierarchy is obtained as the output of a ZK computation; with inputs of the VGK of the preceding level, and additionally the Gl of the particular level. In FIG. 5.0a, VGK computations at the first level of the hierarchy require inputs of the root pseudonymisation key (502) and one or more Gl valuations (502a, 502b), and result in outputs of the corresponding VGK valuations (504a, 504b). Equivalently, the VGK computations at the last level of the hierarchy require inputs of the penultimate pseudonymisation key (603) and one or more Gl valuations (506a, 506b), and result in outputs of the corresponding VGK valuations. The VPI of a particular user is then obtained as the output of the last ZK computation in the proposed sequence of equivalent computations; with input of the particular VGK obtained in the last step of the computation sequence; and additionally the Pll of the particular user. Correspondingly, a particular RGK at one particular level of the hierarchy is obtained as the output of the equivalent ZK computation; with inputs of the RGK of the preceding level, and additionally the Gl of the particular level. The RPI of a particular user is then equivalently obtained as the output of the last ZK computation in the proposed sequence of equivalent computations; with input of the particular RGK obtained in the last step of the computation sequence; and additionally the VPI of the particular user. The proposed method of computation allows for the aggregation of multiple groups within the same database, with the possibility of a particular individual human subject appearing as multiple Pll elements, each uniquely characterized by a particular Gl valuation, with multiple corresponding Dll elements.

Generating pseudonymisation keys further comprises protecting HMAC keys by accessing record of interest, as specified via particular identifier by service requestor (100a) assessing whether service requester possesses authority to request present HMAC computation of interest by service provider (101 ). Subsequently, HMAC computations internal to HSM is undertaken by service provider (1001 ) by requesting a first identifier as input, and a second identifier as output resulting in association of a first record to a second record.

In computing VPI from Pll and storing said VPI in the VPT when a user with authorization creates a Dll record further comprises associating a particular Pll record with subject of interest to a pseudonymous Dll record by accessing Pll record of interest by service requestor (100a). Thereafter, pseudonymisation computations is undertaken by service provider (101 ) with access to first HMAC keys at every level of aggregation; undertaking assessment as to whether service requestor (100a) possesses authority to request pseudonymisation; and associating Pll record of interest to Dll record with positive outcome of the assessment. In computing VPI from Pll and storing said VPI in the VPT when a user with authorization creates a Dll record further comprises associating a particular pseudonymous Dll record of interest to Pll record by accessing Dll record of interest by service requestor (100a). Pseudonymisation computations are undertaken through access to second HMAC keys at every level of aggregation by service provider (101 ) with service provider being able to undertake assessment as to whether service requestor possesses (100a) authority to request reverse pseudonymisation; and associating Dll record of interest to Pll with positive outcome of the assessment.

Reference is now made to FIG. 6.0 which illustrates the pseudonymisation traversal in the database of interest from a particular Pll element (600a) in a multi-level group hierarchy, with a particular group at a particular level designated by a particular Gl specification (601 ), in the RPT (602) of the database; to the corresponding Dll (603) element in the VPT (604), as similarly designated by the same Gl specification (601 ). FIG. 6.0 also illustrates reverse-pseudonymisation traversal from a particular Dll element (605) with a particular Gl specification, to the corresponding Pll element, as designated by the same Gl specification. The pseudonymisation traversal commences with computation of the associated pseudonymisation key via the sequence of FIG. 5.0a. The pseudonymisation computation then proceeds on inputs of the particular Pll, or subset thereof, and additionally the pseudonymisation key as previously computed, and results in output of the corresponding VPI, which indicates the corresponding Dll as the correct destination for the traversal of interest. The reverse-pseudonymisation traversal commences with equivalent computation of the associated reverse-pseudonymisation key via the sequence of FIG. 5.0a. The reverse-pseudonymisation computation on inputs of the particular VPI and additionally the reverse-pseudonymisation key as previously computed, and results in output of the corresponding RPI, which indicates the corresponding Pll as the correct destination for the traversal of interest. The pseudonymisation and reverse-pseudonymisation traversals are therefore only possible with access to the respective computation sequences, and the pseudonymisation and reverse-pseudonymisation keys required therein, as subject to the requisite access control.

The present invention describes a multi-tenancy model comprising a real person table and a virtual person table. The real person table stores RPI and Pll elements, whereas the virtual person table stores VPI elements. The multi-tenancy model allows two records of the same Pll which is indexed by RPI to be created using two different group keys. The present invention enables reverse-pseudonymisation of personal identifying data while providing different KDF sequential computations for traversals from Pll to Dll and from Dll to Pll.

Unless the context requires otherwise or specifically stated to the contrary, integers, steps or elements of the invention recited herein as singular integers, steps or elements clearly encompass both singular and plural forms of the recited integers, steps or elements.

Throughout this specification, unless the context requires otherwise, the word “comprise”, or variations such as“comprises” or“comprising”, will be understood to imply the inclusion of a stated step or element or integer or group of steps or elements or integers, but not the exclusion of any other step or element or integer or group of steps, elements or integers. Thus, in the context of this specification, the term“comprising” is used in an inclusive sense and thus should be understood as meaning “including principally, but not necessarily solely”.