Login| Sign Up| Help| Contact|

Patent Searching and Data


Title:
A METHOD FOR OPTIMIZING PROCESSING OF RESTRICTED-ACCESS DATA
Document Type and Number:
WIPO Patent Application WO/2013/113607
Kind Code:
A1
Abstract:
The invention provides for a computer-implemented method for processing restricted- access data of distributed data for providing data sets of the distributed data to a user (410) of an instance of a software application (408), a back-end system infrastructure for the software application comprising a centralized database (302) persistently storing non-restricted access data of the distributed data and at least one local system for persistently storing restricted-access data of the distributed data, each local system being associated with a respective first set of areas and/or entities and comprising at least one federated database (300) for providing a federated view (306) of the non-restricted access data and the restricted access data, the method comprising the steps of - routing by a routing entity a request for a data set of the distributed data from the user (410) and/or the instance of the software application (408) to the at least one local system or to the centralized database, wherein the routing is based on a matching between the first and the second set of areas and/or the first and the second set of entities,the user and/or instance being associated with a second set of areas and/or entities, wherein the requested data set is constituted from a first and a second part of data, - in case of a routing to the at least one local system, receiving the request at the federated database, retrieving at the federated database the first part of the data set from the centralized database (302) where the first part is stored as the non-restricted-access data of the distributed data, and retrieving at the federated database a second part of the data set from the at least one local system where the second part is stored as the restricted- access data of the distributed data.

Inventors:
MAIER ALBERT (DE)
SCHUETZNER JOHANNES (DE)
RECH THOMAS (DE)
SEEMANN VOLKER (DE)
Application Number:
PCT/EP2013/051310
Publication Date:
August 08, 2013
Filing Date:
January 24, 2013
Export Citation:
Click for automatic bibliography generation   Help
Assignee:
IBM (US)
IBM DEUTSCHLAND (DE)
International Classes:
G06F17/30
Foreign References:
US20110191862A12011-08-04
Other References:
"IBM WebSphere, Administration Guide for Federated Systems, Version 9", 1 January 2006 (2006-01-01) - 31 December 2006 (2006-12-31), INTERNET, pages 1 - 362, XP055058352, Retrieved from the Internet [retrieved on 20130403]
RAFAE BHATTI: "Federated Access Control in Distributed Data Warehouse in Applications", IP.COM, 6 November 2008 (2008-11-06)
F. KASTNER: "Access Controls of Federated Database Environments - The taxonomy of Design Choices", COMMUNICATIONS AND MULTIMEDIA SECURITY. PROCEEDINGS OF THE IFIP TC
Attorney, Agent or Firm:
KUISMA, Sirpa (IBM-Allee 1, Ehningen, DE)
Download PDF:
Claims:
C L A I M S A computer-implemented method for processing restricted-access data of distributed data for providing data sets of the distributed data to a user (410) of an instance of a software application (408), a back-end system infrastructure for the software application comprising a centralized database (302) persistently storing non-restricted access data of the distributed data and at least one local system for persistently storing restricted-access data of the distributed data, each local system being associated with a respective first set of areas and/or entities and comprising at least one federated database (300) for providing a federated view (306) of the non-restricted access data and the restricted access data, the method comprising the steps of

- routing by a routing entity a request for a data set of the distributed data from the user (410) and/or the instance of the software application (408) to the at least one local system or to the centralized database, wherein the routing is based on a matching between the first and the second set of areas and/or the first and the second set of entities, the user and/or instance being associated with a second set of areas and/or entities, wherein the requested data set is constituted from a first and a second part of data,

- in case of a routing to the at least one local system, receiving the request at the federated database, retrieving at the federated database the first part of the data set from the centralized database (302) where the first part is stored as the non-restricted-access data of the distributed data, and retrieving at the federated database a second part of the data set from the at least one local system where the second part is stored as the restricted- access data of the distributed data.

The method of claim 1 , wherein in case of a routing to the centralized database, only the first part of the data is retrieved at the centralized database.

The method of claim 2, wherein in case of the routing to the centralized database (302), the method further comprises providing at the centralized database (302) a view of only the first part to the user (410).

4. The method of claim 1 , wherein in case of the routing to the at least one local system, the method further comprises providing at the federated database (300) the federated view (306) of the first and second part to the user and/or instance.

5. The method of claim 4, wherein the first part comprises first objects with first attributes (314a, 316a) and the second part comprises second objects with second attributes (314b), wherein the first objects correspond to first primary keys (312a) of first tables (308) stored persistently in the centralized database (302) and the second objects correspond to second primary keys (312b) (312b) of second tables (310) stored persistently in the local system, wherein the first tables (308) comprise a first column with the first primary keys (312a) and first successive columns with the first attributes (314a, 316a) associated with the first objects, wherein the second tables (310) comprise a second column with the second primary keys (312b) and second successive columns with the second attributes (314b) associated with the second objects (312b), wherein the first attributes (314a, 316a) are marked by indicators (318), the indicators (318) indicating, if the first attributes (314a, 316a) are accessible to the user (410), wherein the generation of the federated view (306) comprises the method steps of

- generating a third column of third primary keys (312c) and third

successive columns with third attributes (314c, 316c) associated with the third primary keys (312c) in third tables (306) in the federated database (300),

- determining for a given first attribute of the first attributes (314a, 316a) if the indicator (318) of said given first attribute indicates that this given first attribute is accessible to the local system,

- in case of the given first attribute being accessible to the local system, generating the third primary keys (312c) by performing a union process of the first primary keys (312a) and the second primary keys (312b) and generating the third attributes (314c, 316c) by a union process of the first (314a, 316a) and second attributes (314b).

6. The method of anyone of the previous claims, wherein the federated database and the centralized database (302) are relational and/or object-oriented databases.

7. The method of anyone of the previous claims, wherein the user and/or instance is associated with the second set of areas and/or second entities by an authorization scheme.

8. The method of anyone of the previous claims, wherein the request comprises information about a current location of the user, wherein the routing is only performed to the at least one local system in case of a matching of the location with the first set of areas.

9. The computer-implemented method of any of the previous claims 4-8, wherein the system infrastructure is a three-tier architecture (400), the three-tier architecture (400) comprising:

- a presentation tier (402), the instance of the software application (408) being part of the presentation tier (402), wherein the presentation tier (402) is adapted to visualize the federated view (306),

- a middle tier (404), the middle tier (404) comprising at least at least one local application server (426) dedicated to the at least one local system, at least one global application server (412) dedicated to the centralised database and an application server assignment manager, wherein the application server assignment manager is the routing entity,

- a data tier (406), the data tier (406) comprising the centralized database (302) and the at least one federated database (300).

10. The computer-implemented method of any of the previous claims 4 to 8, wherein the system infrastructure is a two-tier architecture, the two-tier architecture comprising:

- a presentation tier (402), the instance of the software application (408) being part of the presentation tier (402), wherein the presentation tier (402) is adapted to visualize the federated view (306), wherein the instance of the software application (408) is the routing entity, - a data tier (406), the data tier (406) comprising the centralized database (302) and the at least one federated database (300).

1 1 . A system for processing restricted-access data of distributed data for providing data sets of the distributed data to a user (410) of an instance of a software application (408), wherein a back-end system infrastructure for the software application comprises a centralized database (302) persistently storing non- restricted access data of the distributed data, wherein the system comprises at least one local system for persistently storing restricted-access data of the distributed data, the at least one local system being comprised in the back-end system infrastructure, each local system being associated with a respective first set of areas and/or entities and comprising at least one federated database (300) adapted for providing a federated view of the non-restricted access data and the restricted access data, wherein

- a routing entity is adapted for routing a request for a data set of the

distributed data from the user (410) and/or the instance of the software application (408) to the at least one local system or to the centralized database, wherein the routing is based on a matching between the first and the second set of areas and/or the first and the second set of entities, the user and/or instance being associated with a second set of areas and/or entities, wherein the requested data set is constituted from a first and a second part of data,

- the federated database is operable for receiving the request in case of a routing to the at least one local system, retrieving the first part of the data set from the centralised database (302) where the first part is stored as the non-restricted-access data of the distributed data, and retrieving at the federated database a second part of the data set from the at least one local system where the second part is stored as the restricted-access data of the distributed data,

- the centralized database is operable for retrieving only the first part of the data in case of a routing to the centralized database.

12. A computer-implemented method for processing restricted-access data of

distributed data for providing data sets of the distributed data to a user (410) of an instance of a software application (408), a back-end system infrastructure for the software application comprising at least one federated database (300) persistently storing non-restricted access data of the distributed data and at least one local system comprising at least one local component database for persistently storing restricted-access data of the distributed data, each local system being associated with a respective first set of areas and/or entities, wherein the federated database (300) is further adapted for providing a federated view of the non-restricted access data and the restricted access data, the method comprising

- routing a request for a data set of the distributed data from the user (410) and/or the instance of the software application (408) to the at least one federated database (300), wherein the requested data set is constituted from a first and a second part of data,

- receiving the request at the federated database, retrieving at the federated database the first part of the data set stored in the federated database as the non-restricted-access data of the distributed data,

- based on a matching between the first and the second set of areas and/or the first and the second set of entities, retrieving at the federated database via a secure network a second part of the data set from the at least one local system where the second part is stored as the restricted-access data of the distributed data.

13. A computer program product comprising computer executable instructions to perform the method steps as claimed in any of the previous method claims.

14. A system for processing restricted-access data of distributed data for providing data sets of the distributed data to a user (410) of an instance of a software application (408), wherein the system comprises at least one federated database (300) being part of a back-end system infrastructure for the software application, wherein the at least one federated database (300) is adapted for persistently storing non-restricted access data of the distributed data and for providing a federated view of the non-restricted access data and restricted access data, wherein the back-end system infrastructure further comprises at least one local system comprising at least one local component database for persistently storing the restricted-access data of the distributed data, each local system being associated with a respective first set of areas and/or entities, wherein the software application (408) is adapted for routing a request for a data set of the distributed data from the user (410) and/or the instance of the software application (408) to the at least one federated database (300), wherein the requested data set is constituted from a first and a second part of data,

the federated database is operable for receiving the request, retrieving at the federated database the first part of the data set stored in the federated database as the non-restricted-access data of the distributed data, the federated database is operable for retrieving at the federated database via a secure network a second part of the data set from the at least one local system where the second part is stored as the restricted-access data of the distributed data, wherein the retrieving is based on a matching between the first and the second set of areas and/or the first and the second set of entities.

Description:
D E S C R I P T I O N A METHOD FOR OPTIMIZING PROCESSING OF RESTRICTED-ACCESS DATA

Field of the invention

The present invention relates to the field of processing data in a global software application of which the data to process are distributed stored in multiple local component databases.

Background art Many countries have data privacy laws forbidding to process and to store persistently a certain kind of data outside the respective country. For example, the German law only allows to transfer a customer's contact data outside of the EU/EEA region, if the respective customer has given his explicit consent. This constitutes a major hurdle for the implementation of a "Globally Integrated Enterprise" (GIE) strategy, characterized by an integration of regional business processes into global processes presupposing that globally distributed data can be processed without any restrictions.

US 201 1/0191862 A1 discloses a system and method for restricting access to requested data based on user location. The method comprises receiving a data request and determining origin location information of the data request from a source providing information having accuracy to the predetermined standard. The method further comprises retrieving one or more policies associated with the requested data, comparing the origin location information with the policies, and dynamically adjusting access restrictions to the requested data based on the comparison.

The publication "Federated Access Control in Distributed Data Warehouse in

Applications" written by the authors Rafae Bhatti et. al., published in IP.com on 6 th November, 2008, discloses a method of automatically routing a query as per access privileges of a user by a middleware in the federated database system. The method disclosed therein discloses an access control scheme involving applying the semantics of Label-Based Access Control (LBAC) model to a federated database using a novel security tacking scheme that allows efficient maintenance of LBAC policies. The - - approach disclosed therein is based on the idea of enforcing the access control without the database having to maintain a separate account for each individual user in the federated system. The publication discloses a design middleware that transparently rewrites a data query as per the access privileges of the user at the application level, where such a query when having been executed on the remote database returns only the data consistent with the authorization of the user.

The publication "Access Controls of Federated Database Environments - The taxonomy of Design Choices", written by the author F. Kastner et. al., published in

"Communications and Multimedia Security. Proceedings of the IFIP TC" discloses a taxonomy of the major design choices concerning access control and database federations, with a taxonomy being organized in the categories granularity,

authorization, and access control. Summary of the invention

It is an objective of embodiments of the invention to provide for a computer- implemented method for processing restricted-access data of distributed data for providing data sets of the distributed data to a user of an instance of a software application. Said objectives are solved by the subject matter of the independent claims, advantageous embodiments are described in the dependent claims.

In a first aspect, the invention relates to a computer-implemented method for processing restricted-access data of distributed data for providing data sets of the distributed data to a user of an instance of a software application, a back-end system infrastructure for the software application comprising a centralized database persistently storing non- restricted access data of the distributed data and at least one local system for persistently storing restricted-access data of the distributed data , each local system being associated with a respective first set of areas and/or entities and comprising at least one federated database for providing a federated view of the non-restricted access data and the restricted access data, the method comprising the steps of

- routing by a routing entity a request for a data set of the distributed data from the user and/or the instance of the software application to the at least one local - - system or to the centralized database, wherein the routing is based on a matching between the first and the second set of areas and/or the first and the second set of entities, the user and/or instance being associated with a second set of areas and/or entities, wherein the requested data set is constituted from a first and a second part of data,

- in case of a routing to the at least one local system, receiving the request at the federated database, retrieving at the federated database the first part of the data set from the centralized database where the first part is stored as the non- restricted-access data of the distributed data, and retrieving at the federated database a second part of the data set from the at least one local system where the second part is stored as the restricted-access data of the distributed data.

Said embodiments may be advantageous, because privacy protected personal information data can be integrated in a common processing by a local instance of a globally implemented software application. Simultaneously this data handling is compliant with local specific data privacy laws. Privacy protected personal information data remains persistently stored in local systems. Users are prevented from storing those privacy protected personal information data in a centralized database and from thereby violating data privacy laws. The invention is applicable also to other restricted- access data than only privacy protected personal information to which access is restricted based on location.

It has to be noted, that the at least one local system may additionally comprise at least one local component database. The restricted-access data of the distributed data may be persistently stored distributed in the at least one federated database and/or the at least one local component database.

For visualizing the federated view of restricted-access data and non- restricted-access data in one common federated view, the front end of the global software application does not need to be modified. One of the key benefits of the generation of a federated view integrating non-restricted-access data stored in a centralized database and restricted-access data persistently stored in the federated database is that there is no impact for the individual user at the software application level, especially not in the presentation layer. Thus, the individual user does not need any new education. Also this - - approach allows a software provider or system integrator, who wants to offer a solution that supports locally-restricted access data, to implement this solution in a cost effective way by adapting an existing solution at the database layer only without touching any code in the presentation layer.

By ensuring that restricted-access data is prevented from being stored in a centralized database this has the additional advantage that a double storage of data is excluded, thus increasing the degree of consolidation of data. Total cost of ownership may be reduced. Also a generation of new user IDs is not necessary. The user will get a view of data, with the data sourcing from several databases, without the need to create a new user ID for each of the databases. Furthermore, no additional APIs need to be implemented. Time to value will also be significantly reduced.

It has to be noted that the present invention does not use an authorization scheme which is applied at an object level at the individual database. Contrary, data storage at an object level can be performed irrespective of any location based user assignments. Thus, a modification of a data storage scheme at object level is not necessary. Further, a modification of software at a business application level is also not necessary. In this context locally-restricted-access data of distributed data is data which according to country-specific data privacy laws are only accessible in the case there is official allowance by the country-specific government or if the owner of the privacy protected data has given explicit permission to be disclosed to a special group of members of another country, another legal entity or another organization. As an example, locally- restricted-access data might be privacy protected personal information data, such as email address, date of birth or local address. Locally-restricted-access data can also mean certain sensitive data that e.g. an organization or a company does not want to be processed outside the own organization or outside a certain region for other reasons than having to obey to the law. As mentioned above, the invention is applicable also to other restricted-access data than only privacy protected personal information to which access is restricted based on location and due to restrictions dictated by law.

Further, distributed data is data which may be persistently stored in multiple local systems, each localized at different locations such as different countries, different legal - - entities or different organizations. Each local system may comprise at least one federated database and optionally one or more local component databases, The multiple local systems are independent from each other. A federated database system is a type of meta-database management system (DBMS) comprising at least one federated database and which may transparently integrate multiple local component databases into the federated database. The local component databases may be interconnected via a computer network and may be geographically decentralized. Since the local component systems remain autonomous, a federated database system is a contrastable alternative to the (sometimes daunting) task of merging together several disparate databases. A federated database may thus be a fully integrated, logical composite of all constituent local component databases in a federated database system. In accordance with an embodiment of the invention, in case of a routing to the centralized database, only the first part of the data is retrieved at the centralized database. This ensures that the access restricted data remain at the respective local system. In case of the routing to the centralized database, in accordance with an embodiment the method further comprises providing at the centralized database a view of only this first part to the user. Thus, privacy protection based on a geographical location is ensured. As soon as the request is directed to the centralized database, the user will not be able to access the restricted access data remain at the respective local system. In accordance with an embodiment of the invention, in case of the routing to the at least one local system, the method further comprises providing at the federated database the federated view of the first and second part to the user and/or instance. In this case, the user is provided with both, the requested non-restricted access data and the requested restricted access data.

In accordance with an embodiment of the invention, the first part comprises first objects with first attributes and the second part comprises second objects with second attributes, wherein the first objects correspond to first primary keys of first tables stored persistently in the centralized database and the second objects correspond to second - - primary keys of second tables stored persistently in the local system, wherein the first tables comprise a first column with the first primary keys and first successive columns with the first attributes associated with the first objects, wherein the second tables comprise a second column with the second primary keys and second successive columns with the second attributes associated with the second objects, wherein the first attributes are marked by indicators, the indicators indicating, if the first attributes are accessible to the user. According to the embodiment, the generation of the federated view comprises the method steps of

- generating a third column of third primary keys and third successive columns with third attributes associated with the third primary keys in third tables in the federated database,

- determining for a given first attribute of the first attributes if the indicator of said given first attribute indicates that this given first attribute is accessible to the local system,

- in case of the given first attribute being accessible to the local system, generating the third primary keys by performing a union process of the first primary keys and the second primary keys and generating the third attributes by a union process of the first and second attributes. Said embodiment may be advantageous, since the concept of the indicator in a global table of a centralized database indicates to get the data for each object from a local system to be able to generate the federated view. Such an indicator may be a simple Y/N flag (meaning YES/NO) associated with an attribute of an object. For example, by means of the indicator Y associated to attributes of objects in a global table of the centralized database system, the local system gets the information that those attributes are available and in this way accessible in the local table of the local database component. Whereas by means of the indicator N associated to attributes of objects in a global table of the centralized database system, the local system gets the information that those attributes are not available and in this way non-accessible in the local table of the centralized database. The Y/N setting of the indicators might be done, e.g. in compliance with country-specific data privacy laws. - -

In this context an object could be e.g. a business object like a customer-ID to which attributes like privacy data like name, address, e-mail address etc. could have been associated. It has to be noted, that an authorization concept comprising a definition which of the users A or B or C are allowed to see locally-restricted data may be realized at application level by e.g using "external authorization information", such as. checking the users' authorization via an administrative table that contains authorization assignments.

By marking on attribute level by means of an indicator, if access to single attributes of an object is allowed or denied, a high degree of differentiation concerning accessibility to locally-restricted-access data, stored persistently on a local autonomous component database, is achieved. Although the federated view is generated by a highly

differentiated retrieval according to predefined access rights, the generated federated view will have a homogeneous and integrative appearance for the user in a respective instance of the software application.

In accordance with an embodiment of the invention, the federated database and the centralized database are relational and/or object-oriented databases. Thus, there is no restriction concerning the structure of the involved databases. Relational as well as object-oriented databases may be used for realizing the concept of database

federations. Also a combination of different kinds of databases is possible: data from a relational database as well as data from an object-oriented database can be combined as data pools for generating a homogeneous and integrative federated view. For the user there will be no difference in the visualization of the federated view.

In accordance with an embodiment of the invention, the user and/or instance is associated with the second set of areas and/or second entities by an authorization scheme. For example, the authorization scheme may be employed at a tier of the software application, wherein the authorization scheme determines a presence of an authorization for the user to access the at least one local system.,

wherein in the case the authorization is present, the request is routed to the local system. - -

Said embodiments may be advantageous, because at the tier of the software

application there will be a control mechanism by authorizing a user, preferably a special user group, to access locally-restricted-access data. This constitutes a simple control mechanism based on a pre-defined authorization concept. This authorization scheme is applicable to the user as well as to the respective local instance of the software application. Only in case the user and the instance of the software application are allowed to access locally-restricted-access data, the data request, initiated by the user in the respective instance of the software application, will be routed to the respective local system in accordance with a preconfigured location based relationship between the user and the at least one local system.

According to some embodiments, the request comprises information about a current location of the user, wherein the routing is only performed to the at least one local system in case of a matching of the location with the first set of areas. Preferably, a dynamic checking of the current location of the user will be performed by the routing entity. For example in case the user is employing a mobile phone application in order to send the request, i.e. in case a mobile phone application provides the instance of the software application, it can be ensured that access to the access restricted data of a given local system is only granted in case of a respective location matching. If the user has moved to a different location not matching the areas assigned to this local system, the access will be automatically denied.

According to some embodiments, the system infrastructure is a three-tier architecture, the three-tier architecture comprising:

a presentation tier, the instance of the software application being part of the presentation tier, wherein the presentation tier is adapted to visualize the federated view,

- a middle tier, the middle tier comprising at least at least one local application server dedicated to the at least one local system, at least one global application server dedicated to the centralised database and an application server assignment manager, wherein the application server assignment manager is the routing entity, _ _ a data tier, the data tier comprising the centralized database and the at least one federated database.

Within the context of this three-tier architecture, a preconfigured relationship between the user and the at least one local application server means that at the level of the application server there may already exist a predefined configuration how the request of a special user, preferentially a special user group, will be routed to the respective local systems.

According to some embodiments, the system infrastructure is a two-tier architecture, the two-tier architecture comprising:

- a presentation tier, the instance of the software application being part of the presentation tier, wherein the presentation tier is adapted to visualize the federated view, wherein the instance of the software application is the routing entity,

- a data tier, the data tier comprising the centralized database and the at least one federated database.

Said embodiments of the invention may be advantageous, as they are applicable to diverse types of infrastructure architecture. Embodiments of the invention may be not only applicable to the 3-tier architecture, that is described in the preferred embodiment, but also applicable to the 2-tier architecture. So vendors of applications of both types could use the invention. In the same manner system integrators working with

applications of both types could use the invention.

In a further aspect, the invention relates to a system for processing restricted-access data of distributed data for providing data sets of the distributed data to a user of an instance of a software application, wherein a back-end system infrastructure for the software application comprises a centralized database persistently storing non- restricted access data of the distributed data, wherein the system comprises at least one local system for persistently storing restricted-access data of the distributed data, the at least one local system being comprised in the back-end system infrastructure, each local system being associated with a respective first set of areas and/or entities _ _ - and comprising at least one federated database adapted for providing a federated view of the non-restricted access data and the restricted access data, wherein

- a routing entity is adapted for routing a request for a data set of the distributed data from the user and/or the instance of the software application to the at least one local system or to the centralized database, wherein the routing is based on a matching between the first and the second set of areas and/or the first and the second set of entities, the user and/or instance being associated with a second set of areas and/or entities, wherein the requested data set is constituted from a first and a second part of data,

- the federated database is operable for receiving the request in case of a routing to the at least one local system, retrieving the first part of the data set from the centralised database where the first part is stored as the non- restricted-access data of the distributed data, and retrieving at the federated database a second part of the data set from the at least one local system where the second part is stored as the restricted-access data of the distributed data,

- the centralized database is operable for retrieving only the first part of the data in case of a routing to the centralized database.

In a second aspect, the invention relates to a computer-implemented method for processing restricted-access data of distributed data for providing data sets of the distributed data to a user of an instance of a software application, a back-end system infrastructure for the software application comprising at least one federated database persistently storing non-restricted access data of the distributed data and at least one local system comprising at least one local component database for persistently storing restricted-access data of the distributed data, each local system being associated with a respective first set of areas and/or entities, wherein the federated database is further adapted for providing a federated view of the non-restricted access data and the restricted access data, the method comprising

- routing a request for a data set of the distributed data from the user and/or the instance of the software application to the at least one federated database, wherein the requested data set is constituted from a first and a second part of data, - -

- receiving the request at the federated database, retrieving at the federated database the first part of the data set stored in the federated database as the non-restricted-access data of the distributed data,

- based on a matching between the first and the second set of areas and/or the first and the second set of entities, retrieving at the federated database via a secure network a second part of the data set from the at least one local system where the second part is stored as the restricted-access data of the distributed data.

Said embodiments may be advantageous, as now the generation of a federated view can be performed on a centralized database, in this way increasing the degree of centralization of the system infrastructure, by letting restricted-access data being persistently stored in local systems. So for example, depending on the respective country-specific laws, a violation of country-specific data privacy laws could be prevented. The centralization of the generation of the federated view will be made feasible by transferring restricted-access data from remote tables in local systems via a secure network to a centralized federated view in a centralized database. The degree of complexity of the system infrastructure will be reduced, because it is sufficient to have only one federated system as a global database management system. All applications of the overall software system will go against this centralized federated system.

In a further aspect, the invention relates to a computer program product comprising computer executable instructions to perform the method steps as described above. In a further aspect, the invention relates to a system for processing restricted-access data of distributed data for providing data sets of the distributed data to a user of an instance of a software application, wherein the system comprises at least one federated database being part of a back-end system infrastructure for the software application, wherein the at least one federated database is adapted for persistently storing non- restricted access data of the distributed data and for providing a federated view of the non-restricted access data and restricted access data, wherein the back-end system infrastructure further comprises at least one local system comprising at least one local component database for persistently storing the restricted-access data of the distributed data, each local system being associated with a respective first set of areas and/or entities, wherein - - the software application is adapted for routing a request for a data set of the distributed data from the user and/or the instance of the software application to the at least one federated database, wherein the requested data set is constituted from a first and a second part of data, the federated database is operable for receiving the request, retrieving at the federated database the first part of the data set stored in the federated database as the non-restricted-access data of the distributed data, the federated database is operable for retrieving at the federated database via a secure network a second part of the data set from the at least one local system where the second part is stored as the restricted-access data of the distributed data, wherein the retrieving is based on a matching between the first and the second set of areas and/or the first and the second set of entities.

As will be appreciated by one skilled in the art, the above described embodiments relating to the first aspect of the method for optimizing processing of locally-restricted- access data can also be applied in an analogous manner to the second aspect of the method for optimizing processing of locally-restricted-access data and the respective systems.

As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a method, system or computer program product. Accordingly, if not explicitly stated otherwise, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, microcode, etc) or an embodiment combining software and hardware aspects that may or generally be referred to herein as a 'module' or 'system'. Any combination of one or more computer-readable medium(s) may be utilized.

Brief description of the drawings _ _

The above and other items, features and advantages of the invention will be better understood by reading the following more particular description of embodiments of the invention in conjunction with the Figs, wherein: Fig. 1 is a flowchart of a method for providing restricted-access data of

distributed data in a federated view, is a flowchart of a method for processing restricted-access data of distributed data with a centralized database acting as a federated database generating federated views, is a block diagram illustrating the generation of a federated view in a local system, is a block diagram showing a three-tier architecture of a system infrastructure.

Fig. 1 is a flowchart depicting the steps of the herein disclosed method. A local system comprising a database with locally-restricted access data is provided in step 100. For example, the locally restricted data may be persistently stored in at least one local federated database or in a local component database of the local system that is accessible via the at least one local federated database. Non-locally-restricted-access data is provided in a centralized database in step 102. A request for data which has been initiated by a user within an instance of a software application is received in step 104. In step 106 authorization of the user is checked, if the user has been authorized to access locally-restricted-access data.

In a two tier architecture, the authorization check may already be performed at the instance of the software application as routing instance. In a three tier architecture, the authorization may be checked at an application server assignment manager as the routing instance. If the user has not been explicitly authorized to access locally- restricted-access data in step 108, the request is assigned to for example a global application server in step 1 16. The global application server is dedicated to the centralized database. After that the request is provided from the global application _ _ server to the centralized database in step 1 18. The request is processed at the centralized database in step 120, with a processing being restricted to data stored in the centralized database. If the user has been authorized in step 1 10 to access locally-restricted-access data, step 124 is carried out.

If the local instance of the software application has been authorized to access locally- restricted-access data in step 122, the request is assigned to at least one local system having dedicated a local application server in step 124. The local application server is dedicated to a respective local federated database. Thus, the request is routed to the local federated database in step 126. Then the request is processed at the local federated database in step 128, where the processing been performed using data stored persistently in at least one local system and a centralized database. In step 130 a federated view of process data is provided by the local federated database.

Fig. 2 is a flowchart illustrating the workflow of a request for data in the case the centralized database acts as a federated database providing federated views on data stored in a centralized database and in at least one local system. It has to be noted that the approach illustrated in Fig. 2 should only be employed in case any local systems storing access-restricted data can be accessed via secure network connections. For example, encrypted channels using end-to-end encryption may be used (e.g. using https). Additionally or alternatively, the data may be encrypted using symmetric or asymmetric cryptography.

In Fig. 2, in step 100, a local system is provided comprising a storage for locally- restricted-access data. Further, in step 102 the centralized database is provided comprising the non-locally-restricted-access data. In step 104 a request for data is received, which for example has been initiated by a user in a local instance of the software application.

In contrast to Fig. 1 , in case there is a secure network between the centralized database and the at least one local autonomous database component, the request is always assigned to at least one global application server in step 200. In this case it is assumed _ _ that the system infrastructure comprises a three-tier architecture, wherein the global application server is dedicated to the centralized database.

In accordance with a preconfigured relationship between the user, who has initiated the data request within the instance of the software application, the data request is routed in step 202 to the centralized database. The centralized database acts as a federated database providing federated views on the data stored persistently in the at least one local system and in the centralized database. For filling the columns in the federated database with attributes of objects, in step 204, a check may be made, if the user who has initiated the data request, is authorized to access attributes of objects which are stored persistently in at least one remote table in the respective local system. A check which of the users A or B or C are allowed to see locally-restricted data may be realized by checking the users' authorization via an administrative table that contains authorization assignments. The checking will be performed based on a current location of the user. The current location will either be a fixed assignment of for example a user ID with a given location. Alternatively, the current spatial location of the user may be comprised in the request. In both cases, at the centralized database the above mentioned check is made.

If the user is not authorized in step 206 to access attributes of objects persistently stored in remote tables in at least one local system, these attributes will remain invisible for the user. Thus, in step 208 only attributes of the corresponding objects are retrieved from the global table in the centralized database, acting as the federated database.

In case the user is explicitly authorized in step 210 to access these attributes of the objects in the at least one local system, these are retrieved from the federated view in step 212 by transferring those attributes via a secure network for providing the federated view at the centralized database.

Fig. 3 is a block diagram illustrating a federated view 306 in a local federated database 300. The view was generated by retrieving locally-restricted-access data from a local table 310 of a local system 300. - -

Primary keys 312a (CLIENT and ADDRNUM) of a global table 308 (ADRC) of a centralized database 302 are matched with primary keys 312b (CLIENT and ADDRUM) of local table 310 (LOCAL.ADRC). The primary keys 312a and 312b, after a generation of a set union, are the base for the primary keys 312c in the federated view 306 (ADRC) in the federated database 300. The primary keys 312a and 312b as well as 312c correspond to objects, to which attributes 314a and 316a respectively, 314b as well as 314c and 316c are assigned in the global table 308, respectively in the local table 310 as well as in the federated view 306.

In the illustrated example, in Fig. 3, the attributes 316a OTHERS are accessible to all local systems of the global software application. For attributes 314, in contrast, an indicator 318, in this case PI_FLAG, is used to determine which values have to be retrieved from the local table 310 and which values have to be retrieved from the global table 308. In the federated view 306 ADRC the respective attribute 314a NAME_TEXT will only be visible as attribute 314b NAME_TEXT, in case the indicator 318 P I F LAG has been set on Y. Otherwise the respective attribute 314c NAME_TEXT of the federated view 306 will be retrieved from the global table 308 ADRC from the column containing the attributes 314a NAME_TEXT of the respective object 312a.

In the illustrated example, in Fig. 3, the column 316c OTHERS of the federated view 306 in the federated database 300 is identical with the column 316a OTHERS in the global table 308 ADRC in the centralized database 302, because the column 316a containing the attributes OTHERS of the respective object 312a has been defined, in this case, as free for access to the local system of the global software application.

The following exemplary instructions illustrate how to set up a federated database to support locally-restricted-access data (with DB2 being a commercial relational database management system (RDBMS) of the IBM corporation): Setup of DB2 for Linux, Unix, and Windows

The description below assumes a new database system instance has been created and it assumes that there is a script that creates a nickname on the DB2 system for each end user table on the remote database system which in our case is DB2 for z/OS. - -

Enable federation by setting the DBMS configuration parameter FEDERATED to YES. Create a new database and connect to it Register the DRDA wrapper to enable access to remote DB2 systems: ... Register the remote DB2 for z/OS system as server: ... Register a user mapping to map the DB2 connect user for the local db system to the DB2 connect user for the remote db system. A user mapping is an association between a federation server authorization ID and a data source user ID and password. By default, user mappings are stored in the catalog on the federated server, but they could also be stored in an external repository, such as on an LDAP server. Update data source statistics at remote system (via DB2 RUNSTATS). This step is needed for a good performance of federated queries. At nickname creation time the remote statistics is shipped to the local system and exploited by the query optimizer. The local statistics can be updated by using the SYS P ROC. NN STAT stored procedure. Register nicknames: this is done by a script that accesses the DB for z/OS system catalog and for each user table this script has to create an equally named nickname on the DB2 for LUW system, e.g.

CREATE NICKNAME <SCHEMA>.KNA1 FOR OS390.<SCHEMA>. DEPARTMENT (where OS390 is the name of the server pointing to the remote DB2/z database).

For each table that contains PI data the following steps have to be performed _ _ _

- Drop the nickname that has been created in the previous step (it has to be replaced by a federated view)

- Create a nickname for the remote table in schema "REMOTE"

- Create a corresponding local table consisting of all key columns and all PI columns. Use the same table name, but use "LOCAL" as schema name

- Create a federated view in the original schema that joins the data. For all rows that don't contain PP-PI data the row is taken from the remote table. For all rows that contain PP-PI data the remote data is joined with the local data and all the local columns are used in the result set instead of the remote ones (that have some default value, e.g. "blank"). Example: to illustrate the idea let's assume that table ADRC has 5 columns, with

CLIENT and ADDRNUMBER as primary key and 3 additional columns,

NAME_TEXT, OTHERS and P I F LAG. Let's also assume we need to protect the column NAME_TEXT and P I F LAG is set to Ύ' for those addresses that have to be protected and to 'N' otherwise. The following SQL statements would be needed in this case:

- DROP NICKNAME <SCHEMA>.ADRC

- CREATE NICKNAME REMOTE.ADRC FOR

OS390.<SCHEMA>. DEPARTMENT

- CREATE TABLE LOCAL.ADRC (CLIENT VARCHAR(9), ADDRNUMBER

VARCHAR(30), NAME_TEXT VARCHAR(50)) - CREATE VIEW <SCHEMA>.ADRC as (SELECT * FROM REMOTE.ADRC

WHERE P I F LAG = 'N') UNION ALL (SELECT L. CLIENT, L.ADDRNUMBER, L.NAME_TEXT, R. OTHERS, R.PI FLAG FROM REMOTE.ADRC R,

LOCAL.ADRC L where R. CLIENT = L. CLIENT and

R.ADDRNUMBER=L.ADDRNUMBER and R.PI FLAG = Ύ') - -

9. Preferably, the business application running on those databases might have

application specific configuration/catalog tables that for performance and/or functional reasons should be kept locally in both database systems and not be replaced by a nickname.

Fig 4 is a block diagram showing the system infrastructure in the form of a three-tier architecture. The three-tier architecture 400 comprises a presentation tier 402, a middle tier 404 and a data tier 406.

The presentation tier 402 comprises business applications like software application 408. In the presentation tier 402 users 410 are able to initiate certain requests for data. As a response to their requests the users 410 are also able to see the results of their data requests in the software application 408 in the presentation tier 402.

The presentation tier 402 is connected with the middle tier 404. The application servers 412 of the middle tier 404 are parts of a global data center 420, therefore named in the following as global application servers. The application servers 426 are part of a local data center 422, therefore in the following those application servers 426 of the middle tier 404 are named as local application servers 426.

The global application servers 412 are connected and designated to a centralized database 302 of the data tier 406. The centralized database 302 is also part of the global data center 420. The location application servers 426 of the middle tier 404 are connected and designated to a local federated database 300. The federated database 300 is part of the respective local data center (i.e. local system) 422. The centralized database 302 and the respective local federated database 300 are connected by a connection 418.

The local federated database 300, is able to generate federated views with data stored persistently in the local system, and data stored in the centralized database 302. For example, the centralized database 302 is used for persistently storing non-restricted access data of distributed data and the local system 422 may comprise at least one _ - federated database 300 and optionally at least one local component database (not shown here) for persistently storing restricted-access data of the distributed data.

An application server assignment manager 424 in the middle tier 404 is able to route a data request of the user to the local or the global application server(s). The routing of the data request to the at least one local application server 426 is performed in accordance with a preconfigured relationship between the user and the at least one local application server 426. The preconfigured location based relationship between the user 410 and the respective application server comprises a predefined configuration of a certain processing of a data request. The relationship is thus depending on the location of the user and the respective area to which the application server 426 is assigned. In case of a matching, the routing is performed to the application server 426. In case of a mismatching, the application server assignment manager routes the request to the application server(s) 412.

By using a concept of user groups it can be defined at application level within a certain administration tool to determine in advance which data request of a user group will be routed to which of the global application servers 412 or to which of the local application servers 426. So it can be defined at application level which local users or user groups are allowed to access a special local application server 426. This can be done country- specific. Preferably this authorization scheme is in accordance with the setting of the indicators.

In another embodiment the application server assignment manager 424 can be the SAP Load Balancer tool, after having been modified according to the purposes of this invention.

In the embodiment shown in Fig. 4, in the centralized database 302 only data is stored which is not privacy protected. Users which are not allowed to privacy protected personal information have only access to this data stored in the centralized database via corresponding routing by the global application servers 412. Those users will only see default values when trying to access privacy protected personal information. _ -

Requests of local users, who are allowed to access privacy protected personal infornnation stored on a local system, will routed to one of the local application servers 426 which has been directed to use the local federated database 300 providing a federated view containing also privacy protected personal information. When requesting data, those local users who are authorized to see privacy protected personal information in will see the same screen as those users who are not allowed to do so, just that the screen now contains privacy protected personal information data coming from the local system as well as non-privacy protected personal information coming from the centralized database 302.

To enable this behavior of combined and integrated processing of data in accordance with the individual authorization of the user to access privacy protected personal information, the federation capabilities of the local database management systems are exploited. The federation of data is to correlate data from local tables and remote data sources, as if all the data is stored locally in the federated database. The federated database 300 basically comprises nicknames pointing to the global table 308 as well as federated views 306 that join remote personal information data with local privacy protected personal information. By means of the nicknames, the remote personal information data can be retrieved.

The architecture shown in Fig. 4 can be used for several privacy protected personal information processing database schemes with locally distributed databases. In this case each location needs its own local application server(s) 426. All privacy protected personal information data is stored in a local database. The database management system (DBMS) has to be a federated DBMS supporting federated views with a DBMS that hosts the centralized database.

Privacy protected personal information data will never be stored in the centralized database and will never leave the local system environment. Only data that is not privacy protected personal information is stored in the centralized database. - -

List of reference numerals

300 local federated database

302 centralized database

306 federated view

308 global table

310 local table

312a, 312b, 312c primary keys depicting objects

314a, 314b, 314c privacy relevant attributes

316a, 316b, 316c attributes that are not privacy relevant

318 indicators

400 three-tier architecture of the system infrastructure

402 presentation tier

404 middle tier

406 data tier

408 instances of the global software application

410 users

412 global application servers

418 connection between the centralized database and the federated database 420 global data center

422 local data center

424 application server assignment manager

426 local application servers