SECURE AND TRANSPARENT DATABASE TRANSACTIONS IN UNTRUSTED ENVIRONMENTS USING PERMISSIONED, MINIMALLY INVASIVE BLOCKCHAIN TECHNOLOGY

Title:

SECURE AND TRANSPARENT DATABASE TRANSACTIONS IN UNTRUSTED ENVIRONMENTS USING PERMISSIONED, MINIMALLY INVASIVE BLOCKCHAIN TECHNOLOGY

Document Type and Number:

WIPO Patent Application WO/2020/225224

Kind Code:

Abstract:

A computer-implemented method for processing a database transaction in a network of database nodes, comprising the steps: Receiving, at a first database node, instructions specifying a database transaction from a database client; determining an effect of the received instructions; among all nodes, forming a consensus on the determined effect.

More Like This:

JP2002319271	INFORMATION PROCESSING DEVICE AND METHOD, RECORDING MEDIUM AND PROGRAM
JP5318069	Information processing device
WO/2023/073066	METHODS AND SYSTEMS FOR DISTRIBUTED BLOCKCHAIN FUNCTIONALITIES

Inventors:

DITTRICH JENS (DE)
SHARMA ANKUR (DE)
SCHUHKNECHT FELIX MARTIN (DE)

Application Number:

PCT/EP2020/062347

Publication Date:

November 12, 2020

Filing Date:

May 04, 2020

Export Citation:

Click for automatic bibliography generation Help

Assignee:

UNIV SAARLAND (DE)

International Classes:

G06F21/64; G06F16/27; H04L9/32

Foreign References:

CN108830733A	2018-11-16
US20170213209A1	2017-07-27

Other References:

ANONYMOUS: "Architecture Origins - hyperledger-fabricdocs master documentation", 5 April 2019 (2019-04-05), XP055701052, Retrieved from the Internet [retrieved on 20200604]

Attorney, Agent or Firm:

BACH, Alexander (DE)

Download PDF:

View/Download PDF PDF Help

Claims:

Claims

1. Computer-implemented method for processing a database transaction in a network of database nodes, comprising the steps:

Receiving, at a first database node, instructions specifying a database transac tion from a database client; determining an effect of the received instructions; among all nodes, forming a consensus on the determined effect.

2. The method of claim 1, wherein the received instructions are cryptographically signed by the client.

3. The method of one of the preceding claims, further comprising the step of:

cryptographically signing the received instructions at the first database node.

4. The method of one of the preceding claims, wherein determining an effect of the re ceived instructions comprises obtaining state information on a state of the first data base node that are occurring when the database transaction commits.

5. The method of one of the preceding claims, wherein the step of determining comprises executing or simulating the received instructions.

6. The method of claim 5, wherein simulating comprises the steps: Validating integrity constraints of the database transaction.

7. The method of claim 4, wherein the state information comprises a hash value.

8. The method of claim 7, wherein the hash value is obtained based on a first hash value (@H1) representing an initial state of the first database node and a second hash value (@H2) representing a state of the first database node after the database transaction has been executed.

9. The method of claim 7, wherein the hash value is obtained based on a first hash value (@H1) representing an initial state of the first database node and a second hash value (@H2) representing a state of the first database node after a block of several database transactions has been executed.

10. The method of claim 9, wherein the size of the block of database transactions is agreed by the participating database nodes.

11. The method of claim 5, wherein simulating comprises the steps of:

Creating a checkpoint/savepoint (@old_state) of the state to undo the effects of the transaction;

Executing the actual transaction locally; and

Rolling back to the checkpoint in order to undo the effects of the transaction.

12. The method of claim 4, wherein the state information is signed. IS. The method of one of the preceding claims, further comprising the step of receiving state information from the second database node.

14. The method of claim 13, further comprising the step of creating a chained transaction, based on the received state information.

15. The method of one of the preceding claims, wherein the transaction is only committed, if the state information of the transaction is consistent with the state information re ceived from the second database node.

16. Computer-implemented method for recovering from an inconsistent state, comprising the steps of detecting that a state deviates from a consensus state; and recovering to a consensus state.

17. The method of claim 16, further comprising recovering from a local checkpoint, using a local transaction history.

18. Database server, implementing a method according to one of claims 1 to 16.

19. A method of converting one or more existing databases for use with a method accord ing to one of claims 1 to 16, comprising the steps:

Creating one or more chained tables, according to a defined table schema;

Storing one or more endorsement constraints, specifying one or more organi zations required to endorse a transaction to be carried out on the one or more chained tables.

20. The method of claim 19, further comprising the step of storing one or more integrity constraints specifying additional properties of the one or more chained tables.

21. The method of claim 19, further comprising the step of storing one or more punish ment constraints, specifying conditions under which organizations that violate integ rity constraints, are suspended from participating in transactions.

22. The method according to one of claims 19 to 21, wherein the constraints are stored in a separate configuration file.

23. A method for ordering transactions on chained database tables.

24. A chained database, obtained by converting an existing database according to a method of one of claims 19 to 22.

25. The method of claim 1, wherein each node determines whether its determined effect matches the determined consensus.

26. The method of claim 25, wherein the effect is committed to a ledger, if the effect matches the consensus.

27. The method of claim 26, wherein the node initiates a recovery, if the effect does not match the consensus.

28. The method of claim 1, wherein all database nodes initiate a recovery, when no con sensus on the effects is determined.

Description:

SECURE AND TRANSPARENT DATABASE TRANSACTIONS IN UNTRUSTED ENVIRONMENTS

USING PERMISSIONED, MINIMALLY INVASIVE BLOCKCHAIN TECHNOLOGY

The present invention relates to secure and transparent business processes in untrusted en vironments using permissioned, minimally invasive blockchain technology.

Technical Background and Prior Art

Blockchain technology enables trusted transactions to be performed in untrusted environ ments. The basic idea is to distribute the notary's task over the entire network and have each transaction validated by all participants. Together they decide whether a transaction is valid for all participating organizations. If yes, it is ensured that all organizations involved execute the transaction locally in exactly the same way.

A large number of blockchain systems implement the technology in a wide variety of ways. Public and permissioned blockchains must be distinguished: While public blockchain systems such as Bitcoin and Ethereum allow any unknown participants to enter and leave the network, permissioned blockchain systems restrict participation to specified organizations. Permis sioned systems such as Hyperledger Fabric are therefore ideal for making transactions be tween companies and departments trustworthy.

However, all existing blockchain systems— regardless of whether they are public or private — suffer from at least one of the following problems that severely limit their usability.

First, blockchain systems currently see themselves as completely independent systems and do not allow seamless integration into existing infrastructures. Typically, organizations already have various database management systems in place that are full of legacy data and access numerous top-level applications. If these organizations want to use blockchain technology, they need to add another new system to their already complex infrastructure. This new block- chain system must be installed, administered and integrated into the application landscape in a complex manner. If necessary, data must be exchanged laboriously between the existing databases and the new blockchain. Thus, the initial and operating costs for organizations that want to use blockchain technology are currently immense. Second, blockchain systems typi cally lack support for classical query languages such as SQL (Structured Query Language) and for established data models such as the relational model. Blockchain systems usually either focus on a very specific application (such as financial transactions, e.g. Ripple) or support ar bitrarily complex Turing complete programs, also called Smart Contracts. These Smart Con tracts must be written in exotic language like Solidity (used by Ethereum for example) and work on unstructured data. This is in contrast to the reality within companies using relational database management systems such as IBM DB2, Microsoft SQL Server, Oracle, MySQL, or PostgreSQL, all of which use SQL. Therefore, it is currently difficult and time-consuming to integrate blockchain technology into existing environments, as top-level applications have to be adapted to the data model and the query language on a large scale.

Third, blockchain systems currently suffer from a low throughput of transactions the size of several hundred transactions per second. In comparison, (distributed) database management systems achieve a throughput that is up to two orders of magnitude higher. One reason is that current blockchain systems tend to reinvent the wheel by re-implementing the complete pro cessing system. Instead, existing components that have been available and optimized in data base management systems for years could be used.

Object of the Invention

It is therefore an object of the invention to provide more efficient methods and systems for secure and transparent database transactions in untrusted environments.

Summary of the Invention

This object is achieved by the methods and systems according to the independent claims. Ad vantageous embodiments are defined in the dependent claims.

According to a first aspect, the invention comprises a computer-implemented method for pro cessing a database transaction in a network of database nodes, comprising the steps: Receiv ing, at a first database node, instructions specifying a database transaction from a database client; determining an effect of the received instructions; among all nodes, forming a consen sus on the determined effect. The received instructions may be cryptographically signed by the client. The method may further comprise the step of: cryptographically signing the re ceived instructions at the first database node. Determining an effect of the received instruc tions may comprise obtaining state information on a state of the first database node that are occurring when the database transaction commits. The step of determining may comprise simulating the received instructions. Simulating may comprise validating integrity constraints of the database transaction. Simulating may further comprise the steps of: creating a check point/savepoint (@old_state) of the state to undo the effects of the transaction; executing the actual SQL-transaction locally; and rolling back to the checkpoint in order to undo the effects of the transaction. The state information may be digitally signed. The state information may comprise a hash value. The hash value may be obtained based on a first hash value (@H1) representing an initial state of the first database node and a second hash value (@H2) repre senting a state of the first database node after the database transaction has been executed. The hash value may also be obtained based on a first hash value (@H1) representing an initial state of the first database node and a second hash value (@H2) representing a state of the first database node after a block of several database transactions has been executed. The size of the block of database transactions may be agreed by the participating database nodes.

The method may further comprise the step of receiving state information from the second database node. The method may further comprise the step of creating a chained transaction, based on the received state information. The transaction may only be committed, if the state information of the transaction is consistent with the state information received from the sec ond database node.

Each node may determine whether its determined effect matches the determined consensus. The effect may be committed to a ledger, if the effect matches the consensus. Otherwise, the node may initiate a recovery, if the effect does not match the consensus. All database nodes may initiate a recovery, when no consensus on the effects is determined.

According to a second aspect, the invention also comprises a computer-implemented method for recovering from a state, comprising the steps of detecting that a state deviates from a consensus state; and recovering to a consensus state. The method may comprise recovering a local checkpoint and a local transaction history. Alternatively, recovery comprises recovering from a remote checkpoint and a remote transaction history. In the latter case, the recovery may further comprise performing another consensus round to determine whether the remote data can be used for recovery. The recovery method may also use certain recovery elements of the underlying database system, if available. For example, the checkpointing mechanism, which may also be performed "manually", i.e. the recovery mechanism using instructions of the underlying database without relying on high-level abstractions, such as a rollback primi tive, could be used if it is available. In this way, it becomes possible to support generic data base systems that do not implement a rollback operation as such. In addition, there are more options and possibilities for controlling the recovery.

According to a third aspect, the invention also comprises a database server, implementing a method according to the first and second aspects.

According to to a fourth aspect, the invention also comprises a method of converting one or more existing databases for use with a method according to the first and / or the second as pect, comprising the steps: creating one or more chained tables, according to a defined table schema; storing one or more endorsement constraints, specifying one or more organizations required to endorse a transaction to be carried out on the one or more chained tables. Con verting may further comprise the step of storing one or more integrity constraints specifying additional properties of the one or more chained tables. The method may further comprise the step of storing one or more punishment constraints, specifying conditions under which organizations that violate integrity constraints, are suspended from participating in transac tions. The constraints may be stored in a separate configuration file.

According to a fifth aspect, the invention also comprises a method for ordering transactions on chained database tables.

According to a sixth aspect, the invention also comprises a chained database, obtained by converting an existing database using a method according to the fourth aspect.

The term ChainifyDB will be used in the following as a shorthand in order to designate collec tively the methods, systems and databases according to the various embodiments of the in vention.

Brief Description of the Figures Fig. 1 shows a first standard scenario according to a first embodiment of the inven tion, in which all organizations are in the same state and all organizations be have honestly.

Fig. 2 shows a second safety-critical scenario according to the first embodiment of the invention, in which the state of an organization deviates, and all organiza tions behave honestly.

Fig. 3 shows third safety-critical scenario according to the first embodiment of the invention, in which the state of an organization deviates, and the organization behaves maliciously.

Fig. 4 shows the handling of a violated integrity constraints in a fourth safety-critical scenario according to the first embodiment of the invention.

Fig. 5a-e show a sequence of diagrams illustrating an example of a recovery mechanism for a chained database according to the first embodiment of the invention. In figure 5a, organization A intends to create a checkpoint, but does not reach consensus. In figure 5b, organization B intends to create a checkpoint and reaches consensus to do so. In figure 5c, organization C detects a corruption of its state. In figure 5d, organization C rolls back to the last checkpoint, commits all missing transactions, and asks for agreement. In figure 5e, organization C fully recovers and reaches the latest checkpoint.

Fig. 6a-b show (a) the order-execute-consensus processing model (OEC) and (b) the

Whatever-LedgerConsensus processing model (WLC) for database transactions on shared tables according to a second embodiment of the invention.

Fig. 7 shows a schematic overview of a concrete system instantiating the Whatever- LedgerConsensus model (WLC).

Fig. 8 shows a logical tuple-wise per block digest computation on an example table. Fig. 9 shows ChainifyDB's checkpointing mechanism, wherein a checkpoint is created after every three blocks.

Fig. 10 shows ChainifyDB's Recovery using checkpoints. As block 46 is non-consenting it has to enter the recovery phase. It will first try to recover using the most recent local checkpoint. This fails in this example and hence recovery from an older checkpoint is performed.

Fig. 11 shows Transaction agreement can be regarded as running a separate pre-WLC phase on transaction agreement before executing the actual transaction.

Fig. 12 A topological sort of the dependency graph with k = 9 transactions yielding four execution stages

Fig. 13 presents another visualization of the round-based processing of the WLC model

Fig. 14 shows a possible implantation of the W-phase in pseudocode.

Fig. 15 shows a possible implantation of the LC-phase in pseudocode.

Fig. 16 shows an example of a checkpoint-based recovery in an organization.

Fig. 17 shows a concrete implementation of the W-phase in pseudo-code.

Fig. 18 shows a running example of how the transaction flows through ChainifyDB.

Fig. 19a shows a throughput of successful transactions for the heterogeneous setup.

Fig. 19b shows a throughput of standalone MySQL and PostgreSQL for a varying number of clients.

Fig. 20a shows robustness and recovery of the inventive system under the Any-2 con sensus policy in a typical heterogeneous setup.

Fig. 20b shows robustness and recovery of the inventive system under the Any-2 con sensus policy in a homogenous setup

Fig. 21 shows a breakdown of the cost of all cryptographic computation

Fig. 22 shows the results of varying the block size

Detailed Description

In contrast to the permissioned blockchain systems that exist on the market, the present in vention builds upon an existing infrastructure. This existing infrastructure is composed of a set of local databases operated by different data management organizations. In order to guarantee privacy of data, wherein private tables and private transactions are iso lated from data shared with other organizations, private tables are strictly separated by the access rights system of the underlying database management systems from data that is shared with other organizations via the inventive system. This means that an organization has no way of accessing the private data of another organization.

In order to guarantee trusted data exchange, the inventive system provides chained tables and chained transactions. After installing the inventive system, organizations can jointly cre ate so-called chained tables on which chained transactions that guarantee trustworthiness can be executed. On a chained table, four properties must be specified: (a) the table schema that describes the properties of the table; (b) the endorsement constraints that specify which organizations must endorse a transaction working on the table. In the case of two organiza tions A and B, a reasonable condition would be that both A and B must endorse each transac tion; (c) optional integrity constraints that specify additional conditions to the table; and (d) optional punishment constraints that determine under which conditions organizations are suspended that violate integrity constraints. The aforementioned properties are specified in a configuration file, on which all organizations have to agree on beforehand.

Endorsement is a central component of the security concept: Every organization that has to endorse a proposed transaction first executes it locally. A cryptographic hash value is calcu lated from the state of the database before and after the execution of the proposed transac tion. This hash-value is additionally signed cryptographically by the respective organization. If the calculated hash values of the individual organizations do not match, the proposed trans action can already be rejected at this point. If they agree, the signed hash values of all organ izations involved and the original proposed transaction will be bundled in the form of an en dorsed transaction. This is then sent to all organizations of the network, which now have to verify the transaction cryptographically independently of each other. This is necessary be cause the transaction may have been manipulated by one of the endorsing organizations. For cryptographic verification, each organization now executes the original transaction and calcu lates a cryptographic hash value of the state before and after execution. In the next step, each organization verifies the signatures contained in the transaction by comparing the hash values contained in the signature with its locally calculated hash value. If all values match, the trans action is valid and the change is fixed. Otherwise, the execution effect is reversed and the transaction is discarded.

In addition to the endorsement conditions, optional integrity constraints can also be formu lated that specify additional security conditions to the table and create trust between organi zations. For example, for a chained table that is to store bank accounts without overdrafts, it could be specified that no introduced transaction may set the account balance to a negative value. Integrity constraints can also be formulated in relation to private tables. These integrity constraints are checked by the endorsing organizations during simulation.

The inventive system also establishes a global order among all transactions. This is realized by a dedicated ordering service. This ordering service can be realized as a single trusted instance or using a consensus mechanism.

In summary, chained tables and chained transactions enable trustworthy data exchange. This is guaranteed by the fact that each transaction must first be endorsed using cryptographic methods and then verified by all organizations. In addition, the integrity of each transaction with respect to the terms of individual organizations is ensured in order to protect their indi vidual requirements.

In order to protect against malicious behavior of organizations, the system suspends organi zations whose endorsements cannot be repeatedly verified by the rest of the network. Such suspension may also apply to organizations that repeatedly propose transactions that violate integrity conditions. Collectively agreed punishment constraints regulate precisely under which conditions an organization is suspended. During a suspension, the organization cannot introduce any new transactions. If an organization has been suspended, its status can be cleared by the other participating organizations.

ChainifyDB can thus be used in untrusted environments where organizations are potentially malicious. In the case of a malicious organization, ChainifyDB continues to guarantee the ex ecution of trusted transactions. Hereby, two types of attacks on the system may be distin guished: (a) an organization tries to introduce changes into a chained table that are not in the sense of the other organizations. If a transaction violates the integrity conditions of the chained table it is working on, ChainifyDB will abort the transaction and no changes will be made to the dataset. If an organization repeatedly submits chained transactions that need to be aborted, the organization may be suspended under the terms of the penalty (b) An organ ization could attempt to spy out private data from private tables of another organization by repeatedly introducing transactions that "test" integrity conditions. For example, an organi zation might want to find out what the internally set maximum price of the trading partner for a trade is by repeatedly introducing chained transactions that gradually lower the trade price until a transaction is accepted by the system. This scenario can also be intercepted by punishment constraints.

In summary, ChainifyDB behaves robustly in the event of a malicious organization and sus pends it so that the system remains usable for the remaining organizations.

In order to guarantee reliability, the inventive system also provides recovery from a damaged or abnormal condition. It is possible that the replication of a chained table within an organi zation is corrupted, e.g. by an administrator making manual changes to the database. Other private blockchain systems do not provide an automated way to restore a consistent state. In ChainifyDB, however, one organization has the ability to restore the state of a chained table with the help of the other organizations by using a consensus mechanism to determine and replicate the last true state.

In summary, ChainifyDB enables secure and automatic recovery of a consistent state after shared data has been corrupted or tampered with.

In order to provide seamless integration, ChainifyDB builds on existing local database man agement systems. During installation, these database management systems remain fully online and can still be queried, as ChainifyDB only needs to connect to the underlying systems.

ChainifyDB therefore integrates trusted transactions without limiting the usability of database management systems. ChainifyDB is thus oriented towards the need of users to make security concepts usable in existing systems without hurdles.

In particular, chained transactions are written in traditional SQL and do not differ from tradi tional private transactions. ChainifyDB can work with almost any relational database manage ment system because it has the following minimum requirements for a DBMS: (1) Support of the SQL standard 92. (2) Optimizing transaction support. (B) Support of the isolation level of "repeatable read". This requirement is met by widely used systems such as Oracle, IBM DB2, Microsoft SQL Server, PostgreSQL and MySQL.

Moreover, organizations that perform chained transactions do not need to agree on a specific DBMS to store the chained table. Any organization can use another relational DBMS as long as it meets the above requirements. For example, Organization A ChainifyDB can be based on IBM DB2, while Organization B combines it with PostgreSQL.

Figu re 1 shows a first standard scenario in which all organizations are in the same state and all organizations behave honestly.

In step 1, organization B, the "client organization", whose chained table Purchases differs from the remaining organizations, proposes a transaction ("proposed transaction") that intends to change the chained table Purchases. This proposed transaction contains the following SQL code ("the original body"):

UPDATE Purchases SET price = 20000

WHERE ID = 42

In step 2, the proposed transaction must be forwarded to all organizations that have to en dorse the proposed transaction. These endorsing organizations are specified in the endorse ment policy of the chained table Purchases. In this example, the endorsement policy states that both organization A and organization B must endorse the proposed transaction. There fore, organization B, which is the client organization, sends the proposed transaction to or ganization A.

In step 3, all endorsing organizations now have to simulate the proposed transaction locally. To do so, they embed the original body of the proposed transaction in a simulating transac tion, that looks as follows, and execute it.

BEGIN;

-- check integrity constraints

IF INTEGRITY_CONSTRAINTS_VIOLATED()

ROLLBACK;

-- compute the hash of the state on which the simulation happens

SELECT @H1 = HASHING_UDF(T, "ID=42") FROM T: -- create a savepoint for old_state

SAVEPOINT @old_state;

-- execute the original body of the proposed transaction

UPDATE Purchases SET price = 20000 WHERE ID = 42

-- compute the hash of the state of the database after the transaction is executed. SELECT @H2 = HASHING_UDF(T, "ID=42") FROM T;

-- rollback to old_state to revert the effects of the transaction

ROLLBACK TO SAVEPOINT @old_state;

-- return the merged hash of initial and final state

SELECT @H = MERGE_HASH_UDF(@H1, @H2);

-- sign the hash using the private key of the organization

SELECT @S = SIGN(@H, privateKey);

COMMIT;

In step 4, the simulating transaction, it is first checked whether the integrity constraints are violated. If yes, the organization aborts the transaction. If they are not violated, the simulating transaction first captures the current state of the database in form of a cryptographic hash ((55 HI) and creates a savepoint (@old state). Then, the original body is executed, which po tentially changes the database. Therefore, a second cryptographic hash (@H2) is computed to capture the new database state. As the effects of the transaction are only simulated at this point in time, the effects of the execution of the original body are now rolled back to the previously created savepoint @old state. Finally, the two hashes @H1 and @H2 are merged into a single hash @H, which is additionally cryptographically signed (@S) using the private key of the organization (asymmetric encryption).

In step 5, organization A returns its pair of hash and signature (@HA, @SA) to the client or ganization B, which itself computed its pair (@HB, @SB).

In step 6, the client organization compares @HA and @HB. If they differ, the client organiza tion aborts the transaction. In this case, they are equal and therefore, the client organization forms an endorsement. An endorsement contains the original body as well as all signatures (@SA and @SB). In step 7, the client organization sends the endorsement to all other organizations, i.e. organ ization A and organization C, as all organizations of the network have to validate the endorse ment in the following.

In step 8, to validate the endorsement, each organization now embeds the original body, as well as the received signatures, in the following validation transaction, and executes it.

1 BEGIN;

-- compute the hash of the state on which the simulation happens

SELECT @H1 = HASHING_UDF(T, "ID=42") FROM T;

-- create a savepoint for old_state

SAVEPOINT @old_state;

-- execute the original body

UPDATE Purchases SET price = 20000 10 WHERE ID = 42

-- compute the hash of the state of the database after the transaction is executed. SELECT @H2 = HASHING_UDF(T, "ID=42") FROM T;

-- return the merged hash of initial and final state

SELECT @H = MERGE_HASH_UDF(@H1, @H2);

-- restore the hashes from the received signatures using the public keys of the organi zations

SELECT @HA = decrypt(@SA, publicKeyA); 20 SELECT @HB = decrypt(@SB, publicK- eyB);

-- if there is any difference between the hashes, roll back the effects.

IF @HA != @H or @HB != @H

ROLLBACK TO SAVEPOINT @old_state;

COMMIT;

In step 9, the validation transaction, it is again a cryptographic hash on the state before (@H1) and after (@H2) the execution of the original body computed. These hashes are then merged (@H). To perform the actual validation, the received signatures (@SA and @SB) are de- cryptedB to the hashes @HA and @HB using the public keys of the respective endorsing or ganizations. Next, it is compared whether the locally computed hash (@H) equals the hashes (@HA and @HB), which have been decrypted from the received signatures. If they do not match, the effects are rolled back to the savepoint (@old state). In this example, they match and therefore, the effects are committed.

Figu re 2 shows a safety-critical scenario, wherein the state of an organization deviates, and all organization behave honestly. Steps 1 to 5 are executed as in the scenario shown in figure 1. In step 6, the client organization compares @HA and @HB. If they differ, the client organi zation aborts the transaction. This is the case here and therefore, the client organization does not form an endorsement and aborts the proposed transaction. Consequently, the proposed transaction is also not worked by any other organization.

Figu re S shows a safety-critical scenario, wherein the state of an organization deviates, and the organization behaves additionally maliciously. Steps 1 to 5 are executed as in the scenario shown in figure 1. In step 6, the client organization compares @HA and @HB. If they differ, the client organization has to abort the transaction. However, since organization B acts mali ciously, it ignores the different hash values and wants to ensure that the proposed transaction is committed by the other organizations despite the differences. Thus, it forms an endorse ment containing the original body as well as all signatures (@SA and @SB). Steps 7 and 8 are then executed as described in relation to figure 1. In step 9, the validation transaction, it is again a cryptographic hash on the state before (@H1) and after (@H2) the execution of the original body computed. These hashes are then merged (@H). To perform the actual valida tion, the received signatures (@SA and @SB) are decrypted4 to the hashes @HA and @HB using the public keys of the respective endorsing organizations. Next, it is compared whether the locally computed hash (@H) equals the hashes (@HA and @HB), which have been de crypted from the received signatures. If they do not match, the effects are rolled back to the savepoint (@old state). In this example, @H does not match with @HB, as organization B worked on a deviated database state. Therefore, the proposed transaction is not worked by any honest organization.

Figu re 4 shows a safety-critical scenario, in particular the handling of a violated integrity constraints. Steps 1 to B are executed as in the scenario shown in figure 1. In step 4, the sim ulating transaction, it is first checked whether the integrity constraints are violated. In the case of organization A, all integrity constraints are respected. In the case of organization B not all integrity constraints are respected, as Purchases. price <= Orders. amount * Orders. pricePer- Piece is violated. Therefore, organization B aborts the proposed transaction. Figu re s 5a to 5 e show a sequence of diagrams illustrating an example of a recovery mech anism for a chained database according to an embodiment of the invention.

As stated above, there is an ordering service, which establishes a global order among all trans actions. The ordering service guarantees that every organization receives the same transac tions in the same order. The recovery mechanism of the invention relies on this property.

A checkpoint represents a globally agreed-on state of the database, which can be used by an organization to recover from a corrupted state. For example, in the situation of figu re 5 a, a new checkpoint should be generated after every three committed transactions. The last checkpoint has been created successfully after transaction T3, where HI = 42 represents a cryptographic hash of the database state. Organization A committed the transactions T4, T5, and T6 and now intends to create a new checkpoint. It computes its hash HA2 = 67 and asks the remaining organizations using a consensus mechanism (e.g. Paxos), whether they agree with the hash. Since they have not received the transactions T4 to T6 yet, they cannot agree and therefore, no checkpoint can currently be created.

In figu re 5 b, organization B now receives the three transactions T4, T5, and T6 as well and commits them. It computes its hash HB2 = 67 and starts a new consensus round. Organization A agrees with the hash, as HA2 = 67 = HB2. Organization C still has not received the three transactions and therefore, does not agree. However, since the majority of the network (two out of three) computed the same hash, a globally agreed-on checkpoint is created.

In figu re 5 c, organization C finally commits the three transactions T4, T5, and T6 as well. However, before committing, the state of the database has been corrupted. Consequently, committing the three transactions on the corrupted state leads the hash HC2 = 99, which dif fers from the hash H2 = 67 of the globally agreed on state of the checkpoint. Thus, organization C has to recover the correct state.

In figu re 5d, organization C first rolls back its database state to the last available checkpoint, i.e. the one represented by HI = 42. Next, it pulls the rolled back transactions from another organization, applies them, and computes a hash HC2 = 67. After that, it asks all other organ izations, whether a majority of them agrees to this hash. In the present example, all of them agree. In figu re 5 e, organization C fully recovered and reached the latest checkpoint represented by the hash H2 = 67. It is now ready to commit new transactions.

In order to eliminate the need for assumptions on the execution of transactions in every par ticipating organization as well as assumptions on their resilience against tampering of local data when processing database transactions on shared tables, the invention proposes to push the consensus-phase towards the end of the processing pipeline.

Figu re 6a shows a preferred model according to a second embodiment of the invention, termed the order-execute-consensus model (OEC). The consensus-phase only occurs at the end of the pipeline, after both the order-phase and the execute-phase. As consensus is reached on the effects of the execute-phase, no assumptions must be made on any previous phase. If consensus is reached on the effects of the execute-phase instead of reaching con sensus on the order established by the order phase, no assumptions must be made on the order-phase and execute-phase. From this perspective, the inventive methods and systems are able to abstract from the concrete order-phase and the execute-phase: whatever happens in these phases, all differences in the produced effects may be detected in the consensus- phase afterwards. This results in a new, highly flexible processing model, termed the What- ever-LedgerConsensus model or WLC ("We'll seel") for short.

Figu re 6 ( b ) is a schematic overview of the Whatever-LedgerConsensus model (WLC) accord ing to the second embodiment of the invention. No assumptions are made on the behavior of the whatever-phase. In the consensus-phase, consensus is reached on the effects of the what- ever-phase:

(1) Whatever-phase. Each organization does whatever it deems necessary to pass the LC- phase later on.

(2) LedgerConsensus-phase. a consensus round is performed on the effects of the whatever- phase to check whether a consensus can be reached on them. If yes, the effects are committed to a ledger. If an organization is non-consenting, it must perform a recovery. If no consensus is reached at all, all organizations must try to recover.

The core idea of WLC is not to seek consensus on what should be done-actions, but rather to seek consensus on the effect of the actions after they have been performed. This allows drop- ping assumptions on the concrete transaction processing behavior of participating organiza tions. It also allows detecting any external tampering with the data. The WLC model implies that if consensus on the effects of certain actions cannot be reached, the organizations must not commit the effects of their actions. WLC is also more powerful than classical 2PC or 3PC- style protocols in the sense that those protocols still assume deterministic behavior of the organizations without looking back at the produced effects. WLC simply does not care about whether organizations claim to be prepared for a transaction and what they claim to commit. In contrast, the WLC processing model solely looks at the outcome. If consensus on the out come can be reached, it does not matter anymore which actions those organizations used to get to the same outcome in the first place. In summary, the outcome is measured rather than the promises.

In order to formalize the processing model of WLC in detail, let O , . . . ,O _k be the k participating organizations in the network. As always, these organizations do not trust each other. Still, they want to perform a mutual sequence of inputs, which is fed into the system batch-wise round by round, as visualized in figure 6a. The following description shows the W-phase and the LC- phase in round t.

Whatever (W)-phase: although the same sequence of inputs enters the system, each organi zation might actually receive a potentially different set of actions for processing, as no as sumptions are made on what exactly is happening in the W-phase. For instance, an ordering service could distribute different actions to different organizations for whatever reason. Thus, A , . . . ,A _t is defined as the sequence of actions, that the individual organization Oi receives until round t .

For the purpose of the following explanations, an effect function Fi() is used for designating an effect Ei _,t of an action. Only if the effect contains the state, an action can be applied on the effect to generate a new effect. For each organization Oi, the accumulated effect E _{/ f} may be determined as:

E i,t ⁼ Ft fe, ’ \ i,t D

with Ei _,o = 0 being the initial empty effect. The iterative construction of this function is later used to create a blockchain-style chaining of effects: The term Ei _, = Fi (0, [Ai _,i, . . . ,Ai _, ]) is used as a shorthand, still there will be a separate effect function call for each of the t actions.

LedgerConsensus (LC)-phase: On the accumulated effects Ei, _t for 1 < I < k, a consensus must be reached (and not necessarily on the entire state !). Otherwise, the system is not allowed to proceed to round t + 1. To decide whether consensus is reached, a consensus policy c specifies how many organizations must at least have reached the same accumulated effect.

Thus, consensus in round t is reached if

The effect on which consensus has been reached is summarized as E _COn _s, _t . If consensus has been reached, each organization Oi has to decide on its own whether its local effect Ei, _t matches the consensus effect E _COn _s, _t . If it matches, the effect can be committed to a ledger and round t ends. Otherwise, Oi cannot proceed to round t +1 and tries to recover.

If no consensus can be reached at all, i.e. E _COn _s, _t = undefined, then no organization can proceed to round t + 1. In this case, all organizations try to recover.

Recovery must be performed on an organization Oi if Ei _;t = E _COn _s;t. The reasons for this can vary: it could be that Oi simply interpreted Ai _;t differently than the others or that non-determinism is hidden in Ai _;t. It could also be that someone, e.g. an administrator, tampered with Ei _;t.

Irrespective of the cause, during recovery, one has to compute a new effect E'i _;t. If E'i _;t ¹ Ei _;t, then the computed effect differs from the original effect, which was used for consensus, and Oi has a chance to recover. If now E'i _;t = E _COn _s;t, then Oi has recovered and can proceed with round t + 1. If not it cannot recover and is excluded from the network.

If no consensus has been reached at all, i.e. E _COn _s;t = undefined, a new consensus round is per formed on the new effects. The network can recover, if

There are tradeoffs and optimizations involved in practical recovery. For example, instead of starting with an empty state and replaying the entire sequence of actions, there are many more options. Table 1 shows the two-dimensions of Whatever recovery (accessibility of effects vs actions) and their implications on the classes of recovery algorithms possible:

Table 1 The 2 x 3 Whatever Recovery Landscape

The table has an effects dimension and an action dimension, resulting in six different classes of recovery algorithms. The effects dimension distinguishes between whether the state is not contained and the state is contained. The actions dimension distinguishes between not acces sible, blackbox actions (access exists, but the semantics are not understood in any way), and whitebox actions (access exists and the semantics are understood, i.e. the individual opera tions carried out are seen). It is understood that each cell in the above table includes all cells with weaker accessibility levels (i.e. all cells that are further left and/or up).

First, if actions are not accessible, a non-consenting organization cannot be recovered, when the state is not contained. Otherwise, when the state is contained, the state computed by the individual organizations can be accessed, but without understanding their semantics. As ac tions cannot be accessed, they cannot be applied in any way. In order to recover organization Oi in round t, its local effect Ei _;t is overwritten with the effect Ei _;t of any other organization Ot¹i, that is matching the consensus effect E _COns _;t

If E _Cons;t is undefined, i.e. no consensus was reached in round t, then Oi cannot recover.

Second, if the sequence of blackbox actions can be accessed, but the state is not contained, organization Oi is recovered in round t by (blindly) replaying the entire history of blackbox actions Ai _;i; ...; Ai _;t from the very beginning to restore the accumulated effect

If E'i _;t = E _Cons;t, then Oi may rejoin the network.

If the state is contained, organization Oi is recovered in round t by performing a partial replay, i.e. by starting with an older effect Ei _;s<t and by partially replaying the blackbox actions Ai _;S+i;...;Ai _;t (redo in database lingo):

If E'i _;t = E _Cons;t, then Oi may rejoin the network.

Third, if the sequence of whitebox actions is accessible and the state is not contained, organ ization Oi in round t is recovered by replaying the entire history of actions from the very be ginning to restore the accumulated effect. However, as whitebox actions are available, the replay can be optimized: For example, if it is known that an action Ai _;t consists of a sequence of three transactions Ai _;t = [Tl; T2; T3], where all three transactions update the same record, then Tl and T2 can be safely dropped and A'i _;t = [T3] can be used for the replay. Further pos sible optimizations include changing the order of operations and to allow for parallel execu tion of non-conflicting operations, as will be shown later. Thus, the effect is restored using the optimized actions A'i _;i; ...; A'i _;t:

If E'i _;t = E _Cons;t, then Oi may rejoin the network.

If the state is contained and access to whitebox actions is possible, organization Oi is recovered in round t by starting with an older effect Ei _;s<t and by partially replaying the optimized actions A'i;s+i; ...; A'i _;t (redo in database lingo):

If E'i _;t = E _Cons;t, then Oi may rejoin the network.

Here, the concepts of ^'effect' and ^'action' are used to abstract from the details of a concrete implementation. For example, it is not strictly necessary to materialize an effect physically: Replaying the entire history of actions is sufficient to restore an effect. This is similar to log- only databases ("the log is the database") that regard the database store as a performance optimized representation of the database log. In the present invention, effects are materialized in the form of database states and snapshots of those to enable high performance transaction processing and recovery. Actions represent blocks of transactions, which modify the database state and thereby generate new effects.

Figu re 7 shows a schematic overview of a concrete ChainifyDB system instantiating the Whatever-LedgerConsensus model (WLC). The implementation operates in rounds and con sumes a batch of input transactions, which have been proposed by clients to the system, in every round.

In its simplest variant, ChainifyDB instantiates the W-phase of the WLC model with two sub phases: the order-subphase and the execute-subphase. In the order-subphase of round t, a batch of input transaction is globally ordered and grouped in a block. An action corresponds to a block of transactions here, i.e. Ai _;t = [Tl; T2; T3]. Packing transactions into blocks is merely a performance optimization in order to amortize the costs for consensus later on. There is no conceptual need to form blocks. In strong contrast to the prior art systems, a consensus round is not performed on the established order afterwards, even if the ordering service is not trusted in any way. Instead, the system simply takes whatever the ordering service outputs.

In the execute-subphase, each organization Oi receives an action Ai _;t = [Tl; T2; T3] produced by the order-phase. Each transaction of that block, that has valid signatures, is then executed against the local relational database. This execution potentially updates the database and thereby produces an effect Ei _;t. For now, it is simply assumed that the effect captures all mod ifications done by the valid transactions of the block to the database.

Again, in contrast to the prior art, deterministic execution across all organizations is not as sumed: the different DBMSs of two organizations could have interpreted a transaction slightly differently or two organizations could have received different blocks from the order-phase altogether.

In the ledger consensus phase, all organizations have to reach consensus on the effects pro duced by the whatever-phase of the individual organizations. Thus, in a consensus round, which will be describe in detail later, the organizations first try to agree on one particular ef fect. Then, each organization whose effect is consenting commits it to its local ledger and pro ceeds with round t + 1. Again, only if consensus on an effect is reached, it is considered as globally committed. If the effect of an organization is non-consenting, the organization must at least try to recover from this situation in the recovery-subphase. This is done using a variant of Optimized Partial Replay from a (Logical) Snapshot described above.

The execution of a block on the local database produces an effect as a side-product of execu tion. On this effect, the consensus round is performed. It is also eventually committed to the ledger.

Figu re 8 shows a logical tuple-wise per block digest computation on an example table named 'Foo'. All changes are automatically tracked and digested through SQL 99 triggers. In Chaini- fyDB SQL-99 compliant relational DBMSs are assumed to keep the state at each organization, because on the one hand, one wants to allow for existing organizations with existing DBMS- products to be able to build WLC-networks easily with blockchain-style guarantees. On the other hand, one can utilize SQL 99 triggers to realize a vendor-independent digest versioning mechanism, that specifically versions the data of ChainifyDB in form of a digest table.

The digest table is computed per block. Every shared table in the system is instrumented with a trigger mechanism to populate this digest automatically as follows: for every tuple changed by an INSERT, UPDATE, and DELETE-statement, a corresponding digest tuple is created. A di gest tuple has the schema: [PK:<as of Foo>, seriahint, hashunt, T], where PK is the primary key of the original table (which may of course also be a compound key). The value serial is a strictly monotonously rising counter used to distinguish entries with the same PK (every new version of a tuple increases this counter). The value hash is the digest of the values of the tuple after it was changed (for a delete: its state before removing it from the table) and the value T is the type of change that was performed, e.g. (I)nsert, (U)pdate or (D)elete. In contrast, the infor mation how to undo/redo changes is not preserved in those tuples, as that information is simply not needed.

In the example, a block of three transactions is processed starting with a particular state of table Foo and an empty digest table Foo Digest. Now, an update is performed on tuple with PK=4. As a result, the tuple in Foo is changed to (4, 42, z, 7.9) and a new digest tuple (4, 0, 3B50, U) is inserted into Foo Digest. After that a delete of record with PK=3 is performed. The tuple in Foo is deleted and a new digest tuple (3, 0, B2F9, D) is inserted into Foo Digest. The processing proceeds until all three transactions TA1 toTA3 available in this particular block have been processed. Processing of the next block starts (again) with an empty digest table. Although the digest table captures all changes done to a table by the last block of transactions, it does not represent the effect yet. The actual effect is represented in form of a so-called LedgerBlock, which comprises the following fields:

1. TA list: The list of transactions that were part of this block. Each transaction is represented by its SQL code.

2. TA successful: A bitlist flagging the successfully executed transactions. This is important as transactions may of course fail and that behavior should be part of consensus across all or ganizations.

S. hash digest: A hash of the logical contents of the digest table. In the present case, this is a hash over the hash-values present in the diff table. The hash values are concatenated in lexi cographical [PK, serial]-order and then input into SHA256.

4. hash previous LedgerBlock: A hash value of the entire contents of the previous LedgerBlock appended to the ledger, again in the form of a SHA256 hash. This backward chaining of hashes allows anyone to verify the integrity of the ledger in linear time.

This LedgerBlock now leaves the W-phase and enters the LC-phase to determine whether con sensus can be reached.

In the permissioned setup according to the invention, one may safely assume that all organi zations of the network are known at any time and that no organization can join the network during a consensus round. This allows us to use a lightweight voting algorithm for this purpose, instead of having to rely on more heavyweight consensus algorithms. To determine whether consensus was reached, a consensus policy c must be specified by all organizations in advance during the bootstrapping process of the network. The constant c specifies how many organi zations must have reached the same effect.

In the first step of the consensus algorithm, the individual organization has to count how often each LedgerBlock occurred in the network. To do so, it requests the so called Ledg- erBlockHashes from all other organizations and counts the occurrences, including its own local LedgerBlockHash. This LedgerBlockHash is essentially a compressed form of the contents of a LedgerBlock. Consensus is reached if two conditions hold: (a) There must be a Ledg erBlockHash that occurred at least c times (b) This consensus LedgerBlockHash equals the local LedgerBlockHash. If both hold, then the organization can append/commit its LedgerBlock to its local ledger. If an organization is non-consenting, it must enter recovery as described in connection with in figure 7).

For recovery, ChainifyDB implements Optimized Partial Replay from a (Logical) Snapshot al ready described. Again, like the digests this approach is DBMS-system independent: access to the source code of the DBMS is not required.

Figu re 9 shows the normal operation mode of a single organization that is consenting: a checkpoint is created by creating a snapshot of table Foo after every k blocks (k = 3 in the example).

Snapshots are created on the SQL-level either through a non-maintained materialized view or by a CREATE TABLE command. If the source code of the DBMS and the operating system is available, the snapshotting support from the operating system could be exploited at this point as well - however, the invention does not make this assumption in the present design. The snapshot is created for all tables that were changed since the last consistent snapshot. Creat ing such a snapshot is surprisingly fast: Snapshotting the accounts table with 1;000;000 rows, which is part of the Smallbank benchmark used in experimental evaluation, takes only 827ms in total on a machine of type 2 running PostgreSQL. Evidently, there is no need to store this checkpoint in the ledger: as the information contained in a checkpoint can be fully recom puted from the ledger, it has to pass the LC-phase again in any case.

Figu re 10 shows an organization switching to recovery mode. The figure must be considered from top to bottom and from left to right, where numbers indicate the continuation of differ ent lines. The organization is in normal (consenting) mode for all blocks shown up to and in cluding block 45. For block 46, the organization is non-consenting. Hence, it must enter the recovery phase.

First, the table Foo has to be reset to the state of the latest consistent snapshot Foo block 44. Then, block 45, which is consenting, is replayed. Then, block 46, which is unfortunately again non-consenting, is replayed. In this situation, it must be assumed that this local snapshot has an issue, i.e. it was corrupted externally. Thus, the local table Foo is reset to the second latest snapshot Foo block 41. Now, all blocks are replayed starting from block 42 up to block 46. This time block 46 is consenting. In the implementation, three committed snapshots are kept per organization. If replaying from all of these snapshots does not lead to a consenting organization, then one can try to replay the entire history. If even this fails, one has to assume that the ordering service acts maliciously and can try again after starting a fresh ordering service, possibly by a different organization.

Evidently, there is no 100% guarantee that any of these measures will lead to a consenting organization. Severe problems such as a hardware error (in particular an error that is not de tected and transparently fixed by hardware or operating system itself) are out of reach for repair by ChainifyDB. The important point here is that the problem is detected early on (after every block of transactions and not only eventually), the organization is tried to be "rehabili tated" through recovery, and reintegrated into the network.

Depending on the intended use, various optimizations may be applied to the described meth ods and systems.

First, there may be situations where a plain execution of arbitrary transactions against the local DBMSs of the individual organizations, without any restrictions, is undesired. For exam ple, a transaction such as

INSERT INTO Trades (TID , product , amount , totalprice )

VALUES (42 ," Gearbox " ,5 ,60000) : is only meaningful, if certain integrity constraints hold. The selling organization must have at least 5 gearboxes in stock and the buying organization must have funds of at least 60000. Instead of enforcing such integrity constraints locally, it is proposed to prepend an optional agreement-subphase to the pipeline, through which any proposed transaction must go first. Only if all involved organizations agree to the proposed transaction, the transaction may enter the subsequent order phase.

In order to enable this optional agreement phase, two steps must be carried out by the organ izations: First, an agreement policy must be installed in consent, when creating the shared table. It specifies which organizations have to agree upon a transaction operating on that ta ble. For the trading example, the policy of the table Trades could look as follows, enforcing both involved organizations to decide for agreement: AgreementPolicy ( Trades ) = { SellingOrganization , BuyingOrganization }

Second, each organization has to implement its individual integrity constraints, which are eval uated against each proposed transaction. The skilled person will understand that these con straints could be formulated to compare the transaction against local data. For example, the agreements for the selling organization and the buying organization could look as follows:

SellingOrganization. agree (T) =

{ return T.amount <= Stocks. amount

WHERE T. product == Stocks. product }

BuyingOrganization . agree (T) =

{ return T.totalprice <= Fund.availableMoney }

Only if both organizations agree to a transaction operating on the Trades table, it is passed on for further processing.

Figu re 11 further illustrates how an agreement can be considered a different WLC-iteration where the participants agree upon the string describing the SQL-transaction to be executed rather than the outcome of running that transaction. In WLC 1, the organizations must first agree upon a trade to be done (represented as an SQL-transaction). If the organizations agree, a SQL string is inserted into a table PlannedTrades:

INSERT INTO PlannedTrades (TID , SQL_string )

VALUES (42 ," UPDATE line SET ...") :

If there is consent, that the tuple was inserted, i.e. that this trade should be done, the corre sponding SQL string will be send to WLC 2, which executes the SQL string. In summary, map ping subphases to incremental rounds of the WLC model enables interesting abstractions.

Moreover, executing all valid transactions of a block in a sequential fashion is prone to waste performance, if the underlying system is able to execute transactions in parallel. One could simply submit all valid transactions of a block to the underlying (potentially parallel) database system in one batch and let it execute them concurrently. However, while this strategy lever ages the performance of the underlying system, it is very likely that every DBMS schedules the same batch of transactions differently for parallel execution. Consequently, the commit order of the transactions likely differs across organizations, thus increasing the likelihood of non consent. When a block of transactions is received by the execute-subphase, it is therefore more efficient that all existing conflict dependencies between transactions are first identified. This allows forming mini-batches of transactions that can be executed safely in parallel, as they have no conflicting dependencies. This process can be decomposed into three phases:

First, for a block of transactions, a semantic analysis of each transaction is carried out. Effec tively, this means parsing the SQL statements and extracting the read and write set of each transaction. These read and write sets are realized as intervals on the accessed columns to support capturing both point query and range query accesses. E.g. for the following two single statement transactions:

Tl: UPDATE Foo SET A = 5 WHERE PK = 100;

T2: UPDATE Foo SET A = 8 WHERE PK > 42; the extracted intervals for are:

Tl: A is updated where PK is in [100 ,100]

T2: A is updated where PK is in [42 , infinity ]

Second, a dependency graph for the block can then be created with the obtained intervals. For two transactions having a read-write, write-write, or write-read conflict, a corresponding edge is added to the graph. As transactions are inserted into the dependency graph in the execution order given by the block, no cycles can occur in the graph.

Extending the example from the Semantic Analysis Phase and assuming that Tl has been added to the dependency graph already, it can be determined by inspecting T2 that PK[42, inf] overlaps with PK[100,100] of Tl. As T2 is an update transaction, there is a conflict between T2 and Tl and add a dependency edge from Tl to T2. Figure 9 presents an example depend ency graph for 9 transactions.

Third, in order to start executing the transactions in parallel, topological sorting is performed, i.e. traversing the execution stages of the graph that are implicitly given by the dependencies.

Figu re 11 shows a topological sort of the dependency graph with k = 9 transactions yielding four execution stages. The graph has four stages in total. Within one stage, all transactions can be executed in parallel, as no dependencies exist between those transactions. The actual parallel execution on the underlying database system is realized using k client con nections to the DBMS. To execute the transactions within an execution stage in parallel, the k clients pick transactions from the stage and submit them to the underlying system. As the presented method is conflict free, it guarantees the generation of serializable schedules. Therefore, concurrency control on the underlying system can be basically switched off, which can be done by setting the isolation level of the underlying system to READ UNCOMMITTED in order to get the best performance out of the DBMS.

An exemplary system architecture of a ChainifyDB network according to the present embodi ment of the invention comprises n (e.g. 3) participating organizations, an untrusted Order- ingServer, and a set of Kafka nodes that act as a message broker to distribute the block created by the OrderingServer. An exemplary implementation of the ChainifyDB system was achieved using Golang and C++ as a primary language in just under 11,000 LOC, excluding the depend encies and the generated code. Unless stated otherwise, the SHA256 hashing algorithm from the crypto package and the Sign and Verify algorithms from the elliptic package of Golang 1.10.4 were used. All messages exchanged between different components of the ChainifyDB network were signed to preserve the integrity of the messages flowing in the system. Unless stated otherwise, RPC calls were used to communicate between different components of the system.

Each participating organization runs one or more instances of ChainifyServer, Agreement- Server, ExecutionServer, and the ConsensusServer.

The ChainifyServer receives the signed transaction in the form of a proposal from the client. Each ChainifyServer stores the policy that defines the set of organizations that must agree to this transaction. Using this policy, the ChainifyServer communicates with the Agreement- Server of the responsible organizations to collect all the agreements required for this transac tion. It then packs the original proposal and all the agreements into a ChainedTransaction. If all agreements are valid, it forwards the ChainedTransaction to the OrderingServer. Each or ganization can run multiple instances of ChainifyServer to scale the number of client requests handled by this organization linearly.

The AgreementServer component is responsible for validating the local integrity constraints of the organization. It receives the transaction sent by the ChainifyServer using the RPC call, decides whether this transaction passes the local integrity check, and responds with a signed yes or no. The organization can run multiple instances of AgreementServer to scale the agree ment-phase linearly as well.

The OrderingServer component receives the ChainedTransactions from the ChainifyServer and packs them into a block in FIFO order. If the block has a sufficient size or a certain amount of time has elapsed, the block is cut and forwarded to Apache Kafka. The Kafka service then delivers the block to all ExecutionServers. This component can be also scaled linearly to ac commodate the high frequency of incoming chained transactions.

The ExecutionServer component fetches the blocks created by the OrderingServer and exe cutes them in-order on the underlying database system. The ExecutionServer first verifies all the signatures to make sure that no one in the transaction pipeline tampered the transaction. If the signatures and the agreements are valid, the transaction is marked as valid and invalid otherwise.

The ExecutionServer then forwards the block to the GraphGenerator, implemented in C++, which generates an efficient execution graph for this block as described above. After receiving the optimized execution-graph, it is executed on the database system in parallel. During the execution, ChainifyDB collects the digest of the execution using SQL 99 triggers. After the ex ecution of all transactions in the block, the ExecutionServer generates the LedgerBlock and the corresponding LedgerBlockHash as described above, and forwards the LedgerBlockHash to the ConsensusServer for the consensus round. The organization can run multiple instances of this component to replicate the effects of the transactions on different database systems inside the organization efficiently.

The ConsensusServer component receives the Ledger Block and the LedgerBlockHash from the ExecutionServer. It then communicates with other organization's ConsensusServer to check whether a particular LedgerBlockHash reaches consensus. On reaching consensus, the ConsensusServer appends the corresponding LedgerBlock to the ledger. Otherwise, the or ganization performs recovery.

In summary, all ChainifyDB components up to the ConsensusServer fall into the W-phase, while the ConsensusServer itself falls into to the LC-phase. Purely following the WLC model, the transactions of the block could be excluded from the consensus round and from the ledger, ignoring how an organization reaches a state. However, keeping the information about the transactions per block enables a more sophisticated recovery phase.

Figu re I B presents another visualization of the round-based processing of the WLC model from the perspective of round t. The W-phase is a complete blackbox, and implemented by the individual organizations.

Round-based processing: Let Oi,...,O _k be k independent organizations in the network, which do not fully trust each other. In the case of three participants, k = 3. Despite their independ ence and potential lack of trust, they have a mutual interest in performing some kind of round- based processing, e.g. processing one transaction per round.

W-phase: By default, the model treats the W-phase as a complete blackbox and makes no assumptions on its internal behavior. In fact, each organization is even allowed to implement its W-phase differently. The only requirement is that as a side-product of executing the W- phase Wi of organization Oi in round t, a so-called effect Ei _;t must be produced. The idea of this effect is to represent externally a digest of the processing that just happened in the W-phase.

Algo rit h m 1 in figu re 14 shows how flexible the W-phase is. First, a proposed transaction is picked from the pool of proposed transactions. This could be done by one of the partici pants, who then notifies the others about the picked transaction. Second, the picked transac tion is executed locally by the DBMS of every participant. The effect captures the changes done by the transaction to the database.

LC-phase: As a result of the W-phase of round t, the k organizations produce their k effects Ei _,t to E _k,t. They are passed to the LC-phase to check whether a consensus can be reached on them. A lightweight voting algorithm may be used instead of having to rely on more heavyweight consensus algorithms, since all organizations of the network are known at any time and no organization can join the network during a round in the inventive setup. In addition to the effects, a consensus policy P = {Ci, C2, ...} is passed to the LC-phase.

A combination C _\ = {ri,. .,ri} with I < k and p, ¹ p _j specifies, which organizations are expected to reach the same effect. If the effects are the same for at least one of the combinations {Ci, C2, ...}, then consensus is reached. For example, P = {{1; 2}, {1; 3}} specifies that either the organizations Oi and O2 or organizations Oi and O3 must have the same effect to reach con sensus. In the case where all three participants have to reach consensus, the consensus policy would be P = {{1, 2, 3}}.

Figu re 15 shows the logic of the LC-phase in algorithm 2. It essentially tests the passed com binations one by one and tries to reach consensus on one combination. If no consensus is reached, then no one is allowed to start the next round. In such a case, all organizations can try to recover.

If consensus on an effect is reached, it is returned as E _COns,t. Now, each organization, whose effect equals the consensus effect, is allowed to proceed to the next round. Formally, each organization Oi has to check whether Ei _;t = E _COns _;t to be allowed to proceed to round t + 1. If the locally computed effect differs from the consensus effect, then the organization must not pro ceed. In such a case, the organization can as well try to recover.

Figu re 16 shows checkpoint-based recovery of organization O _k. As described, the inventive model allows treating the W-phase as a complete blackbox. This is possible as the proceeding is only determined by whether a consensus is reached on the produced effects of the W-phase and not by its internal behavior. However, if no information about the internal behavior of the W-phase is available, there are only limited options in recovering from a situation where the system as a whole or an individual organization cannot proceed to the next round. In order to address this problem, the W-phase is allowed to expose information about its internal behav ior optionally, in order to enable a more sophisticated recovery.

More particularly, it is known in the present case that the W-phase maintains and modifies a local database. Thus, the database is capturing the current state Si, _t of organization Oi after round t. Having access to the state enables recovery via checkpointing, as visualized in figure 16: it is assumed that consensus on the effect E _COns,t was reached, but the effect Ek,t of organi zation O is different. If organization O _k creates a checkpoint every second round in the form of a snapshot of the state, it can try to recover by replaying the two intermediate transactions on the most recent checkpoint before t, namely Sk,t-2- This leads to a potentially new state S'k, _t with a corresponding new effect E'k,t. If now E'k,t = E _COns,t, recovery was successful.

Figu re 17 shows a concrete implementation of the W-phase in pseudo-code. While the or- der-subphase essentially only comprises receiving the next TransactionBlock, the task of the execute-subphase is to execute the valid and agreed transactions and to build the LedgerBlock as a side-product.

Figu re 18 shows a running example of how the transaction flows through a concrete Chaini- fyDB system architecture according to the present embodiment of the invention.

The concrete system architecture comprises three loosely coupled components: Chaini- fyServer, ExecutionServer, and CommitServer. A single organization of a ChainifyDB network is allowed to run as many instances of these components as it wants, as they can be scaled in order to sustain the incoming workload. The ChainifyServer component is an entry point of a transaction proposal in the ChainifyDB network. It is responsible for authenticating the client, validating the integrity of the transaction proposal and creating a so-called ChainedTransac- tion, which wraps the transaction proposal and transaction-specific metadata. Further, the ChainifyServer is able to perform an optional agreement round to filter out proposed transac tions early on.

The ChainifyServer transmits the ChainedTransaction to the OrderingServer component (W- phase). The OrderingServer then forms a TransactionBlock from a batch of received Chained- Transactions and forwards them to every ExecutionServer in the network. Again, the Order ingServer in ChainifyDB is entirely untrusted. If the OrderingServer is malicious in any way (e.g. it sends different blocks to different ExecutionServers), organizations will detect that in the LC-phase of the CommitServer and elect a new OrderingServer.

Once the ExecutionServer component (W-phase) receives the TransactionBlock from the Or deringServer, it computes an execution graph for the block using transaction dependency analysis. The ExecutionServer then executes the graph with high parallelism and builds the LedgerBlock as effect. It finally forwards the LedgerBlock to the local CommitServer.

The CommitServer (LC-phase) then runs the voting-based consensus algorithm on the Ledg erBlock to verify whether consensus can be reached with respect to the consensus policy. If consensus is reached, the LedgerBlock is appended to the ledger. If not, this node must re cover.

In step 1, the client creates a Proposal from a SQL transaction. This Proposal contains the original transaction as a SQL string along with metadata information, such as the client ID and a signature of the transaction. The client then sends the Proposal to the ChainifyServer of organization Oi. In step 2, the ChainifyServer performs the previously mentioned optional agreement round to filter out all Proposals, which will not reach consensus, as early as possi ble. To do so, the ChainifyServer of Oi first accesses the authentication policy of the client, the public key of the client, and the consensus policy. The consensus policy is P = {{1; 2; 3}} in the example, so all three organizations have to agree on the proposal. If the ChainifyServer of Oi successfully authenticates the client and the verification of the Proposal is successful, it for wards the Proposal to O2 and O3 as specified in the consensus policy. Every ChainifyServer in the network now tests for agreement individually and prepares a signed AgreementResponse, which contains whether the organization agrees as well as the proposal itself. This assures that a malicious ChainifyServer cannot use an AgreementResponse of another Proposal as an AgreementResponse for this Proposal. The ChainifyServers of O2 and O3 then send their Agree mentResponse to the ChainifyServer of Oi. In step 3, as all three organizations agreed upon the Proposal, the ChainifyServer of Oi creates a ChainedTransaction from the original Proposal and the three AgreementResponses, and sends it to the OrderingServer. The OrderingServer is then queuing this ChainedTransaction. In step 4, when certain amounts of ChainedTransac- tions are queuing, the OrderingServer produces a TransactionBlock from these ChainedTrans- actions. The dispatch service of the OrderingServer then forwards the TransactionBlock to every ExecutionServer in the network. In step 5, the ExecutionServer of each organization now computes the near-optimal execution graph and executes the ChainedTransactions in the T ransactionBlock in parallel. As a side-product, the ExecutionServer computes the LedgerBlock as effect. In step 6, the ExecutionServer forwards the LedgerBlock to the local CommitServer. In step 7, the CommitServers perform the consensus round according to the consensus policy. This involves all three CommitServers. In step 8, they reach consensus, each CommitServer generates the corresponding LedgerBlock and appends it to its ledger.

To evaluate ChainifyDB experimentally, the following system setup and workload is used. Type 1 (small): Two quad-core Intel Xeon CPU E5-2407 running at 2:2 GHz, equipped with 48GB of DDR3 RAM. Type 2 (large): Two hexa-core Intel Xeon CPU X5690 running at 3:47 GHz, equipped with 192GB of DDR3 RAM. Unless stated otherwise, a heterogeneous network con sisting of three independent organizations Oi, O2, and O3 is used. Organization Oi owns two machines of type 1, where PostgreSQL 11.2 is running on one of these machines. Organization O2 owns two machines of type 1 as well, but MySQL 8.0.18 is running on one of them. Finally, organization O ₃ owns two machines of type 2, where again PostgreSQL is running on one of the machines. The individual components of ChainifyDB are automatically distributed across the two machines of each organization. Additionally, there is a dedicated machine of type 2, which represents a client firing transactions to ChainifyDB. In addition, a type 2 machine solely runs the ordering service. Two different options are configured as consensus policy: In the first option Any-2, set c = 2 such that at least two out of three organizations have to produce the same effect to reach consensus. In the second option All-3, set c = 3 and consensus is reached only if all three organizations produce the same effect. In any case, all three organizations have to agree to every transaction. Besides, empirical evaluation revealed a block size of 4096 transactions to be a good fit. Parallel transaction execution is activated. As workload, transac tions from Smallbank] are used, which simulate a typical asset transfer scenario. To bootstrap the test, a checking account and a savings account are created for 100.000 users each and initialized with random balances. The workload comprises of the following four transactions: TransactSavings and Depositchecking increase the savings account and the checking account by a certain amount. SendPayment transfers money between two given checking accounts. WriteCheck decreases a checking account by a certain amount. During a single run, these four transactions are repeatedly fired at a fire rate of 4096 transactions per client, where one of the transactions is uniformly picked in a random fashion. For each picked transaction, the ac counts to access are determined based on a Zipfian distribution with a s-value of 1:1 and a v- value of 1, unless stated otherwise.

The experimental evaluation of ChainifyDB starts by inspecting the throughput of successful transactions that make it through the system. The throughput of ChainifyDB is first inspected in a heterogeneous setup under the two different consensus policies Any-2 and All-3. Addi tionally, the following two PBS baselines are shown: (a) Vanilla Fabric vl.2, probably the most prominent PBS system currently (b) Fabric++, an improved version of Fabric vl.2. Both Fabric and Fabric are also distributed across the same set of machines and the blocksize is set to 1024.

Figu re 19 (a ) shows a throughput of successful transactions for the heterogeneous setup. On the x-axis, the number of clients firing transactions concurrently is varied from three clients to 24 clients. On the y-axis, the average throughput of successful transactions is shown, ex cluding a ramp-up phase of the first five seconds. The Any-2 strategy shows a significantly higher throughput than Fabric++ with up to almost 5000 transactions per second. In compar ison, Fabric++ achieves only around 1000 transactions per second, although it makes consid erably more assumptions than the inventive system : First, it assumes the ordering service to be trustworthy. Second, it assumes deterministic execution and therefore does not perform any consensus round on the state. In addition, there is a large performance gap between the Any-2 and the All-3 strategy. The reason for this lies in the heterogeneous setup used. The two organizations running PostgreSQL are able to process the workload significantly faster than the single organization running MySQL. Thus, under the Any-2 strategy, the two organi zations using PostgreSQL are able to lead the progress, without having to wait for the signifi cantly slower third organization.

Figu re 19 ( b ) shows a throughput of standalone MySQL and PostgreSQL for a varying number of clients. The same workload as in figure 19(b) is firing using OLTP-bench. OLTP bench follows a uniform distribution. Under the All-3 strategy, the progress is throttled by the slowest or ganizations running MySQL. The difference in processing speed also becomes visible, if the throughput of the stand-alone single-instance database systems in Figure 19(b) under the same workload is inspected. This time, the transactions are fired using OLTP-bench. Both sys tems are configured with a buffer size of 2GB to keep the working set in main memory. Ap parently, PostgreSQL significantly outperforms MySQL under this workload independent of the number of clients.

By comparing figure 19(a) and figure 19(b) side-by-side, one may also see that ChainifyDB in troduces only negligible overhead over the raw database systems. In fact, for 3, 6, and 12 clients, ChainifyDB under the Any-2 policy actually produces a slightly higher throughput than raw PostgreSQL. The reasons for this lie in the optimized parallel transaction execution, which exploits the batch-wise inflow of transactions, and executes the transaction at the lowest pos sible isolation level.

Ta b le 2 shows an average throughput of successful transactions for ChainifyDB (Any-2) under Smallbank following a Zipf distribution and a uniform distribution:

Table 2

More particularly, the throughput under a uniform distribution is even higher with up to 6144 transactions per second than under the skewed Zipf distribution, as it allows for a higher de gree of parallelism during execution due to fewer conflicts between transactions.

Apart from the transaction processing performance, the robustness and recovery capabilities are crucial properties of the inventive methods and systems as well. To put these capabilities to the test, in the following experiment, the network is disturbed in two different ways: First, the database of one organization is forcefully corrupted, in order to see whether the network is able to detect and recover from it. Afterwards, one organization is brought down entirely in order to see whether the network is able to continue progressing. The experiment is set up as follows: In the first phase, the network is sustained with transactions of the Smallbank work load. These do not cause the organizations to deviate in any way. Consequently, this phase essentially resembles the standard processing situation. Then, after a certain amount of time, an update is manually injected into the table of organization Oi in order to see how fast Oi is able to recover from the deviation. This update is not performed through a transaction, but externally, by directly modifying the table in the database. Finally, a complete failure of one organization is simulated by removing it from the network. The remaining two organizations then have to reach consensus to be able to progress under the Any-2 policy.

Figu re 20(a ) shows robustness and recovery of the inventive system under the Any-2 con sensus policy in a typical heterogeneous setup. Plotted on the x-axis is the time of commit for each block. On the y-axis, the corresponding block IDs are plotted. Every five committed blocks, each organizations creates a local checkpoint. The diagram shows that the organiza tions 01 and OB, which run PostgreSQL, progress much faster than organization 02 running MySQL. Shortly after the update has been applied to 01, it detects the deviation in the con sensus round and starts recovery from the most recent checkpoint. Interestingly, this also stops the progression of organization 03, as 03 is not able to reach consensus anymore ac cording to the Any-2 policy: 01 is busy with recovery and 02 is too far behind. As soon as 01 recovers, which takes around 17 seconds, 03 also restarts progressing, as consensus can be reached again. Both 01 and 03 progress until 03 is caused to fail. Now, 01 cannot progress anymore, as OB is not reachable and 02 still too far behind due its slow database system run ning underneath. Thus, 01 halts until 02 has caught up. As soon as this is the case, both 01 and 02 continue progressing at the speed of the slower organization, namely 02.

Figu re 20( b) shows robustness and recovery of the inventive system under the Any-2 con sensus policy in a homogenous setup, where all organization run PostgreSQL. Thus, this time there is no drastically slower organization throttling the network. Again, at a certain point in time, the database of organization 01 is externally corrupted by performing an update and 01 starts to recover from the most recent checkpoint. In contrast to the previous experiment, this time the recovery of 01 does not negatively influence any other organization: 02 and 03 can still reach consensus under the Any-2 policy and continue progressing, as none of the two is behind the other one. Recovery itself takes only around 4 seconds this time and in this case, another organization is ready to perform a consensus round right after recovery. When or ganization 03 fails, 02 has to halt processing for a short amount of time, as organization 01 has to catch up.

In summary, these experiments show (a) that state deviation can be detected and one may recover from it, (b) that the network can progress in the presence of organization failures, (c) that all organizations respect the consensus policy at all times, and (d) that recover neither penalizes the individual organizations nor the entire network significantly.

Figu re 21 shows a breakdown of the cost of all cryptographic computation, such as signing and validation that is happening at several stages in the pipeline. The overhead caused by cryptographic computation is surprisingly small. Under the Any-2 policy, turning on all crypto graphic components decreases the throughput only by 7% for parallel execution. Under the All-3 policy, the decrease is only 8.5 %. While the cryptographic components of the inventive method have smaller negative effects, parallel transaction execution improves the throughput by up to 5 times.

Figu re 22 shows the results of varying the block size, which is an important configuration parameter in any blockchain system. The block size is varied from 256 up to 4096 transactions per block, in logarithmic steps, and the average throughput of successful transactions is re ported. Under both the Any-2 and All-3 policy, the throughput increases with the block size. This increase is mainly caused by the parallel transaction execution mechanism, which ana lyzes the whole block of transactions and schedules them for parallel conflict-free execution. In summary, ChainifyDB provides a powerful set of features, outlined in the following list of contributions:

(1) Synchronization of local data between untrusted organizations without an intermediary. ChainifyDB keeps local databases, which are maintained by individual organizations, synchro nized. Although these organizations potentially do not trust each other, ChainifyDB ensures that transaction processing is respected equally across all databases in the network. Chaini fyDB achieves this without introducing an intermediary. For example, if Dr. Bob adds a pre scription of medicine Ml to the patient file of Alice, ChainifyDB guarantees that the prescrip tion is reflected in exactly the same way in Alice's local patient files of Dr. Carol and Dr. Dave.

(2) Agreement on transactions. ChainifyDB processes a transaction only, if an agreement of all involved organizations is reached. Each organization can individually specify constraints, un der which a transaction is agreeable. In the example, the cardiologist could specify the con straint that only he is allowed to prescribe heart-related medicine. If one of the other doctors would propose to prescribe such a heart-related medicine, the corresponding transaction would not reach agreement and the patient file would remain unchanged.

(3) Seamless and minimally invasive integration. Each organization is able to install ChainifyDB on top of the established data management infrastructure as a separate layer, without adopt ing the underlying data management software in any way. Together, the organizations then form a network. Assuming that the patient file management system of the doctors comprises a frontend, which communicates with some kind of a DBMS backend, and then the ChainifyDB layer would be installed right in between.

(4) Convenience and accessibility. ChainifyDB reuses the already existing access method and query language of the underlying data management system. By this, the well-known way of operating the data can be used. For example, if the backend is a relational DBMS, which is queried by the frontend via SQL, then ChainifyDB communicates via SQL as well.

(5) Robustness via recovery. ChainifyDB provides mechanisms that allow an organization to recover after a deviation has happened. Such a deviation can be caused by various reasons, including data corruption and malicious behavior. In our example, a doctor could accidentally lose or corrupt Alice's patient record as his hard disk is damaged. ChainifyDB allows him to recover from this scenario by restoring the data from a local checkpoint.

Most notably, the previously mentioned features are not realizable with the existing pro cessing model of permissioned blockchain systems. A solution is provided by the new and highly flexible processing model powering ChainifyDB:

(6) Whatever-LedgerConsensus model (WLC). Instead of reaching consensus on the order of transactions before the execution, our WLC-model reaches consensus on the effects gener ated by the execution. This processing model allows providing the desired features of Chaini fyDB in heterogeneous setups, consisting of multiple different DBMSs. Further, it allows han dling access bypassing and enables recovery from the deviation of an organization.

(7) Extensive experimental evaluation of ChainifyDB. In comparison with the comparable state-of-the-art permissioned blockchain systems Fabric and Fabric++, ChainifyDB achieves an up to 6x higher throughput. Further, the inventors have shown that ChainifyDB is able to fully utilize the performance of the underlying database systems and demonstrate its recovery ca pabilities experimentally.

Previous Patent: DISCHARGE HEAD AND LIQUID DISPENSER COMPRISING A DISCHARGE HEAD

Next Patent: METHOD FOR SMOOTHING SURFACES