Managing Multi-Party Contracts on the Blockchain: The Curious Case of Batching— Part I

Lessons learnt, pitfalls, and gotchas… and a valuable exercise in system design.

11 min readSep 20, 2021

Recently, I had the opportunity to contribute to the design and implementation of a rather complex solution, which manages business contracts by leveraging distributed ledger technologies, in particular Hyperledger Fabric. What seemed to be a simple engineering exercise in extending the existing functionalities to “execute many at once”, revealed to be a rather challenging task from the perspective of correctness, system integration, and usability. Here is the story.

“There is an essential complexity in software systems that acts very much like energy, you cannot increase or reduce it, rather you can only mutate it into different forms.” — Myself

Hello Reader! This is the first part of a series of articles that will discuss and investigate this interesting problem. Part I will introduce the business context and some of the interesting challenges that batching brings. The next article will dive into the technical details and challenges of an implementation batching a multi-party process on systems that have transactional nature. The Buckle up and let’s get started!

The Business Context

The platform we were developing provides organisations with the capability of managing contracts and relies on distributed ledger technologies to keep the full audit of operations performed on the contract and execute some of the business logic.

The particular nature of the contract is not relevant for the discussion, we can assume that:

a contract involves agreement from multiple parties (organisations);
a contract is created, cancelled, amended, and may expire;
every operation performed on the contract is subject to the approval of the engaged parties (organisations) and each party may also have an internal approval process.

As a result, the execution of a single business function — cancellation for instance — is already a complex multi-step and multi-party process.

The overall implementation is composed by a considerable portion of “off-chain” logic, which is as important as the blockchain component and contributes to the complexity of the problem. Figure 1 provides an overview of the process implemented to execute one operation and the interaction between on-ledger and off-ledger components.

Figure 1 — Operations are executed by means of request whose workflow tracks the contribution (and decision) of each party. If the workflow is successful the associated operation on the contract is executed.

The figure shows an important design decision we have adopted. Since anything that happens to a contract is executed through a request, these become “first class citizens” in the system. This means that we manage requests through their own life-cycle and we identify and distinguish them from the contract they operate on. As a result they are created, stored, and managed in the system as entities of their own, since they embody most (if not all) the business processes related to a contract.

The advantage of recording request and their life-cycle (and not only the result of their approval on the contract) is to be able to keep a much finer and granular transaction log of what happened. We not only have full record of requests that being successfully approved modify or create a contract, but also full record of those requests that were not successful, and more importantly why. These insights are rather valuable since they provide you with the context and the details that bound entities into a contract, or that modified the instrument that binds them and records their commitments. Context and details that led to actions, are often as important as the action themselves. Hence, the practice of recording request is quite common in some application domain, if not required.

The management of contracts, their provenance, and the detailed auditing of any decision that changed such contracts are the core capability of the platform and the starting point of our implementation.

Challenges in Bringing the “Old World” in the New Platform

Until now, we have a platform that digitises the life-cycle of multi-party contracts, ensures their integrity, records their provenance, and provides us with a fully accountable audit of the events that determined their evolution. But we also have a self-contained system able to only operate on those contracts that are native to it. While this allows us to go forward and use the platform for any new contract being created, it proves to be a strong limitation for existing contracts, which are unable of make use of the platform services. …and guess what? they are plenty of it.

Organisation that may engage with the platform would have tens, hundreds, if not thousands of contracts, which would need to be migrated onto the platform. Some of these contracts may be over a long term. Hence, the option of waiting for their termination is not viable, neither the option of terminating the existing ones and recreating them in the new digital form is advisable.

Therefore, marrying the old world with the new technology calls for designing and implementing a capability for migrating in large numbers existing contracts into the platform, with some sort of a bulk operation. One thing to observe is that there are two competing forces at play here: on one side we want retain the properties of integrity and full of auditability of the contracts and the operations performed on it; on the other side we want to develop a functionality that is meant to be effective for large volumes and somewhat suggests mechanisms of aggregation and a less granular approach.

Let’s quickly review what are some of the key requirements for this functionality and how these requirements impact our decision making.

Business Requirements

We have two main objectives from a functionality standpoint.

Pump Up the Volume. We need to provide the ability to process large quantities of items: we could still decide about the size of the batch to process, but it is reasonable to think that the system would need to at least cater for batches that are 1 or 2 orders of magnitude the cardinality of the single operation. As the volume increases we also want to retain a certain level of responsiveness, even though there is an understanding that bulk operations are not meant to be responsive as other functions and most-likely executed in the background. This means that our design should not get in the way or affect the performance of other functions which are meant to be more responsive.

All the Bells and Whistles. The initial thought of having a simple and “back-office like” operation to import existing contracts, quickly faded away. The project stakeholders recognised that there was more to “we want to be able to have these contracts on the platform, let’s just put them there.” Something that originally was thought as an self-contained capability to be carried out as a background job, became a more sophisticated integration exercise as there was a need from an end-user perspective to receive notifications about such events, retain a proper audit, and participate into the process to ensure the correctness of the information being entered into the system. As a result, importing a contract become more and more similar to the other functions already supported and it was clear that granularity of operation would need to cater for a single contract.

System Design Considerations

The problem is rather interesting and the raises some observations on system design that is worth sharing.

On Coherence and Correctness. The coming into existence of contracts in the platform via a bulk import operation, should generate artefacts that exhibit the same properties of the existing ones. One of the core tenets of the platform is to ensure full accountability of actions and more importantly that organisations entered into a commitment willingly and in full awareness. This is the reason why the creation of a contract is subject to a multi-party approval process. As a result, we cannot bring into existence contracts via a unilateral action. Rather, the import operation need to be supported by the same degree of accountability that the creation of a contract entails. This somewhat re-enforces the second business requirement previously outlined, but originates from the need of creating a systems that is coherent in all its aspects and as a result it is secure and integral.

Being a Good Citizen. As the import operation shares more and more the functionalities with the already implemented flows, we would like to maintain coherence and adherence to the existing patterns to ensure that we don’t build a completely new system but leverage as much as possible what we already know working as expected. For instance, the approval process that the import operation needs to undergo is essentially identical to the other operations, minus a couple of configurable business rules. Hence, it makes sense to extend such process to cater for the variations. At the same time, we also want to be sure that the process is additive and does not drastically destabilise the existing capabilities. This objective is often in tension with the previous one.

How Deep? The matter of throughput comes into light when we need to design and implement a capability that operates on large numbers of entities. In light of providing an effective solution to this problem we may want to investigate what type of support we need implement across the solution stack. In other words, how many layers of our solution need to be aware of the concept of batch (i.e. a group of contracts acted upon as a whole) and be able to manage it. Is this something that we can keep at the peripheral of the system or something that need to deeply penetrate across layers? As a rule of thumb, the more we push the ability of batching down to the stack the more throughput we may obtain, but more complex and sizeable is the work. We can decide to expose the capability at the UI, API level or down to the smart contract layer. For instance, pushing 100 operations into a single transaction can help speeding up the processing of large volumes, but it does come with enabling the smart contract with such capability.

It is becoming clearer and clearer that initial simplicity of “more of the same” quickly fades away the moment we need to equip the bulk import capability with features that make it secure, consistent with the platform design, and actually usable within an enterprise context. It is worth observing that we have touched only some of the complexity hidden in such process and not discussed in detail other significant matters in relation to de-duplication and reconciliation when multiple parties import the same contract.

For the purpose of this discussion we will be concentrating on the challenges that the action batching a multi-party contract brings, without discussing in detail de-duplication.

The Elusive Nature of Batches of Contracts

When we think about executing an operation in bulk very often we think about executing the “same” operation on rather similar records. This is the principle that very often determines grouping and the motivation for considering — from a system design perspective — a group of entities as an entity of its own, which can be tracked as it travels and gets processed through the system.

Enter the exciting world of multi-party contracts, which is the main instrument managed by the platform in question. In the most common case the number of distinct parties involved in a contract is 3. Such number maps to the different roles the parties can take in the contract. This means that the underlying relationship is a many-to-many relationship where a party can have multiple contracts, with multiple different parties.

Moreover, the roles are not symmetric: let’s suppose that the roles are A,B and C. A tends to have the largest number of contracts, B is the second in line, and C the last one. A contract can be any combination of <A,B,C>, and every party is only meant to see the contracts that participates in, which means for any contract accessible to the party that party covers one of these three roles in each of the contract — most likely the same role but not necessarily.

What all of this has to do with batching? …You may ask.

Well.. it turns out it has significant implications on grouping, … if we want to make the batching capability really useful. For instance, it is unlikely that we can find large number of entities where the three parties are exactly the same organisations. Most likely, a batch for each party will only have in common one organisation which is the organisation itself, the other remaining roles may be filled by a rather variable number of parties. Constraining a batch to have exactly the same parties (in the same roles or not) would be rather limiting in its usefulness to handle large volumes.

If we use as a criterion of aggregation the organisation that is currently performing the operation, what we see is that a batch initially created gets fragmented and distributed into other batches based on the party that is performing an operation. …. Somewhat like the Schrödinger’s cat, the batch entity changes its composition based on who is observing it.

Figure 2 — Explosion and regrouping of batches as they travel through the approval process. The figure shows only one possible sequence of approvals.

The figure above explains this concept by contextualising the life of a batch as it travels through the three step process that requires its completion:

Starting from the left (i.e. Stage 1) three distinct organisations submit a batch, each of them comprising a set of contracts to be imported that all have in common the organisation that submits them (i.e. A1, A3, and C1).
After submission, the contracts need to be repackaged with different grouping criteria, because other organisations need to cast their approval for or rejection to importing the contract. The figure shows a particular grouping which shows the view seen by B1, B2, and B3.
The final step completes the process by allowing the remaining organisations for each contract to cast their approval, thus leading to the grouping shown in Stage 3, which sees C1, C2, C3, C4, A1, and A2 doing their part. We can notice that both A1 and C1 appear again in Stage 3 because they are part of a contract that they did not originally submit.

It is worth observing that during Stage 2 we only have approvals otherwise some of these contract would not have made it to Stage 3.

The most important thing to grasp here is that at every step of the process the batch is exploded and regrouped according to the party that needs to cast the approval / rejection. This means that the concept of batch identifier is not unique across parties, but everyone will assign a different identifier to different batches along the way.

This additional complexity comes with an opportunity. As we explode and recreate batches every step along the way, it becomes more and more relevant to enable the import operation at a single contract level, as for a particular party a batch may eventually result composed of a single contract, which still need to be operated. This is the case, for instance, for A1, A2, and C1 in Stage 3 of the previous example.

The solution design we are going to define needs to cater for this flexibility, possibly at all levels where the batch capability is implemented. This is to ensure that only the entities that parties have access to are accessed… across the entire stack. Awesome.

Time to Take a Break

In this part we have discussed the business context and rationale for adding a batching capability on a platform that leverages blockchain technologies. We have focused mainly on canvassing the problem and discuss some of the challenges that the developing this capability brings within the highlighted business context. We have tackled the problem from a requirements perspective and try to define the desired properties of batching.

In the next part we will discuss the challenges that batching brings from an implementation perspective and more specifically when integrated with systems such as Hyperledger Fabric that are of transactional nature. Stay tuned!