Distributed Transactions: Overview

Distributed Transactions: Overview

The basics and overview of Distributed transactions

A distributed transaction is one that involves numerous database systems or other resources in a single transaction. Changes made to one system or resource must be reflected in all other systems or resources involved in the transaction in such instances. In other words, any changes made by the transaction must be committed or rolled back in all systems or resources involved. This article defines distributed transactions and describes how they function.

What are Transactions?

Transactions are a basic notion in database systems and other data-manipulation systems. A transaction is a unit of work that consists of one or more data operations. In a nutshell, a transaction is a collection of commands that either complete entirely or fail altogether.

Properties of a Transaction

The key properties of a transaction are atomicity, consistency, isolation, and durability.

Atomicity

Atomicity refers to the property of a transaction that ensures that either all or none of the operations in the transaction are performed. This means that if an error happens during transaction execution, all changes made by the transaction are undone, and the system is returned to a consistent state.

Consistency

The property of a transaction that assures that the transaction leaves the system in a consistent state is referred to as consistency. A consistent state is one in which all of the system's rules and restrictions are met.

Isolation

The property of a transaction that ensures that the changes performed by the transaction are not visible to other transactions until the transaction is committed is referred to as isolation. This means that other transactions cannot see the intermediate states of the data while the transaction is running.

Durability

The term "transaction durability" is used to describe the quality of a transaction that guarantees its modifications will survive a system failure. This guarantees that the transaction's modifications will survive a crash of the system.

When is a Transaction Distributed?

Transactions are straightforward to implement in a single-system environment. Depending on the outcome of the transaction, the system either commits the changes to the data store or rolls them back and records them in a temporary log called the transaction log.

However, the problem becomes more complicated in a distributed system, where numerous systems or resources are involved in a single transaction. This is because the transaction must produce results that are consistent and persistent across all systems or resources involved. This is known as a distributed transaction.

Distributed transactions refer to a situation where multiple database systems, or other resources, are involved in a single transaction. In such cases, the changes made to one system or resource must be reflected in all the other systems or resources participating in the transaction. In other words, all the changes made by the transaction must be committed or rolled back in all the participating systems or resources.

Requirements for distributed transactions

There are two important requirements for distributed transactions:

  • Consistency: this means all distributed databases are equally up to date with the most recent information.

  • Termination: the distributed transaction is either fully executed or not executed at all. If a distributed transaction fails, it needs to fail for every database that participated in the transaction.

Importance of Distributed Transactions

When a business process involving several systems or resources must be atomic—that is, all changes must be committed or none of them are committed—distributed transactions become crucial. A distributed transaction would be necessary, for instance, to guarantee the completion or reversal of a bank transfer between two different banking systems in the event of an error.

For processing a payment, Distributed Transactions might be helpful when validating and charging a credit card. Typically, billing information is kept separate from credit card information in a database. Using distributed transactions, we can synchronise the data in these two databases.

Challenges of Implementing Distributed Transactions

There are two major challenges involved in implementing distributed transactions, which are:

Consistency Problem

A major hurdle in implementing distributed transactions is making sure the changes performed by the transaction are consistent across all systems or resources involved. The issue is commonly referred to as the "consistency problem."

Durability Problem

Assuring that the transaction's modifications will survive a system failure is another difficult task. The issue is commonly referred to as the "durability problem."

Techniques and Protocols for Implementing Distributed Transactions

To solve the consistency and durability problems, various techniques and protocols have been developed for implementing distributed transactions.

Two-Phase Commit Protocol (2PC)

The transaction log is a short-term log used in this protocol to record the changes made by the transaction. Each system or resource involved in the transaction receives a request from the transaction coordinator, the coordinating entity, asking it to get ready to commit the modifications. Once the coordinator determines that all systems or resources are ready to commit, they will issue a commit request. If the coordinator detects that any system or resource is not yet ready to commit, they will issue a rollback request, at which point everything will reverse its recent actions.

There are several variations of the two-phase commit protocol, including the three-phase commit protocol and the distributed commit protocol. These variations address specific problems or improve the efficiency of the protocol.

Optimistic Concurrency Control (OCC)

In this technique, each system or resource participating in the transaction maintains a version number for the data. When a transaction attempts to update the data, it checks the version number. If the version number has not changed, the transaction updates the data and increments the version number. If the version number has changed, the transaction rolls back the changes and retries the update.

XA Standard

The XA standard is a technique for implementing distributed transactions that involve the use of an XA interface to coordinate the transaction. The XA interface defines a set of functions that can be used to start, end, and roll back a transaction.

Sagas Pattern

One way to execute distributed transactions is through the use of the Sagas pattern, which entails slicing up the transaction into several smaller, self-contained pieces. Individual sagas can be committed or rolled back without affecting others.

Eventual Consistency Model

Relaxing the consistency constraints of the transaction and letting the participating systems or resources finally converge on a consistent state is the basis of the eventual consistency model, a technique for implementing distributed transactions.

Summary

A distributed transaction is necessary because transactions that span multiple databases might fail due to network interruptions or other issues. It is an important concept in database systems and other distributed systems, and various techniques and protocols have been developed to solve the consistency and durability problems involved in implementing distributed transactions.