Two-Phase Commit (2PC): Achieving Atomicity Across Distributed Systems

Microservices architectures frequently reach a point where a business operation must either commit completely across multiple services or not happen at all. Charging a customer and creating an order must be atomic. Transferring funds between accounts must be atomic. Provisioning a resource and recording the billing event must be atomic.

When everything lives in one database, ACID transactions solve this. When the operations are split across services — each with their own database — the question becomes: how do you guarantee that either all writes commit or none do, without a single database transaction to enforce it?

Two-Phase Commit (2PC) is the classical answer to this problem. It predates microservices by decades, was designed for distributed databases in the 1970s, and provides strong atomic guarantees across multiple independent data stores. It is also one of the most misunderstood protocols in distributed systems — frequently misapplied, and frequently dismissed without understanding the exact conditions under which it fails.

Understanding 2PC matters for two distinct reasons.

First, it is still used in production. Database XA transactions, distributed SQL systems like CockroachDB and Google Spanner, and some message brokers implement variants of 2PC internally. Engineers working with these systems encounter its failure modes and performance characteristics whether they know it or not.

Second, it is the baseline against which every alternative protocol is measured. SAGA patterns, eventual consistency, and outbox patterns are all answers to the limitations of 2PC. You cannot fully understand why those alternatives exist without understanding what 2PC provides and why that provision comes at a cost.

XA Transactions

XA is the standard interface for 2PC across heterogeneous resource managers (databases, message brokers). Java's JTA (Java Transaction API) exposes XA. MySQL, PostgreSQL, and many enterprise databases support the XA protocol. Laravel does not expose XA natively, but PHP's PDO can use XA through direct SQL statements.

Consider a banking application performing an interbank transfer: debit £500 from Account A in Bank-DB-1, credit £500 to Account B in Bank-DB-2. Both databases are separate PostgreSQL instances.

Without atomic coordination, two failure modes threaten correctness:

Money creation — Credit succeeds, debit fails. £500 appears in Account B without leaving Account A.
Money destruction — Debit succeeds, credit fails. £500 leaves Account A but never arrives in Account B.

2PC prevents both outcomes. It ensures that either both operations commit together, or neither commits.

The protocol drives this through a coordinator process and two distinct phases:

Phase 1: Prepare

Coordinator → Bank-DB-1: "Can you commit DEBIT(A, 500)?"
Bank-DB-1: Acquires row lock on Account A, writes to WAL (but does not commit)
Bank-DB-1 → Coordinator: "YES — ready to commit"

Coordinator → Bank-DB-2: "Can you commit CREDIT(B, 500)?"
Bank-DB-2: Acquires row lock on Account B, writes to WAL (but does not commit)
Bank-DB-2 → Coordinator: "YES — ready to commit"

Phase 2: Commit

Coordinator: All participants voted YES
Coordinator → Bank-DB-1: "COMMIT"
Bank-DB-1: Commits the debit, releases lock
Bank-DB-1 → Coordinator: "COMMITTED"

Coordinator → Bank-DB-2: "COMMIT"
Bank-DB-2: Commits the credit, releases lock
Bank-DB-2 → Coordinator: "COMMITTED"

If any participant votes NO in Phase 1 (or doesn't respond):

Coordinator: At least one NO received
Coordinator → Bank-DB-1: "ROLLBACK"
Bank-DB-1: Rolls back, releases lock
Coordinator → Bank-DB-2: "ROLLBACK"
Bank-DB-2: Rolls back, releases lock

The result is a fully atomic operation across two databases with strong consistency guarantees.

The coordinator is the central authority in 2PC. It owns the transaction state log, drives both phases, and makes the final commit/rollback decision. The coordinator can be:

An application server (the service that initiates the transaction)
A dedicated transaction manager (e.g., Atomikos, Narayana)
A database proxy or middleware layer

The coordinator must durably log its decision before broadcasting it to participants. If it logs COMMIT and then crashes before notifying all participants, on recovery it reads the log and resumes broadcasting the commit. Without this durable log, a coordinator crash between phases leaves participants in a permanently uncertain state.

A participant in the prepared state is in a limbo condition: it has done enough work to commit, but has not yet been told to do so. During this window, the participant must:

Hold all locks acquired for the transaction
Persist enough information to either commit or rollback on coordinator instruction
Respond to coordinator retries if the coordinator crashed and recovered

The prepared state is a resource commitment. Locks are held. WAL entries are written. The participant is ready but waiting. The duration of this wait determines the performance cost of 2PC.

A correct 2PC implementation requires careful durability at the coordinator:

1. Coordinator writes "PREPARE" record to durable log
2. Coordinator sends Prepare to all participants
3. Coordinator collects all votes (with timeout)
4. If all YES: coordinator writes "COMMIT" to durable log
5. Coordinator sends Commit to all participants
6. On all ACKs: coordinator writes "COMPLETE" to log

If coordinator crashes at step 2: on recovery, re-send Prepare (participants are idempotent)
If coordinator crashes at step 4: on recovery, read log — "PREPARE" logged but no "COMMIT" → send ROLLBACK
If coordinator crashes at step 5: on recovery, read log — "COMMIT" logged → re-send Commit

This durability requirement is why 2PC implementations use write-ahead logs at both the coordinator and participant level.

2PC is inherently synchronous. Every message in both phases waits for a response before proceeding. The total transaction time is bounded by:

T_total = T_prepare(1) + T_prepare(2) + ... + T_prepare(N) + T_commit(1) + T_commit(2) + ... + T_commit(N)

In the worst case (sequential communication), this is the sum of all participant round-trip times multiplied by two. In a system with 5 participants each taking 10ms, the minimum 2PC overhead is 100ms — before any actual business logic.

2PC has a well-known failure mode called the blocking problem: if the coordinator fails after sending Prepare but before sending the final Commit or Rollback, participants are stuck.

A participant that voted YES is committed to awaiting the coordinator's decision. It cannot unilaterally commit (the other participants may have voted NO). It cannot unilaterally rollback (the coordinator may have logged COMMIT before crashing). It holds its locks indefinitely and waits.

This is the blocking problem. The system is in a state where forward progress requires coordinator recovery. If the coordinator's disk is corrupted, if the coordinator machine is physically destroyed, participants may be blocked permanently. No coordination among participants alone can resolve the uncertainty — they don't know what the coordinator decided.

Three-Phase Commit (3PC) was designed to address the blocking problem by adding a pre-commit phase that allows participants to detect and resolve coordinator failures. In practice, 3PC is rarely implemented because it doubles the communication overhead and only solves blocking in the absence of network partitions — which don't actually meet the conditions where blocking most commonly occurs.

Locks acquired during the Prepare phase are held until the Commit phase completes. In a high-throughput system, this creates contention: concurrent transactions attempting to access the same rows are blocked for the entire duration of the 2PC round-trip.

The impact compounds with participant count. A 2PC transaction across 5 services holds locks across 5 databases simultaneously. The probability that any one of those rows is in contention with another transaction grows with the number of rows locked and the number of concurrent transactions.

In a system processing 10,000 transactions per second, even 50ms average lock hold time during 2PC creates significant queue depth. The system doesn't scale horizontally because locks are a shared serialization point.

Under CAP, 2PC is a CP protocol. When a network partition occurs, participants in the minority partition cannot receive the coordinator's decision. They remain blocked with locks held until the partition heals and the coordinator's message arrives.

This is the fundamental incompatibility between 2PC and cloud-native microservices: cloud infrastructure has network events. Partitions between availability zones, between Kubernetes pods, between container hosts — these happen in any distributed system of meaningful scale. 2PC converts every network event into a blocking condition.

Characteristic	2PC	SAGA
Consistency model	Strong (linearizable)	Eventual
Lock hold duration	Entire transaction duration	Per local transaction only
Coordinator failure impact	Blocking across all participants	Orchestrator retries, no blocking
Network partition behavior	Blocking (CP)	Continues (AP)
Throughput scaling	Degrades with participant count	Scales with parallelism
Implementation complexity	Lower (protocol handles atomicity)	Higher (compensation logic required)

Here is a direct XA transaction implementation across two MySQL databases using PHP's PDO:

class InterBankTransferService
{
    public function transfer(string $fromAccount, string $toAccount, int $amountCents): void
    {
        $xid = $this->generateXid();

        $sourceDb = $this->getConnection('bank_db_1');
        $targetDb = $this->getConnection('bank_db_2');

        try {
            // Phase 1: Prepare both participants
            $sourceDb->exec("XA START '{$xid}'");
            $sourceDb->exec("UPDATE accounts SET balance = balance - {$amountCents}
                             WHERE account_id = '{$fromAccount}'
                             AND balance >= {$amountCents}");
            $affected = $sourceDb->query('SELECT ROW_COUNT()')->fetchColumn();

            if ((int) $affected === 0) {
                throw new InsufficientFundsException("Account {$fromAccount} has insufficient funds");
            }

            $sourceDb->exec("XA END '{$xid}'");
            $sourceDb->exec("XA PREPARE '{$xid}'");

            $targetDb->exec("XA START '{$xid}'");
            $targetDb->exec("UPDATE accounts SET balance = balance + {$amountCents}
                             WHERE account_id = '{$toAccount}'");
            $targetDb->exec("XA END '{$xid}'");
            $targetDb->exec("XA PREPARE '{$xid}'");

            // Both prepared — Phase 2: Commit
            // Coordinator durably logs decision here before committing
            $this->logCoordinatorDecision($xid, 'COMMIT');

            $sourceDb->exec("XA COMMIT '{$xid}'");
            $targetDb->exec("XA COMMIT '{$xid}'");

            $this->logCoordinatorDecision($xid, 'COMPLETE');

        } catch (InsufficientFundsException $e) {
            $this->rollbackXa($sourceDb, $targetDb, $xid);
            throw $e;
        } catch (\Throwable $e) {
            // Log coordinator decision before rolling back
            $this->logCoordinatorDecision($xid, 'ROLLBACK');
            $this->rollbackXa($sourceDb, $targetDb, $xid);
            throw $e;
        }
    }

    private function rollbackXa(\PDO $source, \PDO $target, string $xid): void
    {
        // Attempt rollback on both; catch exceptions to ensure both are attempted
        try { $source->exec("XA ROLLBACK '{$xid}'"); } catch (\Throwable) {}
        try { $target->exec("XA ROLLBACK '{$xid}'"); } catch (\Throwable) {}
    }

    private function generateXid(): string
    {
        // XA transaction IDs must be unique across the coordinator
        return sprintf('%s-%s', config('app.xa_gtrid'), bin2hex(random_bytes(8)));
    }

    private function logCoordinatorDecision(string $xid, string $decision): void
    {
        DB::table('xa_coordinator_log')->upsert(
            ['xid' => $xid, 'decision' => $decision, 'decided_at' => now()],
            ['xid'],
            ['decision', 'decided_at']
        );
    }
}

The logCoordinatorDecision call before the commit broadcast is critical. If the process crashes after logging COMMIT but before calling XA COMMIT on the databases, the recovery process can read the log and complete the commits. Without this log, a crash between the two XA COMMIT calls leaves the system with one database committed and one in prepared state indefinitely.

class XaRecoveryJob implements ShouldQueue
{
    public function handle(): void
    {
        // Find prepared XA transactions not yet completed
        $prepared = $this->findPreparedXaTransactions();

        foreach ($prepared as $xid) {
            $decision = DB::table('xa_coordinator_log')
                ->where('xid', $xid)
                ->value('decision');

            if ($decision === 'COMMIT') {
                // Coordinator decided to commit — complete it
                $this->commitXaTransaction($xid);
            } elseif ($decision === 'ROLLBACK' || $decision === null) {
                // Coordinator decided to rollback, or crashed before deciding
                $this->rollbackXaTransaction($xid);
            }
        }
    }

    private function findPreparedXaTransactions(): array
    {
        // MySQL XA RECOVER returns all prepared transactions on this node
        $rows = DB::select('XA RECOVER');
        return array_column($rows, 'data');
    }
}

In a standard 2PC implementation, the coordinator is stateful and its failure causes blocking. Production 2PC deployments must make the coordinator highly available: replicated state, leader election, and a recovery process that can resume in-progress transactions. This operational overhead is significant and frequently underestimated.

A participant that has prepared but not yet received a commit or rollback is in an in-doubt state. In-doubt transactions hold locks. If the coordinator is unavailable for a prolonged period, DBA intervention is required: an administrator must manually query XA RECOVER on each participant and make a decision based on the coordinator log.

In-doubt transactions that persist for hours or days — because the coordinator log was lost — require manual reconciliation of application state. This is the worst failure mode of 2PC.

If a participant acknowledges PREPARED but the coordinator's log is corrupted before it records the decision, the coordinator may choose to rollback on recovery. The participant that prepared is told to rollback. But if that participant's WAL was also corrupted and it independently chose to commit (incorrectly), you have a participant that committed while the coordinator decided to rollback. This is a phantom commit — data that exists in one database with no corresponding committed transaction.

Preventing phantom commits requires durable coordinator logs that survive the failure scenarios that corrupt participant WALs — typically, the coordinator log lives on a separate, more durable storage system.

Publishing a message to a broker (Kafka, RabbitMQ) as part of a 2PC transaction is problematic. Most message brokers do not support the XA protocol. You cannot atomically commit a database write and a message publish in a single 2PC transaction. This is one of the primary drivers for the Outbox Pattern — a way to achieve atomicity between a database write and a message publication without 2PC.

2PC fundamentally limits horizontal scaling. Adding more participant nodes does not increase throughput because the bottleneck is lock contention during the prepared state. More participants mean more locks held simultaneously, more network round-trips per transaction, and higher probability of participant failure during a transaction.

In practice, 2PC systems hit throughput ceilings at modest transaction rates. Netflix's internal research showed 2PC as unviable above a few thousand transactions per second across multiple services. SAGA-based systems, by contrast, scale with the throughput of each individual service.

Cross-region 2PC is prohibitively expensive. A prepare round-trip from US-East to EU-West takes 70ms per message. A full 2PC cycle across regions takes a minimum of 280ms (4 message round-trips) before any actual business logic. Synchronous cross-region transactions are architecturally incompatible with low-latency user-facing operations.

Systems that require cross-region consistency typically use asynchronous replication with careful application-level conflict detection — not 2PC.

Two-Phase Commit is a correct and well-specified protocol for distributed atomic transactions. Its correctness guarantees are real: when it works, both operations commit or neither does. The problem is that "when it works" carries significant conditions that don't hold in cloud-native distributed systems.

Use 2PC when:

You have 2–3 participants maximum
All participants are in the same datacenter / low-latency network
Transaction volume is modest (hundreds per second, not thousands)
The infrastructure is stable and coordinator failure is a recoverable edge case
The consistency guarantee justifies the throughput and availability cost
You're working within a single organisation's tightly controlled infrastructure

Do not use 2PC when:

Services are owned by different teams or cross organisational boundaries
You need horizontal scalability
Participants are geographically distributed
The system must remain available during network events
Transaction volume exceeds a few thousand per second

The honest assessment of 2PC in modern microservices is this: for the vast majority of distributed business operations, SAGAs with carefully designed compensating transactions and the Outbox Pattern provide better operational characteristics with acceptable consistency tradeoffs. 2PC remains the right answer for a narrow but real category of use cases — financial systems with strict atomicity requirements across a small number of tightly coupled components, often within a single database cluster or distributed SQL system that implements 2PC internally.

Understanding 2PC is not about knowing when to use it. It is about understanding the atomicity guarantee well enough to know precisely when the alternative patterns are actually safe — and when they are not.

Two-Phase Commit (2PC):
Achieving Atomicity Across Distributed Systems

Distributed Transactions in Microservices: Why Consistency Becomes Difficult

CAP Theorem Explained: The Tradeoffs Behind Distributed Systems

Understanding the SAGA Pattern: Coordinating Long-Running Transactions

Designing a complex system?