CodeWithAbdessamad

Protocols

Distributed Transactions: Protocols

When building distributed systems, ensuring data consistency across multiple services is critical. Traditional transactional guarantees from single-machine databases (like ACID) don’t directly translate to distributed environments. This section explores two foundational protocols for managing distributed transactions: Two-Phase Commit (2PC) and the Saga Pattern. We’ll dive deep into their mechanics, trade-offs, and real-world implementations—equipping you to choose the right approach for your system.


Two-Phase Commit (2PC)

Two-Phase Commit (2PC) is the classic distributed transaction protocol that guarantees atomicity across a network of distributed services. It operates in two distinct phases: voting and commit, ensuring all participants either fully commit the transaction or roll it back. This protocol is foundational but has significant limitations in modern distributed systems.

How 2PC Works: A Step-by-Step Breakdown

Imagine a distributed transaction that involves three services: OrderService, PaymentService, and InventoryService. The coordinator (a dedicated service) manages the transaction flow:

  1. Prepare Phase:

The coordinator contacts all participants to check readiness. Each participant:

– Validates local data (e.g., checks if inventory is sufficient)

– Sends a PREPARE message to the coordinator

– If ready, replies with VOTECOMMIT; otherwise, VOTEABORT

  1. Commit Phase:

– If all participants vote VOTE_COMMIT, the coordinator sends a COMMIT message to all.

– If any participant votes VOTE_ABORT, the coordinator sends a ABORT message to all.

This ensures the transaction is either fully completed or fully rolled back—no partial state exists.

Real-World Example: Order Processing

Here’s a simplified implementation using a mock coordinator and participants. The OrderService initiates a transaction to place an order:

<code class="language-javascript">// Coordinator (2PC implementation)
<p>class TransactionCoordinator {</p>
<p>  async prepare(orderId) {</p>
<p>    const results = await Promise.all([</p>
<p>      this.orderService.prepare(orderId),</p>
<p>      this.paymentService.prepare(orderId),</p>
<p>      this.inventoryService.prepare(orderId)</p>
<p>    ]);</p>
<p>    return results.every(result => result === 'VOTE_COMMIT');</p>
<p>  }</p>

<p>  async commit(orderId, commit = true) {</p>
<p>    if (commit) {</p>
<p>      await this.orderService.commit(orderId);</p>
<p>      await this.paymentService.commit(orderId);</p>
<p>      await this.inventoryService.commit(orderId);</p>
<p>    } else {</p>
<p>      await this.orderService.rollback(orderId);</p>
<p>      await this.paymentService.rollback(orderId);</p>
<p>      await this.inventoryService.rollback(orderId);</p>
<p>    }</p>
<p>  }</p>
<p>}</p>

<p>// Participant (OrderService)</p>
<p>class OrderService {</p>
<p>  async prepare(orderId) {</p>
<p>    // Check order validity (e.g., no duplicate orders)</p>
<p>    if (await this.validateOrder(orderId)) {</p>
<p>      return 'VOTE_COMMIT';</p>
<p>    }</p>
<p>    return 'VOTE_ABORT';</p>
<p>  }</p>

<p>  async commit(orderId) {</p>
<p>    // Apply order to database</p>
<p>    await this.database.saveOrder(orderId);</p>
<p>  }</p>

<p>  async rollback(orderId) {</p>
<p>    // Remove order from database</p>
<p>    await this.database.deleteOrder(orderId);</p>
<p>  }</p>
<p>}</code>

Key Observations:

  • Atomicity: All services either commit or rollback together.
  • Failure Handling: If a service fails during prepare (e.g., InventoryService unavailable), the transaction aborts immediately.
  • Latency: The coordinator’s round-trip communication adds overhead (especially in high-latency networks).

When 2PC Fails in Practice

While 2PC guarantees consistency, it struggles in modern systems:

  • Network partitions: If the coordinator fails, participants remain in a “precommit” state indefinitely.
  • Long-running transactions: 2PC blocks participants for the duration of the transaction (e.g., 10+ seconds for complex orders).
  • State explosion: In microservices architectures, 2PC requires a dedicated coordinator per transaction, scaling poorly.

💡 Pro Tip: Use 2PC only for short-lived transactions with low-latency services. For most cloud-native systems, alternatives like the Saga Pattern are more practical.


Saga Pattern

The Saga Pattern is a modern alternative to 2PC that decouples transactional consistency from distributed coordination. Instead of a single coordinator, it uses a sequence of local transactions with compensating actions to achieve eventual consistency. This approach is ideal for asynchronous microservices where 2PC’s overhead is prohibitive.

Core Principles of the Saga Pattern

  1. Saga Flow: A series of ordered local transactions (e.g., OrderService → PaymentService → InventoryService).
  2. Compensating Actions: For every transaction step, a reversible action (e.g., CancelOrder if payment fails).
  3. Event-Driven: Each step emits an event to track progress and trigger compensating actions.

Real-World Example: Payment Saga

Consider a payment flow where an order is placed, payment is processed, and inventory is reserved. If payment fails, we compensate by canceling the order:

<code class="language-javascript">// Step 1: Create Order (OrderService)
<p>async function placeOrder(order) {</p>
<p>  await orderService.createOrder(order);</p>
<p>  // Emit event: ORDER_CREATED</p>
<p>  return { orderId: order.id };</p>
<p>}</p>

<p>// Step 2: Process Payment (PaymentService)</p>
<p>async function processPayment(orderId) {</p>
<p>  const payment = await paymentService.charge(orderId);</p>
<p>  if (payment.success) {</p>
<p>    // Emit event: PAYMENT_SUCCESS</p>
<p>    return payment;</p>
<p>  }</p>
<p>  // If payment fails, trigger compensation</p>
<p>  await compensationService.cancelOrder(orderId);</p>
<p>  throw new Error("Payment failed");</p>
<p>}</p>

<p>// Step 3: Reserve Inventory (InventoryService)</p>
<p>async function reserveInventory(orderId) {</p>
<p>  await inventoryService.reserve(orderId);</p>
<p>  // Emit event: INVENTORY_RESERVED</p>
<p>}</p>

<p>// Compensation Workflow</p>
<p>async function cancelOrder(orderId) {</p>
<p>  await orderService.cancelOrder(orderId); // Compensate for ORDER_CREATED</p>
<p>  await paymentService.refund(orderId);    // Compensate for PAYMENT_SUCCESS</p>
<p>  await inventoryService.release(orderId); // Compensate for INVENTORY_RESERVED</p>
<p>}</code>

How It Handles Failure:

If processPayment fails, the cancelOrder compensation workflow runs automatically:

  1. ORDERCREATEDCANCELORDER (removes order)
  2. PAYMENT_SUCCESSREFUND (reverses payment)
  3. INVENTORYRESERVEDRELEASEINVENTORY (returns stock)

Why Saga Beats 2PC for Most Systems

Factor 2PC Saga Pattern
Network Latency High (coordinator round-trips) Low (local transactions)
Fault Tolerance Coordinator failure blocks all Self-healing via compensations
Scalability Poor (coordinator bottleneck) Excellent (no single point)
Use Case Short, synchronous transactions Asynchronous microservices

Real-World Benefit:

In a high-traffic e-commerce system, a Saga can reduce transaction latency by 40–60% compared to 2PC while handling failures gracefully. For example, during Black Friday, a Saga-based payment flow processes 10k orders/sec without coordinator overload.

When to Use Saga vs. 2PC

Scenario Choose 2PC Choose Saga
Short transactions (< 1s) ✅ (low latency) ❌ (overkill)
Single-service transactions ✅ (no need for compensation) ❌ (not applicable)
High failure rates (e.g., cloud) ❌ (coordinator fails) ✅ (self-healing)
Eventual consistency required ❌ (strong consistency) ✅ (ideal)

💡 Pro Tip: Start with Saga for all new microservices. Only use 2PC for legacy systems with strict ACID requirements.


Summary

  • Two-Phase Commit (2PC) guarantees atomicity through a coordinator-driven voting mechanism but suffers from high latency and poor scalability in distributed systems. It’s best reserved for short, synchronous transactions where strong consistency is non-negotiable.
  • Saga Pattern replaces 2PC with a sequence of local transactions and compensating actions, enabling eventual consistency without a central coordinator. It’s ideal for asynchronous microservices, handling failures gracefully, and scaling well in modern cloud environments.
  • Key Takeaway: For most distributed systems today, Saga Pattern is the pragmatic choice—it balances consistency, resilience, and performance without sacrificing flexibility. Reserve 2PC for edge cases where its simplicity outweighs the trade-offs.

Choose the right protocol for your system’s constraints, and you’ll build transactions that scale without breaking. 🌟