Event Sourcing Series Part 3: Saga Orchestration
September 15, 2025 · 7 min read
Saga Pattern, Microservices, Distributed Systems, Azure, Architecture
This is Part 3 of a 5-part series on Event Sourcing and Saga Orchestration with Azure.
Here's a scenario that broke my system: Customer places an order. Payment succeeds. But inventory service is down. Now I have a paid order I can't fulfill, and no automatic way to handle it.
Welcome to the world of distributed transactions.
Why Traditional Transactions Don't Work
In a monolith, this is easy:
using var transaction = await db.BeginTransactionAsync();
try
{
await orderRepo.CreateAsync(order);
await paymentRepo.ChargeAsync(payment);
await inventoryRepo.ReserveAsync(items);
await transaction.CommitAsync();
}
catch
{
await transaction.RollbackAsync();
throw;
}
In microservices, each service has its own database. There's no shared transaction. If payment succeeds but inventory fails, you can't rollback the payment - it already committed in a different database.
Two-Phase Commit (2PC) exists but it's:
- Slow (locks held across network calls)
- Fragile (coordinator failure = stuck transactions)
- Not supported by most cloud services
Enter the Saga Pattern
A saga is a sequence of local transactions. Each step either succeeds or triggers compensating actions to undo previous steps.
Order Service → Payment Service → Inventory Service
│ │ │
│ │ ✗ (fails)
│ │ │
│ Refund ◀──────────────────┘
│ │
Cancel Order ◀────────┘
The key insight: instead of preventing inconsistency (transactions), we detect and recover from it (compensation).
Choreography vs Orchestration
Two ways to coordinate sagas:
Choreography (Event-Driven)
Each service listens for events and reacts. No central coordinator.
┌──────────────┐ OrderCreated ┌──────────────┐
│ Order │ ───────────────▶ │ Payment │
│ Service │ │ Service │
└──────────────┘ └──────┬───────┘
▲ │
│ PaymentFailed │ PaymentSucceeded
└─────────────────────────────────│─────────────────┐
▼ │
┌──────────────┐ │
│ Inventory │ │
│ Service │ │
└──────┬───────┘ │
│ │
│ StockReserved │
▼ │
┌──────────────┐ │
│ Shipping │◀────────┘
│ Service │
└──────────────┘
Pros:
- Loosely coupled
- Simple for small flows
- No single point of failure
Cons:
- Hard to understand the full flow
- Difficult to track saga state
- Adding new steps affects multiple services
- Debugging is a nightmare
Orchestration (Central Coordinator)
A dedicated orchestrator tells each service what to do and handles failures.
┌───────────────────┐
│ Order Saga │
│ Orchestrator │
└─────────┬─────────┘
│
┌─────────────────────┼─────────────────────┐
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Payment │ │ Inventory │ │ Shipping │
│ Service │ │ Service │ │ Service │
└──────────────┘ └──────────────┘ └──────────────┘
Pros:
- Easy to understand flow
- Centralized failure handling
- Simple to add/modify steps
- Clear saga state tracking
Cons:
- Central point of failure (mitigate with durability)
- Orchestrator can become complex
- Tighter coupling to orchestrator
My recommendation: Use orchestration for anything non-trivial. The visibility and control are worth it.
Designing Compensating Actions
Every forward action needs a compensating action:
| Step | Forward Action | Compensating Action |
|---|---|---|
| 1 | Create Order (pending) | Cancel Order |
| 2 | Reserve Inventory | Release Inventory |
| 3 | Process Payment | Refund Payment |
| 4 | Ship Order | (Can't compensate - manual intervention) |
Rules for Compensation
- Compensations must be idempotent - They might run multiple times
- Compensations can fail - Have retries and dead letter handling
- Some actions can't be compensated - Design for this (e.g., don't ship until payment confirmed)
- Order matters - Compensate in reverse order
Semantic vs Technical Rollback
Technical rollback: Undo the data changes
// Payment was $100, refund $100
await paymentService.Refund(transactionId, amount);
Semantic rollback: Apply a business correction
// Can't un-send email, but can send correction
await emailService.SendCancellationNotice(orderId);
The Order Saga Example
Let's design a complete order saga:
public class OrderSagaState
{
public string SagaId { get; set; }
public string OrderId { get; set; }
public string CustomerId { get; set; }
public List<OrderItem> Items { get; set; }
public decimal TotalAmount { get; set; }
// Step results
public string PaymentTransactionId { get; set; }
public string InventoryReservationId { get; set; }
public string ShipmentTrackingNumber { get; set; }
// State tracking
public SagaStatus Status { get; set; }
public string CurrentStep { get; set; }
public string FailureReason { get; set; }
public List<string> CompletedSteps { get; set; } = new();
}
public enum SagaStatus
{
Started,
Processing,
Completed,
Compensating,
Failed,
Compensated
}
The Saga Flow
START
│
▼
┌─────────────────┐
│ Create Order │──── Success ────┐
│ (Pending) │ │
└────────┬────────┘ │
│ Failure ▼
│ ┌─────────────────┐
│ │ Reserve │──── Success ────┐
│ │ Inventory │ │
│ └────────┬────────┘ │
│ │ Failure ▼
│ │ ┌─────────────────┐
│ │ │ Process │
│ │ │ Payment │
│ │ └────────┬────────┘
│ │ │
│ │ Success│Failure
│ │ │
│ │ ┌────────────────┴─────┐
│ │ │ │
│ │ ▼ │
│ │ ┌─────────────┐ │
│ │ │ Confirm │ │
│ │ │ Order │ │
│ │ └──────┬──────┘ │
│ │ │ │
│ │ ▼ │
│ │ SUCCESS │
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ COMPENSATION FLOW │
│ │
│ Refund Payment ◀── Release Inventory ◀── Cancel Order │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ COMPENSATED │
└─────────────────────────────────────────────────────────────────┘
Handling Edge Cases
Idempotency is Critical
Services will receive duplicate messages. Every operation must be idempotent:
public async Task<PaymentResult> ProcessPayment(ProcessPaymentCommand cmd)
{
// Check if already processed
var existing = await _db.Payments
.FirstOrDefaultAsync(p => p.IdempotencyKey == cmd.IdempotencyKey);
if (existing != null)
{
return new PaymentResult(existing.TransactionId, existing.Status);
}
// Process payment...
}
The Outbox Pattern
Ensure database commit and message publish are atomic:
public async Task ReserveInventory(ReserveCommand cmd)
{
using var transaction = await _db.Database.BeginTransactionAsync();
// 1. Update inventory
var reservation = new InventoryReservation { ... };
_db.Reservations.Add(reservation);
// 2. Write to outbox (same transaction!)
_db.Outbox.Add(new OutboxMessage
{
Id = Guid.NewGuid(),
EventType = "InventoryReserved",
Payload = JsonSerializer.Serialize(new InventoryReserved(...)),
CreatedAt = DateTime.UtcNow
});
await _db.SaveChangesAsync();
await transaction.CommitAsync();
// 3. Background job publishes from outbox
}
The Inbox Pattern
Deduplicate incoming messages:
public async Task HandleMessage(ServiceBusReceivedMessage message)
{
var messageId = message.MessageId;
// Check if already processed
if (await _db.ProcessedMessages.AnyAsync(m => m.MessageId == messageId))
{
_logger.LogWarning("Duplicate message {MessageId}, skipping", messageId);
return;
}
// Process the message...
// Mark as processed
_db.ProcessedMessages.Add(new ProcessedMessage { MessageId = messageId });
await _db.SaveChangesAsync();
}
Timeout Handling
What if a step never responds?
public async Task ExecuteStepWithTimeout(Func<Task> step, TimeSpan timeout)
{
using var cts = new CancellationTokenSource(timeout);
try
{
await step().WaitAsync(cts.Token);
}
catch (OperationCanceledException)
{
// Step timed out - decide: retry or compensate?
throw new SagaStepTimeoutException();
}
}
Saga State Persistence
The orchestrator must survive restarts. Persist saga state:
public class SagaStateRepository
{
private readonly CosmosContainer _container;
public async Task SaveAsync(OrderSagaState state)
{
await _container.UpsertItemAsync(state,
new PartitionKey(state.SagaId));
}
public async Task<OrderSagaState> LoadAsync(string sagaId)
{
var response = await _container.ReadItemAsync<OrderSagaState>(
sagaId,
new PartitionKey(sagaId));
return response.Resource;
}
}
Coming Up Next
In Part 4, we'll implement this saga using Azure Durable Functions - which handles state persistence, retries, and timeouts for us.
This is Part 3 of a 5-part series on Event Sourcing and Saga Orchestration:
- Part 1: The Honest Truth About Event Sourcing
- Part 2: Event Sourcing with Azure - Building Blocks
- Part 3: Saga Orchestration - Distributed Transactions Done Right (You are here)
- Part 4: Implementing a Saga Orchestrator with Azure Durable Functions
- Part 5: Putting It All Together - Interview-Ready Knowledge