Consensus without the hype - Distributed systems notes

Consensus is one of those terms that attracts mythology: it sounds like a universal solution to distributed complexity. In practice, consensus is a very specific tool: a way for a group of machines to agree on a single sequence (or a single value) despite failures.

The purpose of this note is to cut through the romance. “Do we need consensus?” is an engineering question with a concrete answer: it depends on which invariants you need to preserve under failures, and where you are willing to place authority.

Quick takeaways

Consensus is about agreement under failuresNot about being fast, and not about eliminating all coordination.
It provides a single source of truthTypically a log of decisions: ordered, durable, and replicated.
The cost is latency and availability trade-offsCross-node quorum work affects tail latency and behaviour under partitions.
Many problems do not need consensusIf your invariant tolerates divergence, you can use weaker coordination.
Start from the invariant“Exactly once”, “one leader”, and “no double-spend” are the kinds of invariants that push you toward consensus.

Problem framing (what consensus is for)

Distributed systems fail in ways single-node systems do not: messages can be delayed, reordered, duplicated, or lost; machines can crash and later restart; networks can partition. If you need multiple machines to behave as if they were one authoritative machine for some decision, you need agreement in the presence of those failures.

Consensus gives you a replicated decision procedure. It does not prevent failures; it gives you rules for what to do when failures occur. The most practical “interface” of consensus is an ordered log: appending an entry means the group agrees that the entry exists at that position.

Diagram: quorum agreement produces an ordered log

Key concepts (definitions + mini examples)

Agreement, safety, and progress

Consensus is fundamentally a safety promise: two correct participants will not decide on different values for the same decision index. Many systems also want a liveness/progress promise: decisions eventually get made. Under partitions, you often have to choose which promise to prioritise.

Quorums and majority

Most practical consensus protocols rely on quorums (often majorities). The intuition: any two majorities overlap, so you can “carry” information forward across leadership changes and failures. The overlap is what prevents divergent histories from being simultaneously committed.

Leader-based consensus (common shape)

Many deployments use a leader to coordinate log appends. The leader proposes an entry; followers acknowledge; once a quorum acknowledges, the entry is committed. The details differ across protocols, but the shape is similar.

This shape matters for performance: it centralises coordination and adds a quorum round trip. That is why consensus is rarely the right tool for every write.

Practical checks (do you actually need it?)

1) Write down the invariant that must hold

Examples of invariants that tend to require strong coordination:

UniquenessAt most one leader at a time; at most one active lease for a key.
No double-spendA resource is consumed at most once, even under retries and failures.
Global orderingUpdates must be applied in a single agreed order (e.g., configuration changes).

If you cannot state an invariant, you likely do not need consensus.

2) Identify who is allowed to be wrong

In a partition, you may have to deny service to preserve safety. Ask: is it acceptable for part of the system to stop making progress to avoid inconsistency? If the answer is “no”, then you may need a different invariant, not a different protocol.

3) Consider weaker alternatives

Common alternatives include:

Single-writer by designRoute writes for a key to one shard and accept temporary unavailability during failover.
Idempotent operationsPermit retries and duplicates without violating invariants.
Commutative updatesDesign operations so order does not matter for correctness.

4) If you need consensus, isolate it

Use consensus for small, high-value decisions (leadership, configuration, membership, metadata), not for high-volume data writes. A common architecture is: consensus decides metadata; data operations remain local or sharded.

Common pitfalls

Using consensus as a databaseIt can store state, but it is expensive; isolate it to metadata and control-plane decisions.
Ignoring tail latencyQuorum work amplifies tail effects; build budgets and monitor p95/p99, not just averages.
Assuming partitions are rareEven brief partitions force your system to choose between safety and availability.
Confusing leader election with consensusElection is one instance; consensus is a repeated agreement mechanism.
Not specifying failure semanticsClients need to know what “uncertain commit” means and how to retry safely.

Related notes

Time, clocks, and orderingOrdering is what consensus provides; understanding causality helps interpret it.
Failure modes, retries, and idempotencyIdempotency is a key alternative to stronger coordination.
Model checking primerConsensus protocols are often reasoned about via small abstract models.
Queueing basics and latency budgetsQuorum traffic affects tail latency and budget allocation.

Back to topic • Back to Notes index