The Limits of Physical Time
Physical clock synchronization cannot solve all ordering problems in distributed systems. Even with perfectly synchronized clocks, events can occur faster than clock resolution. Many events share the same timestamp. More fundamentally, network transmission delays mean that when an event is timestamped at one machine, its timestamp may not reflect when it becomes visible elsewhere.
Consider a distributed database where two clients concurrently update the same record. What matters is not when the updates occurred in absolute time, but whether one update could have seen the other. If the updates are truly concurrent (neither saw the other), the system needs conflict resolution. If one happened after the other, the later one supersedes the earlier. This is a question of causality, not chronology.
The Happened-Before Relationship
Leslie Lamport’s 1978 paper revolutionized distributed systems by recognizing that physical time does not matter. What matters is the potential causal relationship between events. Lamport defined the happened-before relationship, written \(\rightarrow\), as a partial ordering on events. A partial ordering allows some pairs of elements to be incomparable: concurrent events have no ordering. This contrasts with a total ordering where every pair must be ordered consistently.
Definition: For events a and b:
-
If a and b occur on the same process and a occurs before b in that process’s execution, then \(a \rightarrow b\)
-
If a is the event of sending a message and b is the event of receiving that message, then \(a \rightarrow b\)
-
If \(a \rightarrow b\) and \(b \rightarrow c\), then \(a \rightarrow c\) (transitivity)
If neither \(a \rightarrow b\) nor \(b \rightarrow a\), we say a and b are concurrent, written \(a \parallel b\). Concurrent events happened independently with no potential causal influence.
The happened-before relationship captures potential causality: if \(a \rightarrow b\), then information from event a could have reached event b through the system’s communication channels.
Lamport Timestamps
Lamport timestamps assign each event a logical clock value such that if \(a \rightarrow b\), then timestamp(a) < timestamp(b).
Note the direction: we guarantee causally ordered events have increasing timestamps, but the converse is not true. If timestamp(a) < timestamp(b), we cannot conclude that \(a \rightarrow b\). The events might be concurrent.
Algorithm:
-
Each process maintains a counter, initially zero
-
On an internal event: increment the counter
-
When sending a message: increment the counter and include its value in the message
-
When receiving a message with timestamp T: set counter = max(counter, T) + 1
This ensures the happened-before property: if \(a \rightarrow b\) through a chain of events and messages, each step increases the counter, so timestamp(a) < timestamp(b).
Creating a total ordering: Lamport timestamps only provide a partial ordering: some timestamps may be identical. We can extend them to make each timestamp unique by ombining Lamport timestamps with process IDs:
\((t_1, p_1) < (t_2, p_2)\) if \(t_1 < t_2\), or if \(t_1 = t_2\) and \(p_1 < p_2\)
This breaks ties by process ID, giving every pair of events a definite order even if they are concurrent.
Limitation: Lamport timestamps cannot detect concurrency. If you observe timestamp(a) < timestamp(b), you cannot determine whether a causally precedes b or whether they are concurrent.
Vector Clocks
Vector clocks fully capture causal relationships and can detect concurrent events. Instead of maintaining a single counter, each process maintains a vector of counters: one for each process in the system.
For a process Pi, the entry Vi[j] represents “process i’s knowledge of how many events process j has executed.” Initially all entries are zero.
Algorithm:
-
On an internal event at Pi: increment Vi[i]
-
When sending a message from Pi: increment Vi[i] and include the entire vector in the message
-
When receiving a message with vector Vmsg:
-
For all k: set Vi[k] = max(Vi[k], Vmsg[k])
-
Then increment Vi[i]
-
In words, what vector clocks do is increment the count in the vector position representing the process. When a message is received, the process updates each element of the vector with the corresponding value in the received vector if that value is greater.
This propagates knowledge: when you receive a message, you learn everything the sender knew.
Comparing vector clocks:
Vectors Va and Vb satisfy:
-
\(a \rightarrow b\) if Va[i] ≤ Vb[i] for all i, and Va[j] < Vb[j] for at least one j
-
Same event if Va[i] = Vb[i] for all i
-
Concurrent (\(a \parallel b\)) if neither of the above (there exist indices where one is greater and other indices where it is smaller)
Implementation: In practice, vector clocks are implemented as sets of (processID, counter) tuples rather than fixed-size arrays. This handles systems where processes join dynamically and not all processes communicate with all others. You only track processes you have heard from.
Scalability: Each process maintains O(n) state where n is the number of processes. This can works well for dozens to hundreds of processes, but then the size of the vector can dominate the message.
Hybrid Logical Clocks (HLC)
Hybrid logical clocks bridge the gap between physical and logical time. Physical clocks provide real-world time but drift and jump. Logical clocks provide perfect causality but lose connection to wall-clock time. For some applications, like databases or version control systems, it’s useful to have both: human-friendly wall time and the ability to track handle causal relationships.
An HLC timestamp consists of two components:
-
L (logical component): A value close to physical time, representing the maximum physical time seen (from the system time or from a received message, if that was greater)
-
C (counter): Distinguishes events within the same clock tick
Together, (L, C) behaves like a logical clock while staying close to physical time.
Properties:
-
Preserves happened-before: if \(a \rightarrow b\), then HLC(a) < HLC(b) (lexicographic comparison)
-
Stays close to physical time: if clocks synchronized within ε using NTP, then l stays within ε of true time
-
Enables time-based queries while maintaining causality
HLC works with commodity servers that synchronize their clocks (typically via NTP).
Trade-off: HLCs are more complex than pure logical clocks, with timestamps that are slightly larger (two components), but they are an elegant solution for systems that need both causal consistency and time-based queries.
Process Identification
In distributed systems, “process ID” refers to a globally unique identifier for a computation entity, not a local Unix process ID. Common approaches to identify a process are:
-
Hostname + local PID: Simple but breaks if a process crashes and restarts
-
Node ID: Survives restarts but requires coordination
-
IP address + port: Works for long-running services with stable addresses
The reincarnation problem: If a process crashes and restarts, should it be the same process or a new one? Most systems treat it as new to avoid causality violations from lost state.
Choosing a Clock
For most distributed systems, the choice is between vector clocks and hybrid logical clocks:
Use vector clocks when:
-
You need to detect concurrent conflicting updates
-
Causality is critical for correctness
-
Number of processes is moderate (dozens to hundreds)
-
Examples: replicated databases, CRDTs, version control
Use hybrid logical clocks when:
-
You need both causality and approximate real time
-
You want time-based queries with causal consistency
-
Examples: distributed databases with MVCC
Lamport timestamps provide the conceptual foundation but are rarely used directly—systems that need ordering usually also need concurrency detection.
Matrix clocks exist for specialized applications requiring common knowledge tracking. They require O(n2) space and are rarely used in practice.
What You Don’t Need to Study
-
Matrix clock algorithm details or structure
-
Years of Lamport’s paper, vector clocks, HLC proposal
-
Specific database names beyond understanding representative examples
-
How systems implement version control or use clocks
-
Space complexity formulas (but understand O(1) for Lamport, O(n) for vector, O(n2) for matrix)