Clock Synchronization

UTC (Coordinated Universal Time) is the primary time standard used worldwide to track time. Time zones are an offset from UTC. Computers track time and try to stay close to UTC. They accomplish this by synchronizing their clocks with a system that knows the time.

How Computers Keep Time

Computers maintain time using hardware and software components. When a computer boots, the operating system reads time from a battery-powered real-time clock (RTC), a chip designed for low power consumption to survive power outages, not for accuracy. Once the OS is running, it maintains a more accurate system clock by reading high-resolution hardware counters, such as the Intel timestamp counter (TSC) or the ARM Generic Timer.

Most systems represent time as elapsed seconds since an epoch: January 1, 1970, 00:00:00 UTC for Unix systems, or January 1, 1601 for Windows. This representation avoids timezone confusion, daylight saving time ambiguities, and makes time arithmetic simple integer operations. The system clock that applications see is called wall time: it tracks UTC but can jump when synchronization corrections are applied.

Accuracy, Precision, and Resolution

Accuracy measures how close a measurement is to the true value. If your clock shows 12:00:00.005 and true UTC is 12:00:00.000, your clock has 5ms of error.

Resolution is the smallest time increment a clock can represent. A nanosecond-resolution clock can distinguish events 1 nanosecond apart. Higher resolution does not guarantee accuracy.

Precision is the consistency of repeated measurements. A clock consistently 5ms fast is precise but not accurate.

When we say “NTP achieves 10ms accuracy,” we mean clocks are within 10ms of true UTC, not that they measure time in 10ms increments.

Why Physical Clocks Drift

All physical clocks drift. A quartz oscillator’s frequency depends on temperature, manufacturing variations, atmospheric pressure, humidity, and aging. Consumer hardware typically drifts at 50-100 parts per million (ppm), meaning clocks can drift apart by almost nine seconds per day. Without synchronization, distributed systems quickly lose any agreement on time.

The Clock Model

A physical clock can be modeled as:

\[C(t) = \alpha t + \beta\]

where:

\(C(t)\) is the clock’s reading at true time \(t\)
\(\alpha\) represents the clock rate (ideally 1.0, but drift causes deviation)
\(\beta\) represents the offset from true time

Drift is the rate error – how fast a clock runs compared to true time. Offset is the instantaneous difference between a clock and true time. Even after perfect synchronization (zero offset), drift causes the offset to grow again. This is why periodic resynchronization is essential.

Clock Adjustment

When synchronization detects an offset, systems prefer slewing over stepping:

Slewing gradually adjusts the clock by making ticks slightly longer or shorter, maintaining monotonic time. This is preferred for small offsets (typically below 128ms).

Stepping instantly jumps to the correct time. This may be used for larger offsets (often ≥ 128ms). Stepping may break applications measuring durations or using timestamps (e.g., software build environments).

Cristian’s Algorithm

The simplest synchronization approach sends a request to a time server and receives a timestamped reply. The challenge is network delay: by the time the response arrives, it no longer reflects the current time. Cristian’s algorithm assumes the delay is symmetric and the timestamp was generated at the midpoint of that delay.

Algorithm:

Client sends request at local time t₀
Server responds with timestamp T_S
Client receives reply at t₁
Client estimates time as \(T_S + \frac{t_1 - t_0}{2}\)

In reality, the server’s time may have been generated before or after the midpoint of the delay, potentially leading to an error in the time value. If we know the best-case network transit time, it will place additional limits on the error beyond the overall delay.

Error bound: If the minimum one-way delay is t_min, the error will be:

\[\epsilon \leq \frac{(t_1 - t_0) - 2t_{\min}}{2}\]

Clients can retry to find the lowest round-trip time, which yields the tightest error bound.

Additive errors: When machines synchronize in chains (A from B, B from C), errors accumulate. A’s total error = ε_A + ε_B. This is why systems generally would try to avoid a deep hierarchy.

A limitation of Cristian’s algorithm is that it has a single point of failure: the server.

Network Time Protocol (NTP)

NTP solves the single point of failure problem through a hierarchical architecture:

Stratum 0: Reference sources (GPS, atomic clocks)
Stratum 1: Servers synchronizing directly from stratum 0
Stratum 2: Servers synchronizing directly from stratum 1 servers
Higher strata: Synchronize from lower strata (maximum 15 levels)

Fault tolerance through multiple sources: NTP encourages systems to query multiple time servers and use statistical techniques to identify and reject outliers. NTP combines the remaining time offset estimates using a weighted average, with more weight given to more reliable servers. NTP tracks each server’s jitter (delay variation) and dispersion (accumulated timing uncertainty), favoring more reliable sources.

Synchronization algorithm uses four timestamps:

T₁: Client sends request
T₂: Server receives request
T₃: Server sends response
T₄: Client receives response

Offset: \(\theta = \frac{(T_2 - T_1) + (T_3 - T_4)}{2}\)

The network delay is the round-trip time minus the estimate of the processing delay on the server:

Delay: \(\delta = (T_4 - T_1) - (T_3 - T_2)\)

Clock discipline gradually adjusts the system clock. For small offsets (< 128ms), it slews. For large offsets, it steps. The discipline learns and compensates for drift over time by adjusting the tick frequency of the system clock.

SNTP is a simplified subset suitable for clients that only consume time. It omits sophisticated filtering and clock discipline of full NTP.

Precision Time Protocol (PTP)

PTP achieves sub-microsecond synchronization through hardware timestamping. Network interface cards with PTP support capture packet transmission and receipt timestamps at the physical layer, eliminating millisecond-level variability from software network stacks.

Architecture: A grandmaster clock provides authoritative time. Unlike NTP, where clients initiate requests, PTP is master-initiated: the grandmaster periodically multicasts sync messages.

The Best Master Clock Algorithm (BMCA) automatically selects the most suitable grandmaster based on priority, clock quality, accuracy, and stability.

PTP uses a four-message exchange:

Sync message at T₁
Follow_Up containing T₁
Delay_Req from slave at T₃
Delay_Resp containing T₄

The first two messages are due to some hardware limitations; the only purpose of the second message is to sent the timestamp of the Sync message (T₁).

Offset: \(\frac{(T_2 - T_1) - (T_4 - T_3)}{2}\)

Cost: Unlike NTP, PTP requires specialized network cards and switches with hardware timestamping support.

When Physical Time Is Not Enough

Even perfectly synchronized physical clocks cannot order events that occur faster than clock resolution. At hundreds of thousands of events per second, many events share the same timestamp. More fundamentally, network delays obscure true ordering: an event timestamped earlier at one machine might arrive at another machine after local events with later timestamps.

For distributed databases, what matters is causality: whether one update could have seen another, not chronology. This leads to logical clocks.

What You Don’t Need to Study

Specific oscillator frequencies (e.g., 32,768 Hz for RTC) or piezoelectric physics
ppm values for specific oscillator types
The Intel TSC or ARM Generic Timer
Windows epoch date (1601); knowing the Unix epoch (1970) is sufficient
Any exact thresholds for slewing vs stepping
Details of adjtimex system call
Specific accuracy numbers for different NTP configurations
Formula for computing NTP or PTP
PTP message format details