pk.org: CS 417/Lecture Notes

Clock Synchronization

Keeping Time in Distributed Systems

Paul Krzyzanowski – February 07, 2026

Goal: Keep machines’ clocks aligned to a common reference, typically UTC, within a known error bound, so timestamps and time-based algorithms behave consistently across a distributed system.

When Charles V retired in weariness from the greatest throne in the world to the solitude of the monastery at Yuste, he occupied his leisure for some weeks trying to regulate two clocks. It proved very difficult. One day, it is recorded, he turned to his assistant and said: “To think that I attempted to force the reason and conscience of thousands of men into one mould, and I cannot make two clocks agree!”
– Havelock Ellis, The Task of Social Hygiene, Chapter 9

Introduction

Computers need to know what time it is. Operating systems use time to schedule tasks, expire cached data, implement timeouts, and timestamp events. Applications use time for deadlines, rate limiting, session management, and audit trails.

Distributed systems rely on machines agreeing on the time. Logs are timestamped. Public key certificates and authentication tokens have expiration times. Cache entries have time-to-live values. Distributed coordination and processing rely on leases and deadlines.

These mechanisms assume that each machine’s concept of “now” is close enough to a shared reference value that time-based decisions remain correct.

The challenge is that every computer maintains its own clock, and these clocks inevitably drift apart. A distributed system, by definition, has no shared global clock. This creates the need for clock synchronization algorithms that can keep clocks reasonably aligned despite the inherent imperfections of physical timekeeping.

Coordinated Universal Time (UTC)

UTC (Coordinated Universal Time)1 is the primary time standard by which the world regulates clocks and time. It is essentially a successor to Greenwich Mean Time (GMT), though technically they are different (GMT is a time zone, UTC is a time standard).

UTC provides the reference against which all clock synchronization protocols ultimately measure accuracy. When we say a synchronization achieves “10 milliseconds accuracy,” we mean clocks are within 10 milliseconds of UTC.

UTC is based on International Atomic Time (TAI), which is measured by atomic clocks. However, UTC is not “pure” atomic time.

Earth’s rotation is gradually slowing down due to tidal friction. A solar day (noon to noon) is getting longer, which means a day is no longer exactly 86,400 seconds of atomic time. To keep UTC aligned with Earth’s rotation, leap seconds are occasionally inserted. When a leap second occurs, UTC time goes from 23:59:59 to 23:59:60 to 00:00:00 instead of the normal 23:59:59 to 00:00:00. Leap seconds are announced several months in advance by the International Earth Rotation and Reference Systems Service.

How Computers Measure Time

Understanding clock synchronization requires understanding how computers keep time in the first place. A computer has multiple time sources with different roles.

The Battery-Backed Real-Time Clock

Most computers contain a battery-powered real-time clock (RTC) chip. The battery ensures the RTC keeps running even when the computer is powered off, which is why your computer knows the approximate time when you boot it after unplugging it for a week.

The RTC contains a quartz crystal oscillator that vibrates at a stable frequency (typically 32,768 Hz, chosen because it is 2¹⁵, making it easy to divide down to 1 Hz). The RTC counts these oscillations and increments its time registers accordingly.

When the computer boots, the operating system reads the RTC time to initialize the system clock. After that, the OS usually relies on higher-resolution timers and counters for timekeeping while the system runs, and it may write corrected time back to the RTC periodically or during shutdown.

The RTC is designed for low power consumption to provide continuity across reboots, not accuracy. It often drifts enough to gain or lose seconds per day.

The System Clock and Interrupts

Once the operating system is running, it maintains a software clock called the system clock. It usually has much finer resolution than the battery-backed RTC, and the OS can periodically adjust it to stay close to real time.

The kernel measures elapsed time by reading a high-resolution hardware counter and converting counter ticks to seconds and nanoseconds using a known tick frequency.

On x86 systems, this counter is often the timestamp counter (TSC). On current machines the TSC typically increments at a constant rate (not necessarily one tick per CPU cycle), so the kernel can convert TSC deltas to time after calibrating the TSC rate against a reference timer.

On ARM systems, the analogous mechanism is the ARM Generic Timer counter (for example, CNTVCT_EL0 or CNTPCT_EL0), which increments at a fixed frequency reported by CNTFRQ_EL0.

When an application requests the current time, the operating system reads the counter and converts it to a time value. Most systems now represent time as the number of seconds (and fractions of seconds) since a reference point called the epoch.

Counting Time: The Epoch

Most systems represent time as elapsed time since a fixed reference point called the epoch.

In Unix and Unix-like systems (including Linux, macOS, and others), the epoch is January 1, 1970, 00:00:00 UTC. Windows systems use a different epoch: January 1, 1601, 00:00:00 UTC, storing time as the number of 100-nanosecond intervals since that date in a FILETIME structure. The different epoch dates matter when converting between systems, but both approaches share the same fundamental principle: tracking elapsed time from a fixed reference point.

Representing time as elapsed seconds has several advantages over storing the current date and time:

No time zone confusion:
A timestamp is absolute. It represents the same moment in time regardless of where you are. When you need to display time to a user, you convert the timestamp to their local time zone. But internally, the system works with UTC, avoiding the complexity of time zones entirely.
No daylight saving time problems:
Twice a year, many regions adjust their clocks forward or backward by an hour. If you stored time as “the current hour of the day,” these transitions would create ambiguity. Did an event happen during the first occurrence of 1:30 AM or the second one when clocks fall back? Elapsed seconds from the epoch never have this problem.
Easy arithmetic:
If you want to know how much time elapsed between two events, simply subtract their timestamps. If you want to add two hours to a date, simply add 2 × 60 × 60 seconds. Time arithmetic becomes integer arithmetic.
Sortable:
Timestamps have a natural ordering. Newer events have larger timestamps. This makes indexing, sorting, and comparing times trivial.

This representation is particularly important in distributed systems. If different machines recorded time as local date-time strings, you would need to parse and convert time zones to compare events. With timestamps, comparison is simple integer arithmetic.

Wall Time vs. Monotonic Time

Wall time is a term used for the time-of-day that an operating system presents to applications as a calendar clock. On Unix-like systems, it is typically represented as seconds (and fractions) since an epoch, then converted to human-readable forms using time zone and daylight-saving rules. Wall time is meant to track UTC closely, but it is still a local estimate maintained by the OS.

Wall time may jump when the system corrects it.

A separate clock, monotonic time, is used to measure elapsed intervals and should not go backward. Monotonic time is the right tool for timeouts and measuring durations.

Quartz Oscillator Imperfections

No physical clock is perfect.

A quartz crystal oscillator works by exploiting the piezoelectric effect: applying voltage causes the crystal to vibrate, and the vibration generates voltage. This creates a feedback loop that sustains oscillation at the crystal’s resonant frequency.

The actual frequency depends on manufacturing tolerances, temperature, atmospheric pressure, humidity, and aging. Even small variations in errors become noticeable over time. The result is that every computer’s clock drifts.

The quartz oscillators used in computers operate to approximately 50 parts per million (ppm). This equates to a variation of up to four seconds per day or 10 minutes per year:

\[50 \times 10^{-6} \times 86{,}400 \approx 4.32 \text{ seconds per day}\]

Higher-quality Temperature-Compensated Crystal Oscillators (TCXOs) reduce drift to 1-5 ppm through compensation circuitry. Oven-Controlled Crystal Oscillators (OCXOs) maintain constant temperature, achieving drift below 0.01 ppm.

Your PC doesn’t have those.

Without synchronization, two computers with 50 ppm drift in opposite directions will be skewed by almost nine seconds after one day.

This is why periodic synchronization is necessary. Even if you perfectly synchronize all clocks at noon, by midnight, they will have drifted apart.

Clock Synchronization Fundamentals

Drift and offset

Since all physical clocks are slightly imperfect, so two machines will not keep identical time even if they start out synchronized.

Two quantities capture most of what matters: drift (rate error) and offset (current error).

Drift (rate error)

Drift is how fast a clock runs compared to an ideal reference. It is a property of the clock’s rate, not its current reading.

If a clock runs fast by 50 parts per million (50 ppm), then after one real second, it advances by \(1 + 50 \times 10^{-6}\) seconds. Over long intervals, this rate error accumulates into a growing time error.

Drift is often specified as a bound, such as “within 100 ppm,” meaning the clock’s rate can be off by up to \(100 \times 10^{-6}\) of real time. Temperature changes, aging, and manufacturing variation all affect drift.

Offset (difference from a reference)

The offset is how far a clock’s current reading differs from a reference clock at the same real time.

If a reference clock reads \(T\) and a local clock reads \(C\) at that same instant, the local clock’s offset is:

\[\text{offset} = C - T\]

A positive offset means the local clock is ahead of the reference. A negative offset means it is behind. Synchronization protocols primarily estimate offset and then correct it, either by stepping the clock or by slewing it gradually.

How drift and offset relate

Offset is a snapshot of error “right now.” Drift is what makes that snapshot get worse over time.

Even if two machines have a zero offset immediately after synchronization, their clocks will drift at slightly different rates, so the offset will grow again. That is why synchronization is not a one-time action. It is a continuous discipline: estimate offset repeatedly, and adjust the clock so that drift does not accumulate into an unacceptable offset.

The Basic Clock Model

A physical clock can be modeled mathematically as:

\[C(t) = \alpha t + \beta\]

where:

For example, if \(\alpha = 1.0001\) (100 ppm fast) and \(\beta = 5\) seconds, the clock runs 100 ppm faster than real time and is currently 5 seconds ahead.

Clock synchronization aims to reduce offset \(\beta\) (offset) and keep \(\alpha\) close to 1 by estimating and compensating for frequency error. Perfect synchronization would have \(\alpha = 1\) and \(\beta = 0\).

Compensating for Drift and Offset

When a clock synchronization algorithm determines that a clock is offset by \(\beta\) and running at rate \(\alpha\), the system must adjust both. Simply setting the clock forward or backward by \(\beta\) would create a discontinuity that could break applications.

Instead, systems slew the clock: they gradually adjust the clock by changing its effective rate. The system makes each clock tick slightly longer or shorter, allowing it to catch up or fall back smoothly as time continues to move forward monotonically.

For example, if the clock is 100ms slow and running at the correct rate, the system might make each tick 0.1% longer for the next 1000 ticks, allowing the clock to catch up gradually. This is called slewing the clock.

If the clock is also drifting (\(\alpha \neq 1\)), the system computes a permanent frequency adjustment. If the clock runs 50 ppm fast, the system permanently reduces the tick length to compensate, preventing the drift from accumulating. On Linux and macOS, the adjtimex system call accomplishes this.

When the offset is very large (typically more than 128ms), NTP will step the clock instead: it jumps the time to the correct value instantly. Stepping can break applications that measure durations, which is why it is avoided when possible.

This is why periodic resynchronization is essential. Even after computing a drift compensation, the drift rate continues to change with temperature and aging. Without regular synchronization measurements, the compensation becomes stale, and errors accumulate.

Systems synchronize periodically and adjust both offset and frequency over time. Polling intervals vary by implementation, configuration, and network conditions, and they are often adaptive: stable conditions allow longer intervals, while instability triggers more frequent sampling.

Accuracy, Precision, and Resolution

The terms accuracy, precision, and resolution have distinct meanings in the context of clocks:

Accuracy: How close a measurement is to the true value (typically UTC). If your clock shows 12:00:00.005 and true UTC is 12:00:00.000, your clock has 5ms of error.

Precision: How consistent measurements are. A clock that always shows the same offset from true time (e.g., consistently 5ms fast) is precise but not accurate. In clock synchronization, this shows up as low jitter.

Resolution: The smallest time increment a clock can represent. A clock with nanosecond resolution can distinguish events 1 nanosecond apart. Higher resolution does not imply high accuracy or high precision; you can have a nanosecond-resolution clock that is seconds away from true time.

When we say “NTP achieves 10ms accuracy,” we mean clocks are within 10ms of true UTC, not that they measure time in 10ms increments (that would be resolution).

Clock Synchronization Algorithms

The naive approach to synchronizing time is to simply ask a server: use a remote procedure call to send a request and get a response.

Clock synchronization over a network faces an inherent problem: network transmission takes time, and that time is variable.

If you ask a time server “what time is it?” and it responds “3:00:00,” by the time you receive that response, it is no longer 3:00:00. If the response took 100 milliseconds to arrive, the actual time is closer to 3:00:00.100, but the round-trip time varies with network conditions. Sometimes it is 100 milliseconds, sometimes 150, sometimes 50.

Every synchronization algorithm must deal with this variable delay. The better the algorithm can measure or bound the delay, the more accurately it can synchronize clocks.

Cristian’s Algorithm and Bounded Error

Cristian’s algorithm synchronizes a client with a time server via a simple request-reply exchange and compensates for network delay by measuring round-trip time and attributing part of it to message transit times.

  1. The client sends a request at local time \(t_0\).

  2. The server replies with a single timestamp \(T_S\), taken as close as possible to the moment the reply is sent.

  3. The client receives the reply at local time \(t_1\).

If the network delay were symmetric, a reasonable estimate of the true time at receipt is:

\[T_new = T_S + \frac{t_1 - t_0}{2}\]

In real networks, symmetry is not guaranteed. A more useful result is an error bound when a best-case one-way latency is known.

Assume the minimum one-way delay between client and server is \(t_{\min}\), and assume the server’s timestamp \(T_S\) is taken at send time (so server processing time is negligible). When the client receives the reply at time \(t_1\), the true time must lie in the interval:

\[T_S + t_{\min} \le T_{\text{true}} \le T_S + (t_1 - t_0) - t_{\min}\]

The interval width is:

\[(t_1 - t_0) - 2t_{\min}\]

If the client sets its clock to the midpoint of this interval, the worst-case error is half the width:

\[\epsilon \le \frac{(t_1 - t_0) - 2t_{\min}}{2}\]

Low-delay results are the most useful ones because they are closest to the best-case path, which tightens the error bound and reduces bias from queueing and delay asymmetry. A computer may try multiple synchronization sequences to find one with the lowest round-trip time.

Errors Are Additive

When machines synchronize in a chain (A synchronizes from B, which synchronized from C), errors accumulate.

Suppose B synchronizes from C with an error \(\varepsilon_B\), and A synchronizes from B with an error \(\varepsilon_A\). Then A’s total error relative to C is \(\varepsilon_A + \varepsilon_B\).

This is why clock synchronization hierarchies (like NTP) limit the depth of the chain. Each additional hop in the hierarchy adds error. NTP supports up to 15 levels (strata), but practical deployments rarely go beyond 5 or 6.

Network Time Protocol (NTP)

NTP, the Network Time Protocol, was created because widely deployed networks needed a way to keep machines aligned to UTC without special hardware, across variable-delay IP paths, and at Internet scale. The protocol was designed to be:

It is now the dominant protocol for clock synchronization over the Internet and LANs.

NTP also does more than exchange timestamps. It includes clock discipline algorithm: a feedback loop that repeatedly estimates how far the local clock is off and whether it is running slightly fast or slow, then makes small, gradual adjustments so the clock stays close to the reference time without jumping around.

UDP and not TCP

Clock synchronization prefers datagrams over reliable streams because the reliability mechanisms of TCP work against getting good synchronization:

NTP does implement its own notion of retry and selection across samples, but it does so at the application level, where it can reject outliers rather than being forced to accept delayed retransmissions as “success.”

NTP Architecture

NTP organizes time servers into a hierarchy of strata:

Stratum 0: These are not networked devices but reference time sources like GPS receivers, atomic clocks, and radio time signal receivers. They provide authoritative time.

Stratum 1: Servers directly connected to stratum 0 devices. A GPS-connected server is stratum 1. These are the most accurate NTP servers. An NTP response identifies what the underlying time source is.

Stratum 2: Servers that synchronize from stratum 1 servers. They may poll multiple stratum 1 servers and use majority voting.

Stratum 3: Servers that synchronize from stratum 2 servers.

And so on, up to stratum 15. Stratum 16 indicates an unsynchronized device.

The stratification serves several purposes. It creates a tree-like distribution topology that scales well. Higher strata servers reduce load on lower strata servers. Most importantly, it bounds the synchronization error by limiting the chain length (a tree is less deep than a comparable list).

Fault tolerance through multiple sources: Each NTP server typically synchronizes from multiple sources at the same or lower stratum. Instead of trusting a single time source, NTP queries several servers and uses statistical techniques to identify and reject outliers (faulty or malicious clocks).

For example, if a server queries four time sources receiving values of 12:00:00, 12:00:01, 12:00:02, and 12:57:32, it can identify 12:57:32 as an outlier and ignore it. NTP maintains statistics on each server’s reliability and favors servers with lower jitter (variation in network delay) and dispersion (accumulated timing uncertainty). After rejecting outliers, NTP computes a weighted average of the offset estimates from the remaining servers, giving more weight to more reliable sources. The final time adjustment comes from this averaged estimate, not from selecting a single “best” server. This ensures that if one source fails or provides bad time, data the server can detect and ignore it without losing synchronization.

NTP Synchronization Process

NTP uses a series of message exchanges to synchronize. Like Cristian’s algorithm, it measures round-trip time to estimate offset and delay:

  1. Client sends request timestamped \(T_1\) (client time)

  2. Server receives request, timestamps arrival as \(T_2\) (server time)

  3. Server sends response, timestamps departure as \(T_3\) (server time)

  4. Client receives response, timestamps arrival as \(T_4\) (client time)

The offset \(\theta\) (difference between client and server clocks) is:

\[\theta = \frac{(T_2 - T_1) + (T_3 - T_4)}{2}\]

The round-trip delay \(\delta\) is:

\[\delta = (T_4 - T_1) - (T_3 - T_2)\]

NTP collects multiple samples of \((\theta, \delta)\) pairs. It applies a filter algorithm to select the best samples, typically preferring those with low delay (less queueing) and low jitter (consistent delay).

NTP Clock Adjustment: The Discipline Algorithm

NTP estimates that the correct time is the client’s current time plus an offset, \(\theta\). However, NTP does not simply set the clock to that value. Abrupt time changes can break software that assumes time increases monotonically.

Instead, NTP uses a clock discipline algorithm that gradually steers the clock toward the correct time. For small offsets (often less than about 128 ms in common configurations), NTP slews the clock: it adjusts the clock’s frequency to make it run slightly faster or slower until it converges to the reference time. For larger offsets (often above about 128 ms), NTP steps the clock: it jumps the time instantly to the corrected value. Stepping can cause applications to break, so it is avoided when possible.

For very large offsets (on the order of 1,000 seconds or more), many implementations refuse to adjust automatically unless configured to do so, requiring manual intervention or an explicit “step at startup” mode.

The discipline algorithm also estimates the clock’s drift rate. Over time, NTP learns that a particular system’s clock runs, say, 35 ppm fast. It continuously compensates for this drift, keeping the clock accurate even if network synchronization becomes temporarily unavailable.

NTP Accuracy

NTP’s achievable accuracy depends heavily on network conditions and hardware:

Over the Internet: Typical accuracy is 10-100 milliseconds. Factors include:

On a LAN: Modern NTP implementations like chrony can achieve sub-millisecond accuracy on a LAN:

With GPS reference and hardware timestamping: Accuracy can reach tens of microseconds:

Simple Network Time Protocol (SNTP)

SNTP is a simplified version of NTP intended for systems that do not require NTP’s full server-selection logic, filtering, or clock-discipline algorithms. An SNTP client typically queries a single time server and performs basic offset calculation without maintaining the peer statistics and filter state that full NTP uses to improve accuracy. This typically yields looser error bounds than a full NTP implementation.

SNTP is common in embedded and IoT devices and in lightweight clients that only need “reasonable” wall time. Many general-purpose operating systems ship a full NTP-capable client (or equivalent), but “SNTP” is still used as a label for simpler NTP-compatible clients and services.

Why not put a GPS receiver in every machine?

GPS can provide a high-quality time reference, so it is natural to ask why not simply use it everywhere and not bother with synchronizing clocks?

In practice, GPS is often used to create a small number of high-quality time sources, which then distribute time to other machines using NTP or PTP.

Precision Time Protocol (PTP)

While NTP provides millisecond accuracy, some applications require much tighter synchronization. IEEE 1588 Precision Time Protocol addresses this need.

The Key PTP Difference: Hardware Timestamping

NTP’s accuracy is limited by software timestamps. When an NTP packet arrives at a network interface, it must pass through the operating system’s network stack, which can take milliseconds and varies with system load. The timestamp is captured in software after these delays.

PTP-capable network interface cards (NICs) capture timestamps in hardware at the physical layer, the instant a packet enters or leaves the wire. This eliminates nearly all software delay and jitter. Instead of millisecond-level variability, timestamps are accurate to nanoseconds.

This requires special hardware support. Not all NICs support PTP. Those that do contain additional circuitry and a high-resolution timer synchronized to an oscillator.

PTP Architecture

Like NTP, PTP uses a master-slave hierarchy. The grandmaster clock is the authoritative time source in a PTP domain. Ordinary clocks synchronize to the grandmaster.

Unlike NTP, PTP synchronization is master-initiated. The grandmaster periodically multicasts sync messages to all slaves. Slaves do not request synchronization; it happens automatically at intervals (typically every second, but configurable).

Best Master Clock Algorithm (BMCA)

A PTP domain may contain multiple clocks capable of acting as a grandmaster. BMCA selects the active grandmaster automatically and provides failover if it disappears.

Clocks periodically send Announce messages describing their properties. Each device runs the same comparison and converges on the same “best” candidate using an ordered comparison. The first field that differs determines the winner:

  1. priority1 (administrator-configured preference)

  2. clockClass (traceability and holdover quality)

  3. clockAccuracy (advertised accuracy category)

  4. offsetScaledLogVariance (stability, roughly short-term variance)

  5. priority2 (secondary administrator preference)

  6. clockIdentity (unique tie-breaker)

If the current grandmaster stops sending Announce messages, the remaining clocks re-run the selection and converge on a new grandmaster without manual intervention.

The Four-Message Exchange

PTP uses four timestamped messages to compute the clock offset and network delay:

  1. Sync message: Master sends a sync message at precise time \(T_1\) (master time). The slave receives this message at time \(T_2\) (slave time).

  2. Follow_Up message: Master sends this message carrying \(T_1\) (this two-step process is used because hardware may not be able to embed the timestamp in the sync message itself).

  3. Delay_Req message: Slave sends a delay request at time \(T_3\) (slave time)

  4. Delay_Resp message: Master receives the request at time \(T_4\) (master time) and responds with \(T_4\)

The slave now has four timestamps: \(T_1\), \(T_2\) (when it received the sync), \(T_3\) (when it sent the delay request), and \(T_4\) (when master received the delay request).

Computing Offset and Delay

The master-to-slave delay is the time for the sync message to travel from master to slave:

\[\text{delay}_{\text{MS}} = T_2 - T_1\]

The slave-to-master delay is:

\[\text{delay}_{\text{SM}} = T_4 - T_3\]

Assuming symmetric paths, the one-way delay is:

\[\text{delay} = \frac{\text{delay}_{\text{MS}} + \text{delay}_{\text{SM}}}{2} = \frac{(T_2 - T_1) + (T_4 - T_3)}{2}\]

The offset (difference between master and slave clocks) is the master-to-slave delay minus what it should be:

\[\text{offset} = T_2 - T_1 - \text{delay}\] \[= T_2 - T_1 - \frac{(T_2 - T_1) + (T_4 - T_3)}{2}\] \[= \frac{T_2 - T_1}{2} - \frac{T_4 - T_3}{2}\]

Simplifying:

\[\text{offset} = \frac{(T_2 - T_1) - (T_4 - T_3)}{2}\]

The slave adjusts its clock by this offset. With hardware timestamping, the timestamps are accurate to tens of nanoseconds, so the offset calculation is highly precise.

Transparent and Boundary Clocks

In a network with multiple switches, variable switch residence time and queueing can add jitter to timing packets. PTP-aware switches can improve accuracy:

Transparent clocks measure how long a PTP event message spends inside the switch (its residence time) and update the message’s correction field so downstream devices can compensate for that added delay. A transparent clock is not a master for downstream devices; it forwards timing messages while accounting for its own contribution.

Boundary clocks act as PTP slaves on one port and masters on other ports. They synchronize to an upstream master, then serve as a master to downstream slaves. This segments the network, preventing jitter from accumulating across many switches.

PTP Use Cases

PTP is used in various applications that require sub-microsecond synchronization. For example:

The Cost of Precision

PTP’s precision comes at a cost:

Hardware requirements: PTP-capable NICs cost more than standard NICs. PTP-aware switches are significantly more expensive than regular switches.

Infrastructure: A complete PTP deployment requires grandmaster clocks (often GPS-disciplined), transparent or boundary clock switches, and PTP-enabled NICs on all hosts.

Complexity: Configuring PTP profiles, managing failover, and troubleshooting synchronization problems requires expertise.

For applications that can tolerate millisecond accuracy, NTP remains the practical choice. PTP is worth the investment only when microsecond or sub-microsecond accuracy is truly necessary.

Cloud Time Services

Cloud providers increasingly offer high-accuracy time services:

AWS Time Sync Service: Available within Amazon VPC, provides time accurate to microseconds using a combination of GPS and atomic clocks across AWS regions.

Azure Precision Time Protocol: Microsoft’s PTP implementation for Azure, targeting sub-millisecond accuracy.

Google’s Public NTP: Google operates public NTP servers (time.google.com) with leap second smearing and sub-millisecond accuracy for many users. Smearing is a specific form of slewing where the leap second is intentionally spread out over a longer duration (like 24 hours) to avoid a sudden jump.

These services lower the barrier to accurate time synchronization. Instead of deploying GPS clocks and PTP infrastructure, cloud-native applications can use provider-managed time services.

However, cross-cloud or hybrid deployments still face challenges. If your application spans AWS and Azure, you still need to account for potential clock skew between providers.

TrueTime: A Brief Introduction

Google’s Spanner database uses a system called TrueTime that represents a different approach to distributed time. Instead of providing a single time value, TrueTime provides an interval guaranteed to contain true UTC time.

For example, TrueTime might return the UTC interval [12:00:00.003, 12:00:00.010], indicating that true UTC lies somewhere within this 7-millisecond window. Google achieves this using GPS clocks and atomic clocks in each data center, with frequent synchronization between them, keeping the uncertainty interval below 7 milliseconds.

Spanner uses these intervals to provide external consistency (strict serializability) across globally distributed transactions. When a transaction commits, Spanner waits until the uncertainty interval has passed before acknowledging. This ensures any subsequently started transaction has a definitively later timestamp.

We will cover TrueTime and Spanner’s consistency model in detail when we discuss distributed databases. For now, it is important to know that TrueTime exists and represents the cutting edge of distributed time synchronization, though it requires specialized infrastructure not available to most systems.

Summary

Physical clock synchronization aims to keep clocks across distributed systems aligned with UTC time. This is necessary for applications that need wall-clock time for timeouts, cache expiration, logs, audit trails, and coordination with external systems.

Computers measure time using quartz oscillators that inevitably drift due to temperature, manufacturing variations, and aging. Drift rates of around 50 ppm are common, causing clocks to drift apart without periodic synchronization.

Cristian’s algorithm provides basic client-server synchronization by measuring round-trip time and compensating for network delay. Error bounds can be computed if minimum transmission time is known, and errors accumulate in synchronization chains.

NTP provides robust, scalable clock synchronization over the Internet using a hierarchical architecture. It achieves 10-100 milliseconds over the Internet, sub-millisecond on LANs with modern implementations. The clock discipline algorithm slews the clock for small offsets (gradual adjustment) and steps it for large offsets (instant jump).

PTP achieves sub-microsecond synchronization in LANs using hardware timestamping. Masters periodically initiate synchronization rather than waiting for client requests. The offset and delay computation uses four timestamps captured in hardware. PTP requires specialized NICs and switches, limiting deployment to applications that truly need microsecond precision.

Understanding physical clock synchronization provides the foundation for distributed systems that need wall-clock time. However, many consistency and ordering problems require logical clocks, which we will cover separately.


Next: Logical Clocks

Back to CS 417 Documents


  1. Coordinated Universal Time in English and Temps Universel Coordonné in French. The abbreviation UTC was chosen to favor neither language.