Goal: Create communication channels for processes on different machines to communicate

Communication as the Basis for Coordination

Processes running on different machines each have access to their local operating system mechanisms, but those mechanisms apply only within a single system. Shared memory regions, pipes, message queues, and kernel-managed synchronization primitives such as semaphores or mutexes cannot be used to communicate or synchronize with processes on other machines.

As a result, communication and coordination across machines must be implemented by sending messages over a network. This shift has deep consequences.

Network communication is slower, less predictable, and more prone to failure than local interprocess communication. Messages may be delayed, duplicated, reordered, or lost, and there is no implicit synchronization provided by the operating system across machines.

Distributed systems must therefore be designed around explicit message exchange rather than relying on local operating system abstractions.

Sharing a Network: Why Packet Switching Exists

If every pair of communicating computers had a dedicated physical connection between them, communication would be simple. There would be no contention, no interference, and no uncertainty. Unfortunately, this approach does not scale. It is wasteful, inflexible, and impractical for large numbers of communicating systems.

The core challenge in networking is sharing communication infrastructure among many participants while allowing them to communicate concurrently. This is known as the multiple access problem: how to coordinate multiple transmitters so they can send data without interfering with each other.

Early networks explored several approaches. Channel partitioning divides the network into fixed slots, either in time (time division multiplexing) or frequency (frequency division multiplexing). Taking-turns approaches grant permission to transmit through polling or token passing. These techniques provide predictability but are often inefficient and brittle in the presence of failures.

Modern networks overwhelmingly use packet switching. Data streams are divided into packets, which are transmitted independently and interleaved with packets from other senders. Each packet contains header data that identifies the source and destination addresses. This form of statistical multiplexing enables the efficient use of network resources and supports a large number of communicating systems.

Packet switching trades predictability for flexibility and cost. There are no guarantees about delivery time, order, or even delivery itself. Distributed systems must be built with this reality in mind.

A Layered View of Networking

Networking is implemented as a layered stack of protocols, each responsible for a specific function. The most commonly referenced conceptual model is the OSI reference model¹, which defines seven layers. While the OSI model does not correspond exactly to any real network stack, it provides a useful vocabulary for reasoning about where functionality lives.

Although the OSI reference model defines seven layers, IP networking does not implement all of them as part of the operating system. In practice, the operating system provides support through the transport layer: local communication on a physical network, packet routing across networks using IP, and process-to-process communication using transport protocols such as TCP and UDP. Layers above the transport layer are not provided by the network stack. Application-level protocols, data representation, security mechanisms, and semantics are implemented entirely in application software.

The layers most relevant to distributed systems are the data link, network, transport, and application layers.

The data link layer handles communication on a single physical network. Technologies such as Ethernet and Wi-Fi operate at this layer. They define how packets are transmitted over a shared medium and how devices on the same local network identify each other. This layer forms the local area network.

The network layer is responsible for routing packets across multiple networks. This is where the Internet Protocol (IP) operates. IP allows data to be sent between machines that are not directly connected, potentially traversing many intermediate networks via routers, which are special-purpose computers that receive an IP packet on one interface and forward it on another interface toward its destination.

The transport layer provides process-to-process communication. Instead of sending packets to machines, applications can send data to other applications. TCP and UDP operate at this layer.

Higher layers handle application-specific protocols and data representation.

The key point is that each layer builds on the services of the one below it while hiding its details. Each layer is defined by a stable interface rather than a specific implementation, which allows it to be replaced or modified without affecting the layers above or below it.

Conceptually, each layer communicates with the corresponding layer on the remote system. For example, the transport layer on one host communicates with the transport layer on another, even though the data is actually carried by lower layers in between. This structure allows protocols to be designed and reasoned about independently at each layer.

For example, Ethernet can be replaced with other link-layer technologies without changing the Internet Protocol. At the network layer, IPv4 can be replaced with IPv6 without requiring changes to transport protocols such as TCP or UDP. This separation of concerns allows network technologies to evolve while preserving compatibility with existing software.

The Internet as a Packet-Switched Network

The Internet is a logical network that interconnects many physical networks. It makes very few assumptions about the underlying technologies. Ethernet, Wi-Fi, fiber optic links, and long-haul optical transport networks can all carry Internet traffic.

This design originated in early packet-switched networks such as ARPANET. The guiding principles were simplicity, robustness, and the absence of centralized control.

IP assumes unreliable communication. Note that unreliable does not mean that packets are usually lost. It means that delivery is not guaranteed. If reliable delivery is required, it must be implemented by software running on the endpoints.

Routers connect networks together and forward packets based on destination addresses. They do not maintain state about ongoing conversations. This allows the network to scale and adapt to failures but shifts complexity to the endpoints.

Network Layer: IP

The Internet Protocol operates at the network layer. Its job is to move packets from a source machine to a destination machine, possibly across many intermediate networks.

IP provides unreliable, connectionless datagram delivery. Each packet is treated independently. There is no notion of a connection, no guarantee of delivery, and no guarantee of ordering.

Because IP is a logical network, it relies on an underlying data link layer to transmit packets on a local network. When an IP packet is sent on an Ethernet network, it is encapsulated inside an Ethernet frame. Mapping between IP addresses and link-layer addresses is handled by mechanisms such as the Address Resolution Protocol (ARP) for IPv4 or Neighbor Discovery Protocol (NDP) for IPv6.

The End-to-End Principle

The Internet was designed around a simple but powerful architectural idea known as the end-to-end principle.

At a high level, the end-to-end principle says that functionality should be implemented at the endpoints of a system whenever possible, rather than inside the network itself.

In the context of the Internet, this means that the network provides a basic packet delivery service, while higher-level properties are implemented in software running on the communicating hosts.

As a result, the Internet itself makes very few guarantees. It attempts to deliver packets, but it does not guarantee delivery, ordering, duplication avoidance, or timing. Those properties, when needed, are implemented by protocols and applications running at the edges of the network.

This design choice was deliberate. Implementing complex functionality inside the network would have required routers to maintain per-connection state, coordinate with each other, and handle various failure scenarios. That approach would have limited scalability and made the network fragile.

By keeping the network simple and pushing complexity to the endpoints, the Internet can scale to large numbers of machines and adapt to failures without centralized control.

Reliability, Ordering, and Security at the Edges

The consequences of the end-to-end principle are visible throughout the Internet stack.

Reliable delivery is not provided by IP. Instead, it is implemented by protocols such as TCP, which run on end hosts and retransmit lost data as needed.

Ordering is also handled at the endpoints. IP packets may arrive out of order, but TCP reassembles them into a consistent byte stream.

Security follows the same pattern. The network does not authenticate endpoints or protect data from modification. Instead, encryption, authentication, and integrity checks are implemented by protocols such as TLS, which run above the transport layer.

This approach allows applications to choose which guarantees they need. Some applications require reliable, ordered delivery. Others prefer lower latency and are willing to tolerate loss or reordering. The network does not force a single model on all traffic.

Fate Sharing

A related idea is fate sharing.

Fate sharing means that state associated with a communication should reside in the same place as the communicating endpoints. If an endpoint crashes, the state associated with that communication is lost as well. This is preferable to storing critical state in the network, where it could be lost due to router failures that are unrelated to the endpoints.

TCP connections, for example, maintain their state at the endpoints. If a router crashes, packets may be lost, but the connection state remains intact at the hosts. If an endpoint crashes, the connection state is lost, which is consistent with the fact that communication can no longer proceed anyway.

Fate sharing reinforces the end-to-end principle by ensuring that failures affect only the components that were already involved in the communication.

This was a design choice

Fate sharing was not an obvious or universal design choice. Earlier communication systems, such as the traditional telephone network and virtual circuit networks, maintained connection state inside the network. Routers or switches were aware of ongoing conversations and could reserve resources along a path.

This approach has advantages. By maintaining per-connection state, the network can provide stronger guarantees about performance, such as controlling the quality of service on a connection through resource reservation. However, it also makes the network more complex and less flexible. Adding new capabilities often requires modifying network devices themselves, rather than simply updating software at the endpoints.

The Internet took a different approach. By keeping the network largely stateless and placing connection state at the endpoints, it favored simplicity, robustness, and scalability over built-in guarantees. Fate sharing ensures that when state is lost, it is lost only when an endpoint has already failed, which is consistent with the fact that communication cannot continue anyway.

Relevance to Distributed Systems Design

The end-to-end principle explains why distributed systems must handle many concerns explicitly.

The Internet provides what is known as best-effort delivery. This means that the network attempts to deliver packets but does not guarantee that they will arrive, arrive only once, arrive in order, or arrive within any particular amount of time. Most packets are delivered successfully and promptly, but the network makes no promises when congestion, failures, or routing changes occur.

Best-effort delivery is not a flaw or an oversight. It is a deliberate design choice that allows the network to remain simple, scalable, and resilient. When packets are dropped or delayed, the network does not attempt to recover them. Instead, recovery, if needed, is handled by software running on the communicating hosts.

As a result, distributed systems must decide explicitly how to handle loss, duplication, reordering, and delay. Security follows the same pattern: because it is implemented at the endpoints, systems must authenticate peers and protect messages themselves. Because ordering is not guaranteed, systems must also reason carefully about time and causality.

These challenges are the direct result of a network architecture that favors scalability, robustness, and flexibility over strong built-in guarantees.

Understanding the end-to-end principle helps explain why distributed systems look the way they do, and why problems that seem straightforward on a single machine become challenging once communication happens over a network.

These architectural choices also have important performance implications, particularly in how networks balance delay and capacity.

Latency and Throughput

Two performance metrics appear repeatedly in distributed systems: latency and throughput.

Latency measures how long it takes for a single message to travel from sender to receiver.

Throughput measures the amount of data transferred per unit of time. These metrics are related but distinct, and improving one does not necessarily improve the other.

Packet-switched networks are designed primarily to maximize throughput by sharing network capacity among many flows. This allows many systems to communicate concurrently, but it introduces variable delay. As a result, individual messages may experience unpredictable latency even when overall throughput is high.

Many protocol design choices reflect tradeoffs between latency and throughput. TCP’s reliability and ordering improve throughput for long data transfers, but can increase latency when packets are lost. UDP avoids these delays but shifts responsibility to the application. QUIC attempts to reduce latency without sacrificing reliability by changing how streams are managed.

Distributed systems must therefore be designed with an understanding of which metric matters more for a given workload.

Transport Layer: TCP and UDP

While IP moves packets between machines, distributed systems require communication between processes. This is the role of the transport layer.

The transport layer introduces port numbers, which allow the operating system to demultiplex incoming data and deliver it to the correct application.

There are two widely used transport protocols on top of IP: TCP and UDP.

Port Numbers: An Analogy

We can think of a machine as a post office and port numbers as post office boxes.

Mail is delivered to a post office based on its address, but individual people do not receive mail directly from the post office itself. Instead, mail is delivered to a specific post office box inside that building. A person may have one or more boxes, and each box has a number that uniquely identifies it within that post office.

Port numbers work the same way. An IP address identifies the destination machine, while a port number identifies a specific communication endpoint (socket) on that machine. A single process may use multiple ports for different connections. A single machine can run multiple services simultaneously, each listening on a different port. The operating system acts like the postmaster, ensuring that incoming data is delivered to the correct socket based on the destination port number.

Some services use well-known, pre-assigned port numbers so clients know where to find them. For example, HTTP typically uses TCP port 80, HTTPS uses TCP port 443, and secure IMAP for reading email commonly uses port 993. Client applications, by contrast, are usually assigned an unused, ephemeral port number by the operating system for the duration of a connection.

The operating system acts like the postmaster, ensuring that incoming data is delivered to the correct process based on the destination port number.

TCP

TCP provides a reliable, ordered byte stream. It ensures that data arrives in order, retransmits lost packets, and regulates transmission through flow and congestion control.

TCP gives applications the illusion of a continuous, reliable connection, even though the underlying network is unreliable. This illusion is created entirely by software at the endpoints, following the Internet’s end-to-end principle.

Because TCP presents data as a single ordered byte stream, it must deliver bytes to the application in order. If a packet carrying earlier data is lost, TCP cannot deliver any subsequent data to the application until the missing data is retransmitted and received. Even if later packets arrive successfully, they must wait.

This behavior is known as head-of-line blocking. It simplifies the programming model by preserving a strict order, but it can increase latency when packet loss occurs.

UDP

UDP provides connectionless datagram delivery. It preserves message boundaries but offers no guarantees about delivery or ordering.

UDP is lightweight and flexible, but all reliability, ordering, and congestion control must be handled by the application. Many distributed systems build custom protocols on top of UDP to regain control over timing and behavior.

Choosing Between TCP and UDP

TCP and UDP make different tradeoffs, and the choice between them has significant implications for system design.

TCP is the dominant transport protocol because it provides a simple and powerful abstraction: a reliable, ordered byte stream. Before data is exchanged, a connection is established, allowing the endpoints to negotiate parameters and initialize state. Lost data is retransmitted, data is delivered in order, and congestion control helps prevent the network from being overwhelmed.

This abstraction greatly simplifies application development. Many distributed systems prefer TCP not because it is fast, but because it hides much of the complexity of unreliable communication.

UDP, by contrast, provides minimal services. There is no connection setup, no reliability, and no ordering. Each packet is sent independently, and the network makes a best-effort attempt to deliver it.

This minimalism has advantages. UDP has lower overhead than TCP, both in terms of protocol processing and latency. There is no connection establishment delay, and applications can send data immediately without waiting for a handshake. Message boundaries are preserved, which can simplify some protocols.

As a result, UDP is commonly used in situations where low latency matters more than reliability, or where the application can tolerate loss. Streaming media is a well-known example, but many infrastructure services also rely on UDP. Network Time Protocol (NTP) and Domain Name System (DNS) queries use UDP because requests are short, loss can be handled with retries, and avoiding connection setup reduces overhead.

Importantly, choosing UDP does not mean abandoning reliability altogether. Many systems implement their own reliability, ordering, or congestion control on top of UDP when the semantics provided by TCP are not a good fit. QUIC, which we’ll touch upon next, is an example of this approach.

The choice between TCP and UDP is therefore not about performance alone. It is about which responsibilities are handled by the transport protocol and which are handled by the application.

QUIC and User-Space Transport

QUIC is a recent² transport protocol that was created at Google and is built on top of UDP and is widely used in today’s web infrastructure, most notably as the transport for HTTP/3.

At first glance, building a reliable transport protocol on top of UDP may seem odd, since UDP itself provides no reliability, ordering, or congestion control. Why not just use TCP?

The two main problems that QUIC was designed to address were:

Latency. TCP relies on a three-way handshake (three back-and-forth messages) to set up a connection before data can be received from a server. With web traffic and many web services, an authenticated and encrypted session is set up, which requires additional back-and-forth messages. At least five messages are exchanged before a client can receive the first byte of content from a server.
Head-of-line blocking. A lost or delayed packet can hold up other data. In the case of the web, a page has multiple distinct parts: images, CSS files, HTML content, JavaScript, fonts. These are all retrieved over the same connection. The HTTP/2 protocol introduced multiplexing for these different streams, but if any TCP packet is lost, the receiver will not deliver any later bytes to the application until the missing bytes are retransmitted and received.

Conceptually, QUIC occupies the same role as TCP. It provides reliable delivery, congestion control, and flow control. However, instead of presenting a single ordered byte stream as a TCP connection would provide, QUIC supports multiple independent streams within a single connection. Loss or delay affecting one stream does not block progress on others, avoiding head-of-line blocking across streams. If a QUIC packet is lost, only streams whose data was carried in that packet are blocked waiting for retransmission; other streams can continue delivering data.

QUIC also integrates security directly into the transport protocol. Encryption and authentication are not layered on afterward but are part of the connection from the beginning. This reflects a broader trend in Internet protocol design toward treating security as a baseline requirement rather than an optional add-on.

Running QUIC in user space has important practical consequences. Updating TCP behavior typically requires changes to the operating system kernel, which can take years to deploy widely. QUIC, by contrast, can be updated as part of an application or library, allowing faster iteration and experimentation.

From a distributed systems perspective, QUIC does not change the fundamental assumptions of networking. The network still provides best-effort packet delivery, and failures, delay, and reordering are still possible. QUIC is best understood as a reorganization of responsibilities between the operating system and applications, not as a departure from the Internet’s end-to-end design.

Sockets: The Application Interface to the Network

Sockets are the operating system’s abstraction for network communication.

They provide a uniform interface for applications to send and receive data, regardless of the underlying network or protocol. From the application’s perspective, sockets are the only way to interact with the network.

For connection-oriented protocols such as TCP, servers create sockets, bind them to addresses, listen for incoming connections, and accept them. Clients create sockets and connect to servers. Once connected, data can be sent and received using standard read and write operations.

For connectionless protocols such as UDP, sockets send and receive individual datagrams. There is no connection setup or teardown, and each message explicitly identifies its destination.

Sockets hide many low-level details but do not eliminate the fundamental properties of network communication.

Socket Operations at a High Level

Sockets are accessed through a small set of system calls provided by the operating system. While the details vary by language and platform, the overall sequence is consistent and helps show the difference between using TCP and UDP.

For a TCP server, the typical sequence is:

socket: create a socket endpoint
bind: associate the socket with a local address and port
listen: indicate willingness to accept incoming connections
accept: wait for and establish a connection with a client
read / write: exchange data over the established connection
close: terminate the connection

For a TCP client, the sequence is simpler:

socket: create a socket endpoint
connect: establish a connection to a remote address and port
read / write: exchange data
close: terminate the connection

TCP provides a reliable, ordered byte stream abstraction. The operating system delivers data as a continuous stream of bytes, with no inherent message boundaries. Applications must impose their own structure on the stream if they want to interpret messages.

Unlike TCP, UDP does not use connections. A typical UDP interaction looks like:

socket: create a socket endpoint
bind (optional): associate the socket with a local address and port
sendto / recvfrom: send and receive individual datagrams, specifying the destination address and port with each message
close: release the socket

UDP provides a message-oriented abstraction. Each call to send data corresponds to a single datagram (packet), and message boundaries are preserved.

Note that TCP connections involve explicit setup and teardown operations that set or delete the communication state tracked by the operating system, while UDP communication consists of independent message exchanges with no persistent connection state.

Many higher-level programming languages, such as Java or Python, provide networking libraries that combine several of these steps into a single method call. For example, creating and listening on a server socket may appear as one operation. These libraries do not eliminate the underlying mechanisms; they invoke the same OS system calls internally. The abstraction is higher-level, but the communication model remains the same.

Many Networks, One Focus

Many networking technologies exist, including Bluetooth, cellular networks, sensor meshes, and specialized industrial protocols. While these technologies are important, most distributed systems are built on top of IP networking.

In this course, we will therefore focus primarily on IP-based communication. The concepts introduced here generalize to other network technologies, but IP provides a common foundation.

Protocol Encapsulation

Each layer of the network stack treats the data from higher layers as payload.

A TCP segment is encapsulated inside an IP packet. An IP packet is encapsulated inside a link-layer frame (like an Ethernet or Wi-Fi frame). This wrapping process is known as protocol encapsulation.

Encapsulation allows each layer to operate independently while composing into a complete system.

Summary

Distributed systems rely on packet-switched communication over unreliable networks.

Layered protocol stacks separate concerns, while sockets provide applications with a uniform interface to the network. TCP and UDP offer different tradeoffs between reliability, ordering, and control.

Understanding these networking fundamentals is essential for reasoning about performance, failure, and correctness in distributed systems.

Next: Study Guide

Back to CS 417 Documents

The OSI model is formally called the Open Systems Interconnection (OSI) Reference Model. It is an ISO (International Standards Organization) standard, published as ISO/IEC 7498-1. ↩
QUIC was created by Google in 2012 to improve web performance and standardized by the IETF in 2021. In contrast, TCP was published in 1974 and UDP in 1980. ↩

Communication and Networking in Distributed Systems

Communication as the Basis for Coordination

Sharing a Network: Why Packet Switching Exists

A Layered View of Networking

The Internet as a Packet-Switched Network

Network Layer: IP

The End-to-End Principle

Reliability, Ordering, and Security at the Edges

Fate Sharing

Relevance to Distributed Systems Design

Latency and Throughput

Transport Layer: TCP and UDP

TCP

UDP

Choosing Between TCP and UDP

QUIC and User-Space Transport

Sockets: The Application Interface to the Network

Socket Operations at a High Level

Many Networks, One Focus

Protocol Encapsulation

Summary

Further Reading

Next: Study Guide

Back to CS 417 Documents