Introduction: Why we need protocols
Up to this point, we have studied the building blocks of cryptography:
-
Public key cryptography for exchanging secrets and verifying identities
-
Digital signatures for proving authenticity
-
Diffie-Hellman for agreeing on a shared key
-
Symmetric encryption for fast, bulk protection of data
-
Hashes and MACs for integrity
These tools are powerful, but by themselves they are not enough. We need a protocol: a set of ordered steps that two strangers on the Internet can follow to establish a secure channel. Each step in the protocol has a purpose: authenticate, agree on keys, prevent tampering, and ultimately enable secure communication.
Without a protocol, we would know how to encrypt data or sign a message, but we wouldn’t know when or how to use each operation.
The most widely used protocol today is Transport Layer Security (TLS). It secures HTTPS, email, VPNs, and countless other applications. TLS gives applications the illusion that they are talking over a normal TCP socket, but all traffic is encrypted and authenticated underneath.
With TLS, we see how to authenticate servers, agree on a shared session key, and switch to efficient symmetric encryption for bulk data. This design demonstrates why cryptography is not just math but also careful engineering of interactions.
A Short History
TLS is the successor to Secure Sockets Layer (SSL), a protocol developed by Netscape in the mid-1990s to secure web traffic. SSL 2.0 was the first widely deployed version, but it had serious flaws. SSL 3.0, released in 1996, corrected many of them and influenced the design of TLS.
The Internet Engineering Task Force (IETF) standardized TLS 1.0 in 1999 as an open, interoperable protocol. Since then, several versions have followed. TLS 1.3, finalized in 2018, is the version we use today. It simplified the handshake, removed outdated algorithms, and mandated forward secrecy by default.
Goals of TLS
TLS has four key goals:
-
Authentication: The client knows it is talking to the real server.
-
Confidentiality: Messages are encrypted so nobody else can read them.
-
Integrity: Messages cannot be altered without detection.
-
Forward secrecy: Even if a long-term private key is stolen later, old sessions remain safe.
The Components Brought Together
TLS 1.3 combines the primitives we studied:
-
Diffie-Hellman: Each side creates an ephemeral key pair for the handshake. This ensures forward secrecy.
-
Digital certificates: The server proves its identity with an X.509 certificate that binds its long-term key to its domain name.
-
Public key cryptography: Used to validate certificates and signatures.
-
Digital signatures: The server signs the handshake transcript to prove it possesses the private key.
-
Hash functions: Used in transcript hashing and in HKDF, the key derivation function. (This will be explained after the walkthrough.)
-
HMAC: Used to authenticate the handshake’s “Finished” messages.
-
Symmetric cryptography (AES-GCM, ChaCha20-Poly1305): Used to protect the actual application data.
TLS 1.3 Handshake Walkthrough
TLS 1.3 is designed to be simpler, faster, and more secure than earlier versions. Here are the essential steps.
Step 1: ClientHello
The client begins:
Sends a list of supported algorithms (cipher suites).
Generates a fresh ephemeral Diffie-Hellman public key and sends it.
Sends a 32-byte random value.
Purpose: Start negotiation, provide uniqueness, and prevent replay. The random value is mixed into later key derivation.
Step 2: ServerHello
The server replies:
Chooses algorithms from the client’s list.
Sends its own ephemeral Diffie-Hellman public key.
Sends its own 32-byte random value.
At this point both sides can compute the shared Diffie-Hellman secret.
What is the handshake transcript?
Throughout the handshake, both client and server keep a running hash of every handshake message they send or receive. This running value is the handshake transcript.
-
The transcript is never sent directly; instead, its hash is used in signatures and MACs later in the handshake.
-
It serves as a tamper-evident log. If an attacker tries to insert, remove, or change a message, the transcript hashes will no longer match, and verification will fail.
Step 3: Certificate
The server sends its X.509 certificate:
Contains the server’s identity (domain name).
Contains the server’s long-term public key.
Is digitally signed by a Certificate Authority (CA).
The client must decide if it can trust this certificate. It does so by building a chain of trust:
Starting from the server’s certificate, it follows the issuer field to find the certificate of the issuer (an intermediate CA). It repeats this until it reaches a root CA certificate that is already in the client’s trust store.
At each link, the client verifies that the issuer’s digital signature on the certificate is valid using the issuer’s public key.
It also checks the certificate’s validity period, that it is authorized for TLS server authentication, and that the server name matches the certificate’s Subject Alternative Name (SAN).
The root CA is trusted not because its signature is verified but because its certificate is preinstalled on the client.
If the entire chain verifies, the client accepts the server’s certificate as authentic.
Step 4: CertificateVerify
The server now proves it actually owns the private key tied to the certificate:
Signs the handshake transcript hash with its private key.
The client verifies this with the server’s public key from the certificate.
Purpose: Prevent an attacker from replaying or splicing messages; bind the handshake to the server’s identity.
Step 5: Finished (Client)
The client sends a Finished message:
Computes
HMAC(finished_key, transcript_hash)
finished_key
comes from HKDF, explained below.This proves the client derived the same cryptographic state.
Step 6: Finished (Server)
The server does the same:
Computes its own HMAC over the transcript.
If both checks succeed, each side knows the other saw the same handshake and derived the same keys.
Step 7: Application Data
At this point:
Both sides derive two distinct session keys with HKDF:
One for client-to-server
One for server-to-client
These keys are used with AEAD ciphers (AES-GCM, ChaCha20-Poly1305).
Nonce (Initialization Vector)
AEAD encryption requires, in addition to the key, a nonce — a “number used once.”
-
In AES-GCM, this plays the same role as an initialization vector (IV): it ensures that even if the same plaintext is encrypted twice with the same key, the ciphertexts will differ.
-
TLS 1.3 guarantees uniqueness by constructing the nonce from a per-record counter (the record sequence number) combined with fixed values derived during the handshake.
-
The same key+nonce pair must never be reused; uniqueness is critical for security.
Authentication tag and AEAD
Each encrypted record also carries a 16-byte authentication tag.
This tag works just like a Message Authentication Code (MAC):
-
It is computed from the ciphertext and a secret key, and
-
It allows the receiver to verify that the record came from someone who knows the key and that it was not modified in transit.
The difference is that in AEAD ciphers such as AES-GCM or ChaCha20-Poly1305, the tag is generated as part of the encryption process rather than by running a separate MAC algorithm. This gives us encryption and integrity in one step.
HKDF: HMAC-based Key Derivation Function
TLS 1.3 relies on HKDF (HMAC-based Key Derivation Function) to generate the many keys it needs from a single shared secret.
You can think of HKDF as a cryptographically secure pseudorandom number generator (PRNG) that is designed specifically for keys.
-
It takes an initial secret (such as the Diffie-Hellman output) and mixes it with other values like the client and server randoms.
-
It then uses HMAC to “stretch” that input into as many independent-looking outputs as we need: handshake keys, client traffic keys, server traffic keys, and more.
-
Each output is labeled, so even though they all come from the same starting point, the keys are distinct and cannot be confused with one another.
Because HKDF is deterministic, both sides that start with the same inputs will always produce the same set of keys ... without ever sending the keys across the network.