pk.org: CS 419/Lecture Notes

Virtual Private Networks

Secure Communications

Paul Krzyzanowski – November 13, 2024

Suppose we want to connect two local area networks in geographically-separated areas together. For instance, we might have a company with locations in New York and in San Francisco. One way of doing this is to get a dedicated private network link between the two points. Many phone companies and network providers offer a private line service but it can be extremely expensive and is not feasible in many circumstances, such as if one of your endpoints is in the Amazon cloud rather than at your physical location.

Instead, we can use the public Internet to communicate between the two locations. Our two subnets will often have private IP addresses (such as 192.168.x.x), which are not routable over the public internet. To overcome this, we can use a technique called tunneling. Tunneling is the process of encapsulating an IP datagram within another IP datagram. An IP datagram in one subnet (a local area network in one of our locations) that is destined to an address on the remote subnet will be directed to a gateway router. There, it will be treated as payload (data) and packaged within an IP datagram whose destination is the IP address of the gateway router at our other location. This datagram is now routed over the public Internet. The source and destination addresses of this outer datagram are the gateway routers at both sides.

IP networking relies on store-and-forward routing. Network data passes through routers, which are often unknown and may be untrustworthy. We have seen that routes may be altered to pass data through malicious hosts or directed to malicious hosts that accept packets destined for the legitimate host. Even with TCP connections, data can be modified or redirected and sessions can be hijacked. We also saw that there is no source authentication on IP packets: a host can place any address it would like as the source. What we would like is the ability to communicate securely, with the assurance that our traffic cannot be modified and that we are truly communicating with the correct endpoints.

Virtual private networks (VPNs) take the concept of tunneling and safeguard the encapsulated data by adding a MAC (message authentication code) so that we can detect if the data is modified and encrytion so that others cannot read the data. This way, VPNs allow separate local area networks to communicate securely over the public Internet.

IPsec is a popular VPN protocol that is really a set of two protocols.

  1. The IPsec Authentication Header (AH) is an IPsec protocol that does not encrypt data but simply affixes a message authentication code to each datagram. It ensures the integrity of the each datagram.

  2. The Encapsulating Security Payload (ESP), which provides integrity checks and also encryts the payload, ensuring secrecy.

IPsec can operate in tunnel mode or transport mode. In both cases, IPsec communciates at the same layer as the Internet Protocol. That is, it is not used by applications to communciate with one another but rather by routers or operating systems to direct an entire stream of traffic.

Tunnel mode VPNs provide network-to-network or host-to-network communication. The communication takes place between either two VPN-aware gateway routers or from a host to a VPN-aware router. The entire datagram is treated like payload and encapsulated within a datagram that is sent over the Internet to the remote gateway. That gateway receives this VPN datagram, extracts the payload, and routes it on the internal network where it makes its way to the target system.

Transport mode VPNs provide communication between two hosts. In this case, the IP header is not modified but data is protected. Note that, unlike transport layer security (TLS), which we examine later, setting up a transport mode VPN will protect all data streams between the two hosts. Applications are unaware that a VPN is in place.

Authentication Header (AH)

The Authentication Header (AH) protocol guarantees the integrity and authenticity of IP packets. AH adds an extra chunk of data (the authentication header) with a MAC to the IP datagram. Anyone with knowledge of the key can create the MAC or verify it. This ensures message integrity since an attacker will not be able to modify message contents and have the HMAC remain valid. Attackers will also not be able to forge messages because they will not know the key needed to create a valid MAC. Every AH also has a sequence number that is incremented for each datagram that is transmitted, ensuring that messages are not inserted, deleted, or replayed.

Hence, IPsec AH protects messages from tampering, forged addresses, and replay attacks.

Encapsulating Security Payload (ESP)

The Encapsulating Security Payload (ESP) provides the same integrity assurance but also adds encryption to the payload to ensure confidentiality. Data is encrypted with a symmetric cipher (usually AES).

IPsec cryptographic algorithms

Authentication

An IPsec session begins with authenticating the endpoints. IPsec supports the use of X.509 digital certificates or the use of pre-shared keys. Digital certificates contain the site's public key and allow us to validate the identity of the certificate if we trust the issuer (the certification authority, or CA). We authenticate by proving that can take a nonce that the other side encrypted with our public key and decrypt it using our private key. A pre-shared key means that both sides configured a static shared secret key ahead of time. We prove that we have the key in a similar manner: one side creates a nonce and asks the other side to encrypt it and send the results. Then the other side does the same thing.

Key exchange

HMAC message authentication codes and encryption algorithms both require the use of secret keys. IPsec uses Diffie-Hellman to create random shared session keys. Diffie-Hellman makes it quick to generate a public-private key pair that is needed to derive a common key ao there is no dependence on long-term keys, assuring forward secrecy.

Confidentiality

In IPsec ESP, the payload is encrypted using either AES-CBC or 3DES-CBC. CBC is cipher-block chaining, which has the property that the ciphertext of each datagram is dependent on all previous datagrams, ensuring that datagrams cannot be substituted from old messages.

Integrity

IPsec uses HMAC, a form of a message authentication code that uses a cryptographic hash function and a shared secret key. It supports either SHA-1 or SHA-2 hash functions.

IPsec Authentication Header mode is rarely used since the overhead of encrypting data these days is quite low and ESP provides both encryption in addition to authentication and integrity.

Transport Layer Security (TLS)

Virtual Private Networks were designed to operate at the network layer. They were designed to connect networks together. Even with transport mode connectivity, they tunnel all IP traffic between two systems and do not differentiate one data stream from another. They do not solve the problem of an application being able to communicate with another application over a network via an authenticated, tamper-proof, and encrypted channel.

Secure Sockets Layer (SSL) was created as a layer of software above TCP that provides authentication, integrity, and encrypted communication while preserving the abstraction of a sockets interface to applications. An application sets up an SSL session to a service. After that, it simply sends and receives data over a socket just like it would with the normal sockets-based API that operating systems provide. The programmer does not have to think about network security.

As SSL evolved, it morphed into a new version called TLS, Transport Layer Security. While SSL is commonly used in conversation and names of APIs, all current implementations are TLS.

Any TCP-based application that may not have addressed network security can be security-enhanced by simply using TLS. For example, the standard email protocols, SMTP, POP, and IMAP, all have TLS-secured interfaces. Web browsers use HTTP, the Hypertext Transfer Protocol, and also support HTTPS, which is the exact same protocol but uses a TLS connection.

TLS has been designed to provide:

Data confidentiality
Symmetric cryptography is used to encrypt data.
Key exchange
During the authentication sequence, TLS performs a Diffie-Hellman key exchange so that both sides can obtain random shared session keys. From the common key, TLS uses a pseudorandom generator to create any number of keys, creating separate keys for each direction of communication and separate keys for data integrity in each direction (MAC keys). Since version 1.3, TLS derives a new key for every message sent.
Data integrity
Ensure that we can detect if data in transit has not been modified and new data has not been injected. TLS includes an HMAC function based on the SHA-256 hash for each message.
Authentication
TLS authenticates the endpoints prior to sending data. Authentication can be unidirectional (the client may just authenticate the server) or bidirectional (each side authenticates the other). TLS uses public key cryptography and X.509 digital certificates as a trusted binding between a user's public key and their identity.
Interoperability & evolution
TLS was designed to support different key exchange, encryption, integrity, & authentication protocols. The start of each session enables the protocol to negotiate what protocols to use for the session.

TLS 1.3

As of this writing, the current version of the TLS protocol is TLS 1.3. TLS 1.2 and older versions were not supported since 2020.

TLS 1.3 is significant because it simplified the TLS protocol to make it less efficient and to remove the ability to choose algorithms that may be cryptographically weaker than other options.

A few key improvements in TLS 1.3 are:

Removed support for older ciphers & hashes
TLS 1.3 reduced the set of acceptable algorithms as well as choices for parameters that drive encryption algorithms (such as the choice of the modulus for Diffie-Hellman key exchange). The motivation for this was to remove weaker algorithms so that attackers would not have the opportunity to perform downgrade attacks. A downgrade attack is one where the client and server will renegotiate the protocol to use a different and weaker algorithm for encryption or message authentication or even disable it.
Require the use of Diffie-Hellman for key exchange
Older versions of TLS (and SSL) allowed using RSA public key cryptography to transmit a session key (e.g., the client encrypts a random session key with the server's public key). TLS 1.3 no longer supports RSA public keys since they were invariable long-term keys. Generating strong RSA keys is computationally costly and systems that use them would reuse the same key for each session. A past attack (Heartbleed) enabled an attacker to grab memory contents from a server that included the server's private key. Diffie-Hellman allows keys to be generated efficiently so that new key pairs (used to produce a common key) can be created spontaneously. This assures Perfect Forward Secrecy: knowledge of a past key will yield no information for decrypting future sessions.
Reduce handshake complexity
Earlier versions of TLS (and SSL) involved several back-and-forth messages to agree on a suite of ciphers for encryption, key exchange, and message authentication codes, for sending certificates, sending nonces (random data), and authenticating. TLS 1.3 optimized this initialization so that the most common usage of the protocol will involve only one set of back-and-forth messages. This dramatically reduces the delay between connecting to a server and commencing secure communication.
TLS 1.3 also authenticates all data starting from the first response from the server, which reduces opportunities for attackers to inject unauthenticated data that can alter how data is treated throughout the session.
0-RTT: zero round-trip time for connection restart
TLS 1.3 added support for near-instantaneous connection resumption in the event that the client needs to re-establish a connection to the server. After the initial setup handshake phase, both sides generate a Resumption Master Key. If the TCP connection between the client and server terminates, the client can re-establish it and send the server a session ticket (to identify itself) along with the first set of data encrypted with this Resumption Master Key. If the server cached the session ID and the resumption key (this is optional for servers), it can start processing client data with the very first message it received on this new connection.

TLS sub-protocols

TLS operates in two phases

(1) Handshake Protocol: Authentication and key exchange
Authentication uses public key cryptography with X.509 certificates to authenticate a system. The server presents the client with an X.509 digital certificate. The server can, optionally, also ask the client for a certificate. Either RSA or Elliptic Curve (the Elliptic Curve Digital Signature Algorithm) public keys are supported for this phase. TLS validates the signature of the certificate. An endpoint authenticates by signing a hash of a set of messages with their private key.
Key exchange used to support several options. With TLS 1.3, only Ephemeral Diffie-Hellman key exchange is supported since it supports the efficient generation of shared keys and there is no long-term key storage, providing forward secrecy.
(2) Record Protocol: Communication
The record protocol is used for sending data back and forth between the two systems. Each message is encrypted and contains a message authentication code. Data encryption uses symmetric cryptography and supports a variety of algorithms, including 128- and 256-bit AES-GCM and ChaCha20-Poly1305. AES is the Advanced Encryption Standard. GCM is Galois/Counter Mode, an alternative to cipher-block chaining (CBC) that encrypts an incrementing counter and exclusive-ors it with a block of plaintext. ChaCha20 is an encryption algorithm that is generally more efficient than AES on low-end processors.
Data integrity is provided by a hashed message authentication code (HMAC) that is attached to each block of data. Previous versions of TLS allowed the client and server to negotiate which MAC to use but TLS 1.3 requires the use of a specific MAC based on the chosen cipher to ensure the cryptographically strongest choice. It supports either HMAC-SHA256 or HMAC-SHA384 or, for ChaCha20 encryption, the Poly1305 message authentication code.

TLS 1.3 Handshake

What follows is a high-level overview of the TLS 1.3 handshake. I'm leaving out descriptions of handshake secrets, traffic secrets, handshake keys, and initialization vectors so that the details will not obscure the basic mechanisms. Please consult additional references for actual details if you're interested in learning about this.

Client Hello

The client connects to the server via TCP. It generates a public-private pair of Diffie-Hellman keys and sends the server a block of information that includes:

  1. TLS version number
  2. Client random data
  3. Diffie-Hellman public key

There's additional data that includes the list of ciphers, versions, and signature algorithm it supports.

Server Hello

When the server receives the "client hello" message, the server generates its own public-private Diffie-Hellman key pair and responds back with a "server hello" block of information that includes:

  1. TLS version (the lesser of the maximum version it can support and the client can support; confirmation that it can use version 1.3)

  2. Server random

  3. Selected cipher suite (the set of algorithms it agrees to use)

  4. Server's Diffie-Hellman public key

  5. Server certificate (the X.509 certificate containing the server's public key)

  6. Certificate verification

With the Diffie-Hellman public key it received from the client and the private key that it generated, the server has all the information it needs to compute a common key. The client will be able to compute the same value when it receives the "server hello" message.

The server authenticates itself to the client with a digital certificate. The entire "server hello" message is signed to prevent tampering and the client has all the information it needs to validate the signature.

Key Derivation

After the handshake, both sides can compute the Diffie-Hellman common key.

The initial key is derived from the common key and the SHA-384 hash of the client_hello and the server_hello messages. When the client receives the message from the server, it will be able to compute the same key. This is used to derive all the keys that will be needed for the session.

TLS 1.3 derives all the keys it needs from this initial secret key by using HKDF, the HMAC-based Extract-and-Expand Key Derivation Function. This is an IETF standard (see RFC 5869) for deriving any number of keys starting from one initial secret. Conceptually, it is similar to the technique used to derive one-time passwords with the S/key algorithm, where each key was a one-way function of the previous key.

With HKDF, we first create an initial fixed-length pseudorandom key, K, from the initial secret:

K = hash(non_secret_salt, initial_secret)

Then, each successive key, Kn is generated as:

Kn = HMAC(K, Kn-1, n)

TLS 1.3 Communication

In previous version of TLS, the client and server could mix and match any of a variety of encryption and MAC algorithms. In TLS 1.3, the choices are restricted. Data is sent using a mechanism called AEAD, which stands for Authenticated Encryption with Additional Data. AEAD uses a different key for each message that is sent.

We start with:

HMAC(ciphertext, AD, lengthciphertext, lengthAD)

The encrypted message and associated HMAC are sent to the others side.

Unidirectional vs. mutual authentication

To implement authentication, the server sends the client its X.509 digital certificate so the client can authenticate the server by having the server prove it knows the private key. TLS also supports mutual authentication: the client will send its X.509 certificate to the server so the server can authenticate the client.

One notable aspect of TLS sessions is that, in most cases, only the server will present a certificate. Hence, the server will not authenticate or know the identity of the client.

Client-side certificates have been problematic. Generating keys and obtaining trustworthy certificates is not an easy process for users. A user would have to install the certificate and the corresponding private key on every system she uses. This would not be practical for shared systems. Moreover, if a client did have a certificate, any server can request it during TLS connection setup, thus obtaining the identity of the client. This could be desirable for legitimate banking transactions but not for sites where a user would like to remain anonymous.

We generally rely on other authentication mechanisms, such as the password authentication protocol, but carry them out over TLS's secure communication channel.

References

A few relatively easy-to-digest references for the TLS 1.3 protocol: