Firewalls &VPNs

Protecting the network

Paul Krzyzanowski

April 13, 2022

Firewalls

A firewall protects the junction between different network segments, most typically between an untrusted network (e.g., external Internet) and a trusted network (e.g., internal network). Two approaches to firewalling are packet filtering and proxies.

A router has the task of determining how to route a packet. That is, a router is connected to two or more networks, each connected to a different port on the router. An IP packet is received on one port and the router needs to determine which port to send it to. A packet filter, or screening router, determines not only the route for a packet but also whether the packet should be routed or rejected based on the contents in its IP header, TCP/UDP header, and the interface (port) on which the packet arrived. This function is usually implemented inside a border router, also known as the gateway router that manages the flow of traffic between the ISP and internal network.

The basic principle of firewalls is to never have a direct inbound connection from the originating host from the Internet to an internal host; all traffic must flow through a firewall and be inspected.

The packet filter evaluates a set of rules to determine whether to drop or accept a packet. This set of rules forms an access control list, often called a chain. Strong security follows a default deny model, where packets are dropped unless some rule in the chain specifically permits them.

First-generation packet filters implemented stateless inspection. A packet is examined on its own with no context based on previously-seen packets.

Second-generation packet filters

Second-generation packet filters track TCP connections and other information from previously-seen packets. These stateful packet inspection (SPI) firewalls enable the router to track sessions and accept or reject packets based on them. For instance:

  • They can block TCP data traffic if a connection setup did not take place to avoid sequence number prediction attacks.

  • They can track that a connection has been established by a client to a remote server and allow return traffic to that client (which is essential for any interaction by someone inside the network who is accessing external services).

  • They can track connectionless UDP and ICMP messages and allow responses to be sent back to clients in the internal network. DNS queries and pings (ICMP echo-reply messages) are examples of services that need this.

  • They also and understand the relationship between packets. For example, when a client establishes an FTP (file transfer protocol) connection to a server on port 21 from some random source port, the server establishes a connection back to the client on a different port when it needs to send data. Some media servers do this as well: data might be requested via a TCP connection but a media stream might be returned on a different port via UDP packets.

Third-generation packet filters

Packet filters traditionally do not look above the transport layer (UDP and TCP protocols and port numbers). Third-generation packet filters incorporate deep packet inspection(DPI), which allows a firewall to not only examine headers but also examine application data and make decisions based on its contents.

Deep packet inspection can validate the protocol of an application as well as check for malicious content such as malformed URLs or other security attacks. DPI is generally considered to be part of Intrusion Prevention Systems. Examples are detecting application-layer protocols such as HTTP and then applying application-specific filters, such as checking for suspicious URLs or disallowing the download of certain ActiveX or Java applets.

Deep Packet Inspection (DPI) firewalls evolved to Deep Content Inspection (DCI) firewalls. These use the same concept but are capable of buffering large chunks of data from multiple packets that contain an entire object and acting on it, such as unpacking base64-encoded content from web and email messages and performing a signature analysis for malware.

Application proxies

An application proxy is software that presents the same protocol to the outside network as the application for which it is a proxy. For example, a mail server proxy will listen on port 25 and understand SMTP, the Simple Mail Transfer Protocol. The primary job of the proxy is to validate the application protocol and thus guard against protocol attacks (extra commands, bad arguments) that may exploit bugs in the service. Valid requests are then regenerated by the proxy to the real application that is running on another server and is not accessible from the outside network.

Application proxies are usually installed on dual-homed hosts. This is a term for a system that has two “homes,” or network interfaces: one for the external network and another for the internal network. Traffic never passes between the two networks. The proxy is the only one that can communicate with the internal network. Unlike DPI, a proxy may modify the data stream, such as stripping headers, modifying machine names, or even restructuring the commands in the protocol used to communicate with the actual servers (that is, it does not have to relay everything that it receives).

DMZs

Each network connected to a router can be thought of as a security zone. There’s a certain level of trust that is usually associated with the computers and their users. The two basic zones are the “internal zone” and “external zone,” where the internal zone represents the computers within an organziation and the external zone represents all the systems on the Internet. A packet filter on a gateway router will screen traffic flowing between the Internet (external zone) and the internal systems.

The danger in this design is that you must ensure that firewall rules are carefully configured so that incoming requests only hit the machines that provide externally-facing services since all systems will share the same range of addresses. Moreover, if attackers manage to penetrate one of the systems, they have full access to the internal local area network and can start attacking those system.

A common firewall environment is a screened subnet architecture, with a separate subnet for systems that run externally-accessible services (such as web servers and mail servers) and a different network for internal systems that do not offer services and should not be accessed from the outside.

The subnet that contains externally-accessible services is called the DMZ (demilitarized zone). The DMZ contains all the hosts that may be offering services to the external network (usually the Internet). Machines on the internal network are not accessible from the Internet. All machines within an organization will be either in the DMZ or in the internal network.

Both subnets are protected by a screening router, which will make decisions based on the interface on which a packet arrives as well as its header values.

The router access control policy will control packet flow from each of the interfaces:

From the outside network (Internet)

  • No packets from the outside network are permitted into the inside network (unless it’s a response an internal request, such as a DNS response or TCP return traffic).
  • Allows packets only to the IP addresses, protocols & ports in the DMZ that are offering valid services.
  • The firewall would also reject any packets that are masqueraded to appear to come from the internal network.

From the DMZ subnet

  • Allows packets to only come from designated machines in the DMZ that need to access services in the internal network. Any packets not targeting the appropriate services in the internal network will be rejected. This reduces the attack surface - should intruders gain access to a machine in the DMZ, they will be severely limited in their ability to reach and attack systems on the internal subnet.
  • Outbound traffic to the internet can be permitted but administrators may want to restrict it to limit the ability of attackers to connect back to the Internet to download tools or malware.

From the internal network

The internal network is considered to be a higher security network than the Internet, so we generally are not concerned about traffic flowing from the internal network to the Internet, although administrators may block certain external services (to disallow employees from accessing the web, for example, or downloading torrent files).

Access to the DMZ from the internal network is generally not restricted. Employees may access to the same set of DMZ services plus some additional services, such as login services.

The separation of an organization into an internal zone and a DMZ can be taken further through micro-segmentation, where subnets (e.g., local area networks or VLANs, virtual local area networks) can be created for various parts of the organization (e.g., developers, Human Resources, legal, marketing) and traffic between the organizations will flow through firewalls and access to systems and services restricted. The DMZ itself can also be segmented into separate subnets to restrict the ability of one system to access another in the DMZ even if it is compromised.

Host-based firewalls

Firewalls intercept all packets entering or leaving a local area network. A host-based firewall, on the other hand, runs on a user’s computer. Unlike network-based firewalls, a host-based firewall can associate network traffic with individual applications. Its goal is to prevent malware from accessing the network. Only approved applications will be allowed to send or receive network data. Host-based firewalls are particularly useful in light of deperimiterization: the boundaries of external and internal networks have become fuzzy as people connect their mobile devices to different networks and import data on flash drives. A concern with host-based firewalls is that if malware manages to get elevated privileges, it may be able to shut off the firewall or change its rules.

Intrusion detection/prevention systems

An enhancement to screening routers is the use of intrusion detection systems (IDS). Intrusion detection systems are often parts of DPI firewalls and try to identify malicious behavior. There are three forms of IDS:

  1. A protocol-based IDS validates specific network protocols for conformance. For example, it can implement a state machine to ensure that messages are sent in the proper sequence, that only valid commands are sent, and that replies match requests.

  2. A signature-based IDS is similar to a PC-based virus checker. It scans the bits of application data in incoming packets to try to discern if there is evidence of “bad data”, which may include malformed URLs, extra-long strings that may trigger buffer overflows, or bit patterns that match known viruses.

  3. An anomaly-based IDS looks for statistical aberrations in network activity. Instead of having predefined patterns, normal behavior is first measured and used as a baseline. An unexpected use of certain protocols, ports, or even amount of data sent to a specific service may trigger a warning.

Anomaly-based detection implies that we know normal behavior and flag any unusual activity as bad. This is difficult since it is hard to characterize what normal behavior is, particularly since normal behavior can change over time and may exhibit random network accesses (e.g., people web surfing to different places). Too many false positives will annoy administrators and lead them to disregard alarms.

A signature-based system employs misuse-based detection. It knows bad behavior: the rules that define invalid packets or invalid application layer data (e.g., ssh root login attempts). Anything else is considered good.

Intrusion Detection Systems (IDS) monitor traffic entering and leaving the network and report any discovered problems. Intrusion Prevention Systems (IPS) serve the same function but are positioned to sit between two networks like a firewall and can actively block traffic that is considered to be a threat or policy violation.

Type Description
Firewall (screening router) 1st generation packet filter that filters packets between networks. Blocks/accepts traffic based on IP addresses, ports, protocols
Stateful inspection firewall 2nd generation packet filter. Like a screening router but also takes into account TCP connection state and information from previous connections (e.g., related ports for TCP)
Deep Packet Inspection firewall 3rd generation packet filter. Examines application-layer protocols
Application proxy Gateway between two networks for a specific application. Prevents direct connections to the application from outside the network. Responsible for validating the protocol
IDS/IPS Can usually do what a stateful inspection firewall does + examine application-layer data for protocol attacks or malicious content
Host-based firewall Typically screening router with per-application awareness. Sometimes includes anti-virus software for application-layer signature checking
Host-based IPS Typically allows real-time blocking of remote hosts performing suspicious operations (port scanning, ssh logins)

Virtual Private Networks (VPNs)

Suppose we want to connect two local area networks in geoagraphically-separated areas together. For instance, we might have a company with locations in New York and in San Francisco. One way of doing this is to get a dedicated private network link between the two points. Many phone companies and network providers offer a private line service but it can be extremely expensive and is not feasible in many circumstances, such as if one of your endpoints is in the Amazon cloud rather than at your physical location.o

Instead, we can use the public Internet to communicate between the two locations. Our two subnets will often have private IP addresses (such as 192.168.x.x), which are not routable over the public internet. To overcome this, we can use a technique called tunneling. Tunneling is the process of encapsulating an IP datagram within another IP datagram. An IP datagram in one subnet (a local area network in one of our locations) that is destined to an address on the remote subnet will be directed to a gateway router. There, it will be treated as payload (data) and packaged within an IP datagram whose destination is the IP address of the gateway router at our other location. This datagram is now routed over the public Internet. The source and destination addresses of this outer datagram are the gateway routers at both sides.

IP networking relies on store-and-forward routing. Network data passes through routers, which are often unknown and may be untrustworthy. We have seen that routes may be altered to pass data through malicious hosts or directed to malicious hosts that accept packets destined for the legitimate host. Even with TCP connections, data can be modified or redirected and sessions can be hijacked. We also saw that there is no source authentication on IP packets: a host can place any address it would like as the source. What we would like is the ability to communicate securely, with the assurance that our traffic cannot be modified and that we are truly communicating with the correct endpoints.

Virtual private networks (VPNs) take the concept of tunneling and safeguard the encapsulated data by adding a MAC (message authentication code) so that we can detect if the data is modified and encrytion so that others cannot read the data. This way, VPNs allow separate local area networks to communicate securely over the public Internet.

IPsec is a popular VPN protocol that is really a set of two protocols.

  1. The IPsec Authentication Header (AH) is an IPsec protocol that does not encrypt data but simply affixes a message authentication code to each datagram. It ensures the integrity of the each datagram.

  2. The Encapsulating Security Payload (ESP), which provides integrity checks and also encryts the payload, ensuring secrecy.

IPsec can operate in tunnel mode or transport mode. In both cases, IPsec communciates at the same layer as the Internet Protocol. That is, it is not used by applications to communciate with one another but rather by routers or operating systems to direct an entire stream of traffic.

Tunnel mode VPNs provide network-to-network or host-to-network communication. The communication takes place between either two VPN-aware gateway routers or from a host to a VPN-aware router. The entire datagram is treated like payload and encapsulated within a datagram that is sent over the Internet to the remote gateway. That gateway receives this VPN datagram, extracts the payload, and routes it on the internal network where it makes its way to the target system.

Transport mode VPNs provide communication between two hosts. In this case, the IP header is not modified but data is protected. Note that, unlike transport layer security (TLS), which we examine later, setting up a transport mode VPN will protect all data streams between the two hosts. Applications are unaware that a VPN is in place.

Authentication Header (AH)

The Authentication Header (AH) protocol guarantees the integrity and authenticity of IP packets. AH adds an extra chunk of data (the authentication header) with a MAC to the IP datagram. Anyone with knowledge of the key can create the MAC or verify it. This ensures message integrity since an attacker will not be able to modify message contents and have the HMAC remain valid. Attackers will also not be able to forge messages because they will not know the key needed to create a valid MAC. Every AH also has a sequence number that is incremented for each datagram that is transmitted, ensuring that messages are not inserted, deleted, or replayed.

Hence, IPsec AH protects messages from tampering, forged addresses, and replay attacks.

Encapsulating Security Payload (ESP)

The Encapsulating Security Payload (ESP) provides the same integrity assurance but also adds encryption to the payload to ensure confidentiality. Data is encrypted with a symmetric cipher (usually AES).

IPsec cryptographic algorithms

Authentication

An IPsec session begins with authenticating the endpoints. IPsec supports the use of X.509 digital certificates or the use of pre-shared keys. Digital certificates contain the site’s public key and allow us to validate the identity of the certificate if we trust the issuer (the certification authority, or CA). We authenticate by proving that can take a nonce that the other side encrypted with our public key and decrypt it using our private key. A pre-shared key means that both sides configured a static shared secret key ahead of time. We prove that we have the key in a similar manner: one side creates a nonce and asks the other side to encrypt it and send the results. THen the other side does the same thing.

Key exchange

HMAC message authentication codes and encryption algorithms both require the use of secret keys. IPsec uses Diffie-Hellman to create random shared session keys. Diffie-Hellman makes it quick to generate a public-private key pair that is needed to derive a common key ao there is no dependence on long-term keys, assuring forward secrecy.

Confidentiality

In IPsec ESP, the payload is encrypted using either AES-CBC or 3DES-CBC. CBC is cipher-block chaining, which has the property that the ciphertext of each datagram is dependent on all previous datagrams, ensuring that datagrams cannot be substituted from old messages.

Integrity

IPsec uses HMAC, a form of a message authentication code that uses a cryptographic hash function and a shared secret key. It supports either SHA-1 or SHA-2 hash functions.

IPsec Authentication Header mode is rarely used since the overhead of encrypting data these days is quite low and ESP provides both encryption in addition to authentication and integrity.

Transport Layer Security (TLS)

Virtual Private Networks were designed to operate at the network layer. They were designed to connect networks together. Even with transport mode connectivity, they tunnel all IP traffic between two systems and do not differentiate one data stream from another. They do not solve the problem of an application being able to communicate with another application over a network via an authenticated, tamper-proof, and encrypted channel.

Secure Sockets Layer (SSL) was created as a layer of software above TCP that provides authentication, integrity, and encrypted communication while preserving the abstraction of a sockets interface to applications. An application sets up an SSL session to a service. After that, it simply sends and receives data over a socket just like it would with the normal sockets-based API that operating systems provide. The programmer does not have to think about network security.

As SSL evolved, it morphed into a new version called TLS, Transport Layer Security. While SSL is commonly used in conversation and names of APIs, all current implementations are TLS.

Any TCP-based application that may not have addressed network security can be security-enhanced by simply using TLS. For example, the standard email protocols, SMTP, POP, and IMAP, all have TLS-secured interfaces. Web browsers use HTTP, the Hypertext Transfer Protocol, and also support HTTPS, which is the exact same protocol but uses a TLS connection.

TLS has been designed to provide:

Data confidentiality
Symmetric cryptography is used to encrypt data.
Key exchange
During the authentication sequence, TLS performs a key exchange so that both sides can obtain random shared session keys. TLS creates separate keys for each direction of communication (encryption keys for client-to-server and server-to-client data streams) and separate keys for data integrity (MAC keys for client-to-server and server-to-client streams).
Data integrity
Ensure that we can detect if data in transit has not been modified and new data has not been injected. TLS includes a MAC with transmitted data.
Authentication
TLS provides mechanisms to authenticate the endpoints prior to sending data. Authentication is optional and can be unidirectional (the client may just authenticate the server) or bidirectional (each side authenticates the other). TLS uses public key cryptography and X.509 digital certificates as a trusted binding between a user’s public key and their identity.
Interoperability & evolution
TLS was designed to support different key exchange, encryption, integrity, & authentication protocols. The start of each session enables the protocol to negotiate what protocols to use for the session.

TLS 1.3

As of this writing, the current version of the TLS protocol is TLS 1.3. As of 2020, TLS 1.2 and older version were no longer supported.

TLS 1.3 is significant because it simplified the protocol to make it less efficient and to remove the ability to choose algorithms that may be cryptographically weaker than other options.

A few key improvements in TLS 1.3 are:

Removed support for older ciphers & hashes
TLS 1.3 reduced the set of acceptable algorithms as well as choices for parameters that drive encryption algorithms (such as the choice of the modulus for Diffie-Hellman key exchange). The motivation for this was to remove weaker algorithms so that attackers would not have the opportunity to perform downgrade attacks. A downgrade attack is one where the client and server will renegotiate the protocol to use a different and weaker algorithm for encryption or message authentication, or even disable it.
Require the use of Diffie-Hellman for key exchange
Older versions of TLS (and SSL) allowed the use of RSA public key cryptography to transmit a session key (e.g., the client encrypts a random session key with the server’s public key). TLS 1.3 no longer support RSA public keys since they were invariable long-term keys. Generating strong RSA keys is computationally costly and systems that use them would reuse the same key for each session. A past attack (Heartbleed) enabled an attacker to grab memory contents from a server that included the server’s private key. Diffie-Hellman allows keys to be generated efficiently so that new key pairs (that will be used to produce a common key) can be created spontaneously. This assures Perfect Forward Secrecy: knowledge of a past key will yield no information for decrypting future sessions.
Reduce handshake complexity
Earlier versions of TLS (and SSL) involved several back-and-forth messages to agree of a suite of ciphers for encryption, key exchange, and message authentication codes, for sending certificates, sending nonces (random data), and authenticating. TLS 1.3 optimized this initialization so that the most common usage of the protocol will involve only one set of back-and-forth messages. This dramatically reduces the delay between connecting to a server and commencing secure communication.
TLS 1.3 also authenticates all data starting from the first response from the server, which reduces opportunities for attackers to inject unauthenticated data that can alter the way data is treated throughout the session.
0-RTT: zero round-trip time
TLS 1.3 added support for near-instantaneous connection resumption in the event that the client needs to re-establish a connection to the server. After the initial setup handshake phase, both sides generate a Resumption Master Key. If the TCP connection between the client and server terminates, the client can re-establish it and send the server a session ticket (to identify itself) along with the first set of data encrypted with this Resumption Master Key. If the server cached the session ID and the resumption key (this is optional for servers), it can start processing client data with the very first message it received on this new connection.

TLS sub-protocols

TLS operates in two phases

(1) Handshake: Authentication and key exchange
Authentication uses public key cryptography with X.509 certificates to authenticate a system. The server presents the client with an X.509 digital certificate. The server can, optionally, also ask the client for a certificate. Either RSA or Elliptic Curve (the Elliptic Curve Digital Signature Algorithm) public keys are supported for this phase. TLS validates the signature of the certificate. An endpoint authenticates by signing a hash of a set of messages with their private key.
Key exchange used to support several options. With TLS 1.3, only Ephemeral Diffie-Hellman key exchange is supported since it supports the efficient generation of shared keys and there is no long-term key storage, providing forward secrecy.
(2) Record protocol: Communication
The record protocol is used for sending data back and forth between the two systems. Each message is encrypted and contains a message authentication code. Data encryption uses symmetric cryptography and supports a variety of algorithms, including 128- and 256-bit AES-GCM and ChaCha20-Poly1305. AES is the Advanced Encryption Standard. GCM is Galois/Counter Mode, an alternative to cipher-block chaining (CBC) that encrypts an incrementing counter and exclusive-ors it with a block of plaintext. ChaCha20 is an encryption algorithm that is generally more efficient than AES on low-end processors.
Data integrity is provided by a hashed message authentication code (HMAC) that is attached to each block of data. Previous versions of TLS allowed the client and server to negotiate which MAC to use but TLS 1.3 requires the use of a specific MAC based on the chosen cipher to ensure the cryptographically strongest choice. It supports either HMAC-SHA256 or HMAC-SHA384 or, for ChaCha20 encryption, the Poly1305 message authentication code.

TLS 1.3 Handshake

What follows is a high-level overview of the TLS 1.3 handshake. I’m leaving out descriptions of handshake secrets, traffic secrets, handshake keys, and initialization vectors so that the details will not obscure the basic mechanisms. Please consult additional references for actual details.

Client Hello

The client connects to the server via TCP. It generates a public-private pair of Diffie-Hellman keys and sends the server a block of information that includes:

  1. TLS version number
  2. Client random data
  3. Diffie-Hellman public key

There’s additional data that includes the list of ciphers, versions, and signature algorithm it supports.

Server Hello

When the server receives the “client hello” message, the server generates its own public-private Diffie-Hellman key pair and responds back with a “server hello” block of information that includes:

  1. TLS version (the lesser of the maximum version it can support and the client can support; confirmation that it can use version 1.3)
  2. Server random
  3. Selected cipher suite (the set of algorithms it agrees to use)
  4. Server’s Diffie-Hellman public key
  5. Server certificate (the X.509 certificate containing the server’s public key)
  6. Certificate verification

With the Diffie-Hellman public key it received from the client and the private key that it generated, the server has all the information it needs to compute a common key. The client will be able to compute the same value when it receives the “server hello” message.

The server authenticates itself to the client with a digital certificate. The entire “server hello” message is signed to prevent tampering but the client has all the information it needs to validate the signature.

Key Derivation

After the handshake, both sides can compute the Diffie-Hellman common key. The initial key is derived from the common key and the SHA-384 hash of the client_hello and the server_hello messages. When the client receives the message from the server, it will be able to compute the same key. This is used to derive all the keys that will be needed for the session.

TLS 1.3 derives all the keys it needs from this initial secret key by using HKDF, the HMAC-based Extract-and-Expand Key Derivation Function. This is an IETF standard (see RFC 5869) for deriving any number of keys starting from one initial secret. Conceptually, it is similar to the technique used to derive one-time passwords with the S/key algorithm, where each key was a one-way function of the previous key.

With HKDF, we first create an initial fixed-length pseudorandom key, K, from the initial secret:

K = hash(non_secret_salt, initial_secret)

Then, each successive key, Kn is generated as:

Kn = HMAC(K, Kn-1, n)

TLS 1.3 Communication

In previous version of TLS, the client and server could mix and match any of a variety of encryption and MAC algorithms. In TLS 1.3, the choices are restricted. Data is sent using a mechanism called AEAD, which stands for Authenticated Encryption with Additional Data**. AEAD uses a different key for each message that is sent.

We start with:

  • Message (data to be sent)
  • Initial secret key
  • Initialization value (IV), used for counter-mode encryption
  • Optional additional non-secret data (AD)

The key and IV are then used to encrypt the message using the chosen cipher (either AES or ChaCha20). A secondary key is used to create the HMAC message authentication code for

HMAC(ciphertext, AD, lengthciphertext, lengthAD)

The encrypted message and associated HMAC are sent to the others side.

Unidirectional vs. mutual authentication

To implement authentication, the server sends the client its X.509 digital certificate so the client can authenticate the server by having the server prove it knows the private key. TLS also supports mutual authentication: the client will send its X.509 certificate to the server so the server can authenticate the client.

One notable aspect of TLS sessions is that, in most cases, only the server will present a certificate. Hence, the server will not authenticate or know the identity of the client. Client-side certificates have been problematic. Generating keys and obtaining trustworthy certificates is not an easy process for users. A user would have to install the certificate and the corresponding private key on every system she uses. This would not be practical for shared systems. Moreover, if a client did have a certificate, any server can request it during TLS connection setup, thus obtaining the identity of the client. This could be desirable for legitimate banking transactions but not for sites where a user would like to remain anonymous. We generally rely on other authentication mechanisms, such as the password authentication protocol, but carry them out over TLS’s secure communication channel.

References

A few relatively easy to digest references for the TLS 1.3 protocol:

Last modified January 17, 2024.
recycled pixels