Introduction

Modern systems depend on software that manipulates memory directly. Operating systems, browsers, network servers, and embedded firmware are mostly written in C or C++. These languages offer direct control of memory: a programmer can allocate space, cast pointers, and perform arithmetic on addresses. That flexibility allows efficient system code but also enables subtle and devastating bugs.

Memory vulnerabilities have been responsible for most serious security flaws in system software for decades. They allow attackers to crash a program, read sensitive information, or take control of a system. A memory-safety bug becomes a vulnerability when an attacker can deliberately trigger it and influence what data is read or written.

The most common categories include:

Buffer overflows: writing beyond the end of an allocated array.
Heap vulnerabilities: corrupting allocator structures or accessing freed memory.
Integer overflows and underflows: arithmetic that silently wraps around.
Off-by-one errors: a single incorrect boundary condition that allows one extra byte to be accessed.
Format-string vulnerabilities: passing untrusted data as a printf format.

We will explore how these mistakes occur, how attackers exploit them, and what mechanisms modern operating systems use to defend against them. We will focus strictly on memory issues; command injection and malware will come later.

Memory Layout and Stack Behavior

Every running program occupies a portion of memory that the operating system maps into its virtual address space. This layout is mostly consistent across UNIX, Linux, and Windows systems, although the exact addresses differ. Understanding it is essential because most memory vulnerabilities arise from how programs use these regions.

A process typically contains the following areas:

Segment	Purpose
Text (code)	Compiled machine instructions; the program itself. This region is usually marked read-only and executable to prevent modification.
Data	Global and static variables that have been initialized by the program.
BSS (Block Started by Symbol)	Global and static variables that are zero-initialized.
Heap	Dynamically allocated memory created by `malloc`, `calloc`, or `new`. The heap grows upward toward higher addresses.
Stack	Temporary storage used for each function call, including parameters, return addresses, and local variables. The stack grows downward toward lower addresses.

How the Stack Works

When a function is called, the compiler sets up a stack frame, which is a small section of the stack reserved for that function’s execution. Each frame generally contains:

The parameters passed to the function.
The return address, automatically pushed by the CPU when the call occurs.
Saved registers, including the caller’s base pointer (the link to the previous frame).
The function’s local variables.

When the function returns, the CPU pops the return address from the stack and jumps back to it. The stack pointer moves back to its previous position, effectively freeing the space used by the frame.

Note: The compiler may also save a base or frame pointer to make it easier to reference local variables, but that detail is not important for our discussion (and most newer compilers omit it, especially with optimization enabled). Compilers also often pass the first couple of parameters in registers, with the rest going on the stack. This is also an implementation detail that isn't important for our discussion. For crafting exploits, however, it is important to know what code a specific compiler generates.

This design makes function calls fast and self-contained. Each new call adds a new frame on top of the previous one; returning from the call removes it. Recursion or deep call chains simply create multiple frames stacked in memory.

However, this structure also makes the stack a prime target. If a program writes past the end of a local buffer, the excess data overwrites whatever happens to be next in memory, possibly the return address.

Example: How Local Variables Are Stored

Consider this small program:

#include <stdio.h>
#include <string.h>

void greet() {
    char name[16];
    printf("Enter your name: ");
    gets(name);
    printf("Hello, %s!\n", name);
}

int main() {
    greet();
    return 0;
}

The array name is a local variable stored on the stack. When greet() is called, the compiler allocates space for name along with the saved registers and the return address to main(). The layout of the stack frame looks roughly like this:

Address	Contents
	Return address (where `main()` will resume)
	Local variables `name[16]`
Low →	Unused stack space

When gets() reads characters from standard input, it keeps copying them into name until a newline or end-of-file is encountered. It performs no bounds checking. If the user types more than 15 characters plus a null terminator, the function keeps writing beyond the end of the array. The excess data overwrites the memory that stores the saved frame pointer and, potentially, the return address.

What Happens When the Return Address Is Overwritten

When a function finishes, the CPU executes a ret instruction, which pops the return address from the top of the stack and jumps to that address. If the return address has been altered by attacker-controlled data, the CPU will attempt to continue execution at the modified address.

Two important clarifications:

The CPU begins executing code that lives inside the overflowing buffer only if the overwritten return address points to an address within that buffer. If the return address points somewhere else, the CPU will jump to that other location instead; that location might be a valid function in the program or a library, or it might be unmapped memory and cause a crash.
Because the new return address can point to any readable executable region, attackers have multiple options. They can place executable instructions in the buffer and set the return address to the buffer location, which is the classic code injection technique. They can instead set the return address to an existing function such as system() to perform a return-to-libc attack. If the stack is non-executable, they can chain short instruction sequences already present in memory, using return-oriented programming (ROP) to achieve the same effect without injecting new code.

In short, overwriting the return address hands control of the next instruction pointer to the attacker. Whether the CPU executes injected bytes or some existing code depends on where the corrupted return address points.

Why `gets()` Is Dangerous

The gets() function was once common in textbooks because of its simplicity; it reads a line of input into a buffer. Unfortunately, it cannot be made safe because it does not know the size of that buffer. As a result, any input longer than the allocated space leads to a buffer overflow.

For this reason, gets() was removed from the C standard library starting with C11. Safer functions such as fgets() or higher-level input routines should always be used. fgets(buffer, sizeof(buffer), stdin) ensures that no more than the specified number of bytes are read.

A Simplified Stack Frame Diagram

      High addresses
      ┌──────────────────────────────────┐
      │ Function parameters              │
      │ Return address                   │
      │ Local variables (e.g., name[16]) │
      └──────────────────────────────────┘
      Low addresses

Each time a function is called, a new frame like this appears on the stack. Returning from the function pops the frame, restoring the previous base pointer and returning to the caller.

Why This Still Matters

It might seem that only careless code like gets() could be vulnerable, but similar problems occur in more subtle ways. Any time a program copies or manipulates data in memory without verifying sizes, it risks overwriting control information. Variable-length arrays, recursive copies, or unsafe string operations can all lead to stack corruption.

Understanding what resides on the stack, such as local variables, saved registers, and the return address, makes it clear why even a small overflow can have catastrophic effects.

Buffer Overflows (Stack and Heap)

Buffer overflows are among the oldest and most instructive memory vulnerabilities. They occur when a program writes more data into an allocated buffer than the buffer can hold. The consequences depend on what lies adjacent to the buffer in memory.

On the stack, an overflow can overwrite local variables, saved registers, or the return address, which can lead to control-flow hijacking. On the heap, an overflow can corrupt memory allocator metadata or adjacent objects, which can produce arbitrary writes, type confusion, or later control-flow hijacks.

This section covers the mechanics of stack overflows, then heap corruption, and finishes with a discussion of how modern allocators make exploitation harder.

Stack Overflows: mechanics and sequence

Consider a function with a fixed-size local buffer:

void f() {
    char buf[64];
    read_input(buf);   // assume read_input writes user-supplied data into buf
}

When f() runs the compiler allocates a stack frame containing buf, saved registers, and the return address. If read_input writes more than 64 bytes into buf, the extra bytes overwrite nearby stack data.

Overwrites of mapped stack memory do not fault; they silently corrupt the values stored there. The CPU only raises an exception if the write touches unmapped memory.

The most interesting targets to attackers are saved control data: the saved instruction pointer (return address) and saved frame pointer. If either is overwritten with an attacker-controlled value, the attacker can influence where execution continues when the function returns.

A simplified stack layout (high → low addresses) looks like this:

High addresses  
 ┌────────────────────────────┐  
 │ saved registers            │  
 │ return address             │  
 │ local buffer buf[64]       │
 │ function parameters        │  
 └────────────────────────────┘  
Low addresses

If user input overwrites the return address, the next ret uses the corrupted value and transfers control to that address. Whether the program runs attacker code depends on where that value points. If it points into the buffer and that buffer contains executable instructions, the CPU will execute them. If it points to a library function, the attacker may invoke that function with controlled arguments.

Example scenario: overwrite a flag variable vs. return address

Consider this example with two adjacent local variables:

void vulnerable() {
    char buf[16];
    int authorized = 0;
    gets(buf);
    if (authorized) grant_access();
}

If the attacker supplies more than 16 bytes, the extra bytes overwrite authorized. A 24-byte input can set authorized to a nonzero value and cause the program to call grant_access() without overwriting the return address. This example distinguishes logical corruption (changing program decisions) from control-flow hijacking (changing the return address).

If the overflow continues beyond authorized to the saved frame data, it becomes possible to overwrite the return address and redirect execution.

Precision and practical constraints

To run injected code an attacker must place executable bytes in memory and point the instruction pointer to those bytes. Two common constraints complicate this:

Many copy routines treat input as C strings; null bytes terminate the copy and cannot appear in the payload unless the copying routine accepts binary data.
Modern environments may randomize addresses or mark writable pages non-executable, reducing the chance that an injected payload will be executed.

Exploit developers address these constraints with a small set of engineering techniques; we will discuss some of those techniques and exploit patterns later in the exploitation section.

Heap Vulnerabilities

Heap Overflows: Corrupting Adjacent Memory

Heap vulnerabilities are less obvious than stack overflows because the heap is managed by the allocator; the effect of overflowing a heap buffer depends on allocator internals. The allocator divides the heap into blocks called chunks. Each allocation returns a pointer to a chunk's user data region.

Early allocators stored chunk metadata near the user data and linked free chunks with pointers. Overwriting those metadata fields enabled attackers to write arbitrary pointers or trick the allocator into returning a pointer to attacker-controlled memory.

Typical heap chunk layout (conceptual)

A classic chunk of data from a memory allocator might look like:

[ prev_size | size | fd | bk ]  <- metadata for free chunks  
[ payload bytes ... ]          <- user data

prev_size and size are used by the allocator to find adjacent chunks and merge them when freed.
fd and bk are forward and backward pointers used when the chunk is in a free list.

An overflow from one chunk's user data into the metadata of an adjacent chunk can corrupt fd and bk. When the allocator later processes that chunk (merging freed regions or removing it from the free list), it may follow the corrupted pointers and write attacker-controlled data to arbitrary addresses. Different allocators have specific techniques that exploit these behaviors, but the fundamental issue is the same: corrupted metadata leads to attacker-controlled writes.

Consequences of heap corruption

Heap corruption enables several types of attacks:

Information leaks: reading memory that contains pointers or addresses gives an attacker knowledge about memory layout and the locations of libraries or heap regions. That knowledge undermines ASLR.
Arbitrary write: by corrupting allocator metadata attackers cause the allocator to write pointers or data to attacker-specified addresses. This is one of the most powerful attacks; it can overwrite function pointers, vtable pointers (used for C++ virtual methods), or saved return addresses in other frames.
Type confusion: if objects are allocated from shared pools and metadata is corrupted, code may treat attacker-controlled bytes as an object of a different type, leading to function calls to attacker-controlled addresses.

How modern allocators defend

Modern allocators include many internal checks to detect corrupted metadata and randomize allocation patterns. They also include protections against use-after-free and double-free, which we will examine in the next section. These mechanisms limit the reliability of heap-based exploits but do not change the fundamental vulnerability.

Example: how an allocator integrity check stops an attack

Suppose an attacker corrupts the forward pointer fd of a free chunk to point to an object X they wish to overwrite. A naive allocator would unlink the corrupted chunk and write pointers into X. A hardened allocator performs a validation:

Check that fd is aligned and within the heap.
Check that fd->bk points back to the candidate chunk.
If the checks fail, abort or put the chunk in a quarantine.

These checks turn a silent corruption into an immediate allocator-detected error, preventing the arbitrary write.

Use-After-Free and Double-Free: Memory Lifetime Errors

Buffer overflows violate memory boundaries by writing beyond the end of an array. Use-after-free and double-free vulnerabilities violate memory lifetimes by accessing memory after it has been freed or by freeing the same memory twice. These are mistakes about when memory can be used, not where it can be accessed.

Use-After-Free

A use-after-free occurs when code accesses a pointer after the memory it points to has been freed. The physical memory still exists and may contain old data, but the allocator now controls it and may reuse it at any time.

struct user {
    char name[32];
    void (*handler)(void);
};

struct user *u = malloc(sizeof(struct user));
u->handler = &process_user;

free(u);

/* ... later ... */
u->handler();  // use-after-free: dereferences freed memory

Three things can happen:

The memory still has the old data: If no other allocation has reused that memory, the old values remain and the code may work normally. This makes the bug hard to find during testing.
The memory has been reused for another object: The pointer now overlaps with a different object, possibly one containing attacker-controlled data.
The memory contains allocator bookkeeping: Some allocators store free-list pointers in freed memory. Reading these values may crash the program or leak information about memory layout.

Why this is dangerous

Attackers can manipulate the allocator to control what data occupies freed memory when the program accesses it. The attack sequence is:

Trigger an allocation containing function pointers or other sensitive data
Save a pointer to that memory
Free the memory while keeping the pointer
Force the allocator to reuse that memory by allocating new objects of similar size, filled with attacker data
Trigger the program to use the old pointer, which now contains attacker data

This is called heap spraying: filling the heap with controlled data so that freed memory likely contains attacker values when accessed.

C++ objects with virtual methods are especially vulnerable. Each object contains a hidden pointer to its vtable -- a table that stores the addresses of the object's virtual functions. If an attacker overwrites this vtable pointer after the object is freed, virtual method calls will jump to attacker-controlled addresses.

Double-Free

A double-free occurs when the same memory is passed to free() twice. This corrupts the allocator's internal bookkeeping structures.

char *ptr = malloc(128);
free(ptr);
/* ... some code ... */
free(ptr);  // double-free

When you call free(), the allocator puts the memory chunk into a free list—a linked list of available chunks. These lists are often implemented by storing pointers directly in the freed memory. Calling free() a second time on the same address adds it to the free list again, creating a corrupted data structure.

The second free() can lead to:

Immediate crash: Modern hardened allocators detect double-free and abort the program.
Silent corruption: Older allocators may not check, allowing the same chunk to appear twice in the free list.
Attacker control: By writing data into the freed chunk between the two calls to free(), attackers can inject their own pointers into the allocator's free lists.

Why this is dangerous

After a double-free, the allocator may return the same memory address twice from two separate calls to malloc():

char *a = malloc(128);
char *b = malloc(128);

free(a);
free(b);
free(a);  // double-free: a is now in the free list twice

char *x = malloc(128);  // returns a
char *y = malloc(128);  // returns b
char *z = malloc(128);  // returns a again!

Now x and z point to the same memory. Anything written through x affects what is read through z. The program treats them as separate objects but they occupy the same location, breaking basic assumptions about memory isolation.

With more sophisticated manipulation, attackers can corrupt the allocator's free-list pointers to force malloc() to return a pointer to an arbitrary memory address, such as a return address or function pointer.

Why These Bugs Are Hard to Find

Unlike buffer overflows that often crash immediately when they overwrite the wrong data, use-after-free and double-free bugs are unpredictable:

They depend on allocator state: Whether freed memory gets reused depends on what other allocations happen, which varies between program runs.
The cause and effect are separated: The memory might be freed in one function and used much later in a different part of the code.
They often work during testing: If freed memory hasn't been reused yet, the program continues normally, hiding the bug until specific conditions trigger it.

Attackers exploit this by studying how a specific allocator works and creating conditions that reliably trigger the vulnerability.

Defenses

AddressSanitizer (ASan): A development tool that instruments your code to track every allocation. When memory is freed, ASan marks it as invalid in a shadow memory region and puts it in a quarantine—it won't be immediately reused. Any attempt to access freed memory triggers an immediate error with a detailed report showing where the memory was freed and where it was accessed. ASan makes programs run 2-3x slower and use 2-3x more memory, so it's used during development and testing, not in production.

Allocator hardening: Modern allocators include checks to detect these errors:

Track whether each chunk is currently free; abort if you try to free it again
Validate free-list pointers before using them; abort if they're corrupted
Delay reuse of freed memory through quarantine mechanisms
Randomize allocation patterns to make exploitation less reliable

ARM Memory Tagging Extension (MTE): A hardware feature that associates tags with memory allocations and pointers. When memory is freed, its tag changes, making any subsequent access through the old pointer immediately detectable. MTE solves use-after-free in hardware with minimal overhead (less than 5%). We will examine MTE in more detail in the hardware defenses section.

Safer languages: Languages that don't require manual memory management eliminate these bugs entirely:

Rust uses ownership rules checked at compile time: the compiler ensures references don't outlive the memory they point to
Garbage-collected languages (Java, Python, Go) never explicitly free memory—the runtime tracks all references and only reclaims memory when nothing references it anymore
Reference counting (Swift, C++ shared_ptr) keeps a count of how many pointers exist to each object and only frees it when the count reaches zero

Real-World Impact

Use-after-free vulnerabilities are one of the most common and serious bugs in systems software:

Chrome browser: Hundreds of use-after-free vulnerabilities in JavaScript engines where objects are freed while event handlers still reference them
Windows BlueKeep (CVE-2019-0708): A use-after-free in the Remote Desktop Protocol service allowed attackers to take control of systems remotely without authentication
Linux kernel: A double-free in the BPF subsystem (CVE-2016-4557) allowed local users to gain root privileges

These bugs are valuable to attackers because they're hard to find through normal testing but can be reliably exploited once the attacker understands how the allocator works. Combined with heap manipulation techniques, they often lead to complete system compromise.

Integer Overflows and Off-by-One Errors

Integer and boundary errors are among the most subtle and dangerous sources of memory corruption. They appear in arithmetic and indexing, not in obviously unsafe functions. These errors become security problems when a miscalculated size or index causes a buffer overflow, an out-of-bounds read, or an undersized allocation.

Integer representation and ranges

In most languages and all computer architectures, numbers occupy a fixed number of bytes. This limits their range of values.

In languages like Java, C, C++, Rust, Go, integers have a fixed width determined by their type. Operations that exceed that width do not raise exceptions; they silently wrap or behave in undefined ways. The maximum value a type can hold depends on the number of bits and whether it is signed or unsigned. Unsigned arithmetic is in the range modulo 2^N, where N is the number of bits.

Type	Size (bits)	Minimum	Maximum
`int8_t`	8	-128	+127
`uint8_t`	8	0	255
`int16_t`	16	-32,768	+32,767
`uint16_t`	16	0	65,535
`int32_t`	32	-2,147,483,648	+2,147,483,647
`uint32_t`	32	0	4,294,967,295
`int64_t`	64	-9,223,372,036,854,775,808	+9,223,372,036,854,775,807
`uint64_t`	64	0	18,446,744,073,709,551,615

If an operation exceeds the range, it wraps around to zero. Signed overflow is undefined behavior under the C standard, but most systems still wrap internally. Programs that depend on that behavior are nonportable and unreliable.

JavaScript
JavaScript differs from most popular languages in that it does not have a distinct integer type. All numbers are represented as double-precision 64-bit binary format IEEE 754 floating-point values. This introduces limitations on which integers can be represented precisely. JavaScript defines Number.MIN_SAFE_INTEGER and Number.MAX_SAFE_INTEGER as the range of values that can be represented without a loss of precision. A large integer library, accessed through the BigInt data type supports arbitrary-precision integer arithmetic. Using this comes at a performance cost since these are no longer native processor operations.

Python 3
In Python 3, integers are objects that implement arbitrary-precision integer arithmetic. An integer will use the native architecture's storage size but check for overflow (or underflow) and switch to an arbitrary size implementation when needed. This comes with a performance ovehead. Libraries like NumPy offer fixed-size integer types (e.g., int32, int64) for cases where memory usage and optimization are needed.

C
Integer sizes are not strictly defined and depend on both the compiler and underlying architecture. The char type is typically one byte (8 bits). The short type must be at least as wide as a char but is typically 16 bits. An int is typically 32 bits, even on 64-bit architecutres. A long must be at least as wide as an int and may be 32 or 64 bits. A long long is typically 64 bits wide. The sizeof operator allows you to see the width of data type at runtime. The C99 standard introduced fixed-width integer types in the <stdint.h> header, which include int8_t, int16_t, int32_t, and int64_t for signed integers and uint8_t, uint16_t, uint32_t, anduint64_t for unsigned integers.

Java
In Java, integer overflow can be an issue, similar to other programming languages that use fixed-size integer representations. Java provides several primitive integer types (byte, short, int, long) with fixed sizes: 8, 16, 32, and 64 bits respectively. When an operation causes the value to exceed the range of these types, overflow occurs, and the number wraps around to the minimum value of the type and continues from there, potentially leading to unexpected or incorrect results if not properly handled.

Go
Integer overflow is an issue in Go since it also uses fixed-size integer types. Go provides several integer types (int8, int16, int32, int64 and their unsigned counterparts uint8, uint16, uint32, uint64, along with architecture-dependent types like int, uint, and uintptr) that have fixed sizes. When the value assigned to such a type exceeds its capacity, it wraps around to the beginning of its range, which can lead to unexpected behavior if not properly managed.

Rust
In Rust, integer overflow behavior differs based on the build profile: in debug mode, Rust checks for integer overflow and causes your program to panic (terminate execution with an error) if overflow occurs. In release mode, Rust does not check for overflow, and if overflow occurs, it wraps around to the minimum or maximum value of the type.

The basics

Integer overflow

What happens if you have a 16-bit unsigned integer and add 1 to 65535? Most languages will not detect an error and simply perform a modulo operation.

For unsigned numbers, 65535+1 = 0

We have a possibly more unfortunate situation with signed numbers. What happens if we take a 16-bit integer and add 1 to 32767?

32767+1 = -32768

What should have become a bigger positive number has now become a big negative number.

And underflow

We can go in the opposite direction. If we take the largest negative integer for our bit length and subtract 1, we get a large positive number.

-32768 – 1 = +32767

We used shorts, which are 16 bits long, as an example, but the same thing happens with any data size. An int in C is 32 bits on most machines, even on a 64-bit system. Adding 1 to the maximum value, which is a bit over 2 billion (2,147,483,647), gives us a number that’s a bit smaller than negative 2 billion.

Arithmetic overflow and its impact on memory allocation

Integer overflow commonly appears in allocation or length calculations. Consider a network service that reads a message containing a count and allocates space for that many records:

size_t nitems = get_count_from_network();
char *buf = malloc(nitems * sizeof(struct record));
read(fd, buf, nitems * sizeof(struct record));

If nitems * sizeof(struct record) exceeds the maximum value that fits in size_t, the multiplication wraps to a small value. The program allocates a small buffer but then reads a large amount of data, leading to a heap overflow.

This is not rare. Arithmetic overflows have caused critical vulnerabilities in image decoders, video parsers, and file systems that trusted size fields from untrusted inputs.

Preventing arithmetic overflow:

Check for overflow before allocation. Verify that nitems <= SIZE_MAX / sizeof(struct record).
Use compiler built-ins that detect overflow, such as __builtin_mul_overflow or __builtin_add_overflow.
Keep sizes in size_t, which is the platform’s natural type for object sizes.

Here is an example of safer arithmetic:

size_t total;
if (__builtin_mul_overflow(nitems, sizeof(struct record), &total)) {
    /* handle error */
}
char *buf = malloc(total);
if (!buf) abort();
read(fd, buf, total);

Type confusion: signed versus unsigned mismatches

Mixing signed and unsigned integers in comparisons or assignments can silently change results. The compiler promotes signed values to unsigned when they appear together in an expression. This changes comparison outcomes and allows negative values to appear as large positives.

Here's a simple example:

    unsigned short n =65535
    short i = n;

Converting an unsigned 65,535 to a signed integer gives us a value of -1, not 65,535.

A most significant bit of 1 indicates a negative number in two's complement arithmetic.

Here is a more realistic example that looks harmless but is unsafe:

int len = get_length();   /* may be negative on error */
if (len < buffer_size) {
    memcpy(dst, src, len);
}

If len is negative, it becomes a large unsigned number in the comparison, so the condition is true and memcpy copies an enormous amount of data. The fix is simple: always check for negative values before using them as sizes or indexes.

Mixing signed and unsigned values is a case type confusion bugs. In the general form, they can be any assumptions about using floating point numbers, sizes of arrays, member of unions, or the data types of pointers. The bug is most common in C and C++ but can also be found in languages like PHP and Perl.

Some type confusion bugs may lead to exploits For example, on May 24, 2024, Google rolled out a fix to address the fourth zero-day exploit for May of 2024 (and eighth of that year) in its Chrome browser. This fixed a type confusion vulnerability that was exploited in the wild and allowed a remote attacker to execute arbitrary code via a specially-crafted HTML page. The heart of the problem was a type confusion error in the JavaScript engine. Attackers could create a JavaScript Proxy object whose property trap would return a floating point array instead of a number. The unepected array would corrult the cache in the JavaScript interpreter and misdirect memory pointers. This, in turn, led to heap overflows and use-after-free attacks.

See CWE-843: Access of Resource Using Incompatible Type ('Type Confusion').

Off-by-one errors (the fencepost problem)

An off-by-one error occurs when a loop or index runs one step too far or too short. The name comes from the fencepost problem: a fence with N sections needs N+1 posts. The mistake is a single boundary error, but the effect can be as serious as a full overflow.

Here is an example of a loop that writes one byte too many:

char buf[16];
for (int i = 0; i <= 16; i++) {
    buf[i] = src[i];
}

The loop should stop at i < 16. The incorrect condition i <= 16 writes 17 bytes into a 16-byte buffer. The extra byte may overwrite a saved variable or a pointer, depending on layout. The bug might go unnoticed until the overwritten value happens to matter.

String handling functions are a common source of off-by-one errors. Copying exactly N bytes from an N-byte source string leaves no room for the null terminator, resulting in an unterminated string or a one-byte overflow.

This is a safer approach:

snprintf(buf, sizeof(buf), "%s", src);

This ensures the result fits and that a null terminator is always written.

Truncation and conversion

Truncation happens when converting between integer types of different widths. Large integers stored in smaller variables silently lose higher-order bits.

A truncation error can look like this:

uint64_t big = get_value();
uint32_t small = (uint32_t) big;
char *buf = malloc(small);

If big exceeds UINT32_MAX, the value in small wraps to a smaller number. The buffer allocation is far smaller than intended, and subsequent code that assumes the larger size causes an overflow. Always check ranges before converting:

if (big > UINT32_MAX) error("value too large");
uint32_t small = (uint32_t) big;

But we have 64-bit architectures

One could think that overflow wouldn’t be a problem with 64-bit architectures. 9 quintillion (9,223,372,036,854,775,808) is a huge number. That shows the danger of making assumptions.

A 64-bit integer is more than enough if you're counting bytes in a file, money in a bank account, or possibly grains of sand on earth. However, if a user can set a field to some value, like in a network packet, they can set it to a ridiculously large value and overflows can still occur. Moreover, the default size of an int in C on Linux, Windows, and macOS is still 32 bits.

Overflows can be a particular problem when code specifically deals with smaller data types. Various Internet Protocol fields, for example, regularly use 8- and 16-bit fields.

The Global Positioning System (GPS) stores the week number in 10 bits, which rolls over every 19.7 years. Week 0 started on January 6, 1980. It rolled over on August 21, 1999 and again on April 6, 2019. Most software was updated for this rollover but we can easily imagine a situation has not been updated to know of a new reference date for the week count and will compute a value that’s 19.7 years in the past.

Finally, there are lots of legacy data structures or programmers who might have been concerned about wasting storage where these smaller integer sizes are still present.

Detecting and preventing integer bugs

Modern compiler sanitizers help expose integer problems before deployment.

Undefined Behavior Sanitizer (-fsanitize=undefined) and Integer Sanitizer (-fsanitize=integer) detect arithmetic overflows during testing.
AddressSanitizer (-fsanitize=address) detects out-of-bounds accesses that result from incorrect arithmetic.
Testing with boundary inputs, such as zero, one, the maximum expected value, and values slightly beyond, reveals most off-by-one and overflow cases.

Good engineering practice is to validate all externally supplied sizes, perform overflow-safe arithmetic, and handle error paths explicitly rather than assuming that all computations fit in available types.

Integer and off-by-one errors are silent but may be dangerous. A single miscalculation in arithmetic or indexing can lead to a buffer overflow or a logic error. Because C performs arithmetic without range checking, the programmer must add those checks explicitly. Treat every integer that influences memory allocation or copy length as untrusted input, and test with boundary conditions to detect these problems before deployment.

Examples

If you're interested in examples of how attacks can take advantage of integer overflow, please take a look here.

Format-String Vulnerabilities

Functions such as printf, fprintf, and sprintf provide flexible ways to format output. They interpret a format string that contains directives beginning with the % character and then read corresponding arguments from the stack. A mismatch between the format string and the number or type of arguments can cause undefined behavior. When user input is mistakenly used as the format string itself, the problem becomes a serious vulnerability.

How formatted output works

A call to printf expects a constant format string followed by arguments:

printf("%s is %d years old\n", name, age);

The format string tells the function how many arguments to read and how to interpret them. %s consumes a pointer to a string, and %d consumes an integer. The function steps through the format string and pulls values from the stack one by one.

Some common format parameters in printf are:

Parameter	Purpose	Stack Usage
`%d`, `%u`	Print signed and unsigned decimal integers	Reads 4 bytes (value)
`%x`	Print hexadecimal integer	Reads 4 bytes (value)
`%s`	Print string	Reads 4 bytes (pointer)

If the format string is constant, the compiler can verify that the number and types of arguments match. Problems occur when the format string is provided by the user.

The root cause of the vulnerability

A function call like the following appears harmless but is unsafe:

printf(user_input);

If user_input contains any format specifiers such as %x, printf will treat them as instructions to read additional values from the stack. The attacker can control how many values are read and what is printed. This allows two main classes of attack: information leaks and arbitrary writes.

Leaking stack data

If an attacker provides input such as

%x %x %x %x

the function will print raw data from the stack. Each %x directive reads four bytes and prints them as a hexadecimal value. By adjusting the number of format specifiers, an attacker can walk through memory on the stack and leak sensitive information such as return addresses, function pointers, or passwords that happen to be stored nearby.

In addition to the possibility of leaking confidential information, information disclosure is dangerous because it can reveal the exact location of executable code or library functions, even if those locations are randomized (we will cover Address Space Layout Randomization a bit later). Once an attacker knows where code resides, they can use that knowledge to build precise exploits.

Writing to memory with `%n`

In addition to the more commonly used format directives, printf also supports a %n directive:

Parameter	Purpose	Stack Usage
`%n`	Write byte count	Reads 4 bytes (pointer)
`%hn`, `%Ln`	Write byte count as 16 bits or 64 bits	Reads 4 bytes (pointer)

The %n parameter is unusual: instead of producing output, it writes the number of bytes printed so far to an address supplied on the stack. This feature, combined with attacker control of the format string, enables arbitrary memory writes.

It's an odd formatting directive and most programmers are not even aware of its existence.

Note: The origins of, and motivation for, %n are a bit obscure (at least to me, looking through old manuals and on-line content). It does not appear in and of the Bell Labs versions of Unix and seems to have entered the library via the BSD variant of Unix (Berkeley Standard Distribution). By the time ANSI C standardized the printf family in 1989, %n was included in the specification. The directive serves a legitimate purpose in certain formatting scenarios where the programmer needs to know how many characters were output, but its utility is limited and its security implications are severe.

The security community's response to %n has varied by platform. OpenBSD recognized the security risk and modified its C library implementation: as of OpenBSD 5.5 (2014), any use of %n causes the program to log a warning message and terminate immediately. This aggressive stance reflects OpenBSD's philosophy of removing dangerous features rather than attempting to use them safely. Linux and macOS continue to support %n without restriction as of this writing, prioritizing compatibility with the ANSI C standard. The Microsoft C runtime library (used in Windows) does not implement %n at all, likely due to security concerns.

A lesson for attackers and defenders: Obscure features are valuable to attackers precisely because they are obscure. Developers may not know about them and therefore cannot guard against their misuse. Code reviewers may overlook them during audits. Automated security tools may not check for their presence. Maintenance efforts that aim to remove "dangerous" functions like gets() or strcpy() may miss format string vulnerabilities entirely because they look for specific function names rather than examining how arguments are passed. The existence of %n, an unusual directive with a write side effect, hidden among dozens of other format specifiers, shows why a thorough understanding of library interfaces and language features is essential for both exploitation and defense.

If an attacker can control the format string, they can use %n to perform arbitrary memory writes.

Here's a simple example:

int count;
printf("hello%n", &count);

After this call, count holds the value 5 because five characters were printed. The vulnerability arises when no explicit pointer argument is passed. The %n specifier still expects an address on the stack, and printf will use whatever happens to be in the next position on the stack as a pointer. By carefully crafting input, an attacker can manipulate both the value that is written as well as the memory location on the stack.

The three-step exploitation process

Exploiting format string vulnerabilities with %n requires three coordinated steps: traversing the stack to reach the attacker-controlled buffer, controlling the value to be written, and triggering the write to the target address. Each step is essential for successful exploitation.

Step 1: Traversing the stack to the format string

When printf processes a format string, it maintains an internal pointer to the current position on the stack where it expects to find the next argument. Each format specifier advances this pointer. The attacker's goal is to move this pointer from its initial position (just after the format string pointer itself) to a location where the attacker controls the data, typically, the format string buffer itself.

Why this matters: The format string is usually stored on the stack. If we can make the format function's internal stack pointer point into our own format string, we can supply addresses that %n will use as write targets.

The technique: Use format specifiers that consume stack values without writing anything useful. Each specifier like %x or %u advances the stack pointer by 4 bytes (on 32-bit systems) or 8 bytes (on 64-bit systems).

Example:

For simplicity, we'll use a 32-bit system example where pointers are 4 bytes. Suppose our format string starts at stack offset 32 (8 stack positions away from where printf begins reading arguments). We need to "pop" 8 values off the stack to reach our buffer:

printf("\x10\x01\x48\x08%x%x%x%x%x%x%x%x%n")

Here's what happens step by step:

printf reads the format string pointer from the stack
It encounters the 4-byte sequence \x10\x01\x48\x08 (which represents the address 0x08480110 in little-endian format), but it treats this as characters for output.
Each %x directive consumes 4 bytes from the stack and prints them as hexadecimal:
- First %x: reads from stack position 1, advances pointer
- Second %x: reads from stack position 2, advances pointer
- ... continues through 8 positions ...
- - Eighth %x: reads from stack position 8, advances pointer
Now the internal stack pointer points to stack position 9—which is where our format string begins and where our address bytes \x10\x01\x48\x08 are stored
When %n executes, it reads those 4 bytes as a pointer (0x08480110) and writes to that address

Here's a visual representation of what's on the stack:

Stack layout:
Low addresses
├─ Position 0: [format string pointer] ← printf starts here
├─ Position 1: [junk data]             ← 1st %x reads this
├─ Position 2: [junk data]             ← 2nd %x reads this
├─ Position 3: [junk data]             ← 3rd %x reads this
├─ Position 4: [junk data]             ← 4th %x reads this
├─ Position 5: [junk data]             ← 5th %x reads this
├─ Position 6: [junk data]             ← 6th %x reads this
├─ Position 7: [junk data]             ← 7th %x reads this
├─ Position 8: [junk data]             ← 8th %x reads this
├─ Position 9: [\x10\x01\x48\x08...]  ← %n reads address from here
│              └─ our format string starts here
High addresses

Finding the correct distance: To determine how many stack positions to traverse, attackers use a test string like AAAA%x%x%x%x%x%x%x%x%x. They increase the number of %x specifiers until they see 41414141 (the hex representation of "AAAA") in the output. This confirms the stack pointer now points into their controlled buffer.

Optimization note: The number of characters printed by each format specifier varies:

%x: prints variable length (1-8 hex digits depending on the value, without leading zeros)
%08x: prints exactly 8 hex digits (padded with leading zeros)
%u: prints a decimal number (variable length, 1-10 digits for 32-bit values)
%.u or %.f: prints minimal output while still consuming stack slots
Direct parameter access (e.g., %8$x): on some systems, this allows jumping directly to the 8th parameter without consuming the earlier ones This is a POSIX extension but not a standard C library feature.

Step 2: Controlling the value written by `%n`

The %n directive writes the number of bytes that printf has printed so far in the current call. By default, this count is small, just the bytes already processed. To write arbitrary values (such as addresses, which are typically large numbers like 0xbfffd33c), attackers must artificially increase this byte count.

Why this matters: To gain control of execution, attackers usually want to write an address into a saved return address or function pointer. These addresses are typically large numbers (e.g., 0x08048000 to 0xbfffffff on 32-bit Linux systems), not the small counts that normal format strings would produce.

The technique: Use width specifiers in format parameters to add padding that increases the printed byte count without requiring actual data. The format %Nu (where N is a number) tells printf to print a decimal integer padded to N characters.

Simple example:

To write the value 60 to an address:

printf("AAAA%08x%08x%40u%n", ...)

Byte count breakdown:

AAAA: 4 bytes (literal characters)
%08x: 8 bytes (first hex value, padded)
%08x: 8 bytes (second hex value, padded)
%40u: 40 bytes (decimal value, padded)
Total when %n executes: 4 + 8 + 8 + 40 = 60 bytes

By using format specifiers with explicit field widths (like %08x or %40u), attackers achieve exact control over the byte count.

Key insight about byte counts: The format function counts all characters printed, including:

Literal characters in the format string.
Output from %x, %d, %s, etc.
Padding added by width specifiers.

The byte-at-a-time technique: Writing a full 32-bit address in one operation is problematic because the byte count might need to be over 4 billion for high addresses (e.g., 0xbfffd33c = 3,221,213,500 in decimal). Printing billions of characters would hang the program and is impractical.

The solution is to write one byte at a time to four consecutive memory addresses. This leverages the fact that %n writes an integer (4 bytes), but we only care about controlling the least significant byte of what gets written.

Example scenario:

Target location (where to write): 0x08049abc (a saved return address on the stack)
Desired value (what to write): 0xbfffd33c (address of our shellcode or some function we'd like the program to return to)

Break the desired value into bytes: 0xbf 0xff 0xd3 0x3c (or in decimal: 191, 255, 211, 60)

The key insight: We embed four different target addresses in our format string, each one byte apart:

"\xbc\x9a\x04\x08": 0x08049abc -- write position for byte 0 "\xbd\x9a\x04\x08": 0x08049abd -- write position for byte 1 "\xbe\x9a\x04\x08": 0x08049abe -- write position for byte 2 "\xbf\x9a\x04\x08": 0x08049abf -- write position for byte 3

Then we use four %n directives, each preceded by padding to control the byte count:

printf("\xbc\x9a\x04\x08\xbd\x9a\x04\x08\xbe\x9a\x04\x08\xbf\x9a\x04\x08"
       "%08x%08x%08x%08x"  // stack popping
       "%28u%n"             // write byte 0: total = 60 bytes
       "%151u%n"            // write byte 1: total = 211 bytes  
       "%44u%n"             // write byte 2: total = 255 bytes
       "%192u%n");          // write byte 3: total = 447 ≡ 191 (mod 256)

Here's what happens:

First %n: Reads address 0x08049abc from the format string, writes [3c 00 00 00] there (count=60, LSB is 0x3c)
Second %n: Reads address 0x08049abd from the format string, writes [d3 00 00 00] there (count=211, LSB is 0xd3)
- This overlaps with the first write, so memory at 0x08049abc now contains: [3c d3 00 00]
Third %n: Reads address 0x08049abe from the format string, writes [ff 00 00 00] there (count=255, LSB is 0xff)
- Memory at 0x08049abc now contains: [3c d3 ff 00]
Fourth %n: Reads address 0x08049abf from the format string, writes [bf 00 00 00] there (count=447, LSB is 0xbf)
- Final memory at 0x08049abc: [3c d3 ff bf] = 0xbfffd33c.

Each %n writes 4 bytes starting at its target address, but because each target address is only 1 byte apart, the writes overlap. We control only the least significant byte of each write through careful byte count manipulation, and these overlapping writes construct our desired 4-byte value.

Note on the byte counter: The example above demonstrates an important technique—notice that byte 3 (191) is smaller than byte 2 (255), yet we successfully wrote it. Since we can't decrease the printf byte counter, we used modulo-256 arithmetic: we added 192 bytes of padding to go from 255 to 447, and since %n only writes the least significant byte, 447 % 256 = 191. This wraparound technique allows us to write any sequence of byte values, regardless of order.

Step 3: Putting it all together

The complete exploitation combines stack traversal (Step 1) with controlled byte writes (Step 2). The exploit string structure is:

[addr+0][addr+1][addr+2][addr+3] [stack-pop sequence] [write sequence]

Where:

[addr+0] through [addr+3]: Four consecutive target addresses (one byte apart)
[stack-pop sequence]: Format specifiers to reach the addresses in our buffer
[write sequence]: Four %n directives with calculated padding to write each byte

The example from Step 2 already demonstrates this complete process: we embedded four addresses, popped the stack to reach them, and performed four overlapping writes to construct our desired value byte by byte. This three-step approach—traversing to controlled data, calculating precise byte counts, and triggering writes—is the foundation of format string exploitation.

Alternative technique - short writes: Some exploits use %hn instead of %n, which writes a short (2 bytes) instead of an int (4 bytes). This reduces the number of write operations from 4 to 2 but requires writing larger byte counts (up to 65535 instead of 255). This is useful when you want to avoid overwriting adjacent data or when the target architecture has strict alignment requirements, but not all C library implementations support it reliably.

Preventing format-string vulnerabilities

The simplest defense is never to use untrusted data as a format string. Always supply a constant format string and treat user data as ordinary arguments.

This is a safe pattern:

printf("%s", user_input);

This is an unsafe pattern:

printf(user_input);

Compilers can detect many unsafe calls when warnings are enabled. Use the following options to improve detection:

-Wformat-security warns when a function like printf uses a nonliteral format string.
-D_FORTIFY_SOURCE=2 adds runtime checks to detect mismatched format arguments in some libraries.

It is also good practice to use functions that include explicit buffer size limits, such as snprintf and vsnprintf. These limit the output length and reduce the risk of overflow when writing formatted data.

A format-string vulnerability occurs when user input is treated as a format specification rather than as data. Attackers can read or write arbitrary memory through %x, %s, and %n directives, often without overflowing any buffer. The defense is simple but essential: use fixed format strings and validate all output operations that accept external input. A single misplaced printf call can compromise an entire program.

Examples

If you're interested in examples with more detail, please take a look here.

Exploitation Concepts

This section explains the attacker’s choices once memory corruption allows modification of control data. The goal is to give a clear conceptual model: what an attacker can try to do, what constraints they face, and why different techniques exist. The exposition is descriptive and non-operational. It focuses on principles, not on step-by-step procedures.

The attacker’s decision tree

When an attacker can overwrite control data such as a return address or a function pointer, they face a simple decision tree:

Can they place executable code at a reachable location? If yes, they can attempt to jump there and run that code.
If injected code cannot run, can they reuse existing code to achieve their goals? If yes, they will try code reuse techniques.
If neither option is reliable, can they instead achieve their goal via data corruption, privilege escalation, or information disclosure?

Every exploitation strategy is an engineering response to the environment defined by the program, the operating system, and the hardware. The remainder of this section describes the main categories of strategies and the constraints that shape them.

Shellcode: purpose, constraints, and staging

Shellcode is the historical term for the small sequence of machine instructions an attacker places into writable memory with the intention of executing it. The archetypal shellcode spawns a command shell; the term now applies more broadly to any small fragment that serves as an initial payload.

Three constraints commonly shape shellcode.

Position assumptions. If the attacker can predict addresses reliably, the payload can contain absolute references. When addresses are randomized, position independence is required.
Forbidden bytes. When the copy function treats input as a C string, null bytes (0x00) terminate the copy. Shellcode must avoid null bytes to survive the copy step, which constrains what machine instructions can be used. Other bytes like newline (0x0A) may also be forbidden depending on how input is read. Shellcode must be encoded to avoid these bytes.
Size. The vulnerable slot is usually limited in length, which forces minimalism.

Shellcode often appears in two stages. The first stage is deliberately small and performs limited, architecture-specific setup: preparing registers and stack state, requesting additional bytes, or invoking a minimal runtime action. The second stage is larger and implements the main functionality. The first stage fits the constrained slot; the second stage is delivered or constructed after the first has run.

This description explains why staged payloads exist without providing operational detail. Staging separates placement from functionality: a small first stage gains the conditions a larger payload needs.

Landing zones and NOP sleds

If an attacker injects code, they must set the instruction pointer to an address that reaches useful instructions. Exact addressing is fragile. A landing zone, commonly implemented as a NOP sled, reduces the need for precise targeting.

A landing zone is a contiguous region of harmless instructions (e.g., NOP, no operation instructions) followed by the payload. If the instruction pointer lands anywhere inside that region, execution advances through harmless instructions until it reaches the payload. A landing zone increases the range of addresses that succeed, which relaxes the precision requirement for the corrupted pointer.

A landing zone is useful whether or not the payload is position independent. Position independence determines whether the code can run from multiple addresses; the landing zone determines how precise the jump must be. Even when the payload is position independent, a landing zone improves alignment tolerance and increases the chance that a partial or imprecise overwrite still lands in executable code that flows into the payload.

Landing zones lose value when the system prevents execution from writable memory. In that environment, an attacker cannot execute injected bytes at all and must instead look for other ways to reuse existing code.

Return-to-libc: invoking existing routines

Return-to-libc is the simplest form of code reuse. Instead of jumping to injected instructions, the attacker sets the return address to an existing function in a library, typically a standard C library function such as system(). By arranging the stack appropriately, the attacker can cause that function to execute with attacker-chosen arguments.

Conceptually, return-to-libc shows two points:

It bypasses any non-executable-memory protections because it reuses code already marked executable.
Return-to-libc is constrained by the routines that exist in the process and by the attacker’s ability to place arguments so that a chosen routine performs useful work. Some standard library functions expose considerable power; for example, wrappers around the execve system call or the system() helper let a caller run an arbitrary program when provided a suitable argument string. That capability is why directing control to an appropriate library routine can be equivalent to executing code, and it explains why simple return-to-libc attacks historically targeted functions that invoke a shell or execute commands.

Return-to-libc highlights a general idea: reuse convenient, trusted code to achieve untrusted goals. It is simple to state and to teach, and it motivates why defenses aim to make both address discovery and argument setup difficult.

Return-Oriented Programming (ROP): composing computation from gadgets

Return-Oriented Programming generalizes the reuse of existing code into a more powerful technique. The attacker locates short instruction sequences in executable memory that end in a return (ret) instruction. Each fragment, called a gadget, performs a small, useful operation, such as moving data between registers or performing an arithmetic step. By chaining gadgets through a sequence of return addresses on the stack, the attacker composes arbitrary computation.

ROP works because returns transfer control to addresses the attacker places on the stack. Gadgets are assembled like Lego blocks: one gadget prepares register state, the next executes an operation, and so on. With a rich enough set of gadgets, an attacker can implement complex behavior without injecting new code.

The success of ROP depends on:

The availability of gadgets in executable memory. Compilers and linkers that change instruction alignment or remove unused code can reduce gadget density.
Being able to locate those gadgets. Address randomization and information leaks therefore have a direct impact on ROP feasibility.

ROP is powerful but complicated to construct; defenders aim to remove reliable primitives and to increase uncertainty to make ROP chains brittle. However, tools have been created to make life easier for attackers. For example, ropc is a Turing-complete ROP compiler.

These exploitation strategies -- shellcode injection, landing zones, return-to-libc, and ROP -- illustrate how attackers move from simple to sophisticated methods as protections increase. The next section examines those defensive mechanisms and how they restrict or detect these attacks.

Defenses and Hardware Mechanisms

The techniques described earlier exploited the same weakness: the program trusted memory contents that could be corrupted. Defensive mechanisms try to restore that trust by controlling which memory can be executed, which addresses can be targeted, and which control transfers are valid. Each mechanism solves a particular problem that earlier systems left exposed. Modern systems combine them to make reliable exploitation much harder.

Non-executable memory (NX, DEP, W^X)

The earliest exploits depended on placing machine instructions into a buffer and diverting execution into that buffer. The processor did not distinguish between data and code. Non-executable memory changes that. The operating system marks data regions such as the stack and heap as non-executable. The CPU refuses to fetch instructions from those pages.

This simple change blocks the classic shellcode attack. A buffer overflow can still overwrite data, but the processor will not execute that data as code. Different operating systems use different names: NX (no execute), DEP (Data Execution Prevention), or W^X (writable XOR executable) to express the same policy: memory may be writable or executable, but not both.

NX does not fix memory corruption; it merely removes one outcome. Attackers responded by finding ways to execute existing code instead of injecting new code. This shift gave rise to return-to-libc and later return-oriented programming.

Address-space layout randomization (ASLR)

Return-to-libc made NX alone insufficient. If the attacker knew where a library function such as system() lived, they could redirect control there. The next step in defense was to randomize memory locations so that addresses were unpredictable from one run to the next.

Address-space layout randomization (ASLR) shuffles the base addresses of the program, its shared libraries, the stack, and the heap. Each process gets a slightly different layout. With ASLR in place, a hardcoded address rarely points to the same code twice.

ASLR solved the predictability problem that made return-to-libc reliable. To succeed, an attacker must now first learn or guess the randomized addresses. Information leaks that reveal any valid pointer can undermine ASLR by giving the attacker a reference point. The effectiveness of ASLR therefore depends on how much randomness the system provides and whether any information leaks exist.

ASLR raised the cost of exploitation but did not eliminate it. It simply turned a deterministic attack into a probabilistic one. Attackers responded with information leaks and partial overwrites to reestablish predictability.

Stack canaries

The stack stores both local variables and the function's return address. Classic stack overflows worked because nothing protected the boundary between them. A stack canary restores that missing protection. The compiler inserts a small random value (a "canary") between local buffers and saved control data. Before returning from the function, the program checks whether the canary value changed. If it did, the program terminates immediately.

The idea comes from an old mining analogy: a canary warned miners of invisible danger before it harmed them. Stack canaries warn of corrupted control data before a malicious return occurs.

Stack canaries stop simple overwrites that extend from a buffer into the saved return address. They do not protect against overwriting variables elsewhere in memory or against vulnerabilities such as use-after-free. The effectiveness of the check also depends on secrecy: if the attacker can read the canary, they can include the correct value when overwriting the stack.

Canaries solved the missing boundary between local data and control data. They were one of the first compiler-level defenses that detected corruption rather than preventing it.

Safer C libraries and format hardening

Many vulnerabilities come from functions that assume developers provide correct arguments and buffer sizes. The C standard library was designed for performance, not safety. Functions such as gets, strcpy, and sprintf have no built-in limit on how much data they copy or print. Safer library variants and compiler hardening options address this.

Modern compilers warn when code uses unsafe functions or passes nonliteral format strings to output functions such as printf. Hardened libraries replace vulnerable routines with safer versions like fgets, strncpy, and snprintf. Additional checks, such as FORTIFY_SOURCE, verify at runtime that the destination buffer is large enough for the operation.

These measures solve the unchecked-boundary problem in standard library calls. They prevent many mistakes before they can reach production code.

Linker and loader hardening

Some exploits target not the program’s data but the structures used by the dynamic linker. Early systems stored relocation and function binding data in writable memory. Overwriting those tables could redirect function calls or initialization routines to attacker-controlled addresses.

Linker and loader hardening changed this model. Modern systems mark relocation sections as read-only after dynamic linking, a feature known as RELRO. Immediate binding of symbols prevents runtime resolution that an attacker might hijack later. The result is that dynamic-link data is no longer a practical control target.

This defense solved the integrity problem in the linkage process. It eliminated an entire class of overwrites against runtime relocation data.

Allocator hardening

Heap corruption exploits take advantage of the memory allocator’s internal metadata. Early allocators stored management structures inside user-accessible memory. Overwriting those structures could create arbitrary writes or pointer leaks. Modern allocators add multiple defenses to solve this problem.

Integrity checks and safe unlinking. The allocator verifies that free-list pointers are consistent before using them. This stops most attacks that forged links between memory chunks.
Heap canaries (cookies). Many allocators insert small random guard values before or after each heap block. When the block is freed, the allocator checks that the canary is unchanged. If an overflow or underflow modified it, the program aborts. Heap canaries detect buffer overflows and underflows within individual heap blocks. When the block is freed, if the canary has been modified, the program aborts before the corruption can spread to allocator metadata.
Pointer mangling (safe linking). The allocator encodes free-list pointers with a secret or with address bits so that attackers cannot guess valid values.
Quarantine and delayed reuse. Recently freed chunks are not immediately reused. This limits predictable reallocation that would make use-after-free attacks deterministic.
Randomized placement and per-thread caches. Allocation decisions vary over time and by thread, reducing predictability of heap layout.
Out-of-line metadata. Some allocators store management data outside user memory entirely, so a buffer overflow cannot reach allocator structures directly.

Allocator hardening solved the implicit-trust problem in heap metadata. It transformed heap corruption from an immediate exploit into a difficult reliability problem.

Developer instrumentation and fuzzing

Preventing memory vulnerabilities at runtime is important, but the best outcome is to remove them before software is released. Modern compilers and testing tools make this possible through runtime instrumentation and automated input generation.

Sanitizers

Sanitizers are compiler-based runtime checkers that detect memory and arithmetic errors as they happen. They insert lightweight checks into the compiled program.

AddressSanitizer (ASan) detects out-of-bounds and use-after-free errors.
Undefined Behavior Sanitizer (UBSan) detects invalid operations such as signed integer overflow or bad type casts.
LeakSanitizer reports unfreed memory and pointer leaks.

These tools turn silent corruption into clear, reproducible crashes. They add overhead, so they are used during development and testing, not production.

Fuzzing

Even with instrumentation, testing needs diverse inputs. Fuzzing generates large numbers of semi-random inputs, monitors execution for crashes or sanitizer failures, and identifies unusual behaviors. Fuzzing’s value is in discovering edge cases that human testers would never try.

Modern fuzzers are coverage-guided, using compiler instrumentation to measure which paths each input exercises and to mutate inputs that expand coverage. Combined with sanitizers, fuzzing is one of the most effective methods for discovering memory-safety bugs before deployment.

Sanitizers and fuzzing together address the visibility problem in software security. They do not stop attacks at runtime but find and eliminate vulnerabilities long before an attacker can exploit them.

Hardware support for control-flow and pointer integrity

Software defenses raise the bar, but attackers can still manipulate return addresses and function pointers if they find a suitable memory corruption path. Processor vendors added hardware mechanisms to enforce control-flow and pointer integrity directly.

Intel Control-flow Enforcement Technology (CET)

CET addresses two fundamental weaknesses: return address integrity and indirect branch targeting. It adds a shadow stack, a protected region that stores a copy of each return address. On every function return, the processor compares the normal stack’s return address to the one in the shadow stack. If they differ, execution halts. This directly prevents the attacker from changing the return address in memory.

CET also adds indirect branch tracking. The compiler marks valid branch targets with a special instruction. If the processor encounters an indirect branch to an address without that marker, it faults. This stops jumps into the middle of existing code sequences, which is how return-oriented programming chains are built.

CET solved the integrity problem of return addresses and indirect control transfers. It made tampering visible to hardware rather than relying on software checks.

ARM Pointer Authentication (PAC)

ARM’s pointer authentication protects pointers from tampering by adding a short, keyed integrity check to each pointer value. When the processor creates or modifies a pointer, it computes a Pointer Authentication Code (PAC) over the pointer value and a context value such as the stack pointer. The computation uses a small hardware-supported cryptographic function derived from the process’s secret key, which is stored in special registers that software cannot read or modify directly.

The PAC is a compact authentication tag, not a full cryptographic hash. It functions more like a lightweight message authentication code (MAC) than a general-purpose cryptographic operation. The goal is integrity, not secrecy or long-term collision resistance. The PAC typically the upper 16 bits or the pointer on 64-bit systems and can be verified quickly by the processor when the pointer is used. If verification fails, the pointer is treated as invalid, which usually triggers an exception.

Pointer authentication solves the trust problem in pointers. It allows the processor to detect when a pointer has been modified outside expected control flow. This makes attacks that overwrite return addresses or function pointers unreliable, since a forged pointer will usually fail authentication.

ARM Memory Tagging Extension (MTE)

Memory Tagging Extension detects both boundary violations (accessing memory outside an allocation) and lifetime violations (accessing memory after it has been freed). The processor associates a small 4-bit tag with each memory allocation and with each pointer that references it. When memory is freed or reallocated, its tag changes. On every load or store, the processor checks that the pointer's tag matches the memory's tag. If they do not match, the access is invalid and the processor raises an exception.

MTE solves the silent memory-safety problem. Buffer overflows that write past the end of an allocation, use-after-free bugs that access deallocated memory, and out-of-bounds reads all trigger immediate exceptions instead of silently corrupting data. This transforms previously invisible vulnerabilities into caught errors at the moment they occur, with minimal performance cost (typically less than 5% overhead).

Apple Memory Integrity Enforcement (MIE)

Apple’s Memory Integrity Enforcement extends these ideas into a complete system design. MIE combines hardware tagging with a secure allocator and runtime validation. The goal is to maintain continuous integrity of user-space memory. Allocations receive tags; tagged pointers and runtime checks ensure that invalid accesses are detected consistently across the system.

MIE solves the integration problem: how to make hardware features effective across the entire software stack. When combined with tagging and authenticated pointers, MIE makes exploitation of memory errors much more difficult.

How these defenses fit together

Each defense exists at a different layer of the system. Together they form a hierarchy of protection: hardware enforces basic rules about what memory and control data can do, the operating system applies those rules to processes, the compiler inserts runtime checks, and developer tools detect vulnerabilities before deployment.

Hardware mechanisms

Hardware defines what the processor will and will not execute or accept as valid control flow.

NX (DEP, W^X) prevents instruction fetches from writable memory pages.
CET protects return addresses and indirect branches with a shadow stack and branch markers.
PAC authenticates pointer values with lightweight cryptographic codes.
MTE and MIE detect out-of-bounds and use-after-free errors by tagging memory and pointers.

These features move enforcement into silicon and make many classes of corruption immediately detectable.

Operating system and runtime support

The OS builds on hardware capabilities and controls process layout and memory permissions.

ASLR randomizes base addresses of code, data, and stack regions.
RELRO and loader hardening mark relocation tables as read-only after linking.
Allocator hardening enforces integrity checks, pointer mangling, and delayed reuse of freed memory.
Non-executable stacks and heaps are implemented through OS memory-mapping policies.

The OS layer defines the execution environment: where memory resides, what permissions it has, and how the process manages it.

Compiler and language-level defenses

The compiler knows how code uses memory and control data. It inserts runtime checks that detect corruption and enforces safer function and linking behavior.

Stack canaries detect overwrites of return addresses within stack frames.
Safer library calls (snprintf, fgets, fortified functions) replace unsafe standard functions.
Format-string checks warn when nonliteral format strings reach output functions.
Linker and symbol hardening restrict how functions are bound at load time.

These mechanisms close many of the direct memory-safety holes visible at the source-code level.

Developer and testing tools

Defenses do little good if software ships with undetected vulnerabilities. Developer tools find and remove bugs before deployment.

Sanitizers (ASan, UBSan, LeakSanitizer) insert runtime checks to detect invalid memory or arithmetic operations during testing.
Fuzzing exercises programs with massive numbers of random or mutated inputs to expose crashes and sanitizer failures.

These tools address the visibility problem: they expose memory errors early, turning them into fixable bugs instead of exploitable vulnerabilities.

Putting it together

Each layer reinforces the others:

Hardware prevents direct corruption from executing.
The OS controls memory layout and permissions.
The compiler guards against local corruption of control data.
Developer tools detect unsafe behaviors before release.

When combined, these layers make exploitation far more difficult. An attacker must now bypass randomization, non-executable memory, integrity checks, pointer authentication, and memory tagging, all at once. The defense-in-depth model is what allows modern systems to remain reliable despite occasional programming mistakes.

Limitations and practical tradeoffs

No defense eliminates all risk. Each one has cost and scope.

NX and ASLR rely on hardware and OS support and can interfere with specialized software such as JIT compilers.
Stack canaries detect corruption but do not prevent it and can be bypassed if their value is known.
Safer libraries depend on developers choosing the right functions.
Allocator hardening and tagging increase memory and CPU overhead.
Sanitizers and fuzzing slow execution and are used for testing, not deployment.
Hardware mechanisms require compiler and OS cooperation and may not protect legacy binaries.

Even with all these measures, logic errors and information leaks can still enable exploitation. The value of these defenses lies in how they interact: each layer removes an easy path and forces the attacker toward rarer, less reliable conditions.

Conclusion

Memory safety is never absolute. Early systems trusted the program’s memory layout and paid the price. Non-executable memory, address randomization, and stack canaries reintroduced structure and boundaries. Hardened allocators and safer libraries reduced the attacker’s control over metadata. Development tools like sanitizers and fuzzing detect and eliminate many vulnerabilities before deployment. Hardware features such as CET, PAC, and MTE move enforcement into silicon, making corruption more detectable and less exploitable.

Each mechanism solved a different piece of the same problem: trusting that memory holds what the program expects. Modern systems still depend on that trust, but the layers of defense make it far more costly to violate.

Key Takeaways

Memory defenses evolved in response to specific exploit techniques. Each one closed a gap that attackers had used successfully.

NX blocks execution of injected code; ASLR breaks address predictability.
Stack canaries detect overwrites of return addresses before control is lost.
Hardened allocators protect heap metadata and make corruption less reliable.
Safer libraries and compiler checks prevent many simple overflow bugs.
Sanitizers and fuzzing find vulnerabilities early, before deployment.
Hardware mechanisms such as CET, PAC, MTE, and MIE enforce memory and control-flow integrity directly in the processor.
Defense-in-depth is the guiding principle: each layer reduces what the attacker can do and forces increasingly complex, unreliable exploits.

Memory Vulnerabilities, Exploits, & Defenses

Introduction

Memory Layout and Stack Behavior

How the Stack Works

Example: How Local Variables Are Stored

What Happens When the Return Address Is Overwritten

Why gets() Is Dangerous

A Simplified Stack Frame Diagram

Why This Still Matters

Buffer Overflows (Stack and Heap)

Stack Overflows: mechanics and sequence

Example scenario: overwrite a flag variable vs. return address

Precision and practical constraints

Heap Vulnerabilities

Heap Overflows: Corrupting Adjacent Memory

Typical heap chunk layout (conceptual)

Consequences of heap corruption

How modern allocators defend

Example: how an allocator integrity check stops an attack

Use-After-Free and Double-Free: Memory Lifetime Errors

Use-After-Free

Double-Free

Why These Bugs Are Hard to Find

Defenses

Real-World Impact

Integer Overflows and Off-by-One Errors

Integer representation and ranges

The basics

Integer overflow

And underflow

Arithmetic overflow and its impact on memory allocation

Type confusion: signed versus unsigned mismatches

Off-by-one errors (the fencepost problem)

Truncation and conversion

But we have 64-bit architectures

Detecting and preventing integer bugs

Examples

Format-String Vulnerabilities

How formatted output works

The root cause of the vulnerability

Leaking stack data

Writing to memory with %n

The three-step exploitation process

Step 1: Traversing the stack to the format string

Step 2: Controlling the value written by %n

Step 3: Putting it all together

Preventing format-string vulnerabilities

Examples

Exploitation Concepts

The attacker’s decision tree

Shellcode: purpose, constraints, and staging

Landing zones and NOP sleds

Return-to-libc: invoking existing routines

Return-Oriented Programming (ROP): composing computation from gadgets

Defenses and Hardware Mechanisms

Non-executable memory (NX, DEP, W^X)

Address-space layout randomization (ASLR)

Stack canaries

Safer C libraries and format hardening

Linker and loader hardening

Allocator hardening

Developer instrumentation and fuzzing

Sanitizers

Fuzzing

Hardware support for control-flow and pointer integrity

Intel Control-flow Enforcement Technology (CET)

ARM Pointer Authentication (PAC)

ARM Memory Tagging Extension (MTE)

Apple Memory Integrity Enforcement (MIE)

How these defenses fit together

Hardware mechanisms

Operating system and runtime support

Compiler and language-level defenses

Developer and testing tools

Putting it together

Limitations and practical tradeoffs

Conclusion

Key Takeaways

Why `gets()` Is Dangerous

Writing to memory with `%n`

Step 2: Controlling the value written by `%n`