Stack frames
A really quick explanation of stack frames and frame pointers
Paul Krzyzanowski
March 4, 2024
Understanding Frame Pointers
Each function has local memory associated with it to hold incoming parameters, local variables, and (in some cases) temporary variables. This region of memory is called a stack frame and is allocated on the process’ stack. A frame pointer (the ebp register on intel x86 architectures, rbp on 64-bit architectures) contains the base address of the function’s frame. The code to access local variables within a function is generated in terms of offsets to the frame pointer. The stack pointer (the esp register on intel x86 architectures or rsp on 64-bit architectures) may change during the execution of a function as values are pushed or popped off the stack, such as pushing parameters in preparation to calling another function. The frame pointer doesn’t change throughout the function.
Here’s what happens in a function call (there might be slight differences among languages/architectures):
Push the current value of the frame pointer (ebp/rbp). This saves it so we can restore it later.
Move the current stack pointer to the frame pointer. This defines the start of the frame.
Subtract the space needed for the function’s data from the stack pointer. Remember that stacks grow from high memory to low memory. This puts the stack pointer past the space that will be used by the function so that anything pushed onto the stack now will not overwrite useful values.
Now execute the code for the function. References to local variables will be negative offsets to the frame pointer (e.g., "movl $123, -8(%rbp)”).
On exit from the function, copy the value from the frame pointer to the stack pointer (this clears up the space allocated to the stack frame for the function) and pop the old frame pointer. This is accomplished by the “leave” instruction.
Return from the procedure via a “ret” instruction. This pops the return value from the stack and transfers execution to that address.
Basic example
Let’s consider the following set of functions in a file called try.c
void
bar(int a, int b)
{
int x, y;
x = 555;
y = a+b;
}
void
foo(void) {
bar(111,222);
}
We’ll compile it via
gcc -S -m32 try.c
The -S option tells the compiler to create an assembler file. The -m32 option tells the compiler to generate code for a 32-bit architecture. In this example, it keeps the numbers smaller and we don’t have to worry about specifying -no-red-zone (see more details, below).
gcc chooses to use the mov instruction (movl) instead of push because the Intel x86 instruction set doesn’t have an instruction to push constant values onto the stack. Adjusting the stack and then moving the required parameters into the proper places as negative offsets accomplishes the same thing.
The generated code is (removing lines that contain directives to the linker):
bar:
pushl %ebp
movl %esp, %ebp
subl $16, %esp
movl $555, -4(%ebp)
movl 12(%ebp), %eax
movl 8(%ebp), %edx
addl %edx, %eax
movl %eax, -8(%ebp)
leave
ret
foo:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
movl $222, 4(%esp)
movl $111, (%esp)
call bar
leave
ret
We can annotate the code and trace it by starting at foo():
bar: # --------- start of the function bar()
pushl %ebp # save the incoming frame pointer
movl %esp, %ebp # set the frame pointer to the current top of stack
subl $16, %esp # increase the stack by 16 bytes (stacks grow down)
movl $555, -4(%ebp) # x=555 a is located at [ebp-4]
movl 12(%ebp), %eax # 12(%ebp) is [ebp+12], which is the second parameter
movl 8(%ebp), %edx # 8(%ebp) is [ebp+8], which is the first parameter
addl %edx, %eax # add them
movl %eax, -8(%ebp) # store the result in y
leave #
ret #
foo: # --------- start of the function foo()
pushl %ebp # save the current frame pointer
movl %esp, %ebp # set the frame pointer to the current top of the stack
subl $8, %esp # increase the stack by 8 bytes (stacks grow down)
movl $222, 4(%esp) # this is effectively pushing 222 on the stack
movl $111, (%esp) # this is effectively pushing 111 on the stack
call bar # call = push the instruction pointer on the stack and branch to foo
leave # done
ret #
Let’s see what happens. In foo(), we need to prepare the stack for two parameters that will be sent to bar(). The compiler would like to do
push $222
push $111
but those instructions don’t exist on the IA-32 architecture so instead, the compiler generates code to subtract 8 from the stack pointer, making the stack grow by eight bytes (enough to hold two 32-bit values). It then uses stack offset addressing to place the values 111 and 222 on the stack (see figure 1).
Then foo calls bar. This pushes the return address onto the stack so it looks like this when execution starts at bar (figure 2):
On entry to bar(), we save the previous value of ebp, and set the frame pointer to the top of the stack (the current position of the stack pointer). Then we grow the stack by subtracting 16 from the stack pointer. Stacks on intel architectures grow from high memory to low memory, so the top of the stack (the latest contents) are in low memory. The stack now looks like the one shown in figure 3. We have a stack frame for the function bar that holds local data for this instance of the function. Negative offsets of the frame pointer %ebp (toward the top of the stack, into lower memory) will refer to local data in bar. Positive offsets of %ebp will allow us to read incoming parameters.
Now we’re ready to execute the trivial logic of the function. We set local variable x to 555. This variable is the very next set of four bytes after the saved ebp. The next statement adds the two parameters and stores the result into the local int y. The code for this is to read the value of b (which is [ebp+12]) and store it into register %eax. The value of a (which is [ebp+8]) is read into register %edx. The two values are added and the result is stored in y, which is [ebp-8]. Figure 4 shows the position of the parameters and local variables.
When we’re done, we call “leave”, which sets the stack pointer to the value of the frame pointer (%ebp) and pops the saved value of the frame pointer (the one the function foo was using). Now the stack pointer is pointing to the return address within foo that was saved when the call instruction was executed and our frame is effectively deallocated. The ret instruction pops the stack and transfers control back to foo right after the call bar instruction.
You might be wondering why the stack was adjusted by 16 bytes instead of the eight that was needed to hold x and y. I don’t know. That seems to be a multiple that gcc uses. If you allocate two more local ints, the frame remains the same size. If you allocate another int, the compiler grows the stack by 32 bytes.
More details about how frames are used
gcc (and other compilers) uses registers for the first few (6) parameters and these are copied into areas inside the function’s frame]
As an optimization, the intel x86–64 architecture allows functions to use space on the stack without adjusting the stack pointer if that space is <= 128 bytes. Interrupt handlers are guaranteed to not modify this region. You can search for “red zone” to read about this if you’re interested. The gcc compiler can be told to ignore this via a -mno-red-zone option.
Since the compiler can keep track of what’s going on with the stack at any point in time, the frame pointer isn’t strictly necessary. You can compile code to use the stack pointer exclusively with the -fomit-frame-pointer option to gcc.
Exploiting buffer overflow
By exploiting a buffer overflow, you can write arbitrary data onto the stack. This means that you can change the return address of a function and also change the data past that return address - the local variables of previous functions. In a basic code injection attack, you can change the return address to the address of the buffer that you overwrote with code of your choosing. You now injected code into the program. In a simple return-oriented-programming attack, you change the return address to the address of a library function such as system() and insert data on the stack to make give system() the parameters you want (e.g., a command to execute). Note that the code illustrated above is not vulnerable to buffer overflow since we’re using scalars (just ints) instead of arrays.