Assignment 4

Due Monday March 13, 2017 6:30pm via sakai

Introduction

This assignment helps you develop a detailed understanding of the calling stack organization on an IA–32 processor. It involves applying a series of buffer overflow attacks on an executable file called bufbomb that you will be provided.

Note: In this assignment, you will gain firsthand experience with one of the methods commonly used to exploit security weaknesses in operating systems and network servers. Our purpose is to help you learn about the runtime operation of programs and to understand the nature of this form of security weakness so that you can avoid it when you write system code. We do not condone the use of these or any other form of attack to gain unauthorized access to any system resources. There are criminal statutes governing such activities.

Time

There isn’t much coding in this assignment: just a handful of line of assembler code. If you know exactly what you are doing, you can complete this assignment in under an hour. But … don’t count on that!

You will need to have a rudimentary proficiency with the gdb debugger. If you haven’t used it before, allot several hours of time to read up on it and try it out. If your program has any mistakes, you’ll get an error and the program will exit. Finding the error can be frustrating as you do not have the ability to insert print statements. You’ll need to set breakpoints, check registers, and disassemble code in your buffer.

Start early. You might be lucky and finish super early. That’s a much better feeling than having nothing working the night before the assignment is due.

Logistics

This is an individual assignment. You will work alone in solving the problems for this assignment. The handin will be a file named exploits.txt containing all of the Completion Notification messages that will be printed by the command when you successfully complete each part of the assignment.

Environment

This assignment is to be done on a 32-bit Intel Linux environment that has ASLR disabled. You can use a virtual machine on the iLab systems or run one on your Mac/PC/Linux system using VirtualBox. You can also connect to the iLab VMs if you’re connected to the Rutgers VPN.

Rutgers VMs

Rutgers has set up a few systems that you can use for this. You can ssh to them from the iLab systems. For example:

ssh net000@classvm101.cs.rutgers.edu

You should have received email from me containing your hostname and initial password at your scarletmail.rutgers.edu account. Your login name is your netid. I recommend that you change your password (via the passwd command) immediately.

The VMs do not have shared folders with their iLab hosts. You will have to use the scp command to copy files back and forth.

Be sure to save all your work on your regular systems as the VMs are not guaranteed to be backed up. In particular, save all your exploit strings, assembler code, and completion confirmation messages.

Protect your directory!

The VMs are shared systems. Be sure to protect your work from prying eyes both on the VM and on the iLab machines. Set the permissions on the directory you’re working in to make files accessible only to you:

chmod 700 .

Your own system: running a VM on VirtualBox

You may find it more convenient to use your own PC and run a Linux VM. To do this:

  1. Download the Ubuntu VM image, ubuntu.vdi.

  2. Download VirtualBox

  3. Set up the virtual machine
    • Create a new virtual machine
    • Give it a name (e.g., “ubuntu–419”
    • The Type is Linux
    • The Version is Ubuntu (32-bit)
    • When you get to the Hard Disk prompt, select “Use an existing virtual hard disk file
    • Navigate to wherever you placed your ubuntu.vdi file
    • Click Create
  4. Now you’ll see your OS (e.g., ubuntu–419) appear on the left panel in a Powered Off state.

  5. Click the Start button to boot it.

  6. Log in as root. The password is root.

  7. Install VirtualBox’s guest-utils package so you can share files. Inside the VM, run apt-get install virtualbox-guest-utils

You need to do this for macOS. For other systems, you may be able to go to the Devices menu and select “Insert Guest Additions CD image”. I haven’t tried.

  1. Reboot the system by running the reboot command.

  2. Navigate to Devices > Shared Folders > Shared Folders Settings… in the VirtualBox UI.
    • Click on the + icon on the right and choose:
      • Folder Path: whatever folder you want to make shareable
      • Folder Name: some name for this. For example, shared
      • Select Auto-mount
    • Click OK
  3. Reboot. Run the mount command to see mount points. Your shared folder will probably be the last entry and will show you where the directory is that’s shared with your host system. You can now edit files on your host system and they will appear on your VM.

  4. This is important. Turn off ASLR via the command

    echo 0 > /proc/sys/kernel/randomize_va_space

    You will need to do this whenever you reboot, so may want to add this command to your .bash_profile script. Use your favorite editor and edit (or create) ~/.bash_profile in the terminal. Simply add the above command and save the file.

  5. If you want to create an alias to the shared folder, you may do so with the following command after logging in:

    ln -s path_to_folder name_for_alias

    That will create a link to whatever directory with the name name_for_alias in your current directory.

Install the package

Download bufbomb-proj.tar copy it to your virtual machine. Extract the contents with

tar xvf bufbomb-proj.tar

You will find three programs:

makecookei: Generates a cookie based on your NetID.

bufbomb: The code you will attack.

sendstring: A utility to help convert between string and binary formats.

Reading

This assignment is about creating several buffer overflow attacks. To do this, you will need to have an intimate understanding of how the stack works on an Intel/Linux/gcc system. You can find an easy-to-follow writeup here:

The first part focuses on how the stack changes as a function is called and the second part focuses on the return. One notable difference in your environment is that gcc will not write a function’s return value onto the stack. Instead, it will simply place it in the %eax register. I strongly recommend that you read both of these documents carefully before starting the assignment.

Two other useful references are:

If you look through other sources, such as those last two, you will see two forms of syntax. gcc and gdb use what is called AT&T syntax. The tell-tale sign is % before register names. With this notation, the destination is on the right:

sub $8, %esp

subtracts 8 from the stack pointer (esp register). The other notation is Intel syntax, where the destination is on the left:

sub esp, $8

also subtracts 8 from the stack pointer. Note the lack of the %. Similarly

mov ebp, esp

is Intel syntax for moving the contents of the ebp register to the esp register. With AT&T syntax, you would write mov %esp, %ebp

We will use AT&T syntax for the remainder of this document.

Create a cookie

You should create a name for yourself based on your NetID. You must follow this scheme for generating your handin submissions. Our grading program will only give credit to those people whose IDs can be extracted from handin submissions.

A cookie is a string of eight hexadecimal digits that is (with high probability) unique to yourself. You can generate your cookie with the makecookie program by giving your NetID as the argument. For example:

$ ./makecookie pxk123
0x4548b56a

Use your own NetID in place of pxk123, of course. In four of your five buffer attacks, your objective will be to make your cookie show up in places where it ordinarily would not.

The Bufbomb Program

The bufbomb program reads a string from standard input with a function getbuf that has the following C code:

1 int getbuf()
2 {
3   char buf[12];
4   Gets(buf);
5   return 1;
6 }

The function Gets is similar to the standard library function gets – it reads a string from standard input (terminated by '\n' or end-of-file) and stores it (along with a null terminator) at the specified destination. In this code, the destination is an array buf that has sufficient space for 12 characters.

Neither Gets nor gets have any way to determine whether there is enough space at the destination to store the entire string. Instead, they simply copy the entire string, possibly overrunning the bounds of the storage allocated at the destination.

If the string typed by the user to getbuf is no more than 11 characters long, it is clear that getbuf will return 1, as shown by the following execution example:

$ ./bufbomb
Type string: howdy doody
Dud: getbuf returned 0x1

Typically an error occurs if we type a longer string since you overflowed the buffer and clobbered the return address of Gets.

$ ./bufbomb
Type string: This string is too long
Ouch!: You caused a segmentation fault!

As the error message indicates, overrunning the buffer typically causes the program state to be corrupted, leading to a memory access error when the function tries to return. Your task is to be more clever with the strings you feed bufbomb so that it does more interesting things. These are called exploit strings.

Bufbomb takes several different command line arguments:

-t NetID
Operate the bomb for the indicated NetID. You should always provide this argument for several reasons: - It is required to log your successful attacks. - Bufbomb determines the cookie you will be using based on your NetID name, just as the program makecookie does. - We have built features into bufbomb so that some of the key stack addresses you will need to use depend on your cookie.
-h
Print list of possible command line arguments
-n
Operate in “Nitro” mode, for Level 4 below.

Your exploit strings will typically contain byte values that do not correspond to the ASCII values for printing characters. The program sendstring can help you generate these raw strings. It takes as input a hex-formatted string. In this format, each byte value is represented by two hex digits. For example, the string “012345” could be entered in hex format as “30 31 32 33 34 35.” (Note that the ASCII code for decimal digit x is 0x3_x_.) Non-hex digit characters are ignored, including the blanks in the example shown.

If you generate a hex-formatted exploit string in the file exploit.txt, you can apply the raw string to bufbomb in several different ways:

  1. You can set up a series of pipes to pass the string through sendstring.

    $ cat exploit.txt | ./sendstring | ./bufbomb -t pxk123

  2. You can store the raw string in a file and use I/O redirection to supply it to bufbomb:

    $ ./sendstring < exploit.txt > exploit.raw

    $ ./bufbomb -t pxk123 < exploit.raw

    This approach can also be used when running bufbomb from within gdb:

    $ gdb bufbomb (gdb) run -t pxk123 < exploit-0.raw

One important point: your exploit string must not contain byte value 0x0A at any intermediate position, since this is the ASCII code for newline ('\n'). When Gets encounters this byte, it will assume you intended to terminate the string. Sendstring will warn you if it encounters this byte value.

Once you have a successful exploit, you will see a message such as:

Team: pxk123
Cookie: 0x4548b56a
Type string:Smoke!: You called smoke()

--- Completion notification ---
submission from root
BUFBOMB-15216-KEY:pxk123:0:4548b56a:01 02 03 04 04 06 C5 04 31 41 82 8D :0
Team: pxk123
Cookie: 0x4548b56a
Type string:Smoke!: You called smoke()

This indicates that you have successfully completed the level. Run the command again and save the output:

./bufbomb -t pxk123 <exploit-0.raw >out0.txt

You will need this for your assignment submission.

There is no record of your failed attempts and, of course, no penalty for them; I don’t expect your first attempts to work. You simply need to save the completion notification of each of your successful exploits.

Level 0: Candle (10 points)

The function getbuf is called within bufbomb by a function test that has the following C code:

1 void test()
2 {
3   int val;
4   volatile int local = 0xdeadbeef;
5   entry_check(3); /* Make sure entered this function properly */
6   val = getbuf();
7   /* Check for corrupted stack */
8   if (local != 0xdeadbeef) {
9       printf("Sabotaged!: the stack has been corrupted\n");
10  }
11   else if (val == cookie) {
12      printf("Boom!: getbuf returned 0x%x\n", val);
13      validate(3);
14  }
15  else {
16      printf("Dud: getbuf returned 0x%x\n", val);
17  }
18 }

When getbuf executes its return statement (line 5 of getbuf), the program ordinarily resumes execution within function test (at line 8 of this function). Within the file bufbomb, there is a function smoke having the following C code:

void smoke()
{
    entry_check(0); /* Make sure entered this function properly */ printf("Smoke!: You called smoke()\n");
    validate(0);
    exit(0);
}

Your task is to get bufbomb to execute the code for smoke when getbuf executes its return statement, rather than returning to test. You can do this by supplying an exploit string that overwrites the stored return pointer in the stack frame for getbuf with the address of the first instruction in smoke. Note that your exploit string may also corrupt other parts of the stack state, but this will not cause a problem, since smoke causes the program to exit directly.

Some advice

  • All the information you need to devise your exploit string for this level can be determined by examining a disassembled version of bufbomb.

  • Be careful about byte ordering. Recall that Intel architectures use little endian byte ordering: least significant bytes are in lower memory.

  • You might want to use gdb to step the program through the last few instructions of getbufo make sure it is doing the right thing.

  • The placement of buf within the stack frame for getbuf depends on which version of gcc was used to compile bufbomb. You will need to pad the beginning of your exploit string with the proper number of bytes to overwrite the return pointer. The values of these bytes can be arbitrary.

Level 1: Sparkler (20 points)

Within the program bufbomb there is also a function fizz which has the following C code:

void fizz(int val)
{
    entry_check(1); /* Make sure entered this function properly */
    if (val == cookie) {
        printf("Fizz!: You called fizz(0x%x)\n", val);
            validate(1);
        } else
        printf("Misfire: You called fizz(0x%x)\n", val);
    exit(0);
}

Similar to Level 0, your task is to get bufbomb to execute the code for fizz rather than returning to test. In this case, however, you must make it appear to fizz as if you have passed your cookie as its argument. You can do this by encoding your cookie in the appropriate place within your exploit string.

Some advice:

  • Note that the program won’t really call fizz – it will simply execute its code. This has important implications for where on the stack you want to place your cookie.

Level 2: Firecracker (30 points)

A much more sophisticated form of buffer attack involves supplying a string that encodes actual machine instructions. The exploit string then overwrites the return pointer with the starting address of these instructions. When the calling function (in this case getbuf) executes its ret instruction, the program will start executing the instructions on the stack rather than returning. With this form of attack, you can get the program to do almost anything. The code you place on the stack is called the exploit code. This style of attack is tricky, though, because you must get machine code onto the stack and set the return pointer to the start of this code. Within the file bufbomb there is a function bang which has the following C code:

int global_value = 0;
void bang(int val)
{
    entry_check(2); /* Make sure entered this function properly */
    if (global_value == cookie) {
        printf("Bang!: You set global_value to 0x%x\n", global_value);
            validate(2);
        } else
        printf("Misfire: global_value = 0x%x\n", global_value);
    exit(0);
}

Similar to Levels 0 and 1, your task is to get bufbomb to execute the code for bang rather than returning to test. Before this, however, you must set global variable global_value to your NetID’s cookie. Your exploit code should set global_value, push the address of bang on the stack, and then execute a ret instruction to cause a jump to the code for bang.

Some advice

You can use gdb to get the information you need to construct your exploit string. Set a breakpoint within getbuf and run to this breakpoint. Determine parameters such as the address of global_value and the location of the buffer.

  • Determining the byte encoding of instruction sequences by hand is tedious and prone to errors. You can let tools do all of the work by writing an assembly code file containing the instructions and data you want to put on the stack. Assemble this file with gcc and disassemble it with objdump. You should be able to get the exact byte sequence that you will type at the prompt. A brief example of how to do this is included at the end of this writeup.

  • Keep in mind that your exploit string depends on your machine, your compiler, and even your NetID’s cookie. Be sure to include the proper ID with the -t option on the command line to bufbomb.

  • Our solution requires 16 bytes of exploit code. Fortunately, there is sufficient space on the stack, because we can overwrite the stored value of %ebp. This stack corruption will not cause any problems, since bang causes the program to exit directly.

  • You need to find the start of your buffer. Don’t try to print buf via gdb; that’s not the buf you’re looking for. Instead, set a breakpoint in getbuf just before the call to Gets and look at the parameter that was just placed on the top of the stack.

  • Watch your use of address modes when writing assembly code. Note that movl $0x4, %eax moves the value 0x00000004 into register %eax; whereas movl 0x4, %eax moves the value at memory location 0x00000004 into %eax. Since that memory location is usually undefined, the second instruction will cause a segfault! Use the $ prefix for referencing immediate values rather than contents of addresses.

  • Do not attempt to use either a jmp or a call instruction to jump to the code for bang. These instructions uses PC-relative addressing, which is very tricky to set up correctly. Instead, push an address on the stack and use the ret instruction to make the jump.

Level 3: Dynamite (40 points)

Our preceding attacks have all caused the program to jump to the code for some other function, which then causes the program to exit. As a result, it was acceptable to use exploit strings that corrupt the stack, overwriting the saved value of register %ebp and the return pointer.

The most sophisticated form of buffer overflow attack causes the program to execute some exploit code that patches up the stack and makes the program return to the original calling function (test in this case). The calling function is oblivious to the attack. This style of attack is tricky, though, since you must: 1) get machine code onto the stack, 2) set the return pointer to the start of this code, and 3) undo the corruptions made to the stack state.

Your job for this level is to supply an exploit string that will cause getbuf to return your cookie back to test, rather than the value 1. You can see in the code for test that this will cause the program to go “Boom!.” Your exploit code should set your cookie as the return value, restore any corrupted state, push the correct return location on the stack, and execute a ret instruction to really return to test.

Some advice

  • In order to overwrite the return pointer, you must also overwrite the saved value of %ebp. However, it is important that this value is correctly restored before you return to test. You can do this by either 1) making sure that your exploit string contains the correct value of the saved %ebp in the correct position, so that it never gets corrupted, or 2) restore the correct value as part of your exploit code. You’ll see that the code for test has some explicit tests to check for a corrupted stack.

  • You can use gdb to get the information you need to construct your exploit string. Set a breakpoint within getbuf and run to this breakpoint. Determine parameters such as the saved return address and the saved value of %ebp.

  • Again, let tools such as gcc and objdump do all of the work of generating a byte encoding of the instructions.

  • Keep in mind that your exploit string depends on your machine, your compiler, and even your NetID’s cookie. Be sure to include the proper ID with the -t option on the command line to bufbomb.

Once you complete this level, pause to reflect on what you have accomplished. You caused a program to execute machine code of your own design. You have done so in a sufficiently stealthy way that the program did not realize that anything was amiss.

Level 4 Nitroglycerin (extra credit: 20 points)

From one run to another, especially by different users, the exact stack positions used by a given procedure will vary. One reason for this variation is that the values of all environment variables are placed near the base of the stack when a program starts executing. Environment variables are stored as strings, requiring different amounts of storage depending on their values. Thus, the stack space allocated for a given user depends on the settings of his or her environment variables. Stack positions also differ when running a program under gdb, since gdb uses stack space for some of its own state.

In the code that calls getbuf, we have incorporated features that stabilize the stack, so that the position of getbuf’s stack frame will be consistent between runs. This made it possible for you to write an exploit string knowing the exact starting address of buf and the exact saved value of %ebp. If you tried to use such an exploit on a normal program, you would find that it works some times, but it causes segmentation faults at other times. Hence the name “dynamite” – an explosive developed by Alfred Nobel that contains stabilizing elements to make it less prone to unexpected explosions.

For this level, we have gone the opposite direction, making the stack positions even less stable than they normally are. Hence the name “nitroglycerin” – an explosive that is notoriously unstable.

When you run bufbomb with the command line flag “-n,” it will run in “Nitro” mode. Rather than calling the function getbuf, the program calls a slightly different function getbufn:

int getbufn()
{
    char buf[512];
    Gets(buf);
    return 1;
}

This function is similar to getbuf, except that it has a buffer of 512 characters. You will need this additional space to create a reliable exploit. The code that calls getbufn first allocates a random amount of storage on the stack (using library function alloca) that ranges between 0 and 127 bytes. Thus, if you were to sample the value of %ebp during two successive executions of getbufn, you would find they differ by as much as ±127.

In addition, when run in Nitro mode, bufbomb requires you to supply your string 5 times, and it will execute getbufn 5 times, each with a different stack offset. Your exploit string must make it return your cookie each of these times.

Your task is identical to the task for the Dynamite level. Once again, your job for this level is to supply an exploit string that will cause getbufn to return your cookie back to testn, rather than the value 1. The code for testn is identical to test except that it calls getbufn instead of getbuf. You can see in the code for test that this will cause the program to go “KABOOM!.” Your exploit code should set your cookie as the return value, restore any corrupted state, push the correct return location on the stack, and execute a ret instruction to really return to testn.

Some advice

You can use the program sendstring to send multiple copies of your exploit string. If you have a single copy in the file exploit.txt, then you can use the following command:

$ cat exploit.txt | ./sendstring -n 5 | ./bufbomb -n -t pxk123
  • You must use the same string for all 5 executions of getbufn. Otherwise it will fail the testing code used by our grading script.

  • The trick is to make use of the nop instruction. It is encoded with a single byte (code 0x90). You can place a long sequence of these at the beginning of your exploit code so that your code will work correctly if the initial jump lands anywhere within the sequence. You might recall that this is called a landing zone or no-op sled.

  • You will need to restore the saved value of %ebp in a way that is insensitive to variations in stack positions.

Generating byte codes

Using gcc as an assembler and objdump as a disassembler makes it convenient to generate the byte codes for instruction sequences. For example, suppose we write a file example.s containing the following assembly code:

# Example of hand-generated assembly code
pushl $0x89abcdef   # Push value onto stack
addl $17,%eax       # Add 17 to %eax
.align 4        # Following will be aligned on multipe of 4
.long   0xfedcba98  # A 4-byte constant
.long   0x00000000  # Padding

The code can contain a mixture of instructions and data. Anything to the right of a ‘#’ character is a comment. We have added an extra word of all 0s to work around a shortcoming in objdump to be described shortly.

We can now assemble and disassemble this file:

$ gcc -c example.s
$ objdump -d example.o > example.d

The generated file example.d contains the following lines

0: 68 ef cd ab 89     push   $0x89abcdef
5: 83 c0 11           add    $0x11,%eax
8: 98                 cwtl
9: ba dc fe 00 00     mov    $0xfedc,%edx

Each line shows a single instruction. The number on the left indicates the starting address (starting with 0), while the hex digits after the ‘:’ character indicate the byte codes for the instruction. Thus, we can see that the instruction pushl $0x89abcdef has the hex-formatted byte code 68 ef cd ab 89.

Starting at address 8, the disassembler gets confused. It tries to interpret the bytes in the file example.o as instructions, but these bytes actually correspond to data. Note, however, that if we read off the 4 bytes starting at address 8 we get: 98 ba dc fe. This is a byte-reversed version of the data word 0xfedcba98. This byte reversal represents the proper way to supply the bytes as a string since a little endian machine lists the least significant byte first. Note also that it only generated two of the four bytes at the end with value 00. Had we not added this padding, objdump gets even more confused and does not emit all of the bytes we want.

Finally, we can read off the byte sequence for our code (omitting the final 0’s) as:

68 ef cd ab 89 83 c0 11 98 ba dc fe

Things you need to know about using gdb

You can find many, many tutorials on using gdb on the web. This is not one of them. gdb has tons of commands and takes quite a bit of experience to master. Even if you barely know it, you will find a few commands incredibly useful in trying to figure out what’s going on with your program.

You invoke gdb by giving it your program:

gdb bufbomb

and then run it with the run command and the arguments that you would normally give to the command:

(gdb) run -t pxk123 <exploit-0.raw

Prior to that, however, you probably want to set breakpoints and look at some values.

(gdb) disas getbuf
Will show the disassembled code for the function getbuf.
(gdb) break getbuf
Will set a breakpoint at the entry to getbuf. When you run the program (run command), it will stop at the breakpoint.
(gdb) c
Continues running after a breakpoint
(gdb) print check_level
Will print the value of the global variable check_level
(gdb) print &check_level
Will print the address of the global variable check_level
(gdb) break *0x08048ad4
Will set a breakpoint at the address 0x08048ad4
(gdb) x/4i 0xbfe01234
Will show (examine) the next four instructions (i) starting at 0xbfe01234
(gdb) x/4x 0xbfe01234
Will show (examine) the next four 32-bit integers in hex (x) format starting at 0xbfe01234
(gdb) x/4bx 0xbfe01234
Will show (examine) the next four bytes in hex (x) format starting at 0xbfe01234
(gdb) print $sp
Will show the value of the stack pointer
(gdb) x $sp
Will show the value of the stack pointer and what it’s pointing to.
(gdb) print $ebp
Will show the value of the frame pointer
(gdb) info frame
Will show information about the current stack frame

Submission

You will create a single text file named exploits.txt containing the exact text from each of your completion notifications: levels 0–4 (or how many you managed to do) and submit that as an attachment via sakai.

Be sure to objserve the posted deadline. Late submissions will not be accepted.