Executing Arbitrary Code

Here we'll expand a bit on EIP redirection and use it to execute code that we write. The goal for this tutorial is a modest one: print the letter 'A' to the screen. I'm assuming some basic familiarity with OllyDbg and compiling C. If the steps we take here don't make sense, check the previous tutorial first.

The vulnerable program source code
The vulnerable program
The attack program source code
The attack program

Overview

When you've determined that a character buffer can be overflowed, its time to take advantage of it. First, write the code you'd like executed into the beginning of the buffer in bytecode form. Now write garbage (like 'A's) until you reach the function's return address on the stack. Once you've reached the return address, write the fake return address at that location. The rest of the function's code runs until the execution reaches the RET instruction. Then our fake return address is popped off the stack and the program jumps to it.

This time we're not going to plug an address in the program's code into the function's return address. That only allows us to redirect the flow of the program. Our goal here is to execute any code we want, so we'll plug in an address where we have put our custom bytecode. Remember that we began writing the buffer with this code. That means that we should plug in the buffer's address for the function's return address. Now when the function ends, it will return into the beginning of the overflowed buffer and execute the bytecode that it finds there.

The Target

To the right you'll find the source code for the program we are targetting. As you can see, the vulnerable buffer is declared as 20 bytes long. Before the user is asked for input, the program uses the printChar function to print out a question mark followed by a space. The printChar function merely takes a byte as an argument and then passes it to the putchar function in stdio.h The result is that the character is written to the screen. Our custom foreign bytecode will call this function to achieve its goal.

Building the Attack String

Now its time to craft our buffer. Start by finding out how many bytes are between the start of the buffer and the function's return address. Use OllyDbg to step through the code after inputting a recognizable input like "AAAAAAAA". Once you've reached main()'s RET instruction, check out the stack and find the userInput buffer address. Subtract it from the location of main()'s return address. This tells us how much padding to use before tacking on the fake return address. In this case, the buffer is at 0x0027FF20 and the return address is at 0x0027FF4C. This means our attack string will need 44 bytes of filler before we add the fake return address. Next we need to know what fake address to tack on the end of it. We already have it from playing with Olly a moment ago. Remember that we need to add the bytes to the string in reverse order: \x20\xFF\x27\x00. Also recall that because gets will automatically toss a null byte at the end of our input, we can get rid of that last \x00. So for our malicious input, so far we have 44 'A's followed by \x20\xFF\x27.

Writing the Bytecode

At this point if we feed the attack string so far into the the vulnerable program, it will overflow the buffer and overwrite main()'s return address. Then when main returns, it will jump into our overflowed buffer located at 0x0027FF20. Right now, all we have there is a bunch of A's (0x41). This happens to be the bytecode representation of "INC ECX" (which adds one to the value in the ECX register). Its no fun just incrementing ECX a bunch of times, so we'll replace those first few 0x41s with some bytecode that will accomplish our goal.

Writing assembly that will be assembled into byte code for use in buffer overflows has some additional challenges associated with it. First and foremost, the final bytecode must not contain any \x00 bytes. To use these bytes in our code, we must do some trickery so that when they are inputted in the attack via gets, and it all goes through. Functions like gets interpret null bytes as the end of the string and will cut our input short if it encounters them. Just keep this limitation in mind as you look over the code (esp in regard to the \x00 byte in printChar's address).

Use OllyDbg to find the address in memory that the printChar function begins. You'll find above main() at 0x00401290. If we can manage to call this function after pushing a byte onto the stack, it will be printed to the screen. Pushing the byte is easy: the PUSH 0x41 instruction translates to \x6A\x41. This pushes an 'A' onto the stack for use by the printChar function when we call it. We can't just say CALL 0x00401290 next because that will make our bytecode contain a null byte. Instead we'll move the value 0x40129001 into the register EBX. There isn't anything special about EBX here, its just a convenient place to store the mangled address of printChar while we play with it. Notice that this moving instruction doesn't have any \x00s in it. Next we'll use SHR EBX, 0x08 to shift the value in EBX 8 bits to the right. This makes the new value 0x00401290. If the reason this happened doesn't make sense, read up on hexadecimal and bitwise operations. Now the correct address for the printChar() function is in EBX. We had to go the long way around because we aren't allowed to use null bytes. Lastly, we just need to CALL EBX to jump to printChar. Don't forget to do the PUSH 0x41 or it will just print out whatever byte it finds on the top of the stack instead of our 'A'.

Making it Work

We can use the assemble feature of Olly to turn our assembly code into the bytecode we need. Just pick any relatively free area of the code and right-click, the click Assemble. Begin typing the assembly code in one instruction at a time hitting enter between them. It will add it to the portion of code you selected and display its bytecode to the left. Next select the code you wrote and right-click, then Binary->Binary Copy. This will copy the bytecode into the clipboard. Paste it somewhere and insert \x where the spaces are. Be sure there are no null bytes anywhere and you have usable bytecode for attacking the program.

So now we have the bytecode that will be executed to accomplish our goal (write an 'A' to the screen) and the padded string with our fake return address tacked on the end. Combine the two by writing a small C program similar to the one shown on the left. Take note that the bytecode replaces the first 12 bytes of the padding (don't just add the two together). Finally we compile this code into input3.exe and fire up the command prompt. Typing input.exe | vuln3.exe will feed our attack string into the app. The buffer will be overflowed, thus changing main()'s return address to point to the beginning of our buffer. The bytecode is waiting there and once executed, it calculates the address of the printChar function without using null bytes. It then pushes the hexadecimal representation of 'A' onto the stack and calls printChar() which faithfully prints the 'A' and then crashes. Congratulations, you made your own code execute inside a vulnerable program; go relax and have a beer.