This year it was challenge #10 that I was most intrigued by. I like to approach most challenges with an open mind, and occasionally it turns into something fun and interesting.
Since I haven't blogged in awhile, this felt like a good opportunity to share some of my tips. I personally love reading about techniques and technologies people use when solving challenges, but a lot of the time writeups get too caught up in the challenge specifics and omit the educational part.
The solution is rarely the interesting part. This is not a writeup, I offer no solution.
Challenge Background
For those that don't know, challenge #10 was an obfuscated Windows kernel driver. How cool is that? I don't think I've seen many kernel based reversing challenges, let alone one for Windows.With about ~2.5mb of logic dense assembly in this driver, this isn't a challenge I would advise anyone to tackle statically. There's an overwhelming amount of code and obfuscation throughout the driver as sampled both above and below.
What also amazes me is that it's 2015 and kernel debugging is still painful for basically every platform. It's annoying to get setup, the debugging is tedious, and relevant tools are few and far between. That said, you can absolutely use WinDBG and a virtual machine to investigate this challenge.
But have you ever looked at a Windows kernel driver and thought, 'Huh, I think I'll turn that into a userland ELF' ?
No probably not, but here's how I did it.
1. Patching
Only two small NOP's patches are needed to get challenge-xp.sys ready for its new life as an ELF.
You must also truncate the first 276 bytes (this may vary depending on the object header your system generates) from the driver blob for some alignment reasons which I'll touch on later.
Here's a script that does the patching as described above:
#!/usr/bin/python
import shutil
# create and open the file we want to patch
shutil.copy2("challenge-xp.sys", "challenge-xp.patched")
f = open("challenge-xp.patched", "r+")
# any patches we want to apply to the binary
patches = {
0x0028C16B: "\x90"*5,
0x0028C8F0: "\x90"*6,
}
# apply patches
for offset in patches.keys():
f.seek(offset)
f.write(patches[offset])
# trim 276 bytes from the front
f.seek(276)
data = f.read()
# save the file
f.seek(0)
f.truncate()
f.write(data)
f.close()
2. Binary Blob to Object File
Object files are simply ELF's that wrap up blocks of compiled code or data and are used to stitch together your final ELF executable during the linking process. We would like to turn our modified Windows driver into an object file so that we can link it against our own code and simply call into it.Thankfully there's a nice little linux util called objcopy that is a swiss army knife for manipulating object files.
"objcopy - copy and translate object files"
-Mr. Man Pages
It has a weird name (IMO), but with objcopy can you add, edit, tweak, or remove just about anything in an object file. This includes segments, sections, symbols, flags, offsets, etc.
Converting a binary blob to an object is super easy:
$ objcopy -I binary -O <target format> -B <target arch> <binary blob> <object file>In our case:
$ objcopy -I binary -O elf32-i386 -B i386 challenge-xp.patched challenge-xp.o
Now you can actually run readelf on our new object file. But there's one or two things we want to fix up still.
We are going to rename the .data section to .chall to try and maintain some sanity. We also want to mark the .chall section as RWX so the driver blob can execute its code as well as read/write to its data section as necessary once we start calling into it.
$ objcopy --rename-section .data=.chall challenge-xp.o
$ objcopy --set-section-flags .chall=code,alloc,data challenge-xp.o
That's it, we're ready to link this thing.
3. Linking & Compiling
/*
compiled with:
gcc -o stub -m32 -Wl,-T,elf_i386.x challenge-xp.o stub.c
*/
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
int __attribute__((stdcall)) chall_ioctl(int, unsigned int **);
int main(int argc, char * argv[])
{
unsigned int * irp[25] = {0};
unsigned int req[4] = {0};
if(argc < 2)
{
printf("Usage: %s <IOCTL Code(s)> ...\n", argv[0]);
return 1;
}
/* setup these 'structures' like the ioctl handler expects */
irp[24] = (unsigned int *)&req;
req[0] = 0xE;
/* call the requested ioctls */
int i = 0;
for(i = 1; i < argc; i++)
{
req[3] = strtoul(argv[i], NULL, 16);
chall_ioctl(0, irp);
}
return 0;
}
Because of some weird issues with gcc+ld, I had to use a custom linker script to link our challenge-xp.o properly against the code above.
You can view the linker script we are going to copy from and modify like so:
$ ld -m elf_i386 --verbose
The default elf_i386.x linker script is pretty big so I am not going to inline it on this blog post. You can find both the original linker script, and the diff of the tweaked one below.
These are the two main lines that I changed and added.
...
PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x00010000)); . = SEGMENT_START("text-segment", 0x00010000) + SIZEOF_HEADERS;
.chall : { *(.chall) }
...
I specifically changed the segment start of executable to the imagebase (0x10000) of the original kernel driver. We also want .chall to be prioritized in front of all other segments so that its base ends up being the segment start we defined.
These changes are to ensure the code to be loaded is at the same address as it is on Windows (specifically 0x10490). The alignment trimming we did earlier was to account for the ELF header that objcopy tacked onto the front of this blob.
I also lazily placed a symbol at the bottom of the linker script that defines the address we expect the IOCTL handler function to be at for runtime. You're free to place the name & address of any functions/labels/etc you would like to use in your code right down beside it.
...
chall_ioctl = 0x0029C1A0;
...
At this point we can compile the stub code with the driver blob elf object we made.
$ gcc -o stub -m32 -Wl,-T,elf_i386.x challenge-xp.o stub.c
Great, we made it. We've turned this thing into a *real* ELF executable.
4. Usage
It is now significantly easier (and faster) to debug, analyze, and interact with this challenge.I personally used QIRA to better visualize and explore the program flow through the giant decision tree funcs. It was also very easy to observe the effects the different IOCTL's had on the encrypted buffers.
I was more interested in converting this to an ELF vs an EXE/DLL for the sole purpose of using QIRA. Linux also tends to have less friction when trying to use advanced userland instrumentation frameworks and related tools.
For fun, I have provided stub_dump.c and a simple little script called dump.py. The python script simply wraps stub_dump and feeds it every supported IOCTL individually, dumping the resulting memory region that we were interested in for this challenge (0x029D840 - 0x029D8B8)
For those more familiar with the challenge - I used a few more of these hacked up ELF's with a bit of python to programatically interact with the challenge. I was able to quickly map each IOCTL's influence on the encrypted buffers region, and enumerate every combination of IOCTL's that yielded a unique buffer for each of the 'tree' funcs before being passed into the XTEA decrypt functions.
Unfortunately the real solution to this challenge was a lot less interesting.
I finished 20th.
Conclusion
I used this same methodology over a year ago while researching the Nintendo 3DS firmware. I statically linked decrypted firmware blobs against ELFs on an ARM device to enable easier analysis and debugging.None of this is all that complex or novel, but it serves as a reminder that taking a challenge at face value may be naïve. Creativity can't be dismissed in favor of pure technical merit.
Follow me on twitter @gaasedelen
Code & Materials: flare10.zip