Botnets and Butterflies - Unpacking Mariposa
Lately I've spent most of my time trying to finish writing my thesis which looks at taking a visual approach to program comprehension. As a component of the evaluation I decided it would be interesting to look at an unpacked malware sample and I ended up going with a sample of Mariposa. The MD5 for the sample is 3e3f7d8873985de888ce320092ed99c5 and I've uploaded it to malware.lu. For this post I thought it'd be fun to go through the process of unpacking the malware since it includes a couple anti-debugging techniques.
WARNING: If you decide to follow along, do so at your OWN risk. This is live malware, be smart about what you do.
Step 1: Use Protection
Before doing anything else setup an ISOLATED malware analysis environment. I've done this using VMs on a dedicated machine. VMWare is quite popular, but you can also use VirtualBox or pretty much any other current solution (I used KVM on a 'nix box).
Step 2: Setting Up
For this post we'll be using three tools: OllyDbg, ImpREC, and Lord PE. If you're not familiar with these I highly recommend checking them out. Also, I did this using Windows XP SP3. I know Mariposa doesn't run on SP2 or lower, and I haven't tested on Win7.
Step 3: Get the Party Started
Now that we're ready to start looking at the malware, first open it up in Olly. Upon opening the executable, Olly will do some analysis and eventually you'll be presented with the CPU Window open to the entry point.
At this point we can begin working our way through the code.
The first batch of code (up until address 0x41D4AF) isn't very exciting, in fact the authors throw in a bunch of SSE/FPU code to intentionally try and catch us off guard. You'll then see a jump (through a JMP EAX) to the next block of code.
The next chunk of code starts out by placing the value 0x41D000 in ECX and then XOR's every byte up until 0x41D4C0 with the value 0xCA1A51E5. This is the first decryption loop that unpacks the next stage of the unpacker. After this we jump to 0x41D047, which at first glance looks like a war zone!
Step 4: Anti-Everything
This next region of code is where a lot of the fun happens. When you first jump to it, Olly doesn't display code; just the bytes. Have no fear though, you can beat it! Start by first telling Olly to analyze the region of code (Ctrl+A), then scroll down a bit and you should see something similar to the following screenshot.
Next, highlight the code in the range 0x41D047 to 0x41D076 and then tell Olly to forget about the analysis it just did (Ctrl+BkSp).
Once you've done this you'll see some code appear. The first jump skips over a bunch of invalid instructions (opcode 0xFF) that are intended to try and throw off the debugger. Technically they did because we had to help Olly figure out what it was looking at.
Now follow the JMP to 0x41D057. The next few lines push an address on the stack, clear EAX, push FS;[0], then store the stack pointer at FS:[0]. What does all this do? Try to trick your debugger of course! The address pushed on the stack is used as an exception handler (read about Structured Exception Handling) which in normal execution would be triggered by trying to evaluate an invalid opcode. However, when we run under a debugger it will catch the exception and we're stuck. We can avoid this by patching the binary so that the byte at 0x41D069 is a NOP (opcode 0x90).
The next issue we encounter is more anti-disassembly trickery. Looking at the hex values for the instruction listed at 0x41D06A we see they are 0xFF2B and the disassembled command is JMP FAR DS:[EBX]. But the address DS:[EBX] is not code at this point, so that's clearly wrong. The trick here is to help Olly by setting the 0xFF byte at 0x41D06A to another NOP. Doing this you will once again see that Mariposa tries to trick us again using the invalid opcode trap.
The next issue we encounter are the three instructions beginning at 0x41D078. The opcode 0xFF we know is invalid, the STC (set carry) is fine, and the ADD ESP, 7C8 is valid but will result in an access violation if executed. The easiest solution here is to once again set everything to NOPs.
After this the malware then goes and loads the Kernel32.dll library followed by requesting the address of the VirtualProtect() function. With the address of VirtualProtect() in EAX, the malware then sets 0x1D000 bytes with base at 0x400000 to be EXECUTE_READWRITE.
Following this, a call is made to find the address of OutputDebugStringA(). This function is used to display a string in the debugger and will only return non-zero if successful. Since we see that an address is pushed on the stack and then a RETN follows immediately after (and the address is just the instruction), we can actually just set the region 0x41D0F3 to 0x41D101 to NOPs. For the next part, once again use the analyze/unanlyze trick to see the code and jump down to 0x41D113.
Now we're at the last anti-debugging trick! This last trick tries to catch us with trap flag (in EFLAGS) set. The easiest way around it is to execute the code until 0x41D121, so set a break point there and hit F9 to RUN the code. Notice that here, if the trap flag was set we'd be adding 0x100 to the address 0x41D128 which would be invalid. Since we ran the code, a zero is added and we're all good.
Step 5: Unpack Everything
The next batch of code gets right to work on unpacking everything for us. All we need to do is run through and let it do work its magic. Make a note of the regions that are unpacked because one of those is the actual IAT. Finally, there is a giant loop starting at 0x41D238 that reconstructs the IAT for us. Set a breakpoint at 0x41D2DB and run to there.
Step 6: Finding The OEP
The last thing we need to do in Olly is identify the original entry point (OEP). We need this for when we reconstruct the executable. Generally speaking, the OEP can be identified by looking for either a JMP to an address in a register or a RETN to an address pushed on the stack. In our case we see the value 0x4100A2 is placed in EDI and then jumped too. This is the OEP.
Step 7: Dumping the Process
Next we need to dump the process so that we can reconstruct the executable. Leave Olly running and open up Lord PE. In Lord PE select the process from the top pane, then right click and choose "dump full".
This will let you save a copy of the process to file for later use.
Step 8: Rebuilding The IAT
The last step before we can try running our unpacked malware is to rebuild the IAT. The IAT is used at runtime to resolve addresses to functions in shared libraries. We can do this using the ImpREC (Import REConstructor) tool. Open up ImpREC and select the running process (same as the last step) from the drop down.
Here is where we need the OEP and the address of the IAT. As mentioned earlier, the OEP is at 0x41D100A2 which is a relative address (RVA) of 0x100A2. Also, you should have keep track of the decryption loops and you'll notice the IAT is located at RVA 0x16000 with size 0xEC. Fill this in, then hit "Get Imports". Now just select "Fix Dump" and point it to the dumped copy from Lord PE. If all went well, at this point you should have successfully unpacked the malware.
Step 9: Testing The Malware
The easiest way to see if it worked is to run it. One of the things that this malware does is tries connect to a C&C server (it's a botnet after all!). So, if you have Wireshark available, open it up then run your malware. You should see something like the following in Wireshark if you were successful.
There you have it folks, a crash course on unpacking malware! For me one of the most fun parts of unpacking is solving the anti-debugging mechanisms. If you'd like more info on the various mechanisms out there here a few resources:
Anti-Unpacker Tricks by Peter Ferrie
The Art of Unpacking by Mark Yason, presented at BlackHat USA 2007
Windows Anti-Debug Reference by Nicolas Falliere
Until next time, stay protected; use a VM :)











