« October 2005 | Main | December 2005 »

November 30, 2005

Reading assembly code

Even unobfuscated code is difficult to understand. Look at this function. Can you tell its purpose?

The answer is very simple. Here it is:

unsigned char div61(unsigned char x) { return x / 61; }

In is just a division by 61!

At the first sight it looks overly complicated. Why not use the div instruction? The reason is that this longer(!) code is faster. See some instruction timings here.

Then, when we understand this, the code still looks like a magic. There is nothing like division there, only all kinds of shuffling bits to the left and to the right. If you are really curious how this code works, consider reading this article by Torbjorn Granlund and Peter L. Montgomery:

Division by Invariant Integers using Multiplication

It would be nice to have code sequences like this automatically deciphered. Once we know the answer, the mystery disappears and only the tedious work of finding the dividend remains. A decent program analysis tool or decompiler definitely could (and should) do it.

November 27, 2005

The highlighter

Today I'll present you a pretty small yet useful plugin.

If you tried to trace an obfuscated code in the debugger you already know that it is quite difficult to follow. The code modifies itself, performs complex computations, repeats itself so that after a while you are lost and do not even remember if the current instruction is something you saw before or a completely new thing. You could rename locations and put comments to make the code more recognizable but this is a ungrateful and tedious task which distracts you from the main goal of following the logic of the application. Imagine finding a name for the 30th loop of the 23d meaningless function!

It is much better in these cases to relax and let the application to execute without trying to understand it. Quite often all this obfuscated code ends up doing something trivial. If you let the code execute to the end of a function or a logic chunk, the result becomes apparent by itself and you can move on by giving the function a nice name. The function is still obfuscated but you do not care at all since you know its purpose and the outcome.

If you decide to let the code do its job without trying to understand how it is done, your task is much simpler. You just need to follow the execution flow till its end. No need to care about the register values, the meaning of loops or if-then-else or other constructs. Very simple trick: single step the function until we return from the function or jump out of it.

This simple trick is easier to state than to do since the obfuscated code will not have precise function boundaries. More than that there might be many useless jumps or repeated code with the only purpose of confusing you.

The highlighter plugin solves this very problem: it makes apparent the code which has been single stepped in the debugger. Here is how the disassembly listing looks with the plugin:

The little blue boxes denote instructions which have already been executed.

Since the plugin is very simple and light, it has no configuration parameters - just copy it to the plugins subdirectory and it is ready to use. As usual, it comes with the source code: highlighter.zip.

Happy code exploration!

November 22, 2005

How to unpack XCP.DAT?

I updated my EFD utility to handle the packed XCP.DAT file. To extract files from the archive, use:

efd -x xcp.dat

in a clean directory. It will create files like xcp1.dat, xcp2.dat, etc. Unfortunately the file names are not present in the archive, that's why the names are so meaningless.

Here is the utility: efd.zip

November 20, 2005

Sony DRM

The last week several LGPL violations were found in Sony's DRM implementation.
Here is a proof of one violation. Here is a dedicated page with many other findings.

By the way the license breach could be found using the simplest tools on the earth: any hex editor or the strings tool from unix would be enough to find the copyright strings. In MS Windows Start, Search for Files or Folders would be sufficient as well. Just think about it and look.

In theory the license breach is easy to fix: just add the required copyright notice to the initial dialog box and there is no license violation anymore.

What is not easy to fix is the public opinion. Many will think: Sony's rootkit is a bad thing and (therefore) DRM in general is a bad thing too. In fact what we need is a good DRM implementation (since the option of having no DRM is not available). Without rootkits and 'security by obscurity' approach. Which does not punish legal buyers.

The ultimate stealth method

The last described method does not work if the application uses an "unsupported" antidebugging trick. For example, if the application directly checks the PEB field instead of calling the IsDebuggerPresent function, the method will fail. Or the application could use something else, something from the future...

I will show you the "ultimate stealth method" which will work against the future antidebugging tricks too. We will unpack an application and create a database with clean unpacked code. This time we will use as a sample packer, say, telock but the packer does not really matter. Here is our original sample file and here is the packed file.

The application tries to detect the debugger and the debugger tries to hide itself. Now we will play the game differently: instead of hiding the debugger, we will completely remove it! The application refuses to run with debugger? Fine, let it run without! After all, we are interested in the unpacked code, not the in unpacker code.

Se we need to suspend the application when it unpacks itself. If we let the application run without any modifications, it will not suspend. We will modify it to suspend itself.

The idea: we will patch a Windows API function to suspend our application. As soon as the unpacked code calls it, the application will be suspended. For our sample program we know that the CreateWindowExA function is called. It looks like this:

We patch it to call the SuspendThread function instead of creating a window:

The next step is to let the application run without the debugger. We use the Debugger, Detach from process. The unpacker will do its job and the application will suspend itself before creating any windows, so nothing will happen on the screen. But if we check the task list, we will see our application (Debugger, Attach to process):

The application has been unpacked and is ready to be analyzed! Open the program segmentation, find the application name (sample_telocked.exe in our case) and you will see clean code.

In short, the ultimate stealth method consists of the following sequence:

Patch system dlls - Detach - Attach

What to do if we do not know what Window API functions are used by the application? We could try to patch as many functions as we want. We should not patch functions used by the unpacker since it will prevent the unpacker from doing its job. If we are really out of ideas, even ExitProcess or similar function can be patched. The application will be suspended at the exit time but we will have an opportunity to see the original import table which will give us the list of API functions to consider.

We could also patch API functions in other ways. Instead of calling SuspendThread we could write 0xCC, or invalid opcodes, or even zeroes. The application would crash and we could attach to it with IDA (turn on Debugger, Debugger options, Set as just-in-time debugger before using this method).

November 13, 2005

Stealth plugin

The last time I showed you a simple trick with conditional breakpoints. Today I will present you a plugin which automates these breakpoints - to the extent that a protected malware like the Zotob worm can be unpacked. Since it is dangerous to experiment with a live malware we will use a sample program in our demonstration. Zotob is packed with a variant of the Yoda's Protector (beware of popup windows if you click on the link!). We will take this sample program. and protect it with the protector. Then we will try to unpack it.

IDA complains a lot about the packed executable but manages to load it. It finds the entry point for this packer but in general when you handle malware, turning on the manual load option and turning off the make imports section option is a good idea. The manual load will give you a chance to load all section of the input file to the database (malware may hide its code anywhere in the file). Not creating the imports section will display the import directory contents fully in the original form - again, who said that malware can not hide itself in the import directory?

The first thing we encounter trying to follow the unpacker in the debugger is that there are too many exceptions. You have to be very careful with fake calls and exceptions. Un faux pas and we find ourselves in the middle of nowhere, the program running wild, crash or even closing the debugger. It is a deliberate thing - packer authors love to complicate things and render the analysis almost impossible. The key word is almost. If a program has to run without requiring special keys or additional data then it can be made run under a virtual environment and dissected fully. Today we will not virtualize the whole environment but only a very small part of it - we will render some Windows API calls useless to the unpacker.

Ok, back to the program. The unpacker uses the SEH (structured exception handling) in the unpacking process and IDA keeps reporting about each exception. There might be hundreds or even thousands of them. By default IDA comes configured as most of other debuggers: it suspends the program as soon as there is an exception. This behaviour is good for 'normal' debugging but does not help when tracing a malware.

Let's change the exception handling so that IDA does not stop at each exception. You can do it from the user interface (Debugger, Debugger options, Edit exceptions) or by editing the cfg/exceptions.cfg file. The second method is better because the settings will be used for all future databases while the first method will change the settings only for the current database. We will tell IDA that all exceptions must be handled by the application. Here is a line from the new configuration file:

0xC0000005   nostop app EXCEPTION_ACCESS_VIOLATION         The instruction at 0x%a referenced memory at 0x%a. The memory could not be %s
This line means that the execution will not stop (nostop) and the application will handle it (app). If you have some experience with IDA, you might have noticed that sometimes IDA still stops at an exception despite of such a setting. IDA will stop at an exception if this is a 'second chance' exception: a non handled exception of second chance will terminate the application and IDA gives you a chance to do something about it.

If you replace your configuration file with this file then you will be able to load in into the database using the Load button in the Debugger, Debugger options dialog box.

With the new configuration file it is much easier to single step the program. You can even set a breakpoint at 4766A4 to see the next trick - self modifying code at 4766A8 (just several instructions below). When you once execute 'stosb' at 4766A5, you can press F8 at 4766A6 and the code will appear on the screen. I will not describe in detail every and each trick in the unpacker, there are other sites doing the job very well. Instead, let's put a hardware breakpoint at 476854 and rerun the program. You will see that the unpacker merrily and diligently does it job and can not detect the debugger using SEH tricks.

Why did we use a hardware breakpoint and not a software one? The reason is because the application can detect a software breakpoint easily. For example, at 4775CE there is a checksum calculating function and it uses the opcode bytes to calculate it. If you use software breakpoints, the checksum will be incorrect and the packer will crash somewhere later. In general it is a good idea to use hardware breakpoints but unfortunately IDA does not have the option to use them automatically. I personally use hardware breakpoints to rerun the malware from the start to a certain address. Mistakes are inevitable but hardware breakpoints let me to repeat the whole debugging session up to the last known address. The good side is that I can continue the debugging session even several days later, reboot my computer, etc.

If you have put a hardware breakpoint at 476854 you will see the followng code:

We are to perform an indirect call. Since we are in the debugger we can easily find out the function to be called but the listing looks ugly. Let's fix it using an IDA Pro command. The unpacker uses many references based on the EBP register. Apparently the EBP register does not change. We will select the whole screen and use the 'user-defined offset' command:

The offset base is EBP and it is a plain number (it does not hold an address). Since we have selected a region, there is one more additional dialog box:

We ask to convert everying in the 400000..500000 range to offsets. The result is much better than the original:

We see that the unpacker uses the LoadLibrary function to access Windows API functions. It will retrieve the addresses of many functions and create its import table. If you let the program run up to 476E77 (you may use a hardware breakpoint for that), you will see the import table at 476451:

(by default the table is not visible; you'll need to position the cursor at its beginning, create a dword and then array of dwords). The table is not good enough because it contains references to names but its entries are not named. Entries in the import table will be used one by one and without names the listing will not be readable. The following short script, entered in the script dialog box (F2 is the hotkey) corrects the table:

auto ea, name;
for ( ea=here; ea < 0x476545; ea=ea+4 )
{
  name = Name(Dword(ea));
  name = substr(name, strstr(name, "_")+1, -1);
  MakeName(ea, name);
}
Here is the result:

Looking at the table we can see many nasty functions. The famous IsDebuggerPresent is there, and also functions like SuspendThread, TerminateProcess, BlockInput do not look innocent.

Here is the idea: we will create conditional breakpoints at all these dangerous functions with the following condition:

(EIP=address_of_ret_instruction) && (EAX=return_value)
For example, in the BlockInput function

The breakpoint condition will be

(EIP=0x7D9A059B) && (EAX=0x1)
This breakpoint will skip the function execution and provide the predefined answer. It will not suspend the execution. The unpacker will have no chance of blocking the user input, detecting the debugger and terminating it. Even it tries, it will fail.

Since it is tedious to manually set these breakpoints each time you run an application, I made a plugin. It is quite simple and comes with the source code. With this plugin, running the application without being detected is simple: activate the plugin and run the debugger. The application will unpack itself and run without doubting anything:

I anticipate the next question: "how to detect the moment when the unpacker finishes its work and switches to the original application code". Unfortunately there is no simple answer to this question (even if there were one, the next packer author would make it obsolete at once). It is a nice question with many possible answers. Maybe we will consider it in the future.

Files:
Sample 'hello world' program with the source code
Plugin source code, binary code for IDA 4.9 and new exceptions.cfg

November 04, 2005

Simple trick to hide IDA debugger

Quite often IDA users ask for a plugin or feature to hide the debugger
from the application. In fact there are many anti-debugging tricks and
each of them requires an appropriate reaction from the debugger, let's
start with something simple: we will make the IsDebuggerPresent
function call always return zero.

When the debugger is active, we will go to the disassembly of the
IsDebuggerPresent function. We will use the 'goto to the specified
address' command for that. Unfortunately, the current version of IDA
does not display imported names in the name list and we will need to
type in the function name in the input field manually:

Please note how we form the address: the
dll name followed by an underscore followed by the function name. We
put a breakpoint at the end of the function so we will have a chance
to intercept the execution and modify the result:

Since we don't want to suspend the program and modify the result
manually each time IsDebuggerPresent is called, we will automate it.

We will use breakpont conditions. The breakpoint condition field
can be used to determine whether a breakpoint should be triggered or
not. The condition is an IDC expression. If the expression evaluates
to zero, the breakpoint will not fire. Since IDA evaluates the
expression in order to determine its value, we can use it for the side
effects, like modifying register values, memory, or anything else you
can think of. We modify the breakpoint attributes the following way
(right click, Edit breakpoint):

We specified the condition as "EAX=0". It is not a comparison, it is an
assignment. When IDA evaluates it, EAX will become zero as a side
effect, exactly what we want it to be. We have also to clear the
'break' attribute since we don't want to suspend the application.

With a breakpoint defined like this, our debugger is immune against
the IsDebuggerPresent call. It may sound too simple and you may ask
"what about not-so-childish anti-debugging tricks?" Hold on, we will
develop this topic more.

Latest news: Hex-Rays decompiler has been released!