<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Hex blog</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/" />
    <link rel="self" type="application/atom+xml" href="http://hexblog.com/atom.xml" />
   <id>tag:hexblog.com,2008://1</id>
    <updated>2008-04-09T22:46:26Z</updated>
    <subtitle>About IDA Pro, decompilation, programming, binary program analysis, information security. By Ilfak Guilfanov.</subtitle>
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type 3.2</generator>
 
<entry>
    <title>Some functions are neater than the decompiler thinks</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2008/04/some_functions_are_too_neat.html" />
    <id>tag:hexblog.com,2008://1.75</id>
    <published>2008-04-09T21:22:07Z</published>
    <updated>2008-04-09T22:46:26Z</updated>
    
    <summary>The decompiler makes some assumptions about the input code. Like that call instructions usually return, the memory model is flat, the function frame is set properly, etc. When these assumptions are correct, the output is good. When they are wrong,...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="Decompilation" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[<p>The decompiler makes some assumptions about the input code. Like that call instructions usually return, the memory model is flat, the function frame is set properly, etc. When these assumptions are correct, the output is good. When they are wrong, well, the output does not correspond to the input. Take, for example, the following snippet:</p>

<p><img style="border:1px solid" src="http://www.hexblog.com/decompilation/pix/spoils_callerasm.gif" /></p>

<p>The decompiler produces the following pseudocode:</p>

<p><img style="border:1px solid" src="http://www.hexblog.com/decompilation/pix/spoils_callerc1.gif" /></p>

<p>Apparently, the <b>v3</b> variable  (it corresponds to <b>edx</b>)  is not initialized at all. Why?</p>]]>
        <![CDATA[<p>This happens because called functions usually spoil some registers. The <a href="http://msdn2.microsoft.com/en-us/library/984x0h58.aspx">calling conventions</a> on x86 stipulate that only the <b>esi</b>, <b>edi</b>, <b>ebx</b>, and <b>ebp</b> registers are saved across calls. In other words, other registers may change their values (or be <i>spoiled</i>) by a function call. Since the decompiler assumes that functions obey the regular calling conventions, it separates <b>edx</b> before the call and after the call into two variables. The first variable gets optimized away and is replaced by <b>a1</b>. The second variable (v3) becomes uninitialized.</p>

<p>In fact, there are three possible cases. The <b>edx</b> register could be:<ol><li>unmodified<br />
<li>used to return a value<br />
<li>spoiled<br />
</ol></p>

<p>by the called function. The decompiler chose the default case (#3). Let's check if it was right. Here's the disassembly of <b>sub_2A795</b>:</p>

<p><img style="border:1px solid" src="http://www.hexblog.com/decompilation/pix/spoils_calleeasm.gif" /></p>

<p>As we see, the <b>edx</b> register is not referenced at all, so we have the case #1. If the decompiler could find it out itself, without our help, our life would be much easier (maybe it will do so in the future!) Meanwhile, we have to add the required information ourselves. We do it using the <b>Edit, Functions, Set function type</b> command in IDA. The callee does not spoil any registers:</p>

<p><img style="border:1px solid" src="http://www.hexblog.com/decompilation/pix/spoils_typebox.gif" /></p>

<p>The decompiler produces different pseudocode:</p>

<p><img style="border:1px solid" src="http://www.hexblog.com/decompilation/pix/spoils_callerc2.gif" /></p>

<p>Since it knows that <b>edx</b> is not modified by the call, it creates just one variable for both edx instances (before and after the call).</p>

<p>Were the called function returning its value in <b>edx</b> (the case #2), we would set its type like this:</p>

<p><img style="border:1px solid" src="http://www.hexblog.com/decompilation/pix/spoils_typebox2.gif" /></p>

<p><small>(this prototype means: function with one argument on the stack, the argument will be popped by the callee; the result is returned in <b>edx</b>)</small><br />
The decompiler would create two separate variables for <b>edx</b>, as in the case #3. The first one would be optimized away, but the second one would be initialized with the returned value:</p>

<p><img style="border:1px solid" src="http://www.hexblog.com/decompilation/pix/spoils_callerc3.gif" /></p>

<p>As you see, the type information plays very important role in decompilation. In order to get a correct output, a correct input (or assumptions) must be given. Otherwise the decompiler works in the "garbage in - garbage out" mode. </p>

<p>Always pay attention to the types, it is a good thing to do.<br />
</p>]]>
    </content>
</entry>
<entry>
    <title>Symbian debugger</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2008/04/symbian_debugger.html" />
    <id>tag:hexblog.com,2008://1.74</id>
    <published>2008-04-08T18:02:06Z</published>
    <updated>2008-04-08T18:10:14Z</updated>
    
    <summary>It works! There are lots of limitations but it is alive, handles breakpoints, exceptions, and even some limited tracing is available. It is possible to launch processes and attach to them. Here is just one screenshot: Expect many limitations in...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="IDA Pro" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[<p>It works! There are lots of limitations but it is alive, handles breakpoints, exceptions, and even some limited tracing is available. It is possible to launch processes and attach to them. Here is just one screenshot:</p>

<p><img style="border:1px" src="http://www.hexblog.com/ida_pro/pix/epoc_debugger.gif" /></p>

<p>Expect many limitations in the first version (no hardware bpts, limited multithread support, etc). One of the most annoying shortcomings is that the memory layout is not determined automatically - we had to introduce 'manual memory regions' window to overcome this.</p>

<p>Since it is a new beast and many aspects need polishing, beta testers are welcome!<br />
</p>]]>
        
    </content>
</entry>
<entry>
    <title>Symbian AppTRK</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2008/03/symbian_apptrk.html" />
    <id>tag:hexblog.com,2008://1.73</id>
    <published>2008-03-29T01:06:06Z</published>
    <updated>2008-03-29T02:23:26Z</updated>
    
    <summary> Things are quite easy with the Symbian TRK! Today I decided to write a small program to interact with it and everything worked extremely smoothly. My driver program can download a SIS file to the phone, automatically install and...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="IDA Pro" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[<p><img align="left" src="http://www.hexblog.com/ida_pro/pix/symbian_logo.jpg" /> Things are quite easy with the Symbian TRK! Today I decided to write a small program to interact with it and everything worked extremely smoothly. My driver program can download a SIS file to the phone, automatically install and run it. It reacts to debugging events and gracefully closes the connection when the application terminates. Below are just a few pictures for the curious. </p>]]>
        <![CDATA[<p>Here's the applications folder of the phone:</p>

<p><img src="http://www.hexblog.com/ida_pro/pix/nokia_browser.jpg" /></p>

<p>The <a href="http://tools.ext.nokia.com/agents/index.htm">TRK</a> comes with the Pro version of the Carbide development environment. HelloWorld is just a sample application (maybe I <a href="http://hexblog.com/2008/03/symbian_woes.html">spent </a>more time on it than on the driver). The TRK kernel can connect to the main computer over USB or Bluetooth. Since my computer does not have a Bluetooth connection, I use a USB cable. The port number and baud rate seem to be irrelevant but they are displayed anyway:</p>

<p><img src="http://www.hexblog.com/ida_pro/pix/nokia_apptrk.jpg" /></p>

<p>At the main computer the connection is visible as a serial (COM) port. Connecting to the phone and sending bytes forth and back is quite easy: just open the serial port with <a href="http://msdn2.microsoft.com/en-us/library/aa363858.aspx">CreateFile </a>and use regular read/write system functions. Currently the driver is just a text-mode program and prints the communication packets on the screen:</p>

<p><img src="http://www.hexblog.com/ida_pro/pix/chktrk.gif" /></p>

<p>Finally, here's the helloworld application. It has been installed and ran by the driver program:</p>

<p><img src="http://www.hexblog.com/ida_pro/pix/nokia_hello.jpg" /></p>

<p>There is still a lot to do, but the foundation already exists. All this stuff is quite stable (IMHO much stabler than WinCE, probably because of a better memory protection). </p>

<p>We will have to modify the debugger in IDA to be able to work with TRK. IDA expects the application memory and registers to be available at all times but Symbian TRK is irresponsive while the application is running. Many other debugger servers behave the same way, so it is a good idea to support this mode.</p>

<p>If things go as well as today, we will have a Symbian debugger pretty soon!</p>

<p><br />
</p>]]>
    </content>
</entry>
<entry>
    <title>Hello Symbian!</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2008/03/symbian_woes.html" />
    <id>tag:hexblog.com,2008://1.72</id>
    <published>2008-03-26T10:30:21Z</published>
    <updated>2008-03-26T11:33:57Z</updated>
    
    <summary>Yesterday I created my first Symbian program :) Sure enough, it was a &quot;hello world&quot; and to tell the truth I did not write it myself. But it still took me 3 (three) hours to get it running on Nokia...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="IDA Pro" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[<p>Yesterday I created my first Symbian program :) Sure enough, it was a "hello world" and to tell the truth I did not write it myself. But it still took me 3 (three) hours to get it running on Nokia E51. The good side is that I learned a lot about possible failures with Symbian applications (there are <a href="http://wiki.forum.nokia.com/index.php/S60_SW_installer_troubleshooting">quite many</a> of them, some of them with cryptic error messages like "install failed"). <br />
</p>]]>
        <![CDATA[<p>The main reason why it took so much time is that I used a sample file from Examples/Basics/HelloWorld in the SDK. I have no idea why this file is included in the SDK, because it is incomplete and even manually adding a .pkg file does not help. My manual pkg file had all types of problems (wrong vendor id, secureid, uid, install directory, etc). I tested all combinations trying to make the application to install (this <a href="http://www.whythefuckwontmysisfileinstall.com/index.html">site</a> was very helpful). Finally I installed it  on the device. "I did i!" I congratulated myself - and immediately noticed that the installed application is nowhere. The installer claims that it is on the device, I can see the \sys\bin\hellworld.exe file on the disk, but there is no icon to click on and no other means to launch it. That was disappointing, to tell the least.</p>

<p>If you think about it, this is an expected outcome. The sample application consisted of a single cpp file, no resources, to icons, nothing. I guess Symbian does not display an icon for an application if it was not linked into the sis file (a sound approach, if you ask me).</p>

<p>My problems ended when I located another helloworld in S60Ex\helloworldbasic. With all skills I learned with the other helloworld, it took me only a few moments to build, download, and run it. Don't ask me why there are 2 different helloworlds but I'm glad that I went through this. Here are some good side effects of this failed endeavor:<ul><br />
<li>  the EFD utility displays detailed information about the latest (S60 9.1 3d edition) SIS files<br />
<img style="border:1px solid;" src="http://www.hexblog.com/ida_pro/pix/efd_symbian9x.gif" /><br />
<li> IDA can disassemble them<br />
<img style="border:1px solid;" src="http://www.hexblog.com/ida_pro/pix/ida_symbian9x.gif" /><br />
</ul></p>

<p>I also found that Symbian supports on-device debugging on new devices. The Target Resident Kernel (TRK) from Metrowerks is used as a debugger server. The TRK seems to be documented. The obvious idea is to connect it to IDA and debug Symbian applications (and maybe even system software!) I'm not sure that this will work but it is worth trying.</p>

<p><img style="border:1px solid;" src="http://www.hexblog.com/ida_pro/pix/trk_symbian9x.gif" /></p>]]>
    </content>
</entry>
<entry>
    <title>New Hex-Rays Demo</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2008/03/new_hexrays_demo.html" />
    <id>tag:hexblog.com,2008://1.71</id>
    <published>2008-03-12T17:36:55Z</published>
    <updated>2008-04-02T16:00:28Z</updated>
    
    <summary>This has been online for a while now, I just had no time to announce it properly: a new thorough demo of the decompiler by ccso.com, our US distributor: This demo is not just a teaser like the previous one....</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="Decompilation" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[This has been online for a while now, I just had no time to announce it properly: a new thorough demo of the decompiler by <a href="http://ccso.com">ccso.com</a>, our US distributor:
<p>
<center>
 <a href="http://www.ccso.com/files/hexraysdemo.swf">
 <img src="http://www.hex-rays.com/images/ccso_video_icon.jpg" /></a>
 </center>
</p>
<p>
This demo is not just a teaser like the previous one. It is much deeper and shows many decompiler aspects in detail:  it starts with the plugin configuration, shows a couple of simple decompilation cases, and then moves on to more complex functions. If you wondered how to improve the resulting pseudocode and handle typical cases, this video is for you!
</p><p>
]]>
        
    </content>
</entry>
<entry>
    <title>Pythonic way</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2008/03/pythonic_way.html" />
    <id>tag:hexblog.com,2008://1.70</id>
    <published>2008-03-06T23:22:10Z</published>
    <updated>2008-03-06T23:27:42Z</updated>
    
    <summary>A brilliant blog post by Ero Carrera: IDAPython in action: http://blog.dkbza.org/2008/03/digging-up-system-call-ordinals.html Just note how concise and powerful is the script!...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="IDA Pro" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[<p>A brilliant blog post by Ero Carrera: <a href="http://code.google.com/p/idapython/">IDAPython </a> in action:</p>

<p><a href="http://blog.dkbza.org/2008/03/digging-up-system-call-ordinals.html">http://blog.dkbza.org/2008/03/digging-up-system-call-ordinals.html</a></p>

<p>Just note how concise and powerful is the script!<br />
</p>]]>
        
    </content>
</entry>
<entry>
    <title>Tricky jump tables</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2008/03/tricky_jump_tables.html" />
    <id>tag:hexblog.com,2008://1.69</id>
    <published>2008-03-04T15:25:57Z</published>
    <updated>2008-03-04T15:39:07Z</updated>
    
    <summary>Just a quick post to announce that we have published a small plugin to specify jump table information. When IDA misses them, the flow charts are virtually useless - they fall apart into several loosely connected components and the logic...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="IDA Pro" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[<p>Just a quick post to announce that we have published a small plugin to specify jump table information. When IDA misses them, the flow charts are virtually useless - they fall apart into several loosely connected components and the logic is completely hidden. This plugin is especially useful for rarely used processors with unusual switch idioms.</p>

<p>The plugin and its source code can be found on our forum.</p>]]>
        
    </content>
</entry>
<entry>
    <title>Easy structure types</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2008/02/easy_structure_types_1.html" />
    <id>tag:hexblog.com,2008://1.68</id>
    <published>2008-02-18T12:45:36Z</published>
    <updated>2008-02-18T13:23:51Z</updated>
    
    <summary>I&apos;m happy to tell you that a new build of the decompiler is ready! It introduces new easily accessible commands to manipulate structure pointers. First, a variable can be converted into a structure pointer with one click. Also, new the...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="Decompilation" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[<p>I'm happy to tell you that a new build of the decompiler is ready! It introduces new easily accessible commands to manipulate structure pointers. First, a variable can be converted into a structure pointer with one click. Also, new the structure types can be build on the fly by the decompiler. As usual, any type or name can be modified any time. All this makes using the decompiler really agreeable. Please watch a short demo:</p>]]>
        <![CDATA[<div id="flashcontent">
   This text is replaced by the Flash content.
</div>
<script type="text/javascript" src="/decompilation/video/swfobject.js"></script>
<script type="text/javascript">
   var so = new SWFObject("/decompilation/video/easy_structs.swf", "Easy structs", "500", "571", "5", "#FFFFFF");
   so.write("flashcontent");
</script>

<p>Another nice improvement is how the decompiler handles zero offset fields in structure types. Below are two screenshots. Before:</p>

<p><img style="border:1px solid;"  src="/decompilation/pix/stroff0_before.gif" /></p>

<p>After:</p>

<p><img style="border:1px solid;"  src="/decompilation/pix/stroff0_after.gif" /></p>

<p>Please note that all casts has gone. They have been replaced by direct references to the first field of the structure type. Needless to say, the second text is much clearer than the first one.</p>

<p>As usual, all discovered and reported bugs have been fixed. If you have an old version, please update (especially before sending us bug reports). </p>

<p>Thank you!<br />
</p>]]>
    </content>
</entry>
<entry>
    <title>MRXDAV.SYS and Hex-Rays Decompiler</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2008/02/mrxdavsys_and_hexrays.html" />
    <id>tag:hexblog.com,2008://1.67</id>
    <published>2008-02-13T01:18:29Z</published>
    <updated>2008-02-13T02:32:16Z</updated>
    
    <summary>I wanted to present you a new plugin today. It was about switch idioms (jump tables). I spent a few hours trying to find a problematic x86 sample file but could not locate anything impressive. All jump tables were nicely...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="Security" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[<p>I wanted to present you a new plugin today. It was about switch idioms (jump tables). I spent a few hours trying to find a problematic x86 sample file but could not locate anything impressive. All jump tables were nicely recognized. This certainly does not mean that IDA handles them perfectly, but rather that my search methods must be improved.</p>

<p>Anyway, things were going nowhere and I decided to make a micro-break. It really helps to unblock the thought process  (sometimes my entire working day consists of innumerable micro-breaks :)</p>]]>
        <![CDATA[<p>I remembered that it is time to install security updates so I quickly downloaded them and looked into the details. <a href="http://www.microsoft.com/technet/security/bulletin/ms08-007.mspx">This</a> one looked interesting. </p>

<p>Sometimes I use the decompiler to find out the differences between programs: decompile two files and use a text comparison on the pseudocode. This simple approach works ok for short files: I found out the modified functions. There were two of them, the most interesting one named <strong>MRxDAVPrecompleteUserModeQueryDirectoryRequest</strong>() (what a name!)</p>

<p>The MS08-007 vulnerability is a classic buffer overflow: two unicode strings get concatenated into a buffer of a fixed size. The application checks the size before copying, so in theory there shouldn't be any problems. I think that by looking at the following two screenshots you can tell why it didn't work quite well. Before:</p>

<p><img style="border:1px solid;" src="/security/pix/mrxdav_old.gif" /></p>

<p>The old version was checking that the sum of the input string lengthes is acceptable. Unfortunately, the case of integer overflow is not handled.</p>

<p>After:</p>

<p><img style="border:1px solid;" src="/security/pix/mrxdav_new.gif" /></p>

<p>The new version checks three things: the length of each input string and their sum all must be acceptable. </p>

<p>A single highlight of <strong>0x208</strong> is enough to notice the difference. In the old version, only <strong>tot2</strong> is checked against the limit.</p>

<p>Well, my micro-break turned into a blog post. Back to the plugin: I'll find a nice sample file for jump tables and post a short video here. Stay tuned.</p>]]>
    </content>
</entry>
<entry>
    <title>Debugger and process memory</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2008/02/debugger_and_process_memory.html" />
    <id>tag:hexblog.com,2008://1.66</id>
    <published>2008-02-03T16:02:35Z</published>
    <updated>2008-02-04T01:38:29Z</updated>
    
    <summary>Just a small note about the debugger plugins and events. Many users who try to develop a plugin for the debugger notice that IDA behaves slightly differently in the notification callbacks than anywhere else. For example, IDA might claim that...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="IDA Pro" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[Just a small note about the debugger plugins and events. Many users
who try to develop a plugin for the debugger notice that IDA
behaves slightly differently in the notification callbacks than anywhere else.
<p>
For example, IDA might claim that <b>EIP</b> points to an address without a segment,
or none of exported names of a loaded DLL are available.
<p>
]]>
        <![CDATA[This happens because of the database synchronization. When you query IDA
about a segment or a name, the information from the database is returned.
The database is not (and can not be) always in sync
with the process memory. The debugged process might allocate or free memory chunks
thousands of times per second. The synchronization operation is very expensive:
it requires enumerating all memory regions and collecting information about their types, permissions, etc.
That's why IDA tries to perform it as rarely as possible.
<p>
We tried to save the end user from this problem: the database
gets synchronized as soon as the process is suspended. This is the
expected behavior, so far so good.
<p>
Alas, we can not provide plugin writers with the same virtual reality: there are
two distinct entities, the process memory and the program segmentation. The latter is stored in the database
and it can go out of sync with the former. As a plugin writer, you have the following
choices:
<ul>
<li>Bypass the database and talk directly to the debugger module using the
  <b>dbg->read_memory()</b> and <b>dbg->write_memory()</b> functions.<br>
  <span style="background:lightgreen">Advantages: no synchronization issues</span><br>
  <span style="background:gray">Disadvantages: only memory read/write functions are available, names, functions,
  and other high level abstractions are not; also, direct calls are not cached.</span>
  <p>
<li>Explicitly synchronize the database with the process memory.<br>
<span style="background:lightgreen">Advantages: all high level abstractions are available,
repetitive "read memory" requests
will be cached by the kernel (especially useful for remote debugging);
breakpoints in the process memory are hidden<br></span>
<span style="background:gray">Disadvantages: slow</span>
  <p>
<li>Do not synchronize.<br>
<span style="background:lightgreen">Advantages: fast<br></span>
<span style="background:gray">Disadvantages: occasionally a call to an IDA function might fail (e.g. <b>getseg()</b>
might return NULL for a valid address); nevertheless, this approach can be used if
you know in advance that the memory config won't change or the changes are irrelevant to your analysis</span>
  <p>
</ul>

Feel free to use the best method depending on your needs. When you need to force
a synchronization, use this:

<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">        invalidate_dbgmem_config();
        isEnabled(any_address);
</div>

Or maybe a better approach exists?.. ;)
]]>
    </content>
</entry>
<entry>
    <title>Jump tables</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2008/01/jump_tables.html" />
    <id>tag:hexblog.com,2008://1.65</id>
    <published>2008-01-31T10:21:45Z</published>
    <updated>2008-02-03T21:29:54Z</updated>
    
    <summary>It is an endless story: regardless of how many different jump table types IDA supports, there will be a new unhandled twist. Be it the instruction scheduler, which rearranged the instructions in an unexpected manner, or the compiler, which learned...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="IDA Pro" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[It is an endless story: regardless of how many different jump table types IDA supports, there will be a new unhandled twist. Be it the instruction scheduler, which rearranged the instructions in an unexpected manner, or the compiler, which learned a new optimization trick, it is the same for IDA: jump tables are missed and functions boundaries are wrong. What's worse, the graph view, so loved by IDA users, displays a trimmed graph without jump tables, virtually useless for any analysis.
<p>
That's why we strive to add support for new jump tables to IDA, and since it can not be done for all of them, we focus on compiler generated jump tables for popular processors. Take ARM, for example. The ARM processor module have been improved a lot in v5.2, but yet we received a report with a bunch of new patterns. So expect even better support for ARM in the near future :)
<p>
If you are interested in improving the jump table handling for a rarely used processor, here are the explanations how to do it.
]]>
        <![CDATA[<p>
First, you'll need to hook to the emulation step of the analysis.This way your plugin will be called by the kernel as soon as the instructions that handles the jump table are analyzed. You can do it in many ways, I present 2 of them here:
<ul>
  <li>hook to processor events and intercept the<strong> custom_emu </strong>event.

<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">        hook_to_notification_point(HT_IDP, my_handler, NULL);
        ...
        static int my_handler(void *, int code, va_list va)
        {
          if ( code == processor_t::custom_emu )
          {
            // check the instruction (see 'cmd' structure)
            // and create switch_info_ex_t
          }
        }
</div>
   <li>you may also replace <strong>ph.is_switch()</strong> pointer by your function. This callback is used
   only for indirect jump instructions, so your handler will be called only for them.
</ul>

The task of recognizing an instruction sequence has not been formalized yet,
so you are basically at your own here. Check the processor module samples in the SDK,
or implement your own pattern matcher. In the SDK there is a file named
<i>jptcmn.cpp</i>, it contains an instruction sequence matcher used by a
few IDA modules. But I have to admit that it is quite difficult to use and does 
not handle everything. However, it still can overcome some instruction shuffling and
register substitutions. Please check the samples in the SDK to see how it is used.
<p>
After recognizing a jump table, the IDA kernel should be informed about it.
The structure that holds the jump table information is called <b>switch_info_ex_t</b>.
The following details are stored:
<ul>
<li><b>jumps</b>: The address of the jump table
<li>Element size for the jump table
<li><b>ncases</b>: Number of elements
<li><b>defjump</b>: The address where the execution continues if the control variable is out of range (<b>default</b>  jump address)
<li><b>startea</b>: The address of the beginning of the switch idiom
</ul>
and lots of flags and additional information about the table, like if the table
elements are signed or unsigned, if a separate value table is present, if the jump
table has a custom structure, etc. The <b>switch_info_ex_t</b> structure
is quite complex but it can be used to describe virtually any jump table.
<p>
In the simplest case, a plain 32-bit jump table with elements that contain
target addresses, the structure is filled like this:

<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">  switch_info_ex_t si;
  si.set_jtable_element_size(4);// 32-bit jump offsets
  si.ncases  = n;               // we specify the table size
  si.jumps   = jump_table_ea;   // the table address
  si.startea = cmd.ea;          // address of the first instruction
                                // related to the jump table
</div>
<p>
If we have more information about the jump table, we can add it to the structure.
If we know the <b>default</b> jump address:
<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">  si.flags |= SWI_DEFAULT;
  si.defjump = default_jump_ea;
</div>
<p>
If we know the input register used by the table jump:
<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">  si.set_expr(regnum, reg_dttyp);
</div>
<p>
If a delta is subtracted from the input value before using it:
<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">  si.lowcase = delta;
</div>
<p>
If the table entries are shifted to the left before used:
<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">  si.set_shift(shift_amount);
</div>
<p>
If there is a separate value table:
<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">  si.flags |= SWI_SPARSE;
  si.set_vtable_element_size(value_size);
</div>
<p>
If the value table are used as indexes into the jump table:
<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">  si.flags2 |= SWI2_INDIRECT;
</div>
<p>
Finally, if the jump table has a special structure, then you can set:
<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">  si.flags |= SWI_CUSTOM;
  si.custom = value_of_your_choice;
</div>
<p>
Naturally, the more information you put into <b>switch_info_ex_t</b>, the better
the analysis will be. The prepared structure must be stored into the database:
<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">  set_switch_info_ex(ea, &si);
  setFlbits(ea, FF_JUMP);
</div>
<p>
For regular jump tables, the kernel will handle the rest. For custom tables,
the kernel will generate the <b>create_switch_xrefs</b> and 
<b>calc_switch_cases</b> events. Your module must intercept them
and either create cross references or calculate the requested information.
<p>
One more trick: you may also intercept the <b>processor_t::is_insn_table_jump</b> event
to prevent the kernel from creating jump tables when it should not. The kernel
has some heuristic rules to create jump tables. If they create wrong jump tables,
you can intercept their creation at this event.
<p>
This was a very short introduction to the jump tables in IDA Pro. Feel free to post your questions in the forum, I'll be glad to answer!
<p>
<small>3 feb: minor edits</small>]]>
    </content>
</entry>
<entry>
    <title>Better user interface for decompiler</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2008/01/better_user_interface_for_decompiler.html" />
    <id>tag:hexblog.com,2008://1.64</id>
    <published>2008-01-02T15:24:58Z</published>
    <updated>2008-01-02T16:35:51Z</updated>
    
    <summary>We are glad to release a new version of the Hex-Rays decompiler! Highlights of this build: improved usability support for unusual calling conventions better handling of obfuscated code The most important improvement is the user interface. Now the decompiler is...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="Decompilation" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[We are glad to release a new version of the Hex-Rays decompiler!
Highlights of this build:
<ul>
<li>        improved usability </li>
<li>        support for unusual calling conventions </li>
<li>        better handling of obfuscated code </li>
</ul>
<p>
The most important improvement is the user interface. Now the decompiler is
at your fingertips at all times, the same way as the graph view.
Remember that you can toggle graph-text views in IDA with one keyboard hit?
For the decompiler you can use the Tab key: it toggles between
the disassembly and pseudocode views.
</p>

<p>
For those of you who prefer to see both the decompiler output and disassembler output
in the same window, we added the "<b>copy to disassembly</b>" command. It just does what
its names says: copies the pseudocode text to the disassembly window. You can
see both outputs simultaneously: mapping of low level assembly idioms to high
level constructs is made as transparent as possible.
</p>

<p>
With this build, you will be able to open <b>multiple pseudocode windows</b>.
This will be especially useful for long functions: just open a separate window
for each called function by Ctrl-double clicking on function names. The long
function will stay intact in its own window and you won't lose time by
reanalyzing it upon each return.
</p>

<p>
One more command to handle code complexity: ability to hide parts of code.
The new <b>hide/unhide</b> command allows you to collapse a multiline statement into
just one line. Collapsing unimportant sub-statements reveals
the global structure of the decompiled function.
</p>

<p>
We also added other things to make the life easier: the command to jump to xrefs,
better status line information, support for the __spoiled keyword, and more
heuristic rules to the analyzer.
</p>

<p>
Here's a short video:
</p>

<center>
<a href="http://www.hex-rays.com/video/build20080102.html">
<img src="http://www.hex-rays.com/video/build20080102_icon.gif" />
</a>
</center>

<p>
The detailed list of changes can be accessed <a href="http://www.hex-rays.com/news1.shtml">here</a>
</p>
Nice analysis!
]]>
        
    </content>
</entry>
<entry>
    <title>Decompiler output ctree</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2007/11/decompiler_output_ctree.html" />
    <id>tag:hexblog.com,2007://1.63</id>
    <published>2007-11-27T23:28:00Z</published>
    <updated>2007-11-28T03:26:14Z</updated>
    
    <summary>The upcoming version of the decompiler SDK adds some nice features. First, we created a reference manual. It is in doxygen format: cross references make it really easy to browse. Second, the SDK is compatible with both IDA v5.1 and...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="Decompilation" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[<p>The upcoming version of the decompiler SDK adds some nice features.<br />
First, we created a reference manual. It is in <a href="http://www.doxygen.org">doxygen</a> format: cross references make it really easy to browse. Second, the SDK is compatible with both IDA v5.1 and v5.2. Third, we added functions to retrieve and modify all user-defined attributes like variable names, types, and comments. Fourth, we added more sample plugins. And fifth, our <a href="http://www.hex-rays.com/forum">forum </a> is open. All your decompiler and SDK related questions can be asked there.</p>

<p>Since the "show, don't tell" rule applies to everyone, here's a short video demonstrating one of the new sample plugins (it displays the decompiler output as a graph):</p>

<center><a href="/decompilation/video/vd2.html"><img src="/decompilation/video/vd2_icon.gif" /></a></center>

<p>Hopefully the new version will be available this week, as soon as the regression tests are over.<br />
</p>]]>
        
    </content>
</entry>
<entry>
    <title>Hex-Rays SDK is ready!</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2007/10/hexrays_sdk_is_ready.html" />
    <id>tag:hexblog.com,2007://1.62</id>
    <published>2007-10-30T21:08:26Z</published>
    <updated>2007-10-30T21:25:20Z</updated>
    
    <summary> A binary analysis tool like a decompiler is incomplete without a programming interface. Sure, decompilers tremendously facilitate binary analysis. You can concentrate of the program logic expressed in a familiar way. Just add comments, rename variables and functions to...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="Decompilation" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[<p>
A binary analysis tool like a decompiler is incomplete without a programming interface.
Sure, decompilers tremendously facilitate binary analysis. You can concentrate
of the program logic expressed in a familiar way. Just add comments, rename variables
and functions to get <i>almost</i> the original source code, <i>almost</i> perfect. However, quite often there
is a small ugly detail and the output falls short of being satisfactory.</p>]]>
        <![CDATA[<p>
It can be because of an awkward expression
    </p>

<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">    (result = _putwc_lk(a3, (FILE *)result), result != -1)
</div>

    <p>
which could be represented more concisely:
    </p>

<div style="background:#EEFFEE;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:black">    ((result = _putwc_lk(a3, fp)) != -1)
</div>

    <p>
It can also be an inline function
    </p>

<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">      while ( v16 )
      {
        *(_BYTE *)v17++ = 0;
        --v16;
      }
</div>

    <p>
which could be collapsed:
    </p>

<div style="background:#EEFFEE;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:black">      memset(ptr, 0, count);
</div>

    <p>
It can be a <b>while</b>-loop
    </p>

<div style="background:#DDEEFF;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:grey">    v7 = 48;
    v4 = wcstok(&amp;Str, L&quot;.&quot;);
    if ( v4 )
    {
      do
      {
        v9 = (unsigned __int16)j___wtol(v4) &lt;&lt; v7;
        v6 |= v9;
        v5 |= *((_DWORD *)&amp;v9 + 1);
        v4 = wcstok(NULL, L&quot;.&quot;);
        v7 -= 16;
      }
      while ( v7 &gt;= 0 &amp;&amp; v4 );
    }
</div>

    <p>
which could be converted into a <b>for</b>-loop:
    </p>

<div style="background:#EEFFEE;border:1px solid;white-space:pre; font-family: andale, courier, monospace; color:black">    for ( shift=48, ptr=wcstok(&amp;Str, L&quot;.&quot;);
          shift &gt;= 0 &amp;&amp; ptr;
          ptr=wcstok(NULL, L&quot;.&quot;), shift-=16 )
    {
      v6 |= (ushort)wtol(ptr) &lt;&lt; shift;
      v5 |= codepage;
    }
</div>

<p>
All these transformations improve the readability but the decompiler can not perform them
automatically: they change the meaning of the program. Only the user who knows
that these transformations can be safely applied should activate them.
</p><p>
We could add extensive set of manual
transformation commands to the decompiler (we might do it one day), but there are really too many of them.
Besides, some transformations can be applied only in some particular circumstances proper to a particular
version of a compiler used with particular command line options.
In short, there is no way we can predict all possible transformations and implement them.
</p><p>
Hex-Rays SDK allows you to manipulate the decompilation result as you want.
You can play with the output data structure (called ctree), modify it, rename variables, and change their types.
Watch such a plugin in action:
</p><p>
<center>
<div style="border: 1px solid; width:450px">
<img src="http://www.hexblog.com/decompilation/pix/swapifs.gif">
</div>
</center>
</p><p>
This plugin introduces a new command to swap <b>if</b> branches. I personally prefer to have
the shorter <b>if</b> branch first: shorter means simpler.
Having simplest problems to be solved first is a good approach in programming, it frees
one's mind for complex problems and makes the unsolved part of the problem shorter (thus hopefully simpler ;)
<p>
Other things you can do with the current SDK:
</p>
<ul>
<li>Decompile any function
<li>Modify the pseudocode
<li>Change local variable names and types
<li>Introduce your own interactive commands
<li>Install callbacks to react to decompiler events
</ul>
<p>
The above functionality it enough to implement the <i>Inliner, Exporter, Transformer, and Vizier(partially)</i>
plugins mentioned <a href="http://hexblog.com/2007/06/trunk_branches_and_leaves.html">here</a>.
</p><p>
In the future we will add support for other plugin types. The decompiler will handle other target processors
and data flow analysis functions will be exported. This will allow you to write more
complex analysis and transformation rules.
</p><p>
What about writing your own vulnerability scanner based on Hex-Rays? ;)
<br>
It is quite difficult today but will be within reach very soon.
</p>
]]>
    </content>
</entry>
<entry>
    <title>IDA and Microcontrollers</title>
    <link rel="alternate" type="text/html" href="http://hexblog.com/2007/10/ida_and_microcontrollers.html" />
    <id>tag:hexblog.com,2007://1.61</id>
    <published>2007-10-15T13:52:30Z</published>
    <updated>2007-10-15T14:10:24Z</updated>
    
    <summary>If you ever used IDA to analyze embedded stuff, you would immediately notice its pc-centric nature. While any embedded SDK targets specific devices with real-world part numbers, IDA just provides you with a universal analysis framework. You are supposed to...</summary>
    <author>
        <name>Ilfak Guilfanov</name>
        
    </author>
            <category term="IDA Pro" />
    
    <content type="html" xml:lang="en" xml:base="http://hexblog.com/">
        <![CDATA[<p>If you ever used IDA to analyze embedded stuff, you would immediately notice its pc-centric nature. While any embedded SDK targets specific devices with real-world part numbers, IDA just provides you with a universal analysis framework. You are supposed to know how the device works, its idiosyncrasies, programming model, memory organization, and all other practical stuff. If there is an automatic way to determine the entry point or interrupt vectors, IDA will use it but in general you will have to find out the correct parameters yourself.</p>

<p>The following tutorial fills the gap for C166 (and explains many other things!):</p>

<p><a href="http://andywhittaker.com/ECU/DisassemblingaBoschME755/tabid/96/Default.aspx">http://andywhittaker.com/ECU/DisassemblingaBoschME755/tabid/96/Default.aspx</a></p>

<p>Thanks, Andy!<br />
</p>]]>
        
    </content>
</entry>

</feed> 

