« June 2010 | Main | August 2010 »

July 19, 2010

Implementing command completion for IDAPython

In this blog post we are going to illustrate how to use the command line interpreter (CLI) interface from Python and how to write a basic command completion functionality for the Python CLI.

Understanding the CLI structure

IDA Pro SDK allows programmers to implement command line interpreters with the use of the cli_t structure:
struct cli_t { size_t size; // Size of this structure int32 flags; // Feature bits const char *sname; // Short name (displayed on the button) const char *lname; // Long name (displayed in the menu) const char *hint; // Hint for the input line // callback: the user pressed Enter bool (idaapi *execute_line)(const char *line); // callback: the user pressed Tab (optional) bool (idaapi *complete_line)( qstring *completion, const char *prefix, int n, const char *line, int prefix_start); // callback: a keyboard key has been pressed (optional) bool (idaapi *keydown)( qstring *line, int *p_x, int *p_sellen, int *vk_key, int shift); };
For example, the IDAPython plugin defines its CLI like this:
static const cli_t cli_python = { sizeof(cli_t), 0, "Python", "Python - IDAPython plugin", "Enter any Python expression", IDAPython_cli_execute_line, NULL, // No completion support NULL // No keydown support };
And registers/unregisters it with:
void install_command_interpreter(const cli_t *cp); void remove_command_interpreter(const cli_t *cp);

The CLI in Python

The CLI functionality has been added in IDAPython 1.4.1. Let us suppose that we want to reimplement the Python CLI in Python (rather than in C++), we would do it like this:
class pycli_t(idaapi.cli_t): flags = 0 sname = "PyPython" lname = "Python - PyCLI" hint = "Enter any Python statement" def OnExecuteLine(self, line): """ The user pressed Enter. @param line: typed line(s) @return Boolean: True-executed line, False-ask for more lines """ try: exec(line, globals(), globals()) except Exception, e: print str(e) + "\n" + traceback.format_exc() return True
Summary:
  1. Subclass idaapi.cli_t
  2. Define the CLI short name, long name and hint
  3. Declare the OnExecuteLine handler and use Python's exec() to execute a line
To try this code, simply instantiate a CLI and register it:
pycli = pycli_t() pycli.register()
And later to unregister the CLI:
pycli.unregister()

Adding command completion

In order to add command completion, we need to implement the OnCompleteLine() callback:
class cli_t: def OnCompleteLine(self, prefix, n, line, prefix_start): """ The user pressed Tab. Find a completion number N for prefix PREFIX This callback is optional. @param prefix: Line prefix at prefix_start (string) @param n: completion number (int) @param line: the current line (string) @param prefix_start: the index where PREFIX starts in LINE (int) @return: None or a String with the completion value """ print "OnCompleteLine: pfx=%s n=%d line=%s pfx_st=%d" % (prefix, n, line, prefix_start) return None
This callback is invoked everytime the user presses TAB in the CLI. The callback receives:
  • prefix: the prefix at the cursor position
  • n: the completion number. If this callback succeeded the first time (when n == 0), then IDA will call this callback again with n=1 (and so on) asking for the next possible completion value
  • line: the whole line
  • prefix_start: the index of the prefix in the line
For example typing "os.path.ex" and pressing TAB would call:
OnCompleteLine(prefix="ex", n=0, line="print os.path.ex", prefix_start=14)
For demonstration purposes, here is a very simple algorithm to guess what could complete the "os.path.ex" expression:
  1. Parse the identifier: Since IDA passes the line, prefix and the prefix start index, we need to get the whole expression including the prefix (we need "os.path.ex"). This can be done by scanning backwards from prefix_start until we encounter a non identifier character:
    def parse_identifier(line, prefix, prefix_start): id_start = prefix_start while id_start > 0: ch = line[id_start] if not ch.isalpha() and ch != '.' and ch != '_': id_start += 1 break id_start -= 1 return line[id_start:prefix_start + len(prefix)]
  2. Fetch the attributes with dir(): Given the full identifier name (example "os.path.ex"), we split the name by '.' and use a getattr() loop starting from the __main__ module up to the last split value:
    def get_completion(id, prefix): try: parts = id.split('.') m = sys.modules['__main__'] for i in xrange(0, len(parts)-1): m = getattr(m, parts[i]) except Exception, e: return None else: completion = [x for x in dir(m) if x.startswith(prefix)] return completion if len(completion) else None
  3. Implementing OnCompleteLine(): We parse the identifier and get a list of possible completion values only if n==0, otherwise we return the next possible value in the list we previously built:
    def OnCompleteLine(self, prefix, n, line, prefix_start): # new search? if n == 0: self.n = n id = self.parse_identifier(line, prefix, prefix_start) self.completion = self.get_completion(id, prefix) return None if (self.completion is None) or (n >= len(self.completion)) else self.completion[n]
With this approach we could complete something like this:
f = open('somefile.txt', 'r') f.re^TAB => f.read, f.readinto, f.readline or f.readlines
Or:
print idau^TAB.GetRe^TAB => print idautils.GetRegisterList
The sample script can be downloaded from here and a win32 build of the latest IDAPython plugin (with completion integrated in the plugin) can be downloaded from here.

July 08, 2010

Running scripts from the command line with idascript

In this blog post we are going to demonstrate how the '-S' and '-t' switches (that were introduced in IDA Pro 5.7) can be used to run IDC, Python or other supported scripts from the command line as if they were standlone scripts and how to use the idascript utility


Background

In order to run a script from the command line, IDA Pro needs to know which script to launch. We can specify the script and its argument via the "-S" switch:
idag -Sfirst.idc mydatabase.idb
Or:
idag -S"first.idc arg1 arg2" mydatabase.idb
In case the script does not require a database (for example, it works with the debugger and attaches to existing processes), then IDA Pro will be satisfied with the "-t" (create a temporary database) switch:
idag -S"first.idc arg1 arg2" -t
Where first.idc:
#include <idc.idc> static main() { Message("Hello world from IDC!\n"); return 0; }
If we run IDA Pro with the following command:
idag -Sfirst.idc -t
We notice two things:
  1. Nothing is printed in the console window: This is because the message will show in the output window instead:


    (It is possible to save all the text in the output window by using the IDALOG environment variable.)

  2. IDA Pro remains open and does not close: To exit IDA Pro when the script finishes, use Exit() IDC function.
In the following section, we will address those two problems with idascript.idc and idascript.py helper scripts and the idascript utility.

Running scripts from the command line

In order to print to the console window, we will not use IDC / Message() instead we will write to a file and when IDA Pro exits we will display the contents of that file.

Our second attempt with second.idc:
extern g_idcutil_logfile; static LogInit() { g_idcutil_logfile = fopen("idaout.txt", "w"); if (g_idcutil_logfile == 0) return 0; return 1; } static LogWrite(str) { if (g_idcutil_logfile != 0) return fprintf(g_idcutil_logfile, "%s", str); return -1; } static LogTerm() { if (g_idcutil_logfile == 0) return; fclose(g_idcutil_logfile); g_idcutil_logfile = 0; } static main() { LogInit(); // Open log file LogWrite("Hello world from IDC!\n"); // Write to log file LogTerm(); // Close log file Exit(0); // Exit IDA Pro }
Now let us run IDA Pro:
idag -Ssecond.idc -t
and type afterwards:
type idaout.txt
to get the following output:
Hello world from IDC!

To simplify this whole process, we wrote a small win32 command line utility called idascript:

IDAScript 1.0 (c) Hex-Rays - A tool to run IDA Pro scripts from the command line It can be used in two modes: a) With a database: idascript database.idb script.(idc|py|...) [arg1 [arg2 [arg3 [...]]]] b) With a temporary database: idascript script.(idc|py|...) [arg1 [arg2 [arg3 [...]]]]

Since we will be using LogInit(), LogTerm(), LogWrite() and other helper functions over and over, we moved those common functions to idascript.idc.

The script first.idc can now be rewritten like this:

#include <idc.idc> #include "idascript.idc" static main() { InitUtils(); // calls LogInit() Print(("Hello world from IDC!\n")); // Macro that calls LogWrite() Quit(0); // calls LogTerm() following by Exit() }
As for IDAPython, we wrote a small class to redirect all output to idaout.txt:
import sys class ToFileStdOut(object): def __init__(self): self.outfile = open("idaout.txt", "w") def write(self, text): self.outfile.write(text) def flush(self): self.outfile.flush() def isatty(self): return False def __del__(self): self.outfile.close() sys.stdout = sys.stderr = ToFileStdOut()
Thus, hello.py can be written like this:
import idc import idascript print "Hello world from IDAPython\n" for i in xrange(1, len(idc.ARGV)): print "ARGV[%d]=%s" % (i, idc.ARGV[i]) idc.Exit(0)

Sample scripts

Process list

The sample script listprocs.idc will enumerate all processes and display their ID and name:
#include <idc.idc> #include <idascript.idc> static main() { InitUtils(); LoadDebugger("win32", 0); auto q = GetProcessQty(), i; for (i=0;i<q;i++) Print(("[%08X] %s\n", GetProcessPid(i), GetProcessName(i))); Quit(0); }

Kill process

The killproc.idc script illustrates how to find processes by name and terminate them one by one:
#include <idc.idc> #include "idascript.idc" #include "procutil.idc" static main() { InitUtils(); // Load the debugger LoadDebugger("win32", 0); // Get parameters if (ARGV.count < 1) QuitMsg(0, "Usage: killproc.idc ProcessName\n"); auto procs = FindProcessByName(ARGV[1]), i; if (procs.count == 0) QuitMsg(-1, "No process(es) with name " + ARGV[1]); for (i=procs.count-1;i>=0;i--) { auto pid = procs[i]; Print(("killing pid: %X\n", pid)); KillProcess(pid); } Quit(0); }
To test the script, let us suppose we have a few instances of notepad.exe we want to kill:
D:\idascript>idascript killproc.idc notepad.exe killing pid: 878 killing pid: 14C8 D:\idascript>

We used here the "ARGV" variable that contains all the parameters passed to IDA Pro via the -S switch, FindProcessByName() utility function and KillProcess() (check procutil.idc)

The trick behind terminating a process is to attach and call StopDebugger(). The following is an excerpt from procutil.idc utility script:
static KillProcess(pid) { if (!AttachToProcess(pid)) return 0; StopDebugger(); // Terminate the current process // Normally, we should get a PROCESS_EXIT event GetDebuggerEvent(WFNE_SUSP, -1); }

Process information

The procinfo.idc script will display thread count, register information and the command line arguments of the process in question:
#include "idascript.idc" #include "procutil.idc" static DumpProcessInfo() { // Retrieve command line via Appcall Print(("Command line: %s\n", GetProcessCommandLine())); // Enum modules Print(("Module list:\n------------\n")); auto x; for (x = GetFirstModule();x!=BADADDR;x=GetNextModule(x)) Print(("Module [%08X] [%08X] %s\n", x, GetModuleSize(x), GetModuleName(x))); Print(("\nThread list:\n------------\n")); for (x=GetThreadQty()-1;x>=0;x--) { auto tid = GetThreadId(x); Print(("Thread [%x]\n", tid)); SelectThread(tid); Print((" EIP=%08X ESP=%08X EBP=%08X\n", Eip, Esp, Ebp)); } } static main() { InitUtils(); // Load the debugger LoadDebugger("win32", 0); // Get parameters if (ARGV.count < 2) QuitMsg(0, "Usage: killproc.idc ProcessName\n"); auto procs = FindProcessByName(ARGV[1]), i; for (i=procs.count-1;i>=0;i--) { auto pid = procs[i]; if (!AttachToProcess(pid)) { Print(("Could not attach to pid=%x\n", pid)); continue; } DumpProcessInfo(); DetachFromProcess(); } Quit(0); }
The function GetProcessCommandLine is implemented (using Appcall) like this:
static GetProcessCommandLine() { // Get address of the GetCommandLine API auto e, GetCmdLn = LocByName("kernel32_GetCommandLineA"); if (GetCmdLn == BADADDR) return 0; // Set its prototype for Appcall SetType(GetCmdLn, "char * __stdcall x();"); try { // Retrieve the command line using Appcall return GetCmdLn(); } catch (e) { return 0; } }

Extracting function body

So far we did not really need a specific database to work with. In the following example (funcextract.idc) we will demonstrate how to extract the body of a function from a given database:
#include <idc.idc> #include "idascript.idc" static main() { InitUtils(); if (ARGV.count < 2) QuitMsg(0, "Usage: funcextract.idc FuncName OutFile"); // Resolve name auto ea = LocByName(ARGV[1]); if (ea == BADADDR) QuitMsg(0, sprintf("Function '%s' not found!", ARGV[1])); // Get function start ea = GetFunctionAttr(ea, FUNCATTR_START); if (ea == BADADDR) QuitMsg(0, "Could not determine function start!\n"); // size = end - start auto sz = GetFunctionAttr(ea, FUNCATTR_END) - ea; auto fp = fopen(ARGV[2], "wb"); if (fp == 0) QuitMsg(-1, "Failed to create output file\n"); savefile(fp, 0, ea, sz); fclose(fp); Print(("Successfully extracted %d byte(s) from '%s'", sz, ARGV[1])); Quit(0); }
To test the script, we use idascript utility and pass a database name:
D:\idascript>idascript ar.idb funcextract.idc start start.bin Successfully extracted 89 byte(s) from 'start' D:\idascript>

Other ideas

There are other ideas that can be implemented to create useful command line tools:
  • Process memory read/write: Check the rwproc.idc script that allows you to read from the process memory to a file or the other way round.
  • Associate .IDC with idascript.exe: This allows you to double-click on IDC scripts to run them from the Windows Explorer
  • Scriptable debugger: Write scripts to debug a certain process and extract needed information
  • ...

Installing the idascript utility

Please download idascript and the needed scripts from here and follow these steps:
  1. Copy idascript.exe to the installation directory of IDA Pro (say %IDA%)
  2. Add IDA Pro directory to the PATH environment variable
  3. Copy idascript.idc and procutil.idc to %IDA%\idc
  4. Copy idascript.py to %IDA%\python
  5. Optional: Associate *.idc files with idascript.exe
Comments and suggestions are welcome!

July 02, 2010

IDA Pro 5.7 highlights

We have released a IDA Pro 5.7 few days ago. The complete whatsnew can be found here. In this blog post we will highlight some of the major changes and additions of this release.

Debuggers

Among the various changes and additions to the debugger kernel and modules, we:
  • added support for MMX/XMM registers:



  • added more actions to the modules window:



    • Load debug symbols: Load additional PDB symbols
    • Jump to module base: Jumps to the module base in the current view
    • Analyze module: Converts the module segments to non-debugger segments and analyzes the module. Handy when analyzing crashdump files

  • added Bochs 2.4.2 support.
    Bochs 2.4.2 introduced range read/write physical watchpoints. If a watchpoint was added from the Bochs command line interface IDA Pro will suspend the execution when the watchpoint triggers.

Bochs Linux debugger plugin

If you found Bochs debugger plugin useful in the past (e.g. for low level programming, malware and code snippet emulation), then you may take advantage of the same functionality under Linux / MacOS.


(Debugger selection)


(Debugger configuration)


(Debugger running Under Ubuntu 9 x86)

Please refer to the tutorial to learn more how to configure and use the plugin.

WinDbg debugger

Apart from bug fixes and minor speed improvements, we added non-invasive debugging support. This ability to attach to processes that are already being debugged comes handy when you want to create crashdumps or inspect handles and other kernel objects.

Make sure you enable this option from the Debugger/Debugger Options/Specific debugger options dialog:


If you are debugging 64-bit applications using idag64, the Windbg plugin will offer to run the debugger server for you automatically:




When the debugger server is no longer needed make sure to terminate it.

Scripting

Processor modules and Plugins

It is now possible to write scriptable loaders, processor modules and plugins. If you always wanted your scripts to automatically execute when a database is loaded and unload/deinitialize when the database is closed, then turn your script into a plugin script with just a few additional lines of code.

If we get enough requests about writing debugger modules using scripts, we may add this facility in the future.

IDAPython improvements

We refactored and improved the IDAPython (now version 1.4.0) plugin (and the extlang_t interface by adding new facilities to call object methods, query properties and so on).
This has lead to significant speed gains as demonstrated by Ero Carrera's blog post.

We also documented all the manually wrapped functions and utility classes which were poorly documented with the example scripts.

Please refer to the documentation of the pseudo module pywraps for more information.

The graphical user interface

We did some last minute changes to the GUI and some of the features described before were changed:
  • The recent scripts window can be configured to be a dockable window or a modal dialog (check idagui.cfg / RECENT_SCRIPTS_MODAL)
  • No need to hold the Alt key in order to jump to identifiers, instead simply double click on it
  • Output window is now searchable: use Alt-T to start the search and Ctrl-T to search for the next match

Kernel and processor modules

ARM module

We have added support for almost all ARMv7 instructions, including NEON (aka Advanced SIMD). NEON instructions can be found in the code made for Cortex-A8 processors, such as the one in iPhone 3GS and iPad.


Because ARM uses new, unified syntax for NEON and VFP (Vector Floating Point) instructions in ARMv7, we use the new syntax if NEON is enabled. Otherwise we still display old mnemonics for VFP instructions, as they're what most people are used to.
The only instructions still missing from ARMv7 are ThumbEE instructions which are supposed to be used for JIT compilation of bytecode-based languages. We have not yet encountered any real-life code using it.

You can choose which architecture version to use when disassembling ARM code. This can be done interactively in the "Processor-specific options dialog" :


via the command-line:
idag -parm:ARMv6T2 firmware.bin
or by editing IDA.CFG:
ARM_DEFAULT_ARCHITECTURE = "ARMv6";
For ARM Mach-O files or ELF files that include EABI attributes, the architecture version is set automatically from the flags in the file.



MIPS module

We have improved the register tracing and now almost all indirect code and data references are recognized. Here's one of the many samples:
Before:

After:


We have also added decoding of the MIPS16e instructions jrc, jalrc, save, restore etc.).

PC module

One small but important new feature is the improvement in the parsing of SEH (Structured Exception Handling) in Win32 files. It is especially useful when disassembling drivers which use SEH extensively.

Notice that the finally handler is not converted into a separate function as before (because of the call), but is correctly added to the main function.

Python processor modules

We added two new processor module scripts written entirely in Python. They can be used as a template when developing your own.
  • ebc.py: EFI Byte code processor module:



  • msp430.py: MSP430 is a simple 27-instructions 16-bit RISC processor from TI.


Closing words

We hope that the new features make your reversing job more easier. Please feel free to send us comments, suggestions and feature requests.

Last but not least, we expect to start the beta testing of the new IDA Qt interface soon. If you are interested and have an active IDA Pro license do not hesitate to contact us.