Tuesday, 1 December 2020

Ghidra and the Lost Gems (Fixing Misidentified Code)

Introduction

Using Ghidra disassembler to disassemble non-x86/x64 programs from architectures such as MIPS, Motorola, PowerPC, ... etc can be more error-prone than disassembling x86/x64 programs. One of the reasons for these errors is that Ghidra can misidentify some code bytes as data bytes. To address this issue Ghidra offers the experimental Aggressive Instruction Finder analysis, however, even when this analysis is enabled, Ghidra still misses a lot of code locations (bytes) and leaves them non-disassembled. This can be very annoying especially when using Ghidra's cross-references, since many disassembled functions can be without any caller functions in their cross-references list, as Ghidra is unable to locate these callers.


Figure1: These folks know what I am talking about 😀


Ghidra BruteforceDisassembly.py Script

To fix code bytes (locations) misidentification, I wrote a Ghidra python script that attempts to force the disassembly of misidentified (non-disassembled) code bytes. 

Figure2: BruteforceDisassembly.py Ghidra script
    

The script follows a simple methodology, it first prompts the user to specify the code bytes of interest. Ideally, the user is interested in recovering the missed non-disassembled functions, for this reason, it would make more sense to search for code bytes of functions prologues. For instance, a popular function prologue for x86 architecture is "push ebp"=0x55 followed by "mov ebp, esp"=0x8bec, thus, we can be interested in finding the bytes 558bec. Since Ghidra in general performs well with x86 architecture, I am going to showcase with the Motorola architecture instead, where the functions prologues contain the bytes 4e56. But first I enable Aggressive Instruction Finder analysis to allow Ghidra to try harder to find code bytes.

Figure3: Enable Ghidra's Aggressive Instruction Finder analysis


After running the script it prompts the user to enter the targeted code bytes.

Figure4: Enter the code bytes for functions prologues


Next, the script prompts the user to enter the first instruction name, this is to specify what instruction we are interested in. For instance, if the script found matched bytes then to identify whether if they are truly code bytes (rather than data bytes) we need to make sure that the disassembled bytes will lead to the targeted instruction. For example, in this case, the specified bytes 4e56 if disassembled correctly it should be disassembled to the instruction link.

Figure5: Enter the targeted instruction (first instruction in the disassembled location)

After running the script, it has successfully identified and fixed 115 code locations that were misidentified as data locations.

Figure6: The script fixed 115 misidentified code locations


The BruteforceDisassembly.py script and the program firmware.bin can be found at the GitHub repository




Sunday, 9 July 2017

Null dereference vulnerability in IDA Freeware

Abstract

This is a quick post on a Root Cause Analysis for null dereference vulnerability in an old codebase of PE COFF debugging symbols loader in IDA Freeware version 5.0.0.879, IDA loader parser is not able to parse an interesting string, which result in a null dereference and a denial of service.

one can trigger the vulnerability with any file compiled with PE COFF debugging symbols, just by pasting a magical string into one of the external functions names in the debugging symbols.

09/24/2015 This vulnerability has been reported to hex-rays, patch was never released.


Technical Details

Long time ago I attended Windows exploitation training class, the instructor asked the students to load the targeted program in their IDA Free or IDA Pro ($$$$), the targeted program was FastBack Server from IBM (storage management server). I used IDA Free to disassemble the targeted program, the thing is that IDA never worked! IDA Free was crashing every single time I tried to disassemble that program. So, I decided to do the Root Cause Analysis (RCA) to figure out what went wrong.

When I run IDA with FastBackServer.exe I get this crash:

access violation triggered by the instruction at ida.wll+0x0xBCC74 :)
another screenshot:

Simple debugging with setting a breakpoint on this instruction, reveals that its main job is to iterate over a linked list of structures, these structures represent debugging symbols files of programs that have been previously loaded into IDA, IDA is relying on Windows Registry to retrieve such information. In normal cases such structures keep debugging symbol file (.pdb) path string, but on the structure that causes the crash there was no debugging symbol file path.

Whenever you have a program that crashes against certain input (in our case FastBackServer.exe as input for IDA), in order to know exactly what part of your input caused the crash, you should find a way to reduce and narrow down your data input to the minimum without losing the crash. Normally, to do this you will need an editor that allows you to edit the input (e.g. if it is a video player that crashes from certain video file, then you may want to use video editor to reduce your crashing input by removing certain video frames,..). Also while you’re reducing the file that causing the crash, you may want to do it by dividing the file into two halves, and then by deleting only one half, you could see whether if you still have the crash or not, if not, then what causing the crash is in the deleted half, you could do this until you reach the smallest part to find the least possible piece of data that cause the crash.

During IDA loading (and just before the crash) it was clear to me that IDA crashed while it was analyzing the debugging symbols of FastBackServer.exe, that’s why I knew it is something wrong with the debugging symbols rather than anything else.

Good (kindda ugly) way to manipulate the debugging symbols for Windows programs is to use DUMPBIN.exe (Microsoft ship it with Visual studio under this path {ROOT}\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin\). DUMPBIN displays information about Common Object File Format (COFF) binary files. You can use DUMPBIN to examine COFF object files, standard libraries of COFF objects, executable files, and dynamic-link libraries (DLLs)”, and this will allow me to list and view the debugging symbols details.
At the beginning I was using a hex editor to delete debugging symbols (the crashing input) without using DUMPBIN, and I was losing the crash instantly, the reason behind this is that non-clean cuts of the debugging symbols result in out of shape data, which will gives IDA the permission to just ignore the whole debugging symbols without parsing them, and consequently losing the crash. This is why I had to make sure when I delete a debugging symbols that it is being deleted completely and not partially, deleting even one debugging symbol partially results in losing the crash. DUMPBIN with the option “/SYMBOLS” lists all the debugging symbols, which lists for each symbol what index and size to know where it starts and ends:
Microsoft (R) COFF/PE Dumper Version 10.00.30319.01
Copyright (C) Microsoft Corporation. All rights reserved.

Dump of file ./patched/FastBackServer.exe

File Type: EXECUTABLE IMAGE

COFF SYMBOL TABLE
000 00000000 ABS notype Static | @comp.id
001 007ED1A0 SECT5 notype Static | $R000000
002 007ED4B8 SECT5 notype Static | $R000318
003 007ED4D8 SECT5 notype Static | $R000338
004 007ED588 SECT5 notype Static | $R0003E8
005 007ED9E8 SECT5 notype Static | $R000848
006 00000000 ABS notype Static | @comp.id
007 00226500 SECT1 notype Static | .text
Section length A0, #relocs 5, #linenums 12, checksum BC35A69D
009 00000000 ABS notype Static | @comp.id
.
.


So guided by DUMPBIN, I was using a hex editor to fill half the portion of the debugging symbols with null bytes, to see if I have lost the crash or not, this way I reduced the FastBackServer.exe file till I have kept only one debugging symbol string and that was:
??0?$map@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V?$map@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@U?$less@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@V?$allocator@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@@2@U?$less@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@V?$allocator@V?$map@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@U?$less@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@V?$allocator@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@@std@@@2@@std@@QAE@ABU?$less@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@1@ABV?$allocator@V?$map@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@U?$less@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@V?$allocator@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@@std@@@1@@Z

This debugging symbol string was causing the crash, to verify this, I compiled a “hello, world!” program, and then I edited its debugging symbols, and replaced one of its debugging symbols strings with the debugging symbold crash I found in FastBackServer, and yes, IDA was crashing with my hello, world program too :D

If you’re interested in the reduced poc file: