Archive for the ‘Assembly’ Category

Breakpoints and other Debug Events

There are 3 major types of events you encounter when debugging:

  1. Breakpoints
  2. Memory violations (!!)
  3. Exceptions
buffer overflow, for example,  is probably the most well-known for of memory violation and is both a notorious headache for end-users of buggy software and a breath of fresh air for reverse engineers.
     There are 3 types of breakpoints of interest:
  1. Software Breakpoints
  2. Hardware Breakpoints
  3. Memory Breakpoints
Soft breakpoints are used specifically to halt the CPU and are the most common type of breakpoint. Soft breakpoints are sing-byte instructions that stop execution of the debugger process. In RStudio you can set a soft breakpoint simply by clicking on the left margin of a particular line or function (or by choosing “Rerun with Debugger” in the error traceback dialog). Soft breakpoints can be persistent if they remain after the CPU has executed its tasks or one-shot if they are cleared out. Soft breakpoints are associated with the interrupt 3 (INT3) event.

Using assembly language, to process a soft breakpoint the single byte instruction must be converted into an operation code (a.k.a. an opcode).

In x86 Assembly language an instruction looks like:
which tells the CPU to move the value stored in the EBX register to the EAX register. In native opcode, this would just be:
A common analogy is that x86 Assembly language is to individual opcodes what DNS is to IP addresses. Memorizing thousands of opcodes is impractical, whereas assembly language is much more manageable.

Hardware breakpoints, associated with the INT1 event, are useful when a small number of breakpoints are needed and the software can’t be modified. When registers are used in this way they’re known as debug registers.

Most CPU’s have 8 debug registers denoted as DR0 through DR7. Up to 4 of the 8 registers can be used for breakpoints and each can only set up to 4 bytes at a time. This means you can only use up to 4 hardware breakpoints at a time. DR0 to DR3 is usually reserved for the breakpoint addresses. DR4 and DR5 are reserved and DR6 is the status register (which determines the type of debugging event triggered by a breakpoint).
Unlike soft and hard breakpoints, memory breakpoints are not true breakpoints. Rather, a memory breakpoint refers to the permissions on a region or page of memory set by a debugger.
memory page is the smallest portion of memory that an OS deals with.
Examples of memory page permissions:
  1. Page execution
  2. Page read
  3. Page write
  4. Guard page
The guard page permission allows us to separate the heap and the stack and can ensure that a chunk of memory does not grow past a given boundary. It does so by  throw a one-time exception whenever accessed and then returning to its original status.

CPU Registers: an Overview

Register: a small amount of storage on the CPU; the fastest method for a CPU to access data

In the x86 instruction set, a CPU uses 8 general-purpose registers:
  1. EAX — a.k.a. the accumulator register, used for performing calculations as well as storing return values
  2. EDX — a.k.a. the data register, basically an extension of EAX
  3. ECX — a.k.a. the count register, used for looping, counts DOWNWARD not upward
  4. ESI  — a.k.a. the source index, used for reading, holds the location of the input data stream
  5. EDI  — a.k.a. the destination index, used for writing, points to the location where the result is stored
  6. EBP  — a.k.a. the base pointer, used for managing function calls and stack operations, points to the bottom of the stack unless freed up from this function by the compiler, in which case it would be an extra general purpose register
  7. ESP  — a.k.a. the stack pointer, used for managing function calls and stack operations, points to the very top of the stack
  8. EBX  — an extra register, not designed for anything specific

    Another register worth mentioning is:
  9. EIP — a.k.a. the instruction pointer, points to the current instruction that is being executed; as binary code is being executed by the CPU the EIP is updated to reflect the location where the execution is occurring

White & Black box Debuggers, Intelligent Debugging, and Dynamic Analysis

Debugging is a common task for data scientists, programmers, and security experts alike. In good ole RStudio we have a nice, simple built-in white-box debugger. For many analysis-oriented coders, the basic debugging functionality of an IDE like RStudio is all they know and it may be a surprise that debugging is a bigger, much sexier, topic. Below I define and describe key topics in debugging and dynamic analysis, as well as provide links to the most cutting edge free debuggers I use.

Dynamic Analysis: Runtime tracing of a process, usually performed using a debugger. Dynamic Analysis is critical for exploit development, fuzzer assistance, and malware inspection.

Debugger: a program that is used to test and troubleshoot other programs.Intelligent Debugger: a scriptable debugger that supports extended features such as call hooking, such as Immunity Debugger and PyDbg.

White Box Debugger: Debuggers built into IDEs and other dev platforms, which enable developers to trace through source code with a high degree of control, as to aide in the troubleshooting of functions and other code breakages.
Black Box Debugger: Used by bug hunters and reverse engineers, black box debuggers operate on compiled programs when the source code is not available and the only information is available in a disassembled format. There are two broad subclasses of black box debuggers, which are user mode (i.e. ring 3) and kernel mode (i.e. ring 0).
User mode black box debugger: a processor mode under which your applications run, usually with the least amount of privilege (e.g. double clicking PuTTY.exe launches a user-mode process).
Kernel mode black box debugger: a processor mode where the core of the OS runs using the highest amount of privilege (e.g. capturing packets with a network adapter that is in passive mode).
User-mode Debuggers Commonly used among Reverse Engineers
WinDbg by Microsoft
OllyDbg by Oleh Yuschuk, a F.O.S.S. debugger
GNU Debugger (gdb), a F.O.S.S. Linux debugger by the community

cdecl and stdcall Calling Conventions, Stack Clearing and the EAX Register

Two key calling conventions are:

1. cdecl
2. stdcall

In cdecl parameters are pushed from right to left and the caller of the function is responsible for clearing the arguments from the stack. Used by most C systems on the x86 architecture.
Example of a cdecl call in C:
 int python_rocks(reason_one, reason_two, reason_three); 
In x86 Assembly:

push reason_three
push reason_two
push reason_one
call python_rocks
add esp, 12

The last line above increments the stack pointer 12 bytes (there are 3 parameters to the function and each stack parameter is 4 bytes and thus 12 bytes) which essentially clears those parameters.
Example of a stdcall call in C:

int my_socks(color_one, color_two, color_three);

In x86 Assembly:

push color_three
push color_two
push color_one
call my_socks
The order of the parameters in stdcall is the same but the stack clearing is not done by the caller, but by the my_socks function.
For both stdcall and cdecl calling conventions it’s important to note that return values are stored in the EAX register.