GSoC week 3: Source-level debugging

June 14, 2017

In a previous post, I talked about how we can’t yet use r2’s gdbserver (the one I’m writing) for debugging. Well, now we can. Yeah. Breakpoints, single-stepping and continue work, as of now. Not yet stuff like tracepoints, but we’ll get there in time. And after my latest pull request #7742, we can use source-level debugging with GDB. So how does this look? Let’s have a look at gdb (the actual gdb) talking to r2’s gdbserver.

$ gdb

...

(gdb) set arch i386:x86_64:intel
The target architecture is assumed to be i386:x86_64:intel

(gdb) symbol-file /bin/radare2
Reading symbols from /bin/radare2...done.

(gdb) target remote localhost:8000
Remote debugging using localhost:8000
warning: No executable has been specified and target does not support determining executable
automatically. Try using the "file" command.
0x00007f0ff2ce8d80 in ?? ()

(gdb) break main
Breakpoint 1 at 0x55b8f0748835: file radare2.c, line 375.

(gdb) c
Continuing.

Breakpoint 1, main (argc=2, argv=0x7fffb975e1e8, envp=0x7fffb975e200) at radare2.c: 375
375         RThreadLock *lock = NULL;

(gdb) s
376         RThread *rabin_th = NULL;

(gdb)

Yeah, so there are a few points to keep in mind with this. Notice how we need to first specify the file from which to read debugging symbols, and specify the architecture of the target, before connecting to the target.

Now, for a better idea of what this means.. well.. We can use GDB as the source-level debugger that it is, instead of being limited to a low-level debugger. By source-level debugging, I mean that you can debug in terms of source code lines, instead of doing so in terms of the corresponding assembly (a source code line could correspond to quite a few lines of assembly). Which is usually what you want to do when you’re debugging to find issues with your code. With layout split, it looks something like this -

   ┌───radare2.c─────────────────────────────────────────────────────────────────────────────
   │373 int main(int argc, char **argv, char **envp) {
   │374 #if USE_THREADS
B+ │375         RTheadLock *lock = NULL;
  >│376         RThead *rabin_th = NULL;
   │377 #endif
   ┌─────────────────────────────────────────────────────────────────────────────────────────
   │0x55c5947dc82e <main+25>         mov   %rdx,-0x638(%rbp)
B+ │0x55c5947dc835 <main+32>         movq  $0x0,-0x18(%rbp)
  >│0x55c5947dc83d <main+40>         movq  $0x0,-0x20(%rbp)
   │0x55c5947dc845 <main+48>         movq  $0x0,-0x30(%rbp)
   └─────────────────────────────────────────────────────────────────────────────────────────

remote Thread 3633 In: main                                          L376 PC: 0x55c5947dc83d 

(gdb) target remote localhost:8000
Remote debugging using localhost:8000
warning: No executable has been specified and target does not support
determining executable automatically. Try using the "file" command.
0x00007f0ff2ce8d80 in ?? ()

(gdb) break main
Breakpoint 1 at 0x55b8f0748835: file radare2.c, line 375.

(gdb) c
Continuing.

Breakpoint 1, main (argc=2, argv=0x7fffb975e1e8, envp=0x7fffb975e200) at radare2.c: 375

(gdb)

Loading offsets into the r2 gdb client is a bit more involed, since it involves reading in the memory map from the gdbserver (yeah that’s how the r2 debugging API rebases symbols). Now I can either continue with that and make it a seamless API like what is currently present, where reading the memory map reads stuff like mapping of libraries, but that will take some time for implementation in the GDB code. Or, I could add in an exception for gdb debugging and load in just the base address of the binary with which to rebase symbols, which is untidy. So I prefer option 1, and that is what I am working on now.