I guess in the last post I didn’t really elaborate on why we need to get memory offsets into the gdb client. So here’s the 20-second version. When a process is loaded for execution by the operating system, it needs to map the various regions in the program (code, data etc.) into memory. Due to a modern-day security feature called ASLR (Address Space Layout Randomization), operating systems randomize this layout. Now in order to properly analyze where things are, r2 needs to know the base address of the various sections of the executable, to properly rebase symbol information. And so, we need to get these offsets.
How does GDB do this? Well.. I’m not really sure. How do we plan to do this? The radare2 debug API
has a function, r_debug_get_baddr()
which reads in the process memory map, finds the base of the
first program segment, and returns the base address with which to rebase symbols. Pretty neat, huh?
So what do I mean by reading the process’ memory map? Most operating systems have some way to keep
track of where stuff is mapped in the address space of a process. This includes the process’s code
and data, along with shared libraries. Some also map the kernel into the address space of the
process.. But the point is, they also provide an interface for accessing this information. On a
Linux-based system for instance, process information is stored inside the /prod/<pid>/
directory,
with the memory map in /proc/<pid>/maps
. A typical memory map on a Linux system looks something
like this -
$ cat /proc/4172/maps
00400000-00401000 r-xp 00000000 08:11 5373982 /home/user/test/a.out
00600000-00602000 rw-p 00000000 08:11 5373982 /home/user/test/a.out
7ffff7dda000-7ffff7dfd000 r-xp 00000000 08:05 264353 /usr/lib/ld-2.25.so
7ffff7ff8000-7ffff7ffa000 r--p 00000000 00:00 0 [vvar]
7ffff7ffa000-7ffff7ffc000 r-xp 00000000 00:00 0 [vdso]
7ffff7ffc000-7ffff7ffe000 rw-p 00022000 08:05 264353 /usr/lib/ld-2.25.so
7ffff7ffe000-7ffff7fff000 rw-p 00000000 00:00 0
7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0 [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
The stuff on the left is memory address ranges. Then you have permissions and stuff. And at the end
is the name of what is mapped to this range. So from here we know that we need to rebase the symbols
by 0x00401000
, which is the base address of the text
section. And all’s well and good here.
How do we do this when we’re debugging remotely, from a gdbserver? Well as I said above, r2 rebases
symbols by reading the process memory map. So we do the same. We read /proc/<pid>/maps
from the
gdbserver. Now note that this only works for a gdbserver running on a Linux system, since other
operating systems would have a different way of storing process information. But we’re working on
it - there’s a plan to add a target interface for GDB debugging, which will sit between r2 and the
GDB protocol implementation. But anyway, the current implementation asks the gdbserver to read the
maps file and send it across, and parses it into r2’s internal memory map representation. And so,
we can now revel in r2’s output for the dm=
command, for a process being debugged over gdbserver.
[0x7ffff7ddad80]> dm=
sys 4K * 0x0000000000400000 |#------| 0x0000000000401000 -r-x /home/user/test/a.out
sys 8K - 0x0000000000600000 |------#| 0x0000000000602000 -rw- /home/user/test/a.out
sys 140K - 0x00007ffff7dda000 |#------| 0x00007ffff7dfd000 -r-x /usr/lib/ld-2.25.so
sys 8K - 0x00007ffff7ff8000 |#------| 0x00007ffff7ffa000 -r-- [vvar]
sys 8K - 0x00007ffff7ffa000 |#------| 0x00007ffff7ffc000 -r-x [vdso]
sys 8K - 0x00007ffff7ffc000 |#------| 0x00007ffff7ffe000 -rw- /usr/lib/ld-2.25.so
sys 4K - 0x00007ffff7ffe000 |#------| 0x00007ffff7fff000 -rw- unk0
sys 132K - 0x00007ffffffde000 |-------| 0x00007ffffffff000 -rw- [stack]
sys 4K - 0xffffffffff600000 |-------| 0xffffffffff601000 -r-x [vsyscall]
[0x7ffff7ddad80]>
Now we get to the meat of the matter. How does this help us? Well, I’ve not yet implemented
automatic loading and rebasing of symbols, but what this does let us do is load the file manually
with the oa
command, specifying the offset with which to rebase symbols. And then r2’s analysis
can work its magic.
$ r2 -d gdb://localhost:8000
= attach 6 6
-- Mind that the 'g' in radare is silent
[0x7ffff7ddad80]> dm
sys 4K 0x0000000000400000 - 0x0000000000401000 s -r-x /home/user/test/a.out /home/user/test/a.out
sys 8K 0x0000000000600000 - 0x0000000000602000 s -rw- /home/user/test/a.out /home/user/test/a.out
sys 140K 0x00007ffff7dda000 * 0x00007ffff7dfd000 s -r-x /usr/lib/ld-2.25.so /usr/lib/ld-2.25.so
sys 8K 0x00007ffff7ff8000 - 0x00007ffff7ffa000 s -r-- [vvar] [vvar]
sys 8K 0x00007ffff7ffa000 - 0x00007ffff7ffc000 s -r-x [vdso] [vdso]
sys 8K 0x00007ffff7ffc000 - 0x00007ffff7ffe000 s -rw- /usr/lib/ld-2.25.so /usr/lib/ld-2.25.so
sys 4K 0x00007ffff7ffe000 - 0x00007ffff7fff000 s -rw- unk0 unk0
sys 132K 0x00007ffffffde000 - 0x00007ffffffff000 s -rw- [stack] [stack]
sys 4K 0xffffffffff600000 - 0xffffffffff601000 s -r-x [vsyscall] [vsyscall]
[0x7ffff7ddad80]> oa 0x400000 /home/user/test/a.out
[0x7ffff7ddad80]> db sym.main
[0x7ffff7ddad80]> dc
Selecting and continuing: 0
= attach 0 0
got signal...
= attach 0 1
= attach 6 1
[0x004004f7]> aa
[x] Analyze all flags starting with sym. and entry0 (aa)
[0x004004f7]> afl
0x00400000 2 60 sym.imp.__libc_start_main
0x004003c8 3 23 sym._init
0x004003f0 1 6 sym.imp.puts
0x00400400 1 43 sym._start
0x00400430 4 50 -> 41 sym.deregister_tm_clones
0x00400470 3 53 sym.register_tm_clones
0x004004b0 3 28 sym.__do_global_dtors_aux
0x004004d0 4 38 -> 35 sym.frame_dummy
0x004004f6 1 1 sym.main
0x004004f7 1 31 fcn.rip
0x00400520 4 101 rbp
0x00400590 1 2 sym.__libc_csu_fini
0x00400594 1 9 sym._fini
0x00600ff0 2 17 fcn.00600ff0
[0x004004f7]> pdf sym.main
;-- rip:
┌ (fcn) fcn.rip 31
│ fcn.rip ();
│ ; var int local_10h @ rbp-0x10
│ ; var int local_4h @ rbp-0x4
│ 0x004004f7 4889e5 mov rbp, rsp
│ 0x004004fa 4883ec10 sub rsp, 0x10
│ 0x004004fe 897dfc mov dword [local_4h], edi
│ 0x00400501 488975f0 mov qword [local_10h], rsi
│ 0x00400505 bfa4054000 mov edi, 0x4005a4
│ 0x0040050a e8e1feffff call sym.imp.puts
│ 0x0040050f b800000000 mov eax, 0
│ 0x00400514 c9 leave
└ 0x00400515 c3 ret
[0x004004f7]>