Reading process memory maps

June 21, 2017

I guess in the last post I didn’t really elaborate on why we need to get memory offsets into the gdb client. So here’s the 20-second version. When a process is loaded for execution by the operating system, it needs to map the various regions in the program (code, data etc.) into memory. Due to a modern-day security feature called ASLR (Address Space Layout Randomization), operating systems randomize this layout. Now in order to properly analyze where things are, r2 needs to know the base address of the various sections of the executable, to properly rebase symbol information. And so, we need to get these offsets.

How does GDB do this? Well.. I’m not really sure. How do we plan to do this? The radare2 debug API has a function, r_debug_get_baddr() which reads in the process memory map, finds the base of the first program segment, and returns the base address with which to rebase symbols. Pretty neat, huh?

So what do I mean by reading the process’ memory map? Most operating systems have some way to keep track of where stuff is mapped in the address space of a process. This includes the process’s code and data, along with shared libraries. Some also map the kernel into the address space of the process.. But the point is, they also provide an interface for accessing this information. On a Linux-based system for instance, process information is stored inside the /prod/<pid>/ directory, with the memory map in /proc/<pid>/maps. A typical memory map on a Linux system looks something like this -

$ cat /proc/4172/maps
00400000-00401000 r-xp 00000000 08:11 5373982             /home/user/test/a.out
00600000-00602000 rw-p 00000000 08:11 5373982             /home/user/test/a.out
7ffff7dda000-7ffff7dfd000 r-xp 00000000 08:05 264353      /usr/lib/ld-2.25.so
7ffff7ff8000-7ffff7ffa000 r--p 00000000 00:00 0           [vvar]
7ffff7ffa000-7ffff7ffc000 r-xp 00000000 00:00 0           [vdso]
7ffff7ffc000-7ffff7ffe000 rw-p 00022000 08:05 264353      /usr/lib/ld-2.25.so
7ffff7ffe000-7ffff7fff000 rw-p 00000000 00:00 0
7ffffffde000-7ffffffff000 rw-p 00000000 00:00 0           [stack]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0   [vsyscall]

The stuff on the left is memory address ranges. Then you have permissions and stuff. And at the end is the name of what is mapped to this range. So from here we know that we need to rebase the symbols by 0x00401000, which is the base address of the text section. And all’s well and good here.

How do we do this when we’re debugging remotely, from a gdbserver? Well as I said above, r2 rebases symbols by reading the process memory map. So we do the same. We read /proc/<pid>/maps from the gdbserver. Now note that this only works for a gdbserver running on a Linux system, since other operating systems would have a different way of storing process information. But we’re working on it - there’s a plan to add a target interface for GDB debugging, which will sit between r2 and the GDB protocol implementation. But anyway, the current implementation asks the gdbserver to read the maps file and send it across, and parses it into r2’s internal memory map representation. And so, we can now revel in r2’s output for the dm= command, for a process being debugged over gdbserver.

[0x7ffff7ddad80]> dm=
sys     4K * 0x0000000000400000 |#------| 0x0000000000401000 -r-x /home/user/test/a.out
sys     8K - 0x0000000000600000 |------#| 0x0000000000602000 -rw- /home/user/test/a.out
sys   140K - 0x00007ffff7dda000 |#------| 0x00007ffff7dfd000 -r-x /usr/lib/ld-2.25.so
sys     8K - 0x00007ffff7ff8000 |#------| 0x00007ffff7ffa000 -r-- [vvar]
sys     8K - 0x00007ffff7ffa000 |#------| 0x00007ffff7ffc000 -r-x [vdso]
sys     8K - 0x00007ffff7ffc000 |#------| 0x00007ffff7ffe000 -rw- /usr/lib/ld-2.25.so
sys     4K - 0x00007ffff7ffe000 |#------| 0x00007ffff7fff000 -rw- unk0
sys   132K - 0x00007ffffffde000 |-------| 0x00007ffffffff000 -rw- [stack]
sys     4K - 0xffffffffff600000 |-------| 0xffffffffff601000 -r-x [vsyscall]
[0x7ffff7ddad80]>

Now we get to the meat of the matter. How does this help us? Well, I’ve not yet implemented automatic loading and rebasing of symbols, but what this does let us do is load the file manually with the oa command, specifying the offset with which to rebase symbols. And then r2’s analysis can work its magic.

$ r2 -d gdb://localhost:8000

= attach 6 6
 -- Mind that the 'g' in radare is silent

[0x7ffff7ddad80]> dm
sys   4K 0x0000000000400000 - 0x0000000000401000 s -r-x /home/user/test/a.out /home/user/test/a.out
sys   8K 0x0000000000600000 - 0x0000000000602000 s -rw- /home/user/test/a.out /home/user/test/a.out
sys 140K 0x00007ffff7dda000 * 0x00007ffff7dfd000 s -r-x /usr/lib/ld-2.25.so /usr/lib/ld-2.25.so
sys   8K 0x00007ffff7ff8000 - 0x00007ffff7ffa000 s -r-- [vvar] [vvar]
sys   8K 0x00007ffff7ffa000 - 0x00007ffff7ffc000 s -r-x [vdso] [vdso]
sys   8K 0x00007ffff7ffc000 - 0x00007ffff7ffe000 s -rw- /usr/lib/ld-2.25.so /usr/lib/ld-2.25.so
sys   4K 0x00007ffff7ffe000 - 0x00007ffff7fff000 s -rw- unk0 unk0
sys 132K 0x00007ffffffde000 - 0x00007ffffffff000 s -rw- [stack] [stack]
sys   4K 0xffffffffff600000 - 0xffffffffff601000 s -r-x [vsyscall] [vsyscall]

[0x7ffff7ddad80]> oa 0x400000 /home/user/test/a.out

[0x7ffff7ddad80]> db sym.main

[0x7ffff7ddad80]> dc
Selecting and continuing: 0
= attach 0 0
got signal...
= attach 0 1
= attach 6 1

[0x004004f7]> aa
[x] Analyze all flags starting with sym. and entry0 (aa)

[0x004004f7]> afl
0x00400000    2 60           sym.imp.__libc_start_main
0x004003c8    3 23           sym._init
0x004003f0    1 6            sym.imp.puts
0x00400400    1 43           sym._start
0x00400430    4 50   -> 41   sym.deregister_tm_clones
0x00400470    3 53           sym.register_tm_clones
0x004004b0    3 28           sym.__do_global_dtors_aux
0x004004d0    4 38   -> 35   sym.frame_dummy
0x004004f6    1 1            sym.main
0x004004f7    1 31           fcn.rip
0x00400520    4 101          rbp
0x00400590    1 2            sym.__libc_csu_fini
0x00400594    1 9            sym._fini
0x00600ff0    2 17           fcn.00600ff0

[0x004004f7]>  pdf sym.main
            ;-- rip:
┌ (fcn) fcn.rip 31
│   fcn.rip ();
│           ; var int local_10h @ rbp-0x10
│           ; var int local_4h @ rbp-0x4
│           0x004004f7      4889e5         mov rbp, rsp
│           0x004004fa      4883ec10       sub rsp, 0x10
│           0x004004fe      897dfc         mov dword [local_4h], edi
│           0x00400501      488975f0       mov qword [local_10h], rsi
│           0x00400505      bfa4054000     mov edi, 0x4005a4
│           0x0040050a      e8e1feffff     call sym.imp.puts
│           0x0040050f      b800000000     mov eax, 0
│           0x00400514      c9             leave
└           0x00400515      c3             ret

[0x004004f7]>