0%

Triton以及基于它实现的工具

Triton 的安装与使用

二进制分析框架 Triton 主要用于分析和检查二进制文件,包括可执行文件、动态链接库等

它的主要功能包括:

  1. 反汇编和分析:
    • Triton 可以反汇编二进制文件,生成汇编代码,供安全研究人员分析
    • 它支持多种指令集架构,如 x86、ARM 等
  2. 自动化分析:
    • Triton 提供了丰富的 API,支持开发者编写自动化的二进制分析脚本
    • 这些脚本可用于检测二进制文件中的漏洞、恶意代码等安全隐患
  3. 插件扩展:
    • Triton 支持通过插件的方式扩展其功能,满足不同安全研究人员的特定需求
    • 开发者可以编写自定义的分析插件,集成到 Triton 中使用
  4. 交互式分析:
    • Triton 提供了交互式的命令行界面,安全研究人员可以在此进行手工分析、调试等操作
    • 它支持设置断点、单步执行等调试功能
  5. 跨平台支持:
    • Triton 可以运行在 Windows、Linux、macOS 等多种操作系统上,为安全研究提供跨平台的分析能力

在 Triton 中执行的执行是由我们控制的,污点分析和符号执行都是基于模拟执行实现的

Triton 需要的依赖如下:

1
2
3
4
5
6
* libcapstone                >= 4.0.x   https://github.com/capstone-engine/capstone
* libboost (optional) >= 1.68
* libpython (optional) >= 3.6
* libz3 (optional) >= 4.6.0 https://github.com/Z3Prover/z3
* libbitwuzla (optional) >= 0.4.x https://github.com/bitwuzla/bitwuzla
* llvm (optional) >= 12
  • 如果编译生成的 /usr/lib/python3.8/site-packages/triton.so 出现段错误,则大概率是 libcapstone 的版本问题

使用 vcpkg 下载 Triton 以及依赖:

1
2
3
4
5
git clone https://github.com/Microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh # ./bootstrap-vcpkg.bat for Windows
./vcpkg integrate install
./vcpkg install triton
  • Vcpkg(Visual C++ Package Manager)是一个由微软开发的命令行包管理器,用于C++语言
  • 它旨在简化在各种平台上获取和安装C++库的过程,特别是 Windows
  • Vcpkg 支持多种编译器和构建系统,并且可以与 Visual Studio 集成
1
2
3
./vcpkg list				# 列出已安装的库
./vcpkg search <keyword> # 搜索可用的库
./vcpkg update # 更新vcpkg

在安装 Triton 此之前需要先安装 z3(在 vcpkg 中可以找到):

1
2
3
4
python scripts/mk_make.py --prefix=/home/yhellow --python --pypkgdir=/home/yhellow/.local/lib/python3.8/site-packages/
cd build
make -j8
sudo make install

然后使用如下命令安装 Triton:

1
2
3
4
5
6
mkdir build
cd build
cmake ..
make -j8
sudo make install
sudo mv /usr/lib/python3.8/site-packages/triton.so /home/yhellow/.local/lib/python3.8/site-packages/

Triton 模拟执行

模拟执行(Emulation)是一种技术,它允许软件或硬件模拟另一系统的行为,以便在不同的环境中运行程序或执行任务

使用案例如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
from __future__ import print_function
from triton import *

code = [ # 每一项的结构是 (指令的地址, 指令的字节码)
(0x400000, b"\x48\x8b\x05\xb8\x13\x00\x00"), # mov rax, QWORD PTR [rip+0x13b8]
(0x400007, b"\x48\x8d\x34\xc3"), # lea rsi, [rbx+rax*8]
(0x40000b, b"\x67\x48\x8D\x74\xC3\x0A"), # lea rsi, [ebx+eax*8+0xa]
(0x400011, b"\x66\x0F\xD7\xD1"), # pmovmskb edx, xmm1
(0x400015, b"\x89\xd0"), # mov eax, edx
(0x400017, b"\x80\xf4\x99"), # xor ah, 0x99
(0x40001a, b"\xC5\xFD\x6F\xCA"), # vmovdqa ymm1, ymm2
]

if __name__ == '__main__':
ctx = TritonContext()
ctx.setArchitecture(ARCH.X86_64) # 设置模拟执行的代码架构

for (addr, opcode) in code:
inst = Instruction() # 新建一个指令对象
inst.setOpcode(opcode) # 传递字节码
inst.setAddress(addr) # 传递指令的地址
ctx.processing(inst) # 执行指令

print(inst) # 打印指令的信息
print(' ---------------')
print(' Is memory read :', inst.isMemoryRead())
print(' Is memory write:', inst.isMemoryWrite())
print(' ---------------')
for op in inst.getOperands():
print(' Operand:', op)
if op.getType() == OPERAND.MEM:
print(' - segment :', op.getSegmentRegister())
print(' - base :', op.getBaseRegister())
print(' - index :', op.getIndexRegister())
print(' - scale :', op.getScale())
print(' - disp :', op.getDisplacement())
print(' ---------------')
print()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
0x400000: mov rax, qword ptr [rip + 0x13b8]
---------------
Is memory read : True
Is memory write: False
---------------
Operand: rax:64 bv[63..0]
---------------
Operand: [@0x4013bf]:64 bv[63..0]
- segment : unknown:1 bv[0..0]
- base : rip:64 bv[63..0]
- index : unknown:1 bv[0..0]
- scale : 0x1:64 bv[63..0]
- disp : 0x13b8:64 bv[63..0]
---------------

0x400007: lea rsi, [rbx + rax*8]
---------------
Is memory read : False
Is memory write: False
---------------
Operand: rsi:64 bv[63..0]
---------------
Operand: [@0x0]:64 bv[63..0]
- segment : unknown:1 bv[0..0]
- base : rbx:64 bv[63..0]
- index : rax:64 bv[63..0]
- scale : 0x8:64 bv[63..0]
- disp : 0x0:64 bv[63..0]
---------------

0x40000b: lea rsi, [ebx + eax*8 + 0xa]
---------------
Is memory read : False
Is memory write: False
---------------
Operand: rsi:64 bv[63..0]
---------------
Operand: [@0xa]:64 bv[63..0]
- segment : unknown:1 bv[0..0]
- base : ebx:32 bv[31..0]
- index : eax:32 bv[31..0]
- scale : 0x8:32 bv[31..0]
- disp : 0xa:32 bv[31..0]
---------------

0x400011: pmovmskb edx, xmm1
---------------
Is memory read : False
Is memory write: False
---------------
Operand: edx:32 bv[31..0]
---------------
Operand: xmm1:128 bv[127..0]
---------------

0x400015: mov eax, edx
---------------
Is memory read : False
Is memory write: False
---------------
Operand: eax:32 bv[31..0]
---------------
Operand: edx:32 bv[31..0]
---------------

0x400017: xor ah, 0x99
---------------
Is memory read : False
Is memory write: False
---------------
Operand: ah:8 bv[15..8]
---------------
Operand: 0x99:8 bv[7..0]
---------------

0x40001a: vmovdqa ymm1, ymm2
---------------
Is memory read : False
Is memory write: False
---------------
Operand: ymm1:256 bv[255..0]
---------------
Operand: ymm2:256 bv[255..0]
---------------

上述案例还没有展示模拟执行的精髓,下面这个案例将对一个二进制文件进行模拟执行:

1
gcc test.c -o test -static
  • 设置静态,因为模拟执行没法识别 .got.plt(某些 stdio.h 库中的函数会被编译器优化,进而在调用时可以直接 call,并不需要借助 .got.plt)
  • 另外设置静态可以关闭 PIE,不需要程序进行格外的偏移计算
1
2
3
4
5
6
7
8
9
10
11
#include<stdio.h>

int main(){
char buf[0x10];

read(0,buf,0x200);
int fd = open(buf,0);
read(fd ,buf, 0x10);
write(1 ,buf, 0x10);
return 0;
}
  • .got.plt 中的地址在磁盘上的二进制文件中可能是占位符,但在内存中会被实际的地址所替换(实际地址从 .rela.plt 中获取)
  • IDA 会自动分析这些地址,因此在 IDA 中看到的并不是磁盘上的数据

模拟执行的脚本如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
from triton import *

def taint_analysis2(start, end):
ctx = TritonContext()
with open('./test', 'rb') as f:
bin1 = f.read()

ctx.setArchitecture(ARCH.X86_64)
ctx.setConcreteMemoryAreaValue(0x400000, bin1) # 将二进制文件加载到0x400000处(这个地址由IDA分析得出,是程序的基地址)
RBP_ADDR = 0x60000000
RSP_ADDR = RBP_ADDR - 0x20000000
INPUT_ADDR = 0x10000000

ctx.setConcreteRegisterValue(ctx.registers.rip, start) # 设置rip寄存器
ctx.setConcreteRegisterValue(ctx.registers.rsp, RSP_ADDR) # 设置rsp寄存器
ctx.setConcreteRegisterValue(ctx.registers.rbp, RBP_ADDR) # 设置rbp寄存器

input = b"yhellow\x00"
ctx.setConcreteMemoryAreaValue(INPUT_ADDR, input) # 将字符串加载到INPUT_ADDR
pc = start

while pc:
inst = Instruction()
opcode = ctx.getConcreteMemoryAreaValue(pc, 16) # 读取opcode
inst.setOpcode(opcode)
inst.setAddress(pc)
ctx.processing(inst)
print(str(inst))

pc = ctx.getConcreteRegisterValue(ctx.registers.rip) # 读取rip寄存器

if __name__ == '__main__':
taint_analysis2(0x401CD5, 0x401D74) # main的地址范围(在IDA中查看)

最终结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
0x401cd5: endbr64
0x401cd9: push rbp
0x401cda: mov rbp, rsp
0x401cdd: sub rsp, 0x30
0x401ce1: mov rax, qword ptr fs:[0x28]
0x401cea: mov qword ptr [rbp - 8], rax
0x401cee: xor eax, eax
0x401cf0: lea rax, [rbp - 0x20]
0x401cf4: mov edx, 0x200
0x401cf9: mov rsi, rax
0x401cfc: mov edi, 0
0x401d01: mov eax, 0
0x401d06: call 0x4483f0 /* 执行call时会跳转到对应的代码 */
0x4483f0: endbr64
0x4483f4: mov eax, dword ptr fs:[0x18]
0x4483fc: test eax, eax
0x4483fe: jne 0x448410
0x448400: syscall
0x448402: cmp rax, -0x1000
0x448408: ja 0x448460
0x44840a: ret /* 返回原函数 */
0x401d0b: lea rax, [rbp - 0x20]
0x401d0f: mov esi, 0
0x401d14: mov rdi, rax
0x401d17: mov eax, 0
0x401d1c: call 0x4482c0
0x4482c0: endbr64
0x4482c4: push r12
0x4482c6: mov r10d, esi
0x4482c9: mov r12d, esi
0x4482cc: push rbp
0x4482cd: mov rbp, rdi
0x4482d0: sub rsp, 0x68
0x4482d4: mov qword ptr [rsp + 0x40], rdx
0x4482d9: mov rax, qword ptr fs:[0x28]
0x4482e2: mov qword ptr [rsp + 0x28], rax
0x4482e7: xor eax, eax
0x4482e9: and r10d, 0x40
0x4482ed: jne 0x448348
0x4482ef: mov eax, esi
0x4482f1: and eax, 0x410000
0x4482f6: cmp eax, 0x410000
0x4482fb: je 0x448348
0x4482fd: mov eax, dword ptr fs:[0x18]
0x448305: test eax, eax
0x448307: jne 0x448370
0x448309: mov edx, r12d
0x44830c: mov rsi, rbp
0x44830f: mov edi, 0xffffff9c
0x448314: mov eax, 0x101
0x448319: syscall
0x44831b: cmp rax, -0x1000
0x448321: ja 0x4483b8
0x448327: mov rcx, qword ptr [rsp + 0x28]
0x44832c: xor rcx, qword ptr fs:[0x28]
0x448335: jne 0x4483e1
0x44833b: add rsp, 0x68
0x44833f: pop rbp
0x448340: pop r12
0x448342: ret
0x401d21: mov dword ptr [rbp - 0x24], eax
0x401d24: lea rcx, [rbp - 0x20]
0x401d28: mov eax, dword ptr [rbp - 0x24]
0x401d2b: mov edx, 0x10
0x401d30: mov rsi, rcx
0x401d33: mov edi, eax
0x401d35: mov eax, 0
0x401d3a: call 0x4483f0
0x4483f0: endbr64
0x4483f4: mov eax, dword ptr fs:[0x18]
0x4483fc: test eax, eax
0x4483fe: jne 0x448410
0x448400: syscall
0x448402: cmp rax, -0x1000
0x448408: ja 0x448460
0x44840a: ret
0x401d3f: lea rax, [rbp - 0x20]
0x401d43: mov edx, 0x10
0x401d48: mov rsi, rax
0x401d4b: mov edi, 1
0x401d50: mov eax, 0
0x401d55: call 0x448490
0x448490: endbr64
0x448494: mov eax, dword ptr fs:[0x18]
0x44849c: test eax, eax
0x44849e: jne 0x4484b0
0x4484a0: mov eax, 1
0x4484a5: syscall
0x4484a7: cmp rax, -0x1000
0x4484ad: ja 0x448500
0x4484af: ret
0x401d5a: mov eax, 0
0x401d5f: mov rdx, qword ptr [rbp - 8]
0x401d63: sub rdx, qword ptr fs:[0x28]
0x401d6c: je 0x401d73
0x401d73: leave
0x401d74: ret

Triton 污点分析

污点分析(Taint Analysis)是一种计算机安全分析技术,用于追踪数据在程序中的流动情况,特别是那些可能来源于不信任的输入的数据,这种技术可以帮助识别和预防安全漏洞,如跨站脚本(XSS)、SQL注入、命令注入等

污点分析的脚本如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
from triton import *

def taint_analysis2(start, end):
ctx = TritonContext()
with open('./test', 'rb') as f:
bin1 = f.read()

ctx.setArchitecture(ARCH.X86_64)
ctx.setConcreteMemoryAreaValue(0x400000, bin1)
RBP_ADDR = 0x7ffffffde000
RSP_ADDR = RBP_ADDR - 0x21000
INPUT_ADDR = 0x4C2290

ctx.setConcreteRegisterValue(ctx.registers.rip, start)
ctx.setConcreteRegisterValue(ctx.registers.rsp, RSP_ADDR)
ctx.setConcreteRegisterValue(ctx.registers.rbp, RBP_ADDR)

input = b"./flag\x00"
ctx.setConcreteMemoryAreaValue(INPUT_ADDR, input)

pc = start
nop_addrs = []

while pc:
inst = Instruction()
opcode = ctx.getConcreteMemoryAreaValue(pc, 16)

inst.setOpcode(opcode)
inst.setAddress(pc)
ctx.processing(inst)

print(str(inst))
if pc == 0x401cf5:
print("--------------")
print("taint target: "+hex(ctx.getConcreteRegisterValue(ctx.registers.rdi))) # 获取寄存器的数据
print("--------------")
ctx.taintRegister(ctx.registers.rdi) # 将rdi中的地址数据标记为污点

if inst.isTainted(): # 检测该指令是否被污染
nop_addrs.append(hex(pc))
print("--------------")
if inst.isMemoryRead():
for op in inst.getOperands():
if op.getType() == OPERAND.MEM:
print("read:0x{:08x}, size:{}".format(
op.getAddress(), op.getSize()))

if inst.isMemoryWrite():
for op in inst.getOperands():
if op.getType() == OPERAND.MEM:
print("write:0x{:08x}, size:{}".format(
op.getAddress(), op.getSize()))

pc = ctx.getConcreteRegisterValue(ctx.registers.rip)

print(nop_addrs)

if __name__ == '__main__':
taint_analysis2(0x401CD5, 0x401D3F)

最终结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
0x401cd5: endbr64
0x401cd9: push rbp
0x401cda: mov rbp, rsp
0x401cdd: sub rsp, 0x10
0x401ce1: mov esi, 0
0x401ce6: lea rax, [rip + 0xc05a3]
0x401ced: mov rdi, rax
0x401cf0: mov eax, 0
0x401cf5: call 0x448280
--------------
taint target: 0x4c2290
--------------
0x448280: endbr64
0x448284: push r12
0x448286: mov r10d, esi
0x448289: mov r12d, esi
0x44828c: push rbp
0x44828d: mov rbp, rdi
--------------
0x448290: sub rsp, 0x68
0x448294: mov qword ptr [rsp + 0x40], rdx
0x448299: mov rax, qword ptr fs:[0x28]
0x4482a2: mov qword ptr [rsp + 0x28], rax
0x4482a7: xor eax, eax
0x4482a9: and r10d, 0x40
0x4482ad: jne 0x448308
0x4482af: mov eax, esi
0x4482b1: and eax, 0x410000
0x4482b6: cmp eax, 0x410000
0x4482bb: je 0x448308
0x4482bd: mov eax, dword ptr fs:[0x18]
0x4482c5: test eax, eax
0x4482c7: jne 0x448330
0x4482c9: mov edx, r12d
0x4482cc: mov rsi, rbp
--------------
0x4482cf: mov edi, 0xffffff9c
0x4482d4: mov eax, 0x101
0x4482d9: syscall
0x4482db: cmp rax, -0x1000
0x4482e1: ja 0x448378
0x4482e7: mov rcx, qword ptr [rsp + 0x28]
0x4482ec: xor rcx, qword ptr fs:[0x28]
0x4482f5: jne 0x4483a1
0x4482fb: add rsp, 0x68
0x4482ff: pop rbp
0x448300: pop r12
0x448302: ret
0x401cfa: mov dword ptr [rbp - 4], eax
0x401cfd: mov eax, dword ptr [rbp - 4]
0x401d00: mov edx, 0x10
0x401d05: lea rcx, [rip + 0xc0584]
0x401d0c: mov rsi, rcx
0x401d0f: mov edi, eax
0x401d11: mov eax, 0
0x401d16: call 0x4483b0
0x4483b0: endbr64
0x4483b4: mov eax, dword ptr fs:[0x18]
0x4483bc: test eax, eax
0x4483be: jne 0x4483d0
0x4483c0: syscall
0x4483c2: cmp rax, -0x1000
0x4483c8: ja 0x448420
0x4483ca: ret
0x401d1b: mov edx, 0x10
0x401d20: lea rax, [rip + 0xc0569]
0x401d27: mov rsi, rax
0x401d2a: mov edi, 1
0x401d2f: mov eax, 0
0x401d34: call 0x448450
0x448450: endbr64
0x448454: mov eax, dword ptr fs:[0x18]
0x44845c: test eax, eax
0x44845e: jne 0x448470
0x448460: mov eax, 1
0x448465: syscall
0x448467: cmp rax, -0x1000
0x44846d: ja 0x4484c0
0x44846f: ret
0x401d39: mov eax, 0
0x401d3e: leave
0x401d3f: ret
['0x44828d', '0x4482cc']
  • 在 GDB 中可以查看这两处地址的数据:
1
2
3
4
5
6
7
8
	......
*RDI 0x4c2290 (buf) ◂— 0x0
......
*RBP 0x7fffffffdb40 —▸ 0x402d20 (__libc_csu_init) ◂— endbr64
*RSP 0x7fffffffdb18 —▸ 0x7fffffffdb40 —▸ 0x402d20 (__libc_csu_init) ◂— endbr64
*RIP 0x44828d (open64+13) ◂— mov rbp, rdi
───────────────────────────────────[ DISASM ]───────────────────────────────────
0x44828d <open64+13> mov rbp, rdi <buf>
1
2
3
4
5
6
7
8
	......
RSI 0x0
......
*RBP 0x4c2290 (buf) ◂— 0x0
*RSP 0x7fffffffdab0 ◂— 0x1b
*RIP 0x4482cc (open64+76) ◂— mov rsi, rbp
───────────────────────────────────[ DISASM ]───────────────────────────────────
0x4482cc <open64+76> mov rsi, rbp <buf>
  • 可以发现两处被污染的指令都是引用了 0x4c2290 这处数据,但这并不意味着只要有 0x4c2290 就会被污染
  • 可以分析如下数据:
1
2
3
4
5
6
7
8
9
10
	......
*RCX 0x4c2290 (buf) ◂— 0x0
......
RSI 0x4c2290 (buf) ◂— 0x0
......
RBP 0x7fffffffdb40 —▸ 0x402d20 (__libc_csu_init) ◂— endbr64
RSP 0x7fffffffdb30 ◂— 0x0
*RIP 0x401d0c (main+55) ◂— mov rsi, rcx
───────────────────────────────────[ DISASM ]───────────────────────────────────
0x401d0c <main+55> mov rsi, rcx
  • 虽然也是引用了 0x4c2290,但这个 0x4c2290 的来源不同:
    • 被污染指令的 0x4c2290 源自于 0x401ce6: lea rax, [rip + 0xc05a3]
    • 而这里的 0x4c2290 源自于 0x401d05: lea rcx, [rip + 0xc0584]

Triton 代码插桩

为了解决 Triton 模拟执行没法识别 .got.plt 的问题,可以使用 Triton 自带的插桩模块将 .got.plt 中的函数给替换为自定义的函数

1
gcc test.c -o test
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include<stdio.h>
#include<stdlib.h>

int main(){
char* buf = malloc(0x10);

read(0,buf,0x200);
printf(buf);

int fd = open(buf,0);
read(fd ,buf, 0x10);
write(1 ,buf, 0x10);

free(buf);
free(buf);
return 0;
}
  • 可以发现堆溢出,格式化字符串,double free 等多个漏洞

代码插桩的脚本如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
from triton import *
import string
import os
import time
import lief

BASE_GOT = 0x10000000
BASE_ARGV = 0x20000000
BASE_STACK = 0x7ffffffde000

heap_addr = 0x555555559000
heap_size = 0
is_free = False

def hookingHandler(ctx):
pc = ctx.getConcreteRegisterValue(ctx.registers.rip)
for rel in customRelocation:
if rel[2] == pc:
ret_value = rel[1](ctx) # 调用hook函数,并获取其返回值
if ret_value is not None:
ctx.setConcreteRegisterValue(ctx.registers.rax, ret_value) # 返回值将会被设置到ctx.registers.rax中
ret_addr = ctx.getConcreteMemoryValue(MemoryAccess(ctx.getConcreteRegisterValue(ctx.registers.rsp), CPUSIZE.QWORD)) # 返回地址位于栈顶
ctx.setConcreteRegisterValue(ctx.registers.rip, ret_addr) # 设置rip
ctx.setConcreteRegisterValue(ctx.registers.rsp, ctx.getConcreteRegisterValue(ctx.registers.rsp)+CPUSIZE.QWORD) # 调整rsp
return

def mymalloc(ctx):
global heap_size
print('[+] malloc hooked')
heap_size = ctx.getConcreteRegisterValue(ctx.registers.rdi)
return heap_addr

def myfree(ctx):
global is_free
print('[+] free hooked')
if is_free:
print('[-] free BUG')
is_free = True
return 0

def mylibc(ctx):
print('[+] __libc_start_main hooked')
return 0

def myopen(ctx):
print("[+] open hooked")
arg1 = ctx.getConcreteRegisterValue(ctx.registers.rdi)
arg2 = ctx.getConcreteRegisterValue(ctx.registers.rsi)
return 0

def myread(ctx):
global heap_size
print("[+] read hooked")
read_size = ctx.getConcreteRegisterValue(ctx.registers.rdx)
if read_size > heap_size:
print("[-] read BUG")
return 0

def mywrite(ctx):
global heap_size
print("[+] write hooked")
read_size = ctx.getConcreteRegisterValue(ctx.registers.rdx)
if read_size > heap_size:
print("[-] write BUG")
return 0

def myprintf(ctx):
print("[+] printf hooked")
return 0

customRelocation = [
['__libc_start_main', mylibc, None],
['malloc', mymalloc, None],
['free', myfree, None],
['open', myopen, None],
['read', myread, None],
['write', mywrite, None],
['printf', myprintf, None],
]

def makeRelocation(ctx, binary):
for pltIndex in range(len(customRelocation)):
customRelocation[pltIndex][2] = BASE_GOT + pltIndex # 设置自定义got地址

relocations = [x for x in binary.pltgot_relocations] # 读取pltgot重定位符号
relocations.extend([x for x in binary.dynamic_relocations]) # 读取dynamic重定位符号

for rel in relocations:
symbolName = rel.symbol.name
symbolRelo = rel.address
for crel in customRelocation:
if symbolName == crel[0]:
print('Hooking %-10s:0x%x' %(symbolName,symbolRelo))
ctx.setConcreteMemoryValue(MemoryAccess(symbolRelo, CPUSIZE.QWORD), crel[2]) # 将模拟执行的got表修改为基于BASE_GOT的自定义got表
break
return

def loadBinary(ctx, binary):
phdrs = binary.segments # 读取所有segment
for phdr in phdrs:
size = phdr.physical_size
vaddr = phdr.virtual_address
print('[+] Loading 0x%06x - 0x%06x' %(vaddr, vaddr+size))
ctx.setConcreteMemoryAreaValue(vaddr, list(phdr.content))
return


if __name__ == '__main__':
ctx = TritonContext(ARCH.X86_64)
ctx.setMode(MODE.ALIGNED_MEMORY, True)
ctx.setMode(MODE.CONSTANT_FOLDING, True)

binary = lief.ELF.parse("./test")

loadBinary(ctx, binary)
makeRelocation(ctx, binary)
ctx.setConcreteRegisterValue(ctx.registers.rbp, BASE_STACK)
ctx.setConcreteRegisterValue(ctx.registers.rsp, BASE_STACK)

pc = 0x11E9
while pc:
inst = Instruction()
opcode = ctx.getConcreteMemoryAreaValue(pc, 16)

inst.setOpcode(opcode)
inst.setAddress(pc)
ctx.processing(inst)
print(str(inst))

hookingHandler(ctx) # 处理hook
pc = ctx.getConcreteRegisterValue(ctx.registers.rip)

if pc == 0x11ff:
print("[+] rax: "+hex(ctx.getConcreteRegisterValue(ctx.registers.rax)))

最终结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
[+] Loading 0x000040 - 0x000318
[+] Loading 0x000318 - 0x000334
[+] Loading 0x000000 - 0x000710
[+] Loading 0x001000 - 0x001325
[+] Loading 0x002000 - 0x002150
[+] Loading 0x003d90 - 0x004010
[+] Loading 0x003da0 - 0x003f90
[+] Loading 0x000338 - 0x000358
[+] Loading 0x000358 - 0x00039c
[+] Loading 0x000338 - 0x000358
[+] Loading 0x002004 - 0x002048
[+] Loading 0x000000 - 0x000000
[+] Loading 0x003d90 - 0x004000
Hooking free
Hooking write
Hooking printf
Hooking read
Hooking malloc
Hooking open
Hooking __libc_start_main
0x11e9: endbr64
0x11ed: push rbp
0x11ee: mov rbp, rsp
0x11f1: sub rsp, 0x10
0x11f5: mov edi, 0x10
0x11fa: call 0x10e0
0x10e0: endbr64
0x10e4: bnd jmp qword ptr [rip + 0x2edd]
[+] malloc hooked
[+] rax: 0x555555559000
0x11ff: mov qword ptr [rbp - 8], rax
0x1203: mov rax, qword ptr [rbp - 8]
0x1207: mov edx, 0x200
0x120c: mov rsi, rax
0x120f: mov edi, 0
0x1214: mov eax, 0
0x1219: call 0x10d0
0x10d0: endbr64
0x10d4: bnd jmp qword ptr [rip + 0x2ee5]
[+] read hooked
[-] read BUG
0x121e: mov rax, qword ptr [rbp - 8]
0x1222: mov rdi, rax
0x1225: mov eax, 0
0x122a: call 0x10c0
0x10c0: endbr64
0x10c4: bnd jmp qword ptr [rip + 0x2eed]
[+] printf hooked
0x122f: mov rax, qword ptr [rbp - 8]
0x1233: mov esi, 0
0x1238: mov rdi, rax
0x123b: mov eax, 0
0x1240: call 0x10f0
0x10f0: endbr64
0x10f4: bnd jmp qword ptr [rip + 0x2ed5]
[+] open hooked
0x1245: mov dword ptr [rbp - 0xc], eax
0x1248: mov rcx, qword ptr [rbp - 8]
0x124c: mov eax, dword ptr [rbp - 0xc]
0x124f: mov edx, 0x10
0x1254: mov rsi, rcx
0x1257: mov edi, eax
0x1259: mov eax, 0
0x125e: call 0x10d0
0x10d0: endbr64
0x10d4: bnd jmp qword ptr [rip + 0x2ee5]
[+] read hooked
0x1263: mov rax, qword ptr [rbp - 8]
0x1267: mov edx, 0x10
0x126c: mov rsi, rax
0x126f: mov edi, 1
0x1274: mov eax, 0
0x1279: call 0x10b0
0x10b0: endbr64
0x10b4: bnd jmp qword ptr [rip + 0x2ef5]
[+] write hooked
0x127e: mov rax, qword ptr [rbp - 8]
0x1282: mov rdi, rax
0x1285: call 0x10a0
0x10a0: endbr64
0x10a4: bnd jmp qword ptr [rip + 0x2efd]
[+] free hooked
0x128a: mov rax, qword ptr [rbp - 8]
0x128e: mov rdi, rax
0x1291: call 0x10a0
0x10a0: endbr64
0x10a4: bnd jmp qword ptr [rip + 0x2efd]
[+] free hooked
[-] free BUG
0x1296: mov eax, 0
0x129b: leave
0x129c: ret

Triton 符号执行

符号执行(Symbolic Execution)是一种程序分析技术,它使用符号值代替具体数值来探索程序的所有可能执行路径,这种方法可以帮助分析者或自动化工具理解程序的行为,发现潜在的错误或安全漏洞

符号执行的脚本如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
from triton import *
import lief
import os
import time

def AnalysisBinary(path):
binary = lief.ELF.parse(path) # 使用lief加载二进制文件

sections = binary.sections
for section in sections: # 读取所有section
name = section.name
size = section.size
vaddr = section.virtual_address
if name != "":
print('[+] %-28s 0x%06x - 0x%06x' % (name+":", vaddr, vaddr+size))
data = bytes(section.content[0:0x20])
for i in range(len(data)):
print(f"{data[i]:02x}", end=" ")
if i % 0x10 == 0xf: print()
if i % 0x10 != 0xf: print()
print("-------------------------------------------------")
ctx.setConcreteMemoryAreaValue(vaddr, bytes(section.content)) # 加载所有section

symbols = binary.symbols # 读取所有symbol
for symbol in symbols:
if symbol.name == 'main':
main_address = symbol.value
print()
print('[+] Address of main function: 0x{:x}'.format(main_address))
print()
break
else:
print("'main' function not found")

if __name__ == '__main__':
ctx = TritonContext()
ctx.setArchitecture(ARCH.X86_64)
ast = ctx.getAstContext()

AnalysisBinary(os.path.join(os.path.dirname(__file__), './test'))

start = 0x401DE0 # 执行关键匹配函数之前的某个地址(从IDA中得出)
RBP_ADDR = 0x7ffffffde000
RSP_ADDR = RBP_ADDR - 0x21000

ctx.setAstRepresentationMode(AST_REPRESENTATION.PYTHON)

ctx.setConcreteRegisterValue(ctx.registers.rip, start)
ctx.setConcreteRegisterValue(ctx.registers.rsp, RSP_ADDR)
ctx.setConcreteRegisterValue(ctx.registers.rbp, RBP_ADDR)

for i in range(5):
input_addr = ctx.getConcreteRegisterValue(ctx.registers.rbp) - 0x20
ctx.setConcreteMemoryValue(MemoryAccess(input_addr + i, CPUSIZE.BYTE), ord('a'))
ctx.symbolizeMemory(MemoryAccess(input_addr + i, CPUSIZE.BYTE)) # 将目标地址的数据设置为符号变量

pc = start

while pc:
inst = Instruction()
opcode = ctx.getConcreteMemoryAreaValue(pc, 16)

inst.setOpcode(opcode)
inst.setAddress(pc)
ctx.processing(inst)

print(str(inst))
if inst.getAddress() == 0x401DEC:
rdata = ctx.getRegisterAst(ctx.registers.rax) # 获取rax寄存器的AST结构,可以添加到后续的约束条件中

sv1 = ast.variable(ctx.getSymbolicVariable(0)) # 读取设置的符号变量
sv2 = ast.variable(ctx.getSymbolicVariable(1))
sv3 = ast.variable(ctx.getSymbolicVariable(2))
sv4 = ast.variable(ctx.getSymbolicVariable(3))
sv5 = ast.variable(ctx.getSymbolicVariable(4))

cstr = ast.land([rdata == 0xAD6D] # 添加约束条件
+ [sv1 >= ord(b'A'), sv1 <= ord(b'z')]
+ [sv2 >= ord(b'A'), sv2 <= ord(b'z')]
+ [sv3 >= ord(b'A'), sv3 <= ord(b'z')]
+ [sv4 >= ord(b'A'), sv4 <= ord(b'z')]
+ [sv5 >= ord(b'A'), sv5 <= ord(b'z')]
)

model = ctx.getModel(cstr) # 进行约束求解
answer = ""
for k, v in sorted(model.items()):
value = v.getValue()
answer += chr(value)

if len(answer)==5:
print("answer: {}".format(answer))

break

pc = ctx.getConcreteRegisterValue(ctx.registers.rip)

测试脚本如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include <stdio.h>
#include <stdlib.h>

char *serial = "\x31\x3e\x3d\x26\x31";

int check(char *ptr) {
int i;
int hash = 0xABCD;

for (i = 0; ptr[i]; i++)
hash += ptr[i] ^ serial[i % 5];

return hash;
}

int main() {
int ret;
char buf[0x10];
read(0, buf, 5); // TyrbP

ret = check(buf);
if (ret == 0xad6d)
printf("Win\n");
else
printf("fail\n");

return 0;
}

先用 IDA 分析程序起始地址 0x401DE0:

1
2
3
4
5
6
7
.text:0000000000401DE0 48 8D 45 E0                   lea     rax, [rbp+var_20]
.text:0000000000401DE4 48 89 C7 mov rdi, rax
.text:0000000000401DE7 E8 39 FF FF FF call check
.text:0000000000401DE7
.text:0000000000401DEC 89 45 DC mov [rbp+var_24], eax
.text:0000000000401DEF 81 7D DC 6D AD 00 00 cmp [rbp+var_24], 0AD6Dh
.text:0000000000401DF6 75 11 jnz short loc_401E09
  • 可以发现只要 check 函数的返回值为 0xAD6D 就可以输出 Win,利用这一点可以添加约束

Triton 解决 CTF 问题

题目为:alexctf-2017-re2-cpp-is-awesome

核心加密逻辑如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
i = 0;
flag[0] = std::string::begin(string);
while ( 1 )
{
v13 = std::string::end(string);
if ( !sub_400D3D((__int64)flag, (__int64)&v13) )
break;
v8 = *(unsigned __int8 *)get_char((__int64)flag);
if ( (_BYTE)v8 != key[index[i]] )
fail((__int64)flag, (__int64)&v13, v8);
++i;
sub_400D7A(flag);
}
  • 其实就是简单的换位

本题目需要使用符号执行,首先在 IDA 中分析决定程序流程的代码片段:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
.text:0000000000400C4B 48 8D 45 A0                   lea     rax, [rbp+flag]
.text:0000000000400C4F 48 89 C7 mov rdi, rax
.text:0000000000400C52 E8 43 01 00 00 call get_char
.text:0000000000400C52
.text:0000000000400C57 0F B6 10 movzx edx, byte ptr [rax]
.text:0000000000400C5A 48 8B 0D 3F 14 20 00 mov rcx, cs:key
.text:0000000000400C61 8B 45 EC mov eax, [rbp+i]
.text:0000000000400C64 48 98 cdqe
.text:0000000000400C66 8B 04 85 C0 20 60 00 mov eax, index[rax*4]
.text:0000000000400C6D 48 98 cdqe
.text:0000000000400C6F 48 01 C8 add rax, rcx
.text:0000000000400C72 0F B6 00 movzx eax, byte ptr [rax]
.text:0000000000400C75 38 C2 cmp dl, al
.text:0000000000400C77 0F 95 C0 setnz al
.text:0000000000400C7A 84 C0 test al, al
.text:0000000000400C7C 74 05 jz short loc_400C83
.text:0000000000400C7C
.text:0000000000400C7E ; try {
.text:0000000000400C7E E8 D3 FE FF FF call fail
  • lea rax, [rbp+flag] 加载 flag 地址,到 test al, al 判断是否继续循环,程序的核心约束条件就是要使 zf == 1(指令 test al, al 返回“0”,也就是 cmp dl, al 返回“0”)

基于约束条件,解题脚本如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
from triton import *
import lief
import sys

def AnalysisBinary(path):
binary = lief.ELF.parse(path)
sections = binary.sections
for section in sections:
name = section.name
size = section.size
vaddr = section.virtual_address
if name != "":
print('[+] %-28s 0x%06x - 0x%06x' % (name+":", vaddr, vaddr+size))
data = bytes(section.content[0:0x20])
for i in range(len(data)):
print(f"{data[i]:02x}", end=" ")
if i % 0x10 == 0xf: print()
if i % 0x10 != 0xf: print()
print("-------------------------------------------------")
ctx.setConcreteMemoryAreaValue(vaddr, bytes(section.content))

if __name__ == '__main__':
ctx = TritonContext(ARCH.X86_64)
ctx.setMode(MODE.ALIGNED_MEMORY, True)
ctx.setMode(MODE.ONLY_ON_SYMBOLIZED, True)

AnalysisBinary("./test")

start = 0x400c4b
end = 0x400C7C
input = 0x002000 # 输入字符串的地址
for i in range(31):
ctx.setConcreteMemoryValue(MemoryAccess(input + i, CPUSIZE.BYTE), 0x61)
ctx.symbolizeMemory(MemoryAccess(input + i, CPUSIZE.BYTE)) # 将输入字符串设置为符号变量

rbp = 0x7fffffffe460
ctx.setConcreteRegisterValue(ctx.registers.rbp, rbp)
ctx.setConcreteRegisterValue(ctx.registers.rip, start)
ctx.setConcreteMemoryValue(MemoryAccess(rbp - 96, CPUSIZE.QWORD), input) # 初始化"rbp - 96"为输入字符串的地址
ctx.setConcreteMemoryValue(MemoryAccess(rbp - 20, CPUSIZE.DWORD), 0) # 初始化"rbp - 20"为"0"(这里是解密输入值时使用的索引)

for count in range(31):
pc = start

while pc: # 进行模拟执行,直到遇到约束条件
inst = Instruction()
opcode = ctx.getConcreteMemoryAreaValue(pc, 16)
inst.setOpcode(opcode)
inst.setAddress(pc)
ctx.processing(inst)
pc = ctx.getConcreteRegisterValue(ctx.registers.rip)
print(inst)
if pc == end:
print("------------------------------------------")
break

zf = ctx.getRegisterAst(ctx.registers.zf)
ctx.pushPathConstraint(zf == 1) # 添加约束条件

ctx.setConcreteMemoryValue(MemoryAccess(rbp - 20, CPUSIZE.DWORD), count + 1) # 更新索引
ctx.setConcreteMemoryValue(MemoryAccess(rbp - 96, CPUSIZE.DWORD), input + count + 1) # 更新输入值

mod = ctx.getModel(ctx.getPathPredicate()) # 进行约束求解
if not mod:
print('[-] Failed')
sys.exit(-1)

flag = ""
for k, v in sorted(mod.items()):
ctx.setConcreteVariableValue(ctx.getSymbolicVariable(v.getId()), v.getValue())
value = v.getValue()
flag += chr(value)

print("flag: {}".format(flag))
  • 这种加密的加密逻辑与输入值无关,与输入的顺序也无关,因此可以单独为每一位添加约束条件

exrop 的安装与使用

安装 exrop:

1
git clone https://github.com/d4em0n/exrop
  • 最后在 /home/yhellow 目录的 .zshenv 中写入 export PYTHONPATH=/home/yhellow/Tools/exrop:$PYTHONPATH 即可

第一次使用 exrop 可能会遇到如下报错:

1
TypeError: Z3ToTriton::visit(): 'SymVar_0' AST node not supported yet

经过多方查证,这个报错源自于 Z3_OP_UNINTERPRETED 变量,其在 C++ 绑定和 Python 绑定之间具有不同的值:

这就导致了 Triton 将 Z3_OP_UNINTERPRETED 给识别为 Z3_OP_RECURSIVE,为了解决这个 BUG 我选择修改 Triton 源码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
case Z3_OP_RECURSIVE: /* 添加对Z3_OP_RECURSIVE的识别 */
case Z3_OP_UNINTERPRETED: {
std::string name = function.name().str();

node = this->astCtxt->getVariableNode(name);
if (node == nullptr) {
node = this->astCtxt->string(name);
}

break;
}

default:
throw triton::exceptions::AstLifting("Z3ToTriton::visit(): '" + function.name().str() + "' AST node not supported yet");
  • 原本 Triton 不会对 Z3_OP_RECURSIVE 进行处理

exrop 的主要功能是自动生成 ROP 链

exrop 的使用案例如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
from pwn import *
import time
from Exrop import Exrop

binname = "/lib/x86_64-linux-gnu/libc.so.6"
libc = ELF(binname, checksec=False)
binsh = next(libc.search(b"/bin/sh"))
open_libc = libc.symbols['open']
read_libc = libc.symbols['read']
write_libc = libc.symbols['write']
bss = libc.bss()

t = time.mktime(time.gmtime())
rop = Exrop(binname)
rop.find_gadgets(cache=True)

print("execve('/bin/sh', 0, 0)")
chain = rop.syscall(59, ("/bin/sh", 0, 0), bss) # 字符串会写到第3个参数上
chain.set_base_addr(0)
chain.dump()

print("execve('/bin/sh', 0, 0)")
chain = rop.syscall(59, (binsh, 0, 0)) # 如果没设置第3个参数,则不能直接写入字符串
chain.set_base_addr(0) # 设置基地址
print(chain.payload_str()) # 将ROP链转化为bytes字符串

print("open('/etc/passwd', 0)")
chain = rop.func_call(open_libc, ("/etc/passwd", 0), bss)
chain.set_base_addr(0)
chain.dump()

print("read('rax', bss, 0x100)")
chain = rop.func_call(read_libc, ('rax', bss, 0x100)) # 可以直接使用寄存器
chain.set_base_addr(0)
chain.dump()

print("write(1, bss, 0x100)")
chain = rop.func_call(write_libc, (1, bss, 0x100))
chain.set_base_addr(0)
chain.dump()

print("done in {}s".format(time.mktime(time.gmtime()) - t))
  • 结果如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
execve('/bin/sh', 0, 0)
$RSP+0x0000 : 0x0000000000036174 # pop rax ; ret
$RSP+0x0008 : 0x00000000001ed7a0
$RSP+0x0010 : 0x0000000000023b6a # pop rdi ; ret
$RSP+0x0018 : 0x0068732f6e69622f
$RSP+0x0020 : 0x000000000009a0cf # mov qword ptr [rax], rdi ; ret
$RSP+0x0028 : 0x0000000000036174 # pop rax ; ret
$RSP+0x0030 : 0x000000000000003b
$RSP+0x0038 : 0x0000000000023b6a # pop rdi ; ret
$RSP+0x0040 : 0x00000000001ed7a0
$RSP+0x0048 : 0x000000000002601f # pop rsi ; ret
$RSP+0x0050 : 0x0000000000000000
$RSP+0x0058 : 0x000000000015fae6 # pop rdx ; pop rbx ; ret
$RSP+0x0060 : 0x0000000000000000
$RSP+0x0068 : 0x0000000000000000
$RSP+0x0070 : 0x00000000000630a9 # syscall ; ret

execve('/bin/sh', 0, 0)
b'ta\x03\x00\x00\x00\x00\x00;\x00\x00\x00\x00\x00\x00\x00j;\x02\x00\x00\x00\x00\x00\xbdE\x1b\x00\x00\x00\x00\x00\x1f`\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe6\xfa\x15\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xa90\x06\x00\x00\x00\x00\x00'
open('/etc/passwd', 0)
$RSP+0x0000 : 0x0000000000036174 # pop rax ; ret
$RSP+0x0008 : 0x00000000001ed7a0
$RSP+0x0010 : 0x0000000000023b6a # pop rdi ; ret
$RSP+0x0018 : 0x7361702f6374652f
$RSP+0x0020 : 0x000000000009a0cf # mov qword ptr [rax], rdi ; ret
$RSP+0x0028 : 0x0000000000036174 # pop rax ; ret
$RSP+0x0030 : 0x00000000001ed7a8
$RSP+0x0038 : 0x0000000000023b6a # pop rdi ; ret
$RSP+0x0040 : 0x0000000000647773
$RSP+0x0048 : 0x000000000009a0cf # mov qword ptr [rax], rdi ; ret
$RSP+0x0050 : 0x0000000000023b6a # pop rdi ; ret
$RSP+0x0058 : 0x00000000001ed7a0
$RSP+0x0060 : 0x000000000002601f # pop rsi ; ret
$RSP+0x0068 : 0x0000000000000000
$RSP+0x0070 : 0x000000000010df00

read('rax', bss, 0x100)
$RSP+0x0000 : 0x000000000015fae6 # pop rdx ; pop rbx ; ret
$RSP+0x0008 : 0x0000000000000000
$RSP+0x0010 : 0x0000000000023b6a
$RSP+0x0018 : 0x0000000000044808 # mov r13, rax ; mov rdi, r12 ; call rbx: next -> (0x00023b6a) # pop rdi ; ret
$RSP+0x0020 : 0x0000000000036174 # pop rax ; ret
$RSP+0x0028 : 0x000000000002601f
$RSP+0x0030 : 0x0000000000045872 # mov rdi, r13 ; call rax: next -> (0x0002601f) # pop rsi ; ret
$RSP+0x0038 : 0x000000000002601f # pop rsi ; ret
$RSP+0x0040 : 0x00000000001ed7a0
$RSP+0x0048 : 0x000000000015fae6 # pop rdx ; pop rbx ; ret
$RSP+0x0050 : 0x0000000000000100
$RSP+0x0058 : 0x0000000000000000
$RSP+0x0060 : 0x000000000010e1e0

write(1, bss, 0x100)
$RSP+0x0000 : 0x0000000000023b6a # pop rdi ; ret
$RSP+0x0008 : 0x0000000000000001
$RSP+0x0010 : 0x000000000002601f # pop rsi ; ret
$RSP+0x0018 : 0x00000000001ed7a0
$RSP+0x0020 : 0x000000000015fae6 # pop rdx ; pop rbx ; ret
$RSP+0x0028 : 0x0000000000000100
$RSP+0x0030 : 0x0000000000000000
$RSP+0x0038 : 0x000000000010e280

done in 2.0s

在不使用缓存的情况下 rop.find_gadgets 将会执行非常久的时间(特别是处理 libc.so.6),下面方法可以提高其运行速度:

1
2
3
4
5
6
7
8
9
10
11
def parseRopGadget(filename, opt=""):
from subprocess import Popen, PIPE, STDOUT
import re

cmd = ['ROPgadget', '--binary', filename, '--multibr', '--only',
'pop|xchg|add|sub|xor|mov|ret|jmp|call|syscall|leave', '--dump']
if opt:
cmd.append(opt)
process = Popen(cmd, stdout=PIPE, stderr=STDOUT)
stdout, _ = process.communicate()
......
  • 在 Exrop.py 的 parseRopGadget 函数中会使用 ROPgadget,减少不必要的查找指令可以大幅度提高运行速度
1
2
cmd = ['ROPgadget', '--binary', filename, '--multibr', '--only',
'pop|mov|ret|call|syscall', '--dump']
1
done in 28.0s

qsynthesis 的安装与使用

1
2
3
git clone https://github.com/quarkslab/qsynthesis.git
cd qsynthesis
pip3 install '.[all]'

QSynthesis 是一个 Python3 API,用于执行基于 I/O 的程序合成的 bitvector 表达式,旨在促进代码反混淆

  • 该算法是灰盒方法,结合了基于黑盒 I/O 的算法合成和白盒 AST 搜索以合成子表达式

核心合成基于 Triton 符号引擎,在此基础上构建整个框架,它提供以下功能:

  • 位向量表达式的合成
  • 能够通过 SMT 检查合成表达式的语义等价性
  • 能够合成常量(如果表达式编码常量)
  • 通过学习机制加班改进预言机(预计算表)的能力
  • 能够将合成表达重新组合回组装
  • 能够通过 REST API 提供预言机以方便合成使用
  • 一个 IDA 插件,提供合成的集成

使用工具生成表:

1
qsynthesis-table-manager generate -bs 64 --var-num 3 --input-num 16 --random-level 5 --ops AND,NEG,MUL,XOR,NOT --watchdog 80 --limit 5000000 my_oracle_table

qsynthesis 的使用案例如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
import logging
import sys

from triton import ARCH

from qsynthesis import SimpleSymExec, TopDownSynthesizer, InputOutputOracleLevelDB
import qsynthesis
import logging
import pkg_resources
logging.basicConfig(level=logging.DEBUG)

qsynthesis.enable_logging()

RIP_ADDR = 0x40B160
RSP_ADDR = 0x800000

INSTRUCTIONS = [b'U', b'H\x89\xe5', b'H\x89}\xf8', b'H\x89u\xf0', b'H\x89U\xe8', b'H\x89M\xe0', b'L\x89E\xd8',
b'H\x8bE\xf0', b'H#E\xe0', b'H\x89\xc2', b'H\x8bE\xf0', b'H\x0bE\xe0', b'H\x0f\xaf\xd0', b'H\x8bE\xe0',
b'H\xf7\xd0', b'H#E\xf0', b'H\x89\xc1', b'H\x8bE\xf0', b'H\xf7\xd0', b'H#E\xe0', b'H\x0f\xaf\xc1',
b'H\x01\xc2', b'H\x8bE\xe0', b'H\x0f\xaf\xc0', b'H\x89\xd6', b'H!\xc6', b'H\x8bE\xf0', b'H#E\xe0',
b'H\x89\xc2', b'H\x8bE\xf0', b'H\x0bE\xe0', b'H\x0f\xaf\xd0', b'H\x8bE\xe0', b'H\xf7\xd0', b'H#E\xf0',
b'H\x89\xc1', b'H\x8bE\xf0', b'H\xf7\xd0', b'H#E\xe0', b'H\x0f\xaf\xc1', b'H\x01\xc2', b'H\x8bE\xe0',
b'H\x0f\xaf\xc0', b'H\t\xd0', b'H)\xc6', b'H\x89\xf0', b'H\x83\xe8\x01', b'H3E\xf0', b'H\x89\xc2',
b'H\x8bE\xf0', b'H#E\xe0', b'H\x89\xc1', b'H\x8bE\xf0', b'H\x0bE\xe0', b'H\x0f\xaf\xc8', b'H\x8bE\xe0',
b'H\xf7\xd0', b'H#E\xf0', b'H\x89\xc6', b'H\x8bE\xf0', b'H\xf7\xd0', b'H#E\xe0', b'H\x0f\xaf\xc6',
b'H\x01\xc1', b'H\x8bE\xe0', b'H\x0f\xaf\xc0', b'H1\xc8', b'H#E\xf0', b'H\x01\xc0', b'H)\xc2',
b'H\x89\xd0', b']', b'\xc3']

qsynthesis_version = pkg_resources.get_distribution("qsynthesis").version # 读取qsynthesis的版本
print(f"The version of qsynthesis is: {qsynthesis_version}")

def test(oracle_file):
# Perform symbolic execution of the instructions
symexec = SimpleSymExec(ARCH.X86_64) # 使用预期的体系结构对其进行初始化
symexec.initialize_register('rip', RIP_ADDR) # 初始化寄存器
symexec.initialize_register('rsp', RSP_ADDR)
for opcode in INSTRUCTIONS:
symexec.execute(opcode) # 执行给定的操作码
rax = symexec.get_register_ast("rax") # 执行指令后检索rax AST

# Load lookup tables
ltm = InputOutputOracleLevelDB.load(oracle_file) # 加载查找表数据库

# Perform Synthesis of the expression
synthesizer = TopDownSynthesizer(ltm) # 实例化表的综合大小
synt_rax, simp = synthesizer.synthesize(rax) # 触发rax表达式的合成

# Print synthesis results
print(f"simplified: {simp}")
print(f"synthesized expression: {synt_rax.pp_str}")
sz, nsz = rax.node_count, synt_rax.node_count
print(f"size: {rax.node_count} -> {synt_rax.node_count}\nsize reduction:{((sz-nsz)*100)/sz:.2f}%")
return symexec, rax, synt_rax

if __name__ == "__main__":
sx, rax, srax = test("my_oracle_table/")
  • 结果如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
The version of qsynthesis is: 0.2.0
DEBUG:qsynthesis:try synthesis lookup: (((((((((((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi))))) & ((rcx * rcx))) - (((rcx * rcx)) | (((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi)))))))) - 0x1)) ^ rsi) - ((((((rcx * rcx)) ^ (((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi)))))) & rsi) + ((((rcx * rcx)) ^ (((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi)))))) & rsi))))) [2]
DEBUG:qsynthesis:try synthesis lookup: (((((((((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi))))) & ((rcx * rcx))) - (((rcx * rcx)) | (((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi)))))))) - 0x1)) ^ rsi) [2]
DEBUG:qsynthesis:[base] candidate expr accepted: current:47 candidate:10 (best:0) => ((((rcx * rsi)) ^ (~(rsi))) ^ ((rcx * rcx)))
DEBUG:qsynthesis:Replace: (((((((((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi))))) & ((rcx * rcx))) - (((rcx * rcx)) | (((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi)))))))) - 0x1)) ^ rsi) ===> ((((rcx * rsi)) ^ (~(rsi))) ^ ((rcx * rcx)))
DEBUG:qsynthesis:try synthesis lookup: ((((((rcx * rcx)) ^ (((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi)))))) & rsi) + ((((rcx * rcx)) ^ (((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi)))))) & rsi))) [2]
DEBUG:qsynthesis:try synthesis lookup: ((((rcx * rcx)) ^ (((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi)))))) & rsi) [2]
DEBUG:qsynthesis:[base] candidate expr accepted: current:23 candidate:11 (best:0) => ((((rsi * rcx)) & rsi) ^ (((rcx * rcx)) & rsi))
DEBUG:qsynthesis:Replace: ((((rcx * rcx)) ^ (((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi)))))) & rsi) ===> ((((rsi * rcx)) & rsi) ^ (((rcx * rcx)) & rsi))
DEBUG:qsynthesis:try synthesis lookup: ((((rcx * rcx)) ^ (((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi)))))) & rsi) [2]
DEBUG:qsynthesis:expression cache found !
DEBUG:qsynthesis:Replace: ((((rcx * rcx)) ^ (((((rsi & rcx) * (rsi | rcx))) + ((((~(rsi)) & rcx) * ((~(rcx)) & rsi)))))) & rsi) ===> ((((rsi * rcx)) & rsi) ^ (((rcx * rcx)) & rsi))
simplified: True
synthesized expression: ((((((rcx * rsi)) ^ (~(rsi))) ^ ((rcx * rcx))) - ((((((rsi * rcx)) & rsi) ^ (((rcx * rcx)) & rsi)) + ((((rsi * rcx)) & rsi) ^ (((rcx * rcx)) & rsi))))))
size: 95 -> 34
size reduction:64.21%