0%

Linux-Lab8-Networking

Networking

实验目标:

  • 了解 Linux 内核网络体系结构
  • 使用数据包过滤器或防火墙,学习 IP 数据包管理技能
  • 熟悉如何在 Linux 内核级别使用套接字

互联网的发展导致网络应用的指数级增长,从而增加了操作系统网络子系统的速度和生产力要求

  • 网络子系统不是操作系统内核的基本组件(Linux 内核可以在没有网络支持的情况下编译)
  • 然而,计算系统(甚至是嵌入式设备)不太可能具有非网络操作系统
  • 现代操作系统使用 TCP/IP 堆栈,由内核实现传输层的协议,而应用层协议通常在用户空间(HTTP,FTP,SSH等)中实现

Networking in user space

在用户空间中,网络通信的抽象是套接字 socket

套接字 socket 抽象了一个通信通道,是基于内核的 TCP/IP 堆栈交互接口(其实 TCP/IP 的底层就是发包,通过发包实现计算机网络之间的交互,而 socket 让发包变得更简单了)

Networking in Linux kernel

Linux 内核为处理网络数据包提供了三种基本结构:

struct socket

  • 一个非常接近用户空间的抽象,即用于编程网络应用的 BSD 套接字
1
2
3
4
5
6
7
8
9
struct socket {
socket_state state; /* 表示socket所处的状态 */
short type; /* 该socket的类型 */
unsigned long flags; /* 标志位 */
struct socket_wq *wq; /* 等待该socket的进程队列和异步通知队列 */
struct file *file; /* 与之关联的file */
struct sock *sk; /* 与之关联的sock */
const struct proto_ops *ops; /* 协议相关的一组操作集 */
};
  • 相关联的 API 如下:
1
2
3
4
5
/* socket create/delete */
int sock_create(int family, int type, int protocol, struct socket **res); /* 在系统调用socket()执行后创建一个socket结构体 */
int sock_create_kern(struct net *net, int family, int type, int protocol, struct socket **res); /* 创建一个内核套接字 */
int sock_create_lite(int family, int type, int protocol, struct socket **res); /* 创建不进行参数健全性检查的内核套接字 */
void sock_release(struct socket *sock); /* 关闭socket并释放关联的资源 */

struct sock

  • 在 Linux 术语中是套接字的网络表示形式,又称为 INET 套接字
  • 该结构用于存储有关连接状态的信息
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
struct sock {
/*
* Now struct inet_timewait_sock also uses sock_common, so please just
* don't add nothing before this first member (__sk_common) --acme
*/
struct sock_common __sk_common;

......

socket_lock_t sk_lock;
atomic_t sk_drops;
int sk_rcvlowat;
struct sk_buff_head sk_error_queue;
struct sk_buff_head sk_receive_queue;
struct {
atomic_t rmem_alloc;
int len;
struct sk_buff *head;
struct sk_buff *tail;
} sk_backlog;
#define sk_rmem_alloc sk_backlog.rmem_alloc

int sk_forward_alloc;
#ifdef CONFIG_NET_RX_BUSY_POLL
unsigned int sk_ll_usec;
/* ===== mostly read cache line ===== */
unsigned int sk_napi_id;
#endif
int sk_rcvbuf;

struct sk_filter __rcu *sk_filter;
union {
struct socket_wq __rcu *sk_wq;
struct socket_wq *sk_wq_raw;
};
#ifdef CONFIG_XFRM
struct xfrm_policy __rcu *sk_policy[2];
#endif
struct dst_entry *sk_rx_dst;
struct dst_entry __rcu *sk_dst_cache;
atomic_t sk_omem_alloc;
int sk_sndbuf;

/* ===== cache line for TX ===== */
int sk_wmem_queued;
refcount_t sk_wmem_alloc;
unsigned long sk_tsq_flags;
union {
struct sk_buff *sk_send_head; /* 是用于数据传输的sk_buff列表 */
struct rb_root tcp_rtx_queue;
};
struct sk_buff_head sk_write_queue;
__s32 sk_peek_off;
int sk_write_pending;
__u32 sk_dst_pending_confirm;
u32 sk_pacing_status; /* see enum sk_pacing */
long sk_sndtimeo;
struct timer_list sk_timer;
__u32 sk_priority;
__u32 sk_mark;
unsigned long sk_pacing_rate; /* bytes per second */
unsigned long sk_max_pacing_rate;
struct page_frag sk_frag;
netdev_features_t sk_route_caps;
netdev_features_t sk_route_nocaps;
netdev_features_t sk_route_forced_caps;
int sk_gso_type;
unsigned int sk_gso_max_size;
gfp_t sk_allocation;
__u32 sk_txhash;

/*
* Because of non atomicity rules, all
* changes are protected by socket lock.
*/
unsigned int __sk_flags_offset[0];

......

unsigned int sk_padding : 1,
sk_kern_sock : 1,
sk_no_check_tx : 1,
sk_no_check_rx : 1,
sk_userlocks : 4,
sk_protocol : 8, /* 套接字使用的协议类型 */
sk_type : 16; /* 套接字类型(SOCK_STREAM,SOCK_DGRAM等) */
#define SK_PROTOCOL_MAX U8_MAX
u16 sk_gso_max_segs;
u8 sk_pacing_shift;
unsigned long sk_lingertime;
struct proto *sk_prot_creator;
rwlock_t sk_callback_lock;
int sk_err,
sk_err_soft;
u32 sk_ack_backlog;
u32 sk_max_ack_backlog;
kuid_t sk_uid;
struct pid *sk_peer_pid;
const struct cred *sk_peer_cred;
long sk_rcvtimeo;
ktime_t sk_stamp;
#if BITS_PER_LONG==32
seqlock_t sk_stamp_seq;
#endif
u16 sk_tsflags;
u8 sk_shutdown;
u32 sk_tskey;
atomic_t sk_zckey;

u8 sk_clockid;
u8 sk_txtime_deadline_mode : 1,
sk_txtime_report_errors : 1,
sk_txtime_unused : 6;

struct socket *sk_socket; /* 容纳它的BSD socket */
void *sk_user_data;
#ifdef CONFIG_SECURITY
void *sk_security;
#endif
struct sock_cgroup_data sk_cgrp_data;
struct mem_cgroup *sk_memcg;
void (*sk_state_change)(struct sock *sk);
void (*sk_data_ready)(struct sock *sk);
void (*sk_write_space)(struct sock *sk);
void (*sk_error_report)(struct sock *sk);
int (*sk_backlog_rcv)(struct sock *sk,
struct sk_buff *skb);
#ifdef CONFIG_SOCK_VALIDATE_XMIT
struct sk_buff* (*sk_validate_xmit_skb)(struct sock *sk,
struct net_device *dev,
struct sk_buff *skb);
#endif
void (*sk_destruct)(struct sock *sk);
struct sock_reuseport __rcu *sk_reuseport_cb;
struct rcu_head sk_rcu;
};
  • 相关 API 如下:
1
2
3
4
5
/* sending/receiving messages */
int sock_recvmsg(struct socket *sock, struct msghdr *msg, int flags); /* 从套接字socket(内核空间)接收消息 */
int sock_sendmsg(struct socket *sock, struct msghdr *msg); /* 利用套接字socket(内核空间)发送消息 */
int kernel_recvmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec, size_t num, size_t size, int flags); /* 从套接字socket(内核空间)接收消息 */
int kernel_sendmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec, size_t num, size_t size); /* 利用套接字socket(内核空间)发送消息 */

struct sk_buff

  • 用于描述一个网络数据包及其状态
  • 该结构是在用户空间或网络接口接收到内核数据包时创建的
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
struct sk_buff {
union {
struct {
/* These two members must be first. */
struct sk_buff *next; /* 用于链接sk_buff的链表 */
struct sk_buff *prev;

union {
struct net_device *dev; /* 发送或接收缓冲区的网络设备 */
unsigned long dev_scratch;
};
};
struct rb_node rbnode; /* used in netem, ip4 defrag, and tcp stack */
struct list_head list;
};

union {
struct sock *sk; /* 与缓冲区关联的sock */
int ip_defrag_offset;
};

union {
ktime_t tstamp;
u64 skb_mstamp_ns; /* earliest departure time */
};
char cb[48] __aligned(8);
union {
struct {
unsigned long _skb_refdst;
void (*destructor)(struct sk_buff *skb);
};
struct list_head tcp_tsorted_anchor;
};

#ifdef CONFIG_XFRM
struct sec_path *sp;
#endif
#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
unsigned long _nfct;
#endif
#if IS_ENABLED(CONFIG_BRIDGE_NETFILTER)
struct nf_bridge_info *nf_bridge;
#endif
unsigned int len, data_len;
__u16 mac_len, hdr_len;
__u16 queue_mapping;

/* if you move cloned around you also must adapt those constants */
#ifdef __BIG_ENDIAN_BITFIELD
#define CLONED_MASK (1 << 7)
#else
#define CLONED_MASK 1
#endif
#define CLONED_OFFSET() offsetof(struct sk_buff, __cloned_offset)

__u8 __cloned_offset[0];
__u8 cloned:1,
nohdr:1,
fclone:2,
peeked:1,
head_frag:1,
xmit_more:1,
pfmemalloc:1;
/* private: */
__u32 headers_start[0];
/* public: */

/* if you move pkt_type around you also must adapt those constants */
#ifdef __BIG_ENDIAN_BITFIELD
#define PKT_TYPE_MAX (7 << 5)
#else
#define PKT_TYPE_MAX 7
#endif
#define PKT_TYPE_OFFSET() offsetof(struct sk_buff, __pkt_type_offset)

__u8 __pkt_type_offset[0];
__u8 pkt_type:3;
__u8 ignore_df:1;
__u8 nf_trace:1;
__u8 ip_summed:2;
__u8 ooo_okay:1;

__u8 l4_hash:1;
__u8 sw_hash:1;
__u8 wifi_acked_valid:1;
__u8 wifi_acked:1;
__u8 no_fcs:1;
/* Indicates the inner headers are valid in the skbuff. */
__u8 encapsulation:1;
__u8 encap_hdr_csum:1;
__u8 csum_valid:1;

__u8 csum_complete_sw:1;
__u8 csum_level:2;
__u8 csum_not_inet:1;
__u8 dst_pending_confirm:1;
#ifdef CONFIG_IPV6_NDISC_NODETYPE
__u8 ndisc_nodetype:2;
#endif
__u8 ipvs_property:1;

__u8 inner_protocol_type:1;
__u8 remcsum_offload:1;
#ifdef CONFIG_NET_SWITCHDEV
__u8 offload_fwd_mark:1;
__u8 offload_mr_fwd_mark:1;
#endif
#ifdef CONFIG_NET_CLS_ACT
__u8 tc_skip_classify:1;
__u8 tc_at_ingress:1;
__u8 tc_redirected:1;
__u8 tc_from_ingress:1;
#endif
#ifdef CONFIG_TLS_DEVICE
__u8 decrypted:1;
#endif

#ifdef CONFIG_NET_SCHED
__u16 tc_index; /* traffic control index */
#endif

union {
__wsum csum;
struct {
__u16 csum_start;
__u16 csum_offset;
};
};
__u32 priority;
int skb_iif;
__u32 hash;
__be16 vlan_proto;
__u16 vlan_tci;
#if defined(CONFIG_NET_RX_BUSY_POLL) || defined(CONFIG_XPS)
union {
unsigned int napi_id;
unsigned int sender_cpu;
};
#endif
#ifdef CONFIG_NETWORK_SECMARK
__u32 secmark;
#endif

union {
__u32 mark;
__u32 reserved_tailroom;
};

union {
__be16 inner_protocol;
__u8 inner_ipproto;
};

__u16 inner_transport_header;
__u16 inner_network_header;
__u16 inner_mac_header;

__be16 protocol;
__u16 transport_header;
__u16 network_header;
__u16 mac_header;

/* private: */
__u32 headers_end[0];
/* public: */

/* These elements must be at the end, see alloc_skb() for details. */
sk_buff_data_t tail;
sk_buff_data_t end;
unsigned char *head,
*data;
unsigned int truesize;
refcount_t users;
};

Conversions

在不同的系统中,有几种方法可以对单词中的字节进行排序(字节序),包括:

  • Big Endian 大端
  • Little Endian 小端

由于网络将系统与不同的平台互连,因此互联网已经为数字数据的存储强加了一个标准的顺序,被称为 network byte-order 网络字节顺序

  • 网络字节序就是 Big Endian 大端字节序
  • 对于转换大小端,我们使用以下宏:
1
2
3
4
u16 htons(u16 x) /* 将16位整数从主机字节顺序转换为网络字节顺序 */
u32 htonl(u32 x) /* 将32位整数从主机字节顺序转换为网络字节顺序 */
u16 ntohs(u16 x) /* 将16位整数从网络字节顺序转换为主机字节顺序 */
u32 ntohl(u32 x) /* 将32位整数从网络字节顺序转换为主机字节顺序 */

netfilter

网络过滤器 netfilter 是内核接口的名称,用于捕获网络数据包以修改/分析它们(用于过滤,NAT等)

用户空间通过 iptable 使用网络过滤器接口

在 Linux 内核中,使用网络过滤器的数据包捕获是通过附加钩子来完成的:

  • 可以根据需要在路径中的不同位置指定钩子,后跟内核网络数据包
  • 可以在此处找到组织结构图,其中包含路线后跟包裹以及钩子的可能区域

钩子 hook 是通过以下结构定义的:

1
2
3
4
5
6
7
8
9
10
11
12
typedef unsigned int nf_hookfn(void *priv,
struct sk_buff *skb,
const struct nf_hook_state *state);

struct nf_hook_ops {
nf_hookfn *hook; /* 捕获网络数据包(作为结构发送的数据包)时,调用的处理程序(该字段是传递给处理程序的私有信息) */
struct net_device *dev; /* 要捕获的设备(网络接口) */
void *priv;
u_int8_t pf; /* 包装类型(PF_INET等) */
unsigned int hooknum; /* hook编号 */
int priority; /* 优先级 */
};
  • 钩子函数 hook 的签名中有一个 nf_hook_state 结构体,用于描述 hook 的状态信息,关键条目如下:
1
2
3
4
5
6
7
8
9
struct nf_hook_state {
unsigned int hook; /* hook编号 */
u_int8_t pf; /* 包装类型 */
struct net_device *in; /* 输入接口 */
struct net_device *out; /* 输出接口 */
struct sock *sk; /* 对应的sock(INET套接字) */
struct net *net; /* 对应的net(内核网络命名空间) */
int (*okfn)(struct net *, struct sock *, struct sk_buff *);
};

相关 API 如下:

1
2
int nf_register_net_hook(struct net *net, const struct nf_hook_ops *ops); /* 用于注册挂钩点 */
void nf_unregister_net_hook(struct net *net, const struct nf_hook_ops *ops); /* 用于注销挂钩点 */
1
2
3
4
int nf_register_net_hooks(struct net *net, const struct nf_hook_ops *reg,
unsigned int n); /* 调用n次nf_register_net_hook */
void nf_unregister_net_hooks(struct net *net, const struct nf_hook_ops *reg,
unsigned int n); /* 调用n次nf_unregister_net_hook */
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
int nf_register_net_hooks(struct net *net, const struct nf_hook_ops *reg,
unsigned int n)
{
unsigned int i;
int err = 0;

for (i = 0; i < n; i++) {
err = nf_register_net_hook(net, &reg[i]);
if (err)
goto err;
}
return err;

err:
if (i > 0)
nf_unregister_net_hooks(net, reg, i);
return err;
}
EXPORT_SYMBOL(nf_register_net_hooks);

void nf_unregister_net_hooks(struct net *net, const struct nf_hook_ops *reg,
unsigned int hookcount)
{
unsigned int i;

for (i = 0; i < hookcount; i++)
nf_unregister_net_hook(net, &reg[i]);
}
EXPORT_SYMBOL(nf_unregister_net_hooks);

使用案例如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#include <linux/netfilter.h>
#include <linux/netfilter_ipv4.h>
#include <linux/net.h>
#include <linux/in.h>
#include <linux/skbuff.h>
#include <linux/ip.h>
#include <linux/tcp.h>

static unsigned int my_nf_hookfn(void *priv,
struct sk_buff *skb,
const struct nf_hook_state *state)
{
struct iphdr *iph = ip_hdr(skb); /* IP header */
/* iph->saddr - source IP address */
/* iph->daddr - destination IP address */
if (iph->protocol == IPPROTO_TCP && test_daddr(iph->daddr)) {
struct tcphdr *tcph = tcp_hdr(skb); /* TCP header */
}
else if (iph->protocol == IPPROTO_UDP) {
struct udphdr *udph = udp_hdr(skb); /* UDP header */
}

return NF_ACCEPT;
}

static struct nf_hook_ops my_nfho = {
.hook = my_nf_hookfn,
.hooknum = NF_INET_LOCAL_OUT,
.pf = PF_INET,
.priority = NF_IP_PRI_FIRST
};

int __init my_hook_init(void)
{
return nf_register_net_hook(&init_net, &my_nfho);
}

void __exit my_hook_exit(void)
{
nf_unregister_net_hook(&init_net, &my_nfho);
}

module_init(my_hook_init);
module_exit(my_hook_exit);

netcat

在开发包含网络代码的应用程序时,最常用的工具之一是 netcat(也被称为“用于 TCP/IP 的瑞士军刀”),它允许:

  • 启动 TCP 连接
  • 等待 TCP 连接
  • 发送和接收 UDP 数据包
  • 以十六进制转储格式显示流量
  • 建立连接后运行程序(例如,shell)
  • 在已发送的包中设置特殊选项

Exercises

要解决练习,您需要执行以下步骤:

  • 从模板准备 skeletons
  • 构建模块
  • 将模块复制到虚拟机
  • 启动 VM 并在 VM 中测试模块
1
2
3
make clean
LABS=networking make skels
make build

1.netfilter:

  • 编写一个内核模块,该模块显示启动出站连接的 TCP 数据包的源地址和端口
  • 可以通过 MY_IOCTL_FILTER_ADDRESS ioctl 调用来指定目标地址
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
/*
* SO2 - Networking Lab (#10)
*
* Exercise #1, #2: simple netfilter module
*
* Code skeleton.
*/

#include <linux/kernel.h>
#include <linux/sched.h>
#include <linux/init.h>
#include <linux/module.h>
#include <asm/uaccess.h>
#include <linux/fs.h>
#include <linux/cdev.h>
#include <asm/atomic.h>
#include <linux/netfilter.h>
#include <linux/netfilter_ipv4.h>
#include <linux/net.h>
#include <linux/in.h>
#include <linux/skbuff.h>
#include <linux/ip.h>
#include <linux/tcp.h>

#include "filter.h"

MODULE_DESCRIPTION("Simple netfilter module");
MODULE_AUTHOR("SO2");
MODULE_LICENSE("GPL");

#define LOG_LEVEL KERN_ALERT
#define MY_DEVICE "filter"

static struct cdev my_cdev;
static atomic_t ioctl_set;
static unsigned int ioctl_set_addr;


/* Test ioctl_set_addr if it has been set.
*/
static int test_daddr(unsigned int dst_addr)
{
int ret = 0;

/* TODO 2: return non-zero if address has been set
* *and* matches dst_addr
*/
if (atomic_read(&ioctl_set) == 1)
ret = (ioctl_set_addr == dst_addr);
else
ret = 1;
return ret;
}

/* TODO 1: netfilter hook function */
static unsigned int my_nf_hookfn(void *priv,
struct sk_buff *skb,
const struct nf_hook_state *state)
{
struct iphdr *iph = ip_hdr(skb);

if (iph->protocol == IPPROTO_TCP && test_daddr(iph->daddr)) {
struct tcphdr *tcph = tcp_hdr(skb);
if (tcph->syn && !tcph->ack)
printk(LOG_LEVEL "IP address is %pI4:%u\n", &iph->saddr,ntohs(tcph->source));
}

return NF_ACCEPT;
}

static int my_open(struct inode *inode, struct file *file)
{
return 0;
}

static int my_close(struct inode *inode, struct file *file)
{
return 0;
}

static long my_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
{
switch (cmd) {
case MY_IOCTL_FILTER_ADDRESS:
/* TODO 2: set filter address from arg */
if(copy_from_user(&ioctl_set_addr,(void*)arg,sizeof(ioctl_set_addr))){
return -EFAULT;
}
atomic_set(&ioctl_set,1);
break;
default:
return -ENOTTY;
}

return 0;
}

static const struct file_operations my_fops = {
.owner = THIS_MODULE,
.open = my_open,
.release = my_close,
.unlocked_ioctl = my_ioctl
};

/* TODO 1: define netfilter hook operations structure */
static struct nf_hook_ops my_nfho = {
.hook = my_nf_hookfn,
.hooknum = NF_INET_LOCAL_OUT,
.pf = PF_INET,
.priority = NF_IP_PRI_FIRST
};

int __init my_hook_init(void)
{
int err;

/* register filter device */
err = register_chrdev_region(MKDEV(MY_MAJOR, 0), 1, MY_DEVICE);
if (err != 0)
return err;

atomic_set(&ioctl_set, 0);
ioctl_set_addr = 0;

/* init & add device */
cdev_init(&my_cdev, &my_fops);
cdev_add(&my_cdev, MKDEV(MY_MAJOR, 0), 1);

/* TODO 1: register netfilter hook */
err = nf_register_net_hook(&init_net,&my_nfho);
if (err)
goto out;
return 0;

out:
/* cleanup */
cdev_del(&my_cdev);
unregister_chrdev_region(MKDEV(MY_MAJOR, 0), 1);

return err;
}

void __exit my_hook_exit(void)
{
/* TODO 1: unregister hook */
nf_unregister_net_hook(&init_net,&my_nfho);
/* cleanup device */
cdev_del(&my_cdev);
unregister_chrdev_region(MKDEV(MY_MAJOR, 0), 1);
}

module_init(my_hook_init);
module_exit(my_hook_exit);
  • 结果:
1
2
3
4
5
root@qemux86:~/skels/networking/1-2-netfilter/user# ./test-1.sh                 
filter: loading out-of-tree module taints kernel.
IP address is 127.0.0.1:45934
Should show up in filter.
Check dmesg output.
1
2
3
4
5
root@qemux86:~/skels/networking/1-2-netfilter/user# ./test-2.sh                 
IP address is 127.0.0.1:45936
Should show up in filter.
Should NOT show up in filter.
Check dmesg output.
  • 使用 nc 命令时,hook 函数 my_nf_hookfn 被执行了
  • my_ioctl 的 MY_IOCTL_FILTER_ADDRESS 命令执行以后,效验模块开启,由于 ioctl_set_addr != iph->daddr 因此 hook 函数没有执行
  • PS:当时眼瞎把 nf_hook_ops 初始化错了,导致后面一直报错,调试了很久

2.tcp-sock:

  • 编写一个内核模块,该模块创建一个 TCP 套接字,该套接字侦听环回接口上的端口 60000 上的连接(以 init_module 为单位)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
/*
* SO2 - Networking Lab (#10)
*
* Exercise #3, #4: simple kernel TCP socket
*
* Code skeleton.
*/

#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/init.h>
#include <linux/net.h>
#include <linux/in.h>
#include <linux/fs.h>
#include <net/sock.h>

MODULE_DESCRIPTION("Simple kernel TCP socket");
MODULE_AUTHOR("SO2");
MODULE_LICENSE("GPL");

#define LOG_LEVEL KERN_ALERT
#define MY_TCP_PORT 60000
#define LISTEN_BACKLOG 5

#define ON 1
#define OFF 0
#define DEBUG ON

#if DEBUG == ON
#define LOG(s) \
do { \
printk(KERN_DEBUG s "\n"); \
} while (0)
#else
#define LOG(s) \
do {} while (0)
#endif

#define print_sock_address(addr) \
do { \
printk(LOG_LEVEL "connection established to " \
"%pI4:%d\n", \
&addr.sin_addr.s_addr, \
ntohs(addr.sin_port)); \
} while (0)

static struct socket *sock; /* listening (server) socket */
static struct socket *new_sock; /* communication socket */

int __init my_tcp_sock_init(void)
{
int err;
/* address to bind on */
struct sockaddr_in addr = {
.sin_family = AF_INET,
.sin_port = htons(MY_TCP_PORT),
.sin_addr = { htonl(INADDR_LOOPBACK) }
};
int addrlen = sizeof(addr);
/* address of peer */
struct sockaddr_in raddr;

/* TODO 1: create listening socket */
err = sock_create_kern(&init_net, PF_INET, SOCK_STREAM, IPPROTO_TCP, &sock);
if (err < 0) {
printk("socket create wrong");
goto out;
}
/* TODO 1: bind socket to loopback on port MY_TCP_PORT */
err = sock->ops->bind(sock, (struct sockaddr *) &addr, sizeof(addr));
if (err < 0) {
printk("bind wrong!");
goto out_release;
}
/* TODO 1: start listening */
err = sock->ops->listen(sock, LISTEN_BACKLOG);
if (err < 0) {
printk("listen wrong!");
goto out_release;
}
/* TODO 2: create new socket for the accepted connection */
err = sock_create_lite(PF_INET, SOCK_STREAM, IPPROTO_TCP, &new_sock);
if (err < 0) {
printk("create socket");
goto out;
}
new_sock->ops = sock->ops;
/* TODO 2: accept a connection */
err = sock->ops->accept(sock, new_sock, 0, true);
if (err < 0) {
printk("accept wrong");
goto out_release_new_sock;
}
/* TODO 2: get the address of the peer and print it */
err = sock->ops->getname(new_sock, (struct sockaddr *) &raddr, 1);
if (err < 0) {
printk("not find name");
goto out_release_new_sock;
}
print_sock_address(raddr);
return 0;

out_release_new_sock:
/* TODO 2: cleanup socket for accepted connection */
sock_release(new_sock);
out_release:
/* TODO 1: cleanup listening socket */
sock_release(sock);
out:
return err;
}

void __exit my_tcp_sock_exit(void)
{
/* TODO 2: cleanup socket for accepted connection */
sock_release(new_sock);
/* TODO 1: cleanup listening socket */
sock_release(sock);
}

module_init(my_tcp_sock_init);
module_exit(my_tcp_sock_exit);
  • PS:这里的 sock->ops->bind sock->ops->listen sock->ops->accept 就是用户态同名函数的底层实现
  • 结果:
1
2
3
4
5
6
7
8
9
10
11
12
13
root@qemux86:~/skels/networking/3-4-tcp-sock# ./test-4.sh                       
+ sleep 1
+ insmod tcp_sock.ko
+ netstat -tuan
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 127.0.0.1:60000 0.0.0.0:* LISTEN
+ echo+ sleep 3
+ ../netcat -q 4 127.0.0.1 60000 Should connect.
-p 60001
accept wrong
connection established to 127.0.0.1:60001
+ rmmod tcp_sock
  • 成功监听了目标端口

3.udp-sock:

  • 编写一个内核模块,用于创建 UDP 套接字,并将消息从套接字上的 MY_TEST_MESSAGE 宏发送到端口 60001 上的环回地址
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
/*
* SO2 - Networking Lab (#10)
*
* Bonus: simple kernel UDP socket
*
* Code skeleton.
*/

#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/init.h>
#include <linux/net.h>
#include <linux/in.h>
#include <net/sock.h>

MODULE_DESCRIPTION("Simple kernel UDP socket");
MODULE_AUTHOR("SO2");
MODULE_LICENSE("GPL");

#define LOG_LEVEL KERN_ALERT
#define MY_UDP_LOCAL_PORT 60000
#define MY_UDP_REMOTE_PORT 60001
#define MY_TEST_MESSAGE "kernelsocket\n"

#define ON 1
#define OFF 0
#define DEBUG ON

#if DEBUG == ON
#define LOG(s) \
do { \
printk(KERN_DEBUG s "\n"); \
} while (0)
#else
#define LOG(s) \
do {} while (0)
#endif

#define print_sock_address(addr) \
do { \
printk(LOG_LEVEL "connection established to " \
NIPQUAD_FMT ":%d\n", \
NIPQUAD(addr.sin_addr.s_addr), \
ntohs(addr.sin_port)); \
} while (0)

static struct socket *sock; /* UDP server */

/* send datagram */
static int my_udp_msgsend(struct socket *s)
{
/* address to send to */
struct sockaddr_in raddr = {
.sin_family = AF_INET,
.sin_port = htons(MY_UDP_REMOTE_PORT),
.sin_addr = { htonl(INADDR_LOOPBACK) }
};
int raddrlen = sizeof(raddr);
/* message */
struct msghdr msg;
struct iovec iov;
char *buffer = MY_TEST_MESSAGE;
int len = strlen(buffer) + 1;

/* TODO 1: build message */
iov.iov_base = buffer;
iov.iov_len = len;
msg.msg_flags = 0;
msg.msg_name = &raddr;
msg.msg_namelen = raddrlen;
msg.msg_control = NULL;
msg.msg_controllen = 0;

/* TODO 1: send the message down the socket and return the
* error code.
*/
return kernel_sendmsg(s, &msg, (struct kvec *) &iov, 1, len);

return 0;
}

int __init my_udp_sock_init(void)
{
int err;
/* address to bind on */
struct sockaddr_in addr = {
.sin_family = AF_INET,
.sin_port = htons(MY_UDP_LOCAL_PORT),
.sin_addr = { htonl(INADDR_LOOPBACK) }
};
int addrlen = sizeof(addr);

/* TODO 1: create UDP socket */
err = sock_create_kern(&init_net, PF_INET, SOCK_DGRAM, IPPROTO_UDP, &sock);
if (err < 0) {
printk(LOG_LEVEL "can't create socket\n");
goto out;
}
/* TODO 1: bind socket to loopback on port MY_UDP_LOCAL_PORT */
err = sock->ops->bind(sock, (struct sockaddr *) &addr, addrlen);
if (err < 0) {
printk(LOG_LEVEL "can't bind socket\n");
goto out_release;
}
/* send message */
err = my_udp_msgsend(sock);
if (err < 0) {
printk(LOG_LEVEL "can't send message\n");
goto out_release;
}

return 0;

out_release:
/* TODO 1: release socket */
sock_release(sock);
out:
return err;
}

void __exit my_udp_sock_exit(void)
{
/* TODO 1: release socket */
sock_release(sock);
}

module_init(my_udp_sock_init);
module_exit(my_udp_sock_exit);
  • 本实验算一个比较套路的过程,把 msghdriovec 填充好就行了
  • 结果:
1
2
3
4
5
6
7
8
9
root@qemux86:~/skels/networking/5-udp-sock# ./test-5.sh                         
+ pid=241
+ sleep 1
+ ../netcat -l -u -p 60001
+ insmod udp_sock.ko
udp_sock: loading out-of-tree module taints kernel.
kernelsocket
+ rmmod udp_sock
+ kill 241
  • 感觉本实验的侧重点是对协议栈 API 的使用
  • 之后有时间去专门分析一下这些 API 的底层,了解一下协议堆栈具体的处理过程