The conntrack-tools package contains two programs:
conntrack provides a full featured command line utility to interact with the connection tracking system. The conntrack utility provides a replacement for the limited /proc/net/nf_conntrack interface. With conntrack, you can list, update and delete the existing flow entries; you can also listen to flow events.
conntrackd is the user-space connection tracking daemon. This daemon can be used to deploy fault-tolerant GNU/Linux firewalls but you can also use it to collect flow-based statistics of the firewall use.
--> nf_conntrack vs conntrack vs xt_conntrack vs ip_conntrack?
conntrack is made to replace nf_conntrack
List nf_conntrack
# cat /proc/net/nf_conntrack
tcp 6 431982 ESTABLISHED src=192.168.2.100 dst=123.59.27.117 sport=34846 dport=993 packets=169 bytes=14322 src=123.59.27.117 dst=192.168.2.100 sport=993 dport=34846 packets=113 bytes=34787 [ASSURED] mark=0 use=1
tcp 6 431698 ESTABLISHED src=192.168.2.100 dst=123.59.27.117 sport=34849 dport=993 packets=244 bytes=18723 src=123.59.27.117 dst=192.168.2.100 sport=993 dport=34849 packets=203 bytes=144731 [ASSURED] mark=0 use=1
List Conntrack
# conntrack -L
tcp 6 431982 ESTABLISHED src=192.168.2.100 dst=123.59.27.117 sport=34846 dport=993 packets=169 bytes=14322 src=123.59.27.117 dst=192.168.2.100 sport=993 dport=34846 packets=113 bytes=34787 [ASSURED] mark=0 use=1
tcp 6 431698 ESTABLISHED src=192.168.2.100 dst=123.59.27.117 sport=34849 dport=993 packets=244 bytes=18723 src=123.59.27.117 dst=192.168.2.100 sport=993 dport=34849 packets=203 bytes=144731 [ASSURED] mark=0 use=1
conntrack v1.4.6 (conntrack-tools): 2 flow entries have been shown.
As the documentation tells, this has been obsoleted by the conntrack tool which uses the netlink(7) kernel API instead.
Most of the content of /proc/net/stat/nf_conntrack can be replaced by:conntrack --count
Delete Conntrack Entries
# conntrack -D -p tcp --dport 993
tcp 6 431982 ESTABLISHED src=192.168.2.100 dst=123.59.27.117 sport=34846 dport=993 packets=169 bytes=14322 src=123.59.27.117 dst=192.168.2.100 sport=993 dport=34846 packets=113 bytes=34787 [ASSURED] mark=10 use=1
conntrack v1.4.6 (conntrack-tools): 1 flow entries have been deleted.
Listen Conntrack Events
# conntrack -E
[NEW] udp 17 30 src=192.168.2.100 dst=192.168.2.1 sport=57767 dport=53 [UNREPLIED] src=192.168.2.1 dst=192.168.2.100 sport=53 dport=57767
[UPDATE] udp 17 29 src=192.168.2.100 dst=192.168.2.1 sport=57767 dport=53 src=192.168.2.1 dst=192.168.2.100 sport=53 dport=57767
[NEW] tcp 6 120 SYN_SENT src=192.168.2.100 dst=66.102.9.104 sport=33379 dport=80 [UNREPLIED] src=66.102.9.104 dst=192.168.2.100 sport=80 dport=33379
[UPDATE] tcp 6 60 SYN_RECV src=192.168.2.100 dst=66.102.9.104 sport=33379 dport=80 src=66.102.9.104 dst=192.168.2.100 sport=80 dport=33379
[UPDATE] tcp 6 432000 ESTABLISHED src=192.168.2.100 dst=66.102.9.104 sport=33379 dport=80 src=66.102.9.104 dst=192.168.2.100 sport=80 dport=33379 [ASSURED]
The conntrackd daemon supports three modes:
High availability (HA) is the ability of a system to operate continuously without failing for a designated period of time. HA works to ensure a system meets an agreed-upon operational performance level.
Userspace connection tracking helpers, for layer 7 Application Layer Gateway (ALG) such as DHCPv6, MDNS, RPC, SLP and Oracle TNS. As an alternative to the in-kernel connection tracking helpers that are available in the Linux kernel.
Flow-based statistics collection, to collect flow-based statistics as an alternative to ulogd2, although ulogd2 allows for more flexible statistics collection.
Userspace connection tracking helpers: Connection tracking helpers allows you to filter multi-flow protocols that usually separate control and data traffic into different flows. These protocols usually violate network layering by including layer 3/4 details, eg. IP address and TCP/UDP ports, in their application protocol (which resides in layer 7). This is problematic for gateways since they operate at packet-level, ie. layers 3/4, and therefore they miss this important information to filter these protocols appropriately.
netlink - communication between kernel and user space (AF_NETLINK)
#include <asm/types.h>
#include <sys/socket.h>
#include <linux/netlink.h>
netlink_socket = socket(AF_NETLINK, socket_type, netlink_family);
=> nfnetlink: netfilter에서 구현한 netlink 인터페이스 입니다. 넷필터의 동작들 - 주로 iptables 관련이 되겠죠 - 를 제어하기 위한 녀석입니다. NFQUEUE 도 netlink 인터페이스로 구현이 되어 있지요. nfnetlink 자체는 netfilter처럼 특별한 일을 하지 않고요. packet logging이나 방금 얘기한 queue, 혹은 conntrack 이벤트를 userspace daemon에서 받는 등의 용도로 사용합니다.
=> libnfnetlink:
libnfnetlink is the low-level library for netfilter related kernel/userspace communication. It provides a generic messaging infrastructure for in-kernel netfilter subsystems (such as nfnetlink_log, nfnetlink_queue, nfnetlink_conntrack) and their respective users and/or management tools in userspace.
libnftnl is a userspace library providing a low-level netlink programming interface (API) to the in-kernel nf_tables subsystem.
EX Usage) nft-chain-get
static int table_cb(const struct nlmsghdr *nlh, void *data)
{
struct nftnl_chain *t;
char buf[4096];
uint32_t *type = data;
t = nftnl_chain_alloc();
if (t == NULL) {
perror("OOM");
goto err;
}
if (nftnl_chain_nlmsg_parse(nlh, t) < 0) {
perror("nftnl_chain_nlmsg_parse");
goto err_free;
}
nftnl_chain_snprintf(buf, sizeof(buf), t, *type, 0);
printf("%s\n", buf);
err_free:
nftnl_chain_free(t);
err:
return MNL_CB_OK;
}
int main(int argc, char *argv[])
{
struct mnl_socket *nl;
char buf[MNL_SOCKET_BUFFER_SIZE];
struct nlmsghdr *nlh;
uint32_t portid, seq, type = NFTNL_OUTPUT_DEFAULT;
struct nftnl_chain *t = NULL;
int ret, family;
seq = time(NULL);
if (argc < 2 || argc > 5) {
fprintf(stderr, "Usage: %s <family> [<table> <chain>]\n",
argv[0]);
exit(EXIT_FAILURE);
}
if (strcmp(argv[1], "ip") == 0)
family = NFPROTO_IPV4;
else if (strcmp(argv[1], "ip6") == 0)
family = NFPROTO_IPV6;
else if (strcmp(argv[1], "inet") == 0)
family = NFPROTO_INET;
else if (strcmp(argv[1], "bridge") == 0)
family = NFPROTO_BRIDGE;
else if (strcmp(argv[1], "arp") == 0)
family = NFPROTO_ARP;
else if (strcmp(argv[1], "unspec") == 0)
family = NFPROTO_UNSPEC;
else {
fprintf(stderr, "Unknown family: ip, ip6, inet, bridge, arp, unspec\n");
exit(EXIT_FAILURE);
}
if (argc >= 4) {
t = nftnl_chain_alloc();
if (t == NULL) {
perror("OOM");
exit(EXIT_FAILURE);
}
nlh = nftnl_chain_nlmsg_build_hdr(buf, NFT_MSG_GETCHAIN, family,
NLM_F_ACK, seq);
nftnl_chain_set_str(t, NFTNL_CHAIN_TABLE, argv[2]);
nftnl_chain_set_str(t, NFTNL_CHAIN_NAME, argv[3]);
nftnl_chain_nlmsg_build_payload(nlh, t);
nftnl_chain_free(t);
} else if (argc >= 2) {
nlh = nftnl_chain_nlmsg_build_hdr(buf, NFT_MSG_GETCHAIN, family,
NLM_F_DUMP, seq);
}
nl = mnl_socket_open(NETLINK_NETFILTER);
if (nl == NULL) {
perror("mnl_socket_open");
exit(EXIT_FAILURE);
}
if (mnl_socket_bind(nl, 0, MNL_SOCKET_AUTOPID) < 0) {
perror("mnl_socket_bind");
exit(EXIT_FAILURE);
}
portid = mnl_socket_get_portid(nl);
if (mnl_socket_sendto(nl, nlh, nlh->nlmsg_len) < 0) {
perror("mnl_socket_send");
exit(EXIT_FAILURE);
}
ret = mnl_socket_recvfrom(nl, buf, sizeof(buf));
while (ret > 0) {
ret = mnl_cb_run(buf, ret, seq, portid, table_cb, &type);
if (ret <= 0)
break;
ret = mnl_socket_recvfrom(nl, buf, sizeof(buf));
}
if (ret == -1) {
perror("error");
exit(EXIT_FAILURE);
}
mnl_socket_close(nl);
return EXIT_SUCCESS;
}
libnetfilter_conntrack is a userspace library providing a programming interface (API) to the in-kernel connection tracking state table.
그런데, 우리가 하고자 하는것은 verdict가 있을 때, parallelism을 활용하여 여러 connection에 대한 tracking 처리 overhead를 줄이는건데, Rules 자체의 list/retrieve/insert/modify가 의미가 있나?
만약 있다면, 어떤 방식으로 libnftnl 과 libnetfilter_conntrack는 어떤 방식으로 kernel과 소통하나?
이걸 활용해서, Tracking 처리 방식 자체를 userspace에서 modify가 가능한가?
libnftnl ex2)
#include <stdlib.h>
#include <time.h>
#include <string.h>
#include <netinet/in.h>
#include <linux/netfilter.h>
#include <linux/netfilter/nf_tables.h>
#include <libmnl/libmnl.h>
#include <libnftnl/chain.h>
static struct nftnl_chain *chain_add_parse(int argc, char *argv[])
{
struct nftnl_chain *t;
int hooknum = 0;
if (argc == 6) {
/* This is a base chain, set the hook number */
if (strcmp(argv[4], "NF_INET_LOCAL_IN") == 0)
hooknum = NF_INET_LOCAL_IN;
else if (strcmp(argv[4], "NF_INET_LOCAL_OUT") == 0)
hooknum = NF_INET_LOCAL_OUT;
else if (strcmp(argv[4], "NF_INET_PRE_ROUTING") == 0)
hooknum = NF_INET_PRE_ROUTING;
else if (strcmp(argv[4], "NF_INET_POST_ROUTING") == 0)
hooknum = NF_INET_POST_ROUTING;
else if (strcmp(argv[4], "NF_INET_FORWARD") == 0)
hooknum = NF_INET_FORWARD;
else {
fprintf(stderr, "Unknown hook: %s\n", argv[4]);
return NULL;
}
}
t = nftnl_chain_alloc();
if (t == NULL) {
perror("OOM");
return NULL;
}
nftnl_chain_set_str(t, NFTNL_CHAIN_TABLE, argv[2]);
nftnl_chain_set_str(t, NFTNL_CHAIN_NAME, argv[3]);
if (argc == 6) {
nftnl_chain_set_u32(t, NFTNL_CHAIN_HOOKNUM, hooknum);
nftnl_chain_set_u32(t, NFTNL_CHAIN_PRIO, atoi(argv[5]));
}
return t;
}
int main(int argc, char *argv[])
{
struct mnl_socket *nl;
char buf[MNL_SOCKET_BUFFER_SIZE];
struct nlmsghdr *nlh;
uint32_t portid, seq, chain_seq;
int ret, family;
struct nftnl_chain *t;
struct mnl_nlmsg_batch *batch;
if (argc != 4 && argc != 6) {
fprintf(stderr, "Usage: %s <family> <table> <chain> "
"[<hooknum> <prio>]\n",
argv[0]);
exit(EXIT_FAILURE);
}
if (strcmp(argv[1], "ip") == 0)
family = NFPROTO_IPV4;
else if (strcmp(argv[1], "ip6") == 0)
family = NFPROTO_IPV6;
else if (strcmp(argv[1], "inet") == 0)
family = NFPROTO_INET;
else if (strcmp(argv[1], "bridge") == 0)
family = NFPROTO_BRIDGE;
else if (strcmp(argv[1], "arp") == 0)
family = NFPROTO_ARP;
else {
fprintf(stderr, "Unknown family: ip, ip6, inet, bridge, arp\n");
exit(EXIT_FAILURE);
}
t = chain_add_parse(argc, argv);
if (t == NULL)
exit(EXIT_FAILURE);
seq = time(NULL);
batch = mnl_nlmsg_batch_start(buf, sizeof(buf));
nftnl_batch_begin(mnl_nlmsg_batch_current(batch), seq++);
mnl_nlmsg_batch_next(batch);
chain_seq = seq;
nlh = nftnl_chain_nlmsg_build_hdr(mnl_nlmsg_batch_current(batch),
NFT_MSG_NEWCHAIN, family,
NLM_F_CREATE|NLM_F_ACK, seq++);
nftnl_chain_nlmsg_build_payload(nlh, t);
nftnl_chain_free(t);
mnl_nlmsg_batch_next(batch);
nftnl_batch_end(mnl_nlmsg_batch_current(batch), seq++);
mnl_nlmsg_batch_next(batch);
nl = mnl_socket_open(NETLINK_NETFILTER);
if (nl == NULL) {
perror("mnl_socket_open");
exit(EXIT_FAILURE);
}
if (mnl_socket_bind(nl, 0, MNL_SOCKET_AUTOPID) < 0) {
perror("mnl_socket_bind");
exit(EXIT_FAILURE);
}
portid = mnl_socket_get_portid(nl);
if (mnl_socket_sendto(nl, mnl_nlmsg_batch_head(batch),
mnl_nlmsg_batch_size(batch)) < 0) {
perror("mnl_socket_send");
exit(EXIT_FAILURE);
}
mnl_nlmsg_batch_stop(batch);
ret = mnl_socket_recvfrom(nl, buf, sizeof(buf));
while (ret > 0) {
ret = mnl_cb_run(buf, ret, chain_seq, portid, NULL, NULL);
if (ret <= 0)
break;
ret = mnl_socket_recvfrom(nl, buf, sizeof(buf));
}
if (ret == -1) {
perror("error");
exit(EXIT_FAILURE);
}
mnl_socket_close(nl);
return EXIT_SUCCESS;
}
(batch)
*\verbatim
|<-------------- MNL_SOCKET_BUFFER_SIZE ------------->|
|<-------------------- batch ------------------>| |
|-----------|-----------|-----------|-----------|-----------|
|<- nlmsg ->|<- nlmsg ->|<- nlmsg ->|<- nlmsg ->|<- nlmsg ->|
|-----------|-----------|-----------|-----------|-----------|
^ ^
| |
message N message N+1
\endverbatim
결국, libmnl을 통해 mnl_socket_bind를 이용해 mnl_socket_bind등의 통신.
=>처리 방식 자체를 바꾸기가 가능할까?