netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* GIT: [PATCH v3 0/3] netfilter trace infrastructure support
@ 2015-11-27 18:43 Florian Westphal
  2015-11-27 18:43 ` [PATCH v3 libnftnl 1/3] src: add " Florian Westphal
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Florian Westphal @ 2015-11-27 18:43 UTC (permalink / raw)
  To: netfilter-devel

3rd round.

See the individual patches for changelog vs. V 2.

Patrick: Nothing fancy from libnftnl side, just the 3 new attributes
(oiftype, nfproto, policy).

On kernel side I addressed all of the comments.

In case of NFPROTO_INET, the new NFTA_TRACE_NFPROTO will be 'NFPROTO_IPV4'
(or IPV6), the netlink header family is set to NFPROTO_INET.

I tried to reduce nft_do_chain bloat, +80 on my system.
We have one unconditional foo = false assignment, all nftrace calls
are avoided in 'no trace rules set' case via static key.

Let me know if you spot more problems.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v3 libnftnl 1/3] src: add trace infrastructure support
  2015-11-27 18:43 GIT: [PATCH v3 0/3] netfilter trace infrastructure support Florian Westphal
@ 2015-11-27 18:43 ` Florian Westphal
  2015-12-09 13:55   ` Pablo Neira Ayuso
  2015-11-27 18:43 ` [PATCH v2 nf-next 2/3] netfilter: nf_tables: extend tracing infrastructure Florian Westphal
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2015-11-27 18:43 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

parses trace monitor netlink messages from the kernel and builds
nftnl_trace struct that contains the dissected information.

Provides getters to access these attributes.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
Changes since v2:
 - add NFTNL_TRACE_NFPROTO, NFTNL_TRACE_POLICY, NFTNL_TRACE_OIFTYPE
 - use VERDICT without changing it
 - remove NFTNL_TRACE_QUEUE_ID

Changes since v1:
 - removed all printf support, not needed by Patricks nft patch
 - adjust for changed NFTA_TRACE* kernel attributes
 - VERDICT is now nested, so treat it accordingly
 - removed TRANSPORTPROTO, seems its not needed at this time
 - add NFTNL_TRACE_QUEUE_ID to obtain nfqueue number
 - NF_VERDICT is normalized in this case, i.e.
 if verdict from kernel is '42 << 16 | NF_QUEUE', make
 NF_QUEUE the verdict and put the queue number ->queue_id, accessible
 via NFTNL_TRACE_QUEUE_ID.

 include/libnftnl/Makefile.am        |   1 +
 include/libnftnl/trace.h            |  55 +++++
 include/linux/netfilter/nf_tables.h |  52 +++++
 src/Makefile.am                     |   1 +
 src/libnftnl.map                    |  15 ++
 src/trace.c                         | 438 ++++++++++++++++++++++++++++++++++++
 src/utils.c                         |  12 +
 7 files changed, 574 insertions(+)
 create mode 100644 include/libnftnl/trace.h
 create mode 100644 src/trace.c

diff --git a/include/libnftnl/Makefile.am b/include/libnftnl/Makefile.am
index a20aaee..84f01b6 100644
--- a/include/libnftnl/Makefile.am
+++ b/include/libnftnl/Makefile.am
@@ -1,5 +1,6 @@
 pkginclude_HEADERS = batch.h		\
 		     table.h		\
+		     trace.h		\
 		     chain.h		\
 		     rule.h		\
 		     expr.h		\
diff --git a/include/libnftnl/trace.h b/include/libnftnl/trace.h
new file mode 100644
index 0000000..674bc16
--- /dev/null
+++ b/include/libnftnl/trace.h
@@ -0,0 +1,55 @@
+#ifndef _LIBNFTNL_TRACE_H_
+#define _LIBNFTNL_TRACE_H_
+
+#include <stdint.h>
+#include <stdbool.h>
+#include <sys/types.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+enum nftnl_trace_attr {
+	NFTNL_TRACE_CHAIN = 0,
+	NFTNL_TRACE_FAMILY,
+	NFTNL_TRACE_ID,
+	NFTNL_TRACE_IIF,
+	NFTNL_TRACE_IIFTYPE,
+	NFTNL_TRACE_JUMP_TARGET,
+	NFTNL_TRACE_OIF,
+	NFTNL_TRACE_OIFTYPE,
+	NFTNL_TRACE_MARK,
+	NFTNL_TRACE_LL_HEADER,
+	NFTNL_TRACE_NETWORK_HEADER,
+	NFTNL_TRACE_TRANSPORT_HEADER,
+	NFTNL_TRACE_TABLE,
+	NFTNL_TRACE_TYPE,
+	NFTNL_TRACE_RULE_HANDLE,
+	NFTNL_TRACE_VERDICT,
+	NFTNL_TRACE_NFPROTO,
+	NFTNL_TRACE_POLICY,
+	__NFTNL_TRACE_MAX,
+};
+#define NFTNL_TRACE_MAX (__NFTNL_TRACE_MAX - 1)
+
+struct nftnl_trace;
+
+struct nftnl_trace *nftnl_trace_alloc(void);
+void nftnl_trace_free(struct nftnl_trace *trace);
+
+bool nftnl_trace_is_set(const struct nftnl_trace *trace, uint16_t type);
+
+const void *nftnl_trace_get_data(const struct nftnl_trace *trace,
+				 uint16_t type, uint32_t *data_len);
+
+uint16_t nftnl_trace_get_u16(const struct nftnl_trace *trace, uint16_t type);
+uint32_t nftnl_trace_get_u32(const struct nftnl_trace *trace, uint16_t type);
+uint64_t nftnl_trace_get_u64(const struct nftnl_trace *trace, uint16_t type);
+const char *nftnl_trace_get_str(const struct nftnl_trace *trace, uint16_t type);
+
+int nftnl_trace_nlmsg_parse(const struct nlmsghdr *nlh, struct nftnl_trace *t);
+#ifdef __cplusplus
+} /* extern "C" */
+#endif
+
+#endif /* _LIBNFTNL_TRACE_H_ */
diff --git a/include/linux/netfilter/nf_tables.h b/include/linux/netfilter/nf_tables.h
index 5420c8c..f77693b 100644
--- a/include/linux/netfilter/nf_tables.h
+++ b/include/linux/netfilter/nf_tables.h
@@ -83,6 +83,7 @@ enum nft_verdicts {
  * @NFT_MSG_DELSETELEM: delete a set element (enum nft_set_elem_attributes)
  * @NFT_MSG_NEWGEN: announce a new generation, only for events (enum nft_gen_attributes)
  * @NFT_MSG_GETGEN: get the rule-set generation (enum nft_gen_attributes)
+ * @NFT_MSG_TRACE: trace event (enum nft_trace_attributes)
  */
 enum nf_tables_msg_types {
 	NFT_MSG_NEWTABLE,
@@ -102,6 +103,7 @@ enum nf_tables_msg_types {
 	NFT_MSG_DELSETELEM,
 	NFT_MSG_NEWGEN,
 	NFT_MSG_GETGEN,
+	NFT_MSG_TRACE,
 	NFT_MSG_MAX,
 };
 
@@ -987,4 +989,54 @@ enum nft_gen_attributes {
 };
 #define NFTA_GEN_MAX		(__NFTA_GEN_MAX - 1)
 
+/**
+ * enum nft_trace_attributes - nf_tables trace netlink attributes
+ *
+ * @NFTA_TRACE_TABLE: name of the table (NLA_STRING)
+ * @NFTA_TRACE_CHAIN: name of the chain (NLA_STRING)
+ * @NFTA_TRACE_RULE_HANDLE: numeric handle of the rule (NLA_U64)
+ * @NFTA_TRACE_TYPE: type of the event (NLA_U32: nft_trace_types)
+ * @NFTA_TRACE_VERDICT: verdict returned by hook (NLA_NESTED: nft_verdicts)
+ * @NFTA_TRACE_ID: pseudo-id, same for each skb traced (NLA_U32)
+ * @NFTA_TRACE_LL_HEADER: linklayer header (NLA_BINARY)
+ * @NFTA_TRACE_NETWORK_HEADER: network header (NLA_BINARY)
+ * @NFTA_TRACE_TRANSPORT_HEADER: transport header (NLA_BINARY)
+ * @NFTA_TRACE_IIF: indev ifindex (NLA_U32)
+ * @NFTA_TRACE_IIFTYPE: netdev->type of indev (NLA_U16)
+ * @NFTA_TRACE_OIF: outdev ifindex (NLA_U32)
+ * @NFTA_TRACE_OIFTYPE: netdev->type of outdev (NLA_U16)
+ * @NFTA_TRACE_MARK: nfmark (NLA_U32)
+ * @NFTA_TRACE_NFPROTO: nf protocol processed (NLA_U32)
+ * @NFTA_TRACE_POLICY: policy that decided fate of packet (NLA_U32)
+ */
+enum nft_trace_attibutes {
+	NFTA_TRACE_UNSPEC,
+	NFTA_TRACE_TABLE,
+	NFTA_TRACE_CHAIN,
+	NFTA_TRACE_RULE_HANDLE,
+	NFTA_TRACE_TYPE,
+	NFTA_TRACE_VERDICT,
+	NFTA_TRACE_ID,
+	NFTA_TRACE_LL_HEADER,
+	NFTA_TRACE_NETWORK_HEADER,
+	NFTA_TRACE_TRANSPORT_HEADER,
+	NFTA_TRACE_IIF,
+	NFTA_TRACE_IIFTYPE,
+	NFTA_TRACE_OIF,
+	NFTA_TRACE_OIFTYPE,
+	NFTA_TRACE_MARK,
+	NFTA_TRACE_NFPROTO,
+	NFTA_TRACE_POLICY,
+	__NFTA_TRACE_MAX
+};
+#define NFTA_TRACE_MAX (__NFTA_TRACE_MAX - 1)
+
+enum nft_trace_types {
+	NFT_TRACETYPE_UNSPEC,
+	NFT_TRACETYPE_POLICY,
+	NFT_TRACETYPE_RETURN,
+	NFT_TRACETYPE_RULE,
+	__NFT_TRACETYPE_MAX
+};
+#define NFT_TRACETYPE_MAX (__NFT_TRACETYPE_MAX - 1)
 #endif /* _LINUX_NF_TABLES_H */
diff --git a/src/Makefile.am b/src/Makefile.am
index 107cae5..795307d 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -11,6 +11,7 @@ libnftnl_la_SOURCES = utils.c		\
 		      common.c		\
 		      gen.c		\
 		      table.c		\
+		      trace.c		\
 		      chain.c		\
 		      rule.c		\
 		      set.c		\
diff --git a/src/libnftnl.map b/src/libnftnl.map
index a52b54e..2e193b7 100644
--- a/src/libnftnl.map
+++ b/src/libnftnl.map
@@ -498,3 +498,18 @@ global:
 
 local: *;
 };
+
+LIBNFTNL_4.1 {
+	nftnl_trace_alloc;
+	nftnl_trace_free;
+
+	nftnl_trace_is_set;
+
+	nftnl_trace_get_u16;
+	nftnl_trace_get_u32;
+	nftnl_trace_get_u64;
+	nftnl_trace_get_str;
+	nftnl_trace_get_data;
+
+	nftnl_trace_nlmsg_parse;
+} LIBNFTNL_4;
diff --git a/src/trace.c b/src/trace.c
new file mode 100644
index 0000000..b04abb5
--- /dev/null
+++ b/src/trace.c
@@ -0,0 +1,438 @@
+/*
+ * (C) 2015 Red Hat GmbH
+ * Author: Florian Westphal <fw@strlen.de>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published
+ * by the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+#include "internal.h"
+
+#include <time.h>
+#include <endian.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+#include <netinet/in.h>
+
+#include <libmnl/libmnl.h>
+#include <linux/netfilter.h>
+#include <linux/netfilter/nfnetlink.h>
+#include <linux/netfilter/nf_tables.h>
+
+#include <libnftnl/trace.h>
+
+struct nftnl_header_data {
+	char *data;
+	unsigned int len;
+};
+
+struct nftnl_trace {
+	char *table;
+	char *chain;
+	char *jump_target;
+	uint64_t rule_handle;
+	struct nftnl_header_data ll;
+	struct nftnl_header_data nh;
+	struct nftnl_header_data th;
+	uint32_t family;
+	uint32_t type;
+	uint32_t id;
+	uint32_t iif;
+	uint32_t oif;
+	uint32_t mark;
+	uint32_t verdict;
+	uint32_t nfproto;
+	uint32_t policy;
+	uint16_t iiftype;
+	uint16_t oiftype;
+
+	uint32_t flags;
+};
+
+EXPORT_SYMBOL(nftnl_trace_alloc);
+struct nftnl_trace *nftnl_trace_alloc(void)
+{
+	return calloc(1, sizeof(struct nftnl_trace));
+}
+
+EXPORT_SYMBOL(nftnl_trace_free);
+void nftnl_trace_free(struct nftnl_trace *t)
+{
+	xfree(t->chain);
+	xfree(t->table);
+	xfree(t->jump_target);
+	xfree(t->ll.data);
+	xfree(t->nh.data);
+	xfree(t->th.data);
+	xfree(t);
+}
+
+EXPORT_SYMBOL(nftnl_trace_is_set);
+bool nftnl_trace_is_set(const struct nftnl_trace *t, uint16_t attr)
+{
+	return t->flags & (1 << attr);
+}
+
+static int nftnl_trace_parse_attr_cb(const struct nlattr *attr, void *data)
+{
+	const struct nlattr **tb = data;
+	enum nft_trace_attibutes type = mnl_attr_get_type(attr);
+
+	if (mnl_attr_type_valid(attr, NFTA_TRACE_MAX) < 0)
+		return MNL_CB_OK;
+
+	switch (type) {
+	case NFTA_TRACE_UNSPEC:
+	case __NFTA_TRACE_MAX:
+		break;
+	case NFTA_TRACE_VERDICT:
+                if (mnl_attr_validate(attr, MNL_TYPE_NESTED) < 0)
+			abi_breakage();
+		break;
+	case NFTA_TRACE_IIFTYPE:
+	case NFTA_TRACE_OIFTYPE:
+		if (mnl_attr_validate(attr, MNL_TYPE_U16) < 0)
+			abi_breakage();
+		break;
+	case NFTA_TRACE_ID:
+	case NFTA_TRACE_IIF:
+	case NFTA_TRACE_MARK:
+	case NFTA_TRACE_OIF:
+	case NFTA_TRACE_POLICY:
+	case NFTA_TRACE_NFPROTO:
+	case NFTA_TRACE_TYPE:
+		if (mnl_attr_validate(attr, MNL_TYPE_U32) < 0)
+			abi_breakage();
+		break;
+	case NFTA_TRACE_CHAIN:
+	case NFTA_TRACE_TABLE:
+		if (mnl_attr_validate(attr, MNL_TYPE_STRING) < 0)
+			abi_breakage();
+		break;
+	case NFTA_TRACE_RULE_HANDLE:
+		if (mnl_attr_validate(attr, MNL_TYPE_U64) < 0)
+			abi_breakage();
+		break;
+	case NFTA_TRACE_LL_HEADER:	/* fallthrough */
+	case NFTA_TRACE_NETWORK_HEADER:
+	case NFTA_TRACE_TRANSPORT_HEADER:
+		if (mnl_attr_get_payload_len(attr) == 0)
+			abi_breakage();
+		break;
+	};
+
+	tb[type] = attr;
+	return MNL_CB_OK;
+}
+
+EXPORT_SYMBOL(nftnl_trace_get_data);
+const void *nftnl_trace_get_data(const struct nftnl_trace *trace,
+				 uint16_t type, uint32_t *data_len)
+{
+	enum nftnl_trace_attr attr = type;
+
+	if (!(trace->flags & (1 << type)))
+		return NULL;
+
+	switch (attr) {
+	case NFTNL_TRACE_FAMILY:
+		*data_len = sizeof(uint32_t);
+		return &trace->family;
+	case NFTNL_TRACE_ID:
+		*data_len = sizeof(uint32_t);
+		return &trace->id;
+	case NFTNL_TRACE_IIF:
+		*data_len = sizeof(uint32_t);
+		return &trace->iif;
+	case NFTNL_TRACE_OIF:
+		*data_len = sizeof(uint32_t);
+		return &trace->oif;
+	case NFTNL_TRACE_LL_HEADER:
+		*data_len = trace->ll.len;
+		return trace->ll.data;
+	case NFTNL_TRACE_MARK:
+		*data_len = sizeof(uint32_t);
+		return &trace->mark;
+	case NFTNL_TRACE_NETWORK_HEADER:
+		*data_len = trace->nh.len;
+		return trace->nh.data;
+	case NFTNL_TRACE_TYPE:
+		*data_len = sizeof(uint32_t);
+		return &trace->type;
+	case NFTNL_TRACE_CHAIN:
+		*data_len = strlen(trace->chain);
+		return trace->chain;
+	case NFTNL_TRACE_TABLE:
+		*data_len = strlen(trace->table);
+		return trace->table;
+	case NFTNL_TRACE_JUMP_TARGET:
+		*data_len = strlen(trace->jump_target);
+		return trace->jump_target;
+	case NFTNL_TRACE_TRANSPORT_HEADER:
+		*data_len = trace->th.len;
+		return trace->th.data;
+	case NFTNL_TRACE_RULE_HANDLE:
+		*data_len = sizeof(uint64_t);
+		return &trace->rule_handle;
+	case NFTNL_TRACE_VERDICT:
+		*data_len = sizeof(uint32_t);
+		return &trace->verdict;
+	case NFTNL_TRACE_IIFTYPE:
+		*data_len = sizeof(uint16_t);
+		return &trace->iiftype;
+	case NFTNL_TRACE_OIFTYPE:
+		*data_len = sizeof(uint16_t);
+		return &trace->oiftype;
+	case NFTNL_TRACE_NFPROTO:
+		*data_len = sizeof(uint32_t);
+		return &trace->nfproto;
+	case NFTNL_TRACE_POLICY:
+		*data_len = sizeof(uint32_t);
+		return &trace->policy;
+	case __NFTNL_TRACE_MAX:
+		break;
+	}
+
+	return NULL;
+}
+
+EXPORT_SYMBOL(nftnl_trace_get_str);
+const char *nftnl_trace_get_str(const struct nftnl_trace *trace, uint16_t type)
+{
+	if (!nftnl_trace_is_set(trace, type))
+		return NULL;
+
+	switch (type) {
+	case NFTNL_TRACE_CHAIN: return trace->chain;
+	case NFTNL_TRACE_TABLE: return trace->table;
+	default: break;
+	}
+	return NULL;
+}
+
+EXPORT_SYMBOL(nftnl_trace_get_u16);
+uint16_t nftnl_trace_get_u16(const struct nftnl_trace *trace, uint16_t type)
+{
+	const uint16_t *d;
+	uint32_t dlen;
+
+	d = nftnl_trace_get_data(trace, type, &dlen);
+	if (d && dlen == sizeof(*d))
+		return *d;
+
+	return 0;
+}
+
+EXPORT_SYMBOL(nftnl_trace_get_u32);
+uint32_t nftnl_trace_get_u32(const struct nftnl_trace *trace, uint16_t type)
+{
+	const uint32_t *d;
+	uint32_t dlen;
+
+	d = nftnl_trace_get_data(trace, type, &dlen);
+	if (d && dlen == sizeof(*d))
+		return *d;
+
+	return 0;
+}
+
+EXPORT_SYMBOL(nftnl_trace_get_u64);
+uint64_t nftnl_trace_get_u64(const struct nftnl_trace *trace, uint16_t type)
+{
+	const uint64_t *d;
+	uint32_t dlen;
+
+	d = nftnl_trace_get_data(trace, type, &dlen);
+	if (d && dlen == sizeof(*d))
+		return *d;
+
+	return 0;
+}
+
+static bool nftnl_trace_nlmsg_parse_hdrdata(struct nlattr *attr,
+					    struct nftnl_header_data *header)
+{
+	uint32_t len;
+
+	if (!attr)
+		return false;
+
+	len = mnl_attr_get_payload_len(attr);
+
+	header->data = malloc(len);
+	if (header->data) {
+		memcpy(header->data, mnl_attr_get_payload(attr), len);
+		header->len = len;
+		return true;
+	}
+
+	return false;
+}
+
+static int nftnl_trace_parse_verdict_cb(const struct nlattr *attr, void *data)
+{
+	int type = mnl_attr_get_type(attr);
+	const struct nlattr **tb = data;
+
+	switch (type) {
+	case NFTA_VERDICT_CODE:
+		if (mnl_attr_validate(attr, MNL_TYPE_U32) < 0)
+			abi_breakage();
+		tb[type] = attr;
+		break;
+	case NFTA_VERDICT_CHAIN:
+		if (mnl_attr_validate(attr, MNL_TYPE_STRING) < 0)
+			abi_breakage();
+		tb[type] = attr;
+		break;
+	}
+
+	return MNL_CB_OK;
+}
+
+static void
+nftnl_trace_parse_verdict(const struct nlattr *attr, struct nftnl_trace *t)
+{
+	struct nlattr *tb[NFTA_VERDICT_MAX+1];
+
+	mnl_attr_parse_nested(attr, nftnl_trace_parse_verdict_cb, tb);
+
+	if (!tb[NFTA_VERDICT_CODE])
+		abi_breakage();
+
+	t->verdict = ntohl(mnl_attr_get_u32(tb[NFTA_VERDICT_CODE]));
+	t->flags |= (1 << NFTNL_TRACE_VERDICT);
+
+	switch (t->verdict) {
+	case NFT_GOTO: /* fallthough */
+	case NFT_JUMP:
+		if (!tb[NFTA_VERDICT_CHAIN])
+			abi_breakage();
+		t->jump_target = strdup(mnl_attr_get_str(tb[NFTA_VERDICT_CHAIN]));
+		if (t->jump_target)
+			t->flags |= (1 << NFTNL_TRACE_JUMP_TARGET);
+		break;
+	}
+}
+
+EXPORT_SYMBOL(nftnl_trace_nlmsg_parse);
+int nftnl_trace_nlmsg_parse(const struct nlmsghdr *nlh, struct nftnl_trace *t)
+{
+	struct nfgenmsg *nfg = mnl_nlmsg_get_payload(nlh);
+	struct nlattr *tb[NFTA_TRACE_MAX+1] = {};
+
+	if (mnl_attr_parse(nlh, sizeof(*nfg), nftnl_trace_parse_attr_cb, tb) < 0)
+		return -1;
+
+	if (!tb[NFTA_TRACE_ID])
+		abi_breakage();
+
+	if (!tb[NFTA_TRACE_TYPE])
+		abi_breakage();
+
+	if (tb[NFTA_TRACE_TABLE])
+		t->table = strdup(mnl_attr_get_str(tb[NFTA_TRACE_TABLE]));
+	if (tb[NFTA_TRACE_CHAIN])
+		t->chain = strdup(mnl_attr_get_str(tb[NFTA_TRACE_CHAIN]));
+
+	t->family = nfg->nfgen_family;
+	t->flags |= (1 << NFTNL_TRACE_FAMILY);
+
+	t->type = ntohl(mnl_attr_get_u32(tb[NFTA_TRACE_TYPE]));
+	t->flags |= (1 << NFTNL_TRACE_TYPE);
+
+	t->id = ntohl(mnl_attr_get_u32(tb[NFTA_TRACE_ID]));
+	t->flags |= (1 << NFTNL_TRACE_ID);
+
+	if (tb[NFTA_TRACE_IIFTYPE]) {
+		t->iiftype = ntohs(mnl_attr_get_u16(tb[NFTA_TRACE_IIFTYPE]));
+		t->flags |= (1 << NFTNL_TRACE_IIFTYPE);
+	}
+
+	if (tb[NFTA_TRACE_IIF]) {
+		t->iif = ntohl(mnl_attr_get_u32(tb[NFTA_TRACE_IIF]));
+		t->flags |= (1 << NFTNL_TRACE_IIF);
+	}
+
+	if (tb[NFTA_TRACE_OIFTYPE]) {
+		t->iiftype = ntohs(mnl_attr_get_u16(tb[NFTA_TRACE_OIFTYPE]));
+		t->flags |= (1 << NFTNL_TRACE_OIFTYPE);
+	}
+
+	if (tb[NFTA_TRACE_OIF]) {
+		t->oif = ntohl(mnl_attr_get_u32(tb[NFTA_TRACE_OIF]));
+		t->flags |= (1 << NFTNL_TRACE_OIF);
+	}
+
+	if (tb[NFTA_TRACE_MARK]) {
+		t->mark = ntohl(mnl_attr_get_u32(tb[NFTA_TRACE_MARK]));
+		t->flags |= (1 << NFTNL_TRACE_MARK);
+	}
+
+	if (tb[NFTA_TRACE_RULE_HANDLE]) {
+		t->rule_handle = be64toh(mnl_attr_get_u64(tb[NFTA_TRACE_RULE_HANDLE]));
+		t->flags |= (1 << NFTNL_TRACE_RULE_HANDLE);
+	}
+
+	if (tb[NFTA_TRACE_VERDICT])
+		nftnl_trace_parse_verdict(tb[NFTA_TRACE_VERDICT], t);
+
+	if (nftnl_trace_nlmsg_parse_hdrdata(tb[NFTA_TRACE_LL_HEADER], &t->ll))
+		t->flags |= (1 << NFTNL_TRACE_LL_HEADER);
+
+	if (nftnl_trace_nlmsg_parse_hdrdata(tb[NFTA_TRACE_NETWORK_HEADER], &t->nh))
+		t->flags |= (1 << NFTNL_TRACE_NETWORK_HEADER);
+
+	if (nftnl_trace_nlmsg_parse_hdrdata(tb[NFTA_TRACE_TRANSPORT_HEADER], &t->th))
+		t->flags |= (1 << NFTNL_TRACE_TRANSPORT_HEADER);
+
+	if (tb[NFTA_TRACE_NFPROTO]) {
+		t->iiftype = ntohs(mnl_attr_get_u16(tb[NFTA_TRACE_NFPROTO]));
+		t->flags |= (1 << NFTNL_TRACE_NFPROTO);
+	}
+
+	if (tb[NFTA_TRACE_POLICY]) {
+		t->iiftype = ntohs(mnl_attr_get_u16(tb[NFTA_TRACE_POLICY]));
+		t->flags |= (1 << NFTNL_TRACE_POLICY);
+	}
+
+	if (tb[NFTA_TRACE_OIF]) {
+		t->oif = ntohl(mnl_attr_get_u32(tb[NFTA_TRACE_OIF]));
+		t->flags |= (1 << NFTNL_TRACE_OIF);
+	}
+
+	if (tb[NFTA_TRACE_MARK]) {
+		t->mark = ntohl(mnl_attr_get_u32(tb[NFTA_TRACE_MARK]));
+		t->flags |= (1 << NFTNL_TRACE_MARK);
+	}
+
+	if (tb[NFTA_TRACE_RULE_HANDLE]) {
+		t->rule_handle = be64toh(mnl_attr_get_u64(tb[NFTA_TRACE_RULE_HANDLE]));
+		t->flags |= (1 << NFTNL_TRACE_RULE_HANDLE);
+	}
+
+	if (tb[NFTA_TRACE_VERDICT])
+		nftnl_trace_parse_verdict(tb[NFTA_TRACE_VERDICT], t);
+
+	if (nftnl_trace_nlmsg_parse_hdrdata(tb[NFTA_TRACE_LL_HEADER], &t->ll))
+		t->flags |= (1 << NFTNL_TRACE_LL_HEADER);
+
+	if (nftnl_trace_nlmsg_parse_hdrdata(tb[NFTA_TRACE_NETWORK_HEADER], &t->nh))
+		t->flags |= (1 << NFTNL_TRACE_NETWORK_HEADER);
+
+	if (nftnl_trace_nlmsg_parse_hdrdata(tb[NFTA_TRACE_TRANSPORT_HEADER], &t->th))
+		t->flags |= (1 << NFTNL_TRACE_TRANSPORT_HEADER);
+
+	if (t->chain)
+		t->flags |= (1 << NFTNL_TRACE_CHAIN);
+
+	if (t->chain)
+		t->flags |= (1 << NFTNL_TRACE_CHAIN);
+	if (t->table)
+		t->flags |= (1 << NFTNL_TRACE_TABLE);
+
+	return 0;
+}
diff --git a/src/utils.c b/src/utils.c
index 84fbe94..ba36bc4 100644
--- a/src/utils.c
+++ b/src/utils.c
@@ -146,12 +146,24 @@ const char *nftnl_verdict2str(uint32_t verdict)
 		return "accept";
 	case NF_DROP:
 		return "drop";
+	case NF_STOLEN:
+		return "stolen";
+	case NF_QUEUE:
+		return "queue";
+	case NF_REPEAT:
+		return "repeat";
+	case NF_STOP:
+		return "stop";
 	case NFT_RETURN:
 		return "return";
 	case NFT_JUMP:
 		return "jump";
 	case NFT_GOTO:
 		return "goto";
+	case NFT_CONTINUE:
+		return "continue";
+	case NFT_BREAK:
+		return "break";
 	default:
 		return "unknown";
 	}
-- 
2.4.10


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 nf-next 2/3] netfilter: nf_tables: extend tracing infrastructure
  2015-11-27 18:43 GIT: [PATCH v3 0/3] netfilter trace infrastructure support Florian Westphal
  2015-11-27 18:43 ` [PATCH v3 libnftnl 1/3] src: add " Florian Westphal
@ 2015-11-27 18:43 ` Florian Westphal
  2015-11-28 12:36   ` Patrick McHardy
  2015-11-27 18:43 ` [PATCH v3 nf-next 3/3] netfilter: nf_tables: wrap tracing with a static key Florian Westphal
  2015-11-27 18:55 ` GIT: [PATCH v3 0/3] netfilter trace infrastructure support Patrick McHardy
  3 siblings, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2015-11-27 18:43 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

nft monitor mode can then decode and display this trace data.

Parts of LL/Network/Transport headers are provided as separate
attributes.

Otherwise, printing IP address data becomes virtually impossible
for userspace since in the case of the netdev family we really don't
want userspace to have to know all the possible link layer types
and/or sizes just to display/print an ip address.

We also don't want userspace to have to follow ipv6 header chains
to get the s/dport info, the kernel already did this work for us.

To avoid bloating nft_do_chain all data required for tracing is
encapsulated in nft_traceinfo.

The structure is initialized unconditionally(!) for each nft_do_chain
invocation.

This unconditionall call will be moved under a static key in a
followup patch.

With lots of help from Patrick McHardy and Pablo Neira.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
Changes since v2:
 - set nfgen_family to the rule family, not pkt->pf.
   To get the 'logical' family (e.g. NFPROTO_INET), we
   keep basechain ptr in nft_traceinfo struct.
 - export oif type, policy and pkt->pf
 - miniminze bloating of nft_do_chain
 - revert to a simpler id for now, based on skb address and
   rx hash
 - several cosmetic changes

 Changes since v1:
  - move trace into new file, nf_tables_trace.c (Pablo)
  - add new netlink group for trace messages (Pablo, Patrick)
  - make verdict nested attr (Patrick)
  - add percpu area to store current processed skb id
  (suggested by Patrick)
  - fold arguments to nft_trace_notify into a structure,
  nft_traceinfo.  This would allow easier extension later
  on, e.g. ptr to registers to e.g. log NFT_BREAK/mismatch
  as well.

 include/net/netfilter/nf_tables.h        |  32 ++++
 include/uapi/linux/netfilter/nf_tables.h |  52 ++++++
 include/uapi/linux/netfilter/nfnetlink.h |   2 +
 net/netfilter/Makefile                   |   2 +-
 net/netfilter/nf_tables_api.c            |  12 +-
 net/netfilter/nf_tables_core.c           |  45 ++++--
 net/netfilter/nf_tables_trace.c          | 269 +++++++++++++++++++++++++++++++
 net/netfilter/nfnetlink.c                |   1 +
 8 files changed, 396 insertions(+), 19 deletions(-)
 create mode 100644 net/netfilter/nf_tables_trace.c

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 101d7d7..b313cda 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -888,6 +888,38 @@ void nft_unregister_chain_type(const struct nf_chain_type *);
 int nft_register_expr(struct nft_expr_type *);
 void nft_unregister_expr(struct nft_expr_type *);
 
+int nft_verdict_dump(struct sk_buff *skb, int type,
+		     const struct nft_verdict *v);
+
+/**
+ *	struct nft_traceinfo - nft tracing information and state
+ *
+ *	@pkt: pktinfo currently processed
+ *	@basechain: base chain currently processed
+ *	@chain: chain currently processed
+ *	@rule:  rule that was evaluated
+ *	@verdict: verdict given by rule
+ *	@type: event type (enum nft_trace_types)
+ *	@packet_dumped: packet headers sent in a previous traceinfo message
+ *	@trace: other struct members are initialised
+ */
+struct nft_traceinfo {
+	const struct nft_pktinfo	*pkt;
+	const struct nft_base_chain	*basechain;
+	const struct nft_chain		*chain;
+	const struct nft_rule		*rule;
+	const struct nft_verdict	*verdict;
+	enum nft_trace_types		type;
+	bool				packet_dumped;
+	bool				trace;
+};
+
+void nft_trace_init(struct nft_traceinfo *info, const struct nft_pktinfo *pkt,
+		    const struct nft_verdict *verdict,
+		    const struct nft_chain *basechain);
+
+void nft_trace_notify(struct nft_traceinfo *info);
+
 #define nft_dereference(p)					\
 	nfnl_dereference(p, NFNL_SUBSYS_NFTABLES)
 
diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 5f3ecec..b48a3ab 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -83,6 +83,7 @@ enum nft_verdicts {
  * @NFT_MSG_DELSETELEM: delete a set element (enum nft_set_elem_attributes)
  * @NFT_MSG_NEWGEN: announce a new generation, only for events (enum nft_gen_attributes)
  * @NFT_MSG_GETGEN: get the rule-set generation (enum nft_gen_attributes)
+ * @NFT_MSG_TRACE: trace event (enum nft_trace_attributes)
  */
 enum nf_tables_msg_types {
 	NFT_MSG_NEWTABLE,
@@ -102,6 +103,7 @@ enum nf_tables_msg_types {
 	NFT_MSG_DELSETELEM,
 	NFT_MSG_NEWGEN,
 	NFT_MSG_GETGEN,
+	NFT_MSG_TRACE,
 	NFT_MSG_MAX,
 };
 
@@ -987,4 +989,54 @@ enum nft_gen_attributes {
 };
 #define NFTA_GEN_MAX		(__NFTA_GEN_MAX - 1)
 
+/**
+ * enum nft_trace_attributes - nf_tables trace netlink attributes
+ *
+ * @NFTA_TRACE_TABLE: name of the table (NLA_STRING)
+ * @NFTA_TRACE_CHAIN: name of the chain (NLA_STRING)
+ * @NFTA_TRACE_RULE_HANDLE: numeric handle of the rule (NLA_U64)
+ * @NFTA_TRACE_TYPE: type of the event (NLA_U32: nft_trace_types)
+ * @NFTA_TRACE_VERDICT: verdict returned by hook (NLA_NESTED: nft_verdicts)
+ * @NFTA_TRACE_ID: pseudo-id, same for each skb traced (NLA_U32)
+ * @NFTA_TRACE_LL_HEADER: linklayer header (NLA_BINARY)
+ * @NFTA_TRACE_NETWORK_HEADER: network header (NLA_BINARY)
+ * @NFTA_TRACE_TRANSPORT_HEADER: transport header (NLA_BINARY)
+ * @NFTA_TRACE_IIF: indev ifindex (NLA_U32)
+ * @NFTA_TRACE_IIFTYPE: netdev->type of indev (NLA_U16)
+ * @NFTA_TRACE_OIF: outdev ifindex (NLA_U32)
+ * @NFTA_TRACE_OIFTYPE: netdev->type of outdev (NLA_U16)
+ * @NFTA_TRACE_MARK: nfmark (NLA_U32)
+ * @NFTA_TRACE_NFPROTO: nf protocol processed (NLA_U32)
+ * @NFTA_TRACE_POLICY: policy that decided fate of packet (NLA_U32)
+ */
+enum nft_trace_attibutes {
+	NFTA_TRACE_UNSPEC,
+	NFTA_TRACE_TABLE,
+	NFTA_TRACE_CHAIN,
+	NFTA_TRACE_RULE_HANDLE,
+	NFTA_TRACE_TYPE,
+	NFTA_TRACE_VERDICT,
+	NFTA_TRACE_ID,
+	NFTA_TRACE_LL_HEADER,
+	NFTA_TRACE_NETWORK_HEADER,
+	NFTA_TRACE_TRANSPORT_HEADER,
+	NFTA_TRACE_IIF,
+	NFTA_TRACE_IIFTYPE,
+	NFTA_TRACE_OIF,
+	NFTA_TRACE_OIFTYPE,
+	NFTA_TRACE_MARK,
+	NFTA_TRACE_NFPROTO,
+	NFTA_TRACE_POLICY,
+	__NFTA_TRACE_MAX
+};
+#define NFTA_TRACE_MAX (__NFTA_TRACE_MAX - 1)
+
+enum nft_trace_types {
+	NFT_TRACETYPE_UNSPEC,
+	NFT_TRACETYPE_POLICY,
+	NFT_TRACETYPE_RETURN,
+	NFT_TRACETYPE_RULE,
+	__NFT_TRACETYPE_MAX
+};
+#define NFT_TRACETYPE_MAX (__NFT_TRACETYPE_MAX - 1)
 #endif /* _LINUX_NF_TABLES_H */
diff --git a/include/uapi/linux/netfilter/nfnetlink.h b/include/uapi/linux/netfilter/nfnetlink.h
index 354a7e5..4bb8cb7 100644
--- a/include/uapi/linux/netfilter/nfnetlink.h
+++ b/include/uapi/linux/netfilter/nfnetlink.h
@@ -22,6 +22,8 @@ enum nfnetlink_groups {
 #define NFNLGRP_NFTABLES                NFNLGRP_NFTABLES
 	NFNLGRP_ACCT_QUOTA,
 #define NFNLGRP_ACCT_QUOTA		NFNLGRP_ACCT_QUOTA
+	NFNLGRP_NFTRACE,
+#define NFNLGRP_NFTRACE			NFNLGRP_NFTRACE
 	__NFNLGRP_MAX,
 };
 #define NFNLGRP_MAX	(__NFNLGRP_MAX - 1)
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 7638c36..2293484 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -67,7 +67,7 @@ obj-$(CONFIG_NF_NAT_TFTP) += nf_nat_tftp.o
 obj-$(CONFIG_NETFILTER_SYNPROXY) += nf_synproxy_core.o
 
 # nf_tables
-nf_tables-objs += nf_tables_core.o nf_tables_api.o
+nf_tables-objs += nf_tables_core.o nf_tables_api.o nf_tables_trace.o
 nf_tables-objs += nft_immediate.o nft_cmp.o nft_lookup.o nft_dynset.o
 nf_tables-objs += nft_bitwise.o nft_byteorder.o nft_payload.o
 
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 93cc473..c4969a0 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4446,22 +4446,22 @@ static void nft_verdict_uninit(const struct nft_data *data)
 	}
 }
 
-static int nft_verdict_dump(struct sk_buff *skb, const struct nft_data *data)
+int nft_verdict_dump(struct sk_buff *skb, int type, const struct nft_verdict *v)
 {
 	struct nlattr *nest;
 
-	nest = nla_nest_start(skb, NFTA_DATA_VERDICT);
+	nest = nla_nest_start(skb, type);
 	if (!nest)
 		goto nla_put_failure;
 
-	if (nla_put_be32(skb, NFTA_VERDICT_CODE, htonl(data->verdict.code)))
+	if (nla_put_be32(skb, NFTA_VERDICT_CODE, htonl(v->code)))
 		goto nla_put_failure;
 
-	switch (data->verdict.code) {
+	switch (v->code) {
 	case NFT_JUMP:
 	case NFT_GOTO:
 		if (nla_put_string(skb, NFTA_VERDICT_CHAIN,
-				   data->verdict.chain->name))
+				   v->chain->name))
 			goto nla_put_failure;
 	}
 	nla_nest_end(skb, nest);
@@ -4572,7 +4572,7 @@ int nft_data_dump(struct sk_buff *skb, int attr, const struct nft_data *data,
 		err = nft_value_dump(skb, data, len);
 		break;
 	case NFT_DATA_VERDICT:
-		err = nft_verdict_dump(skb, data);
+		err = nft_verdict_dump(skb, NFTA_DATA_VERDICT, &data->verdict);
 		break;
 	default:
 		err = -EINVAL;
diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c
index f3695a4..2395de7 100644
--- a/net/netfilter/nf_tables_core.c
+++ b/net/netfilter/nf_tables_core.c
@@ -44,22 +44,36 @@ static struct nf_loginfo trace_loginfo = {
 	},
 };
 
-static void __nft_trace_packet(const struct nft_pktinfo *pkt,
-			       const struct nft_chain *chain,
-			       int rulenum, enum nft_trace type)
+static noinline void __nft_trace_packet(struct nft_traceinfo *info,
+					const struct nft_chain *chain,
+					int rulenum, enum nft_trace type)
 {
+	const struct nft_pktinfo *pkt = info->pkt;
+
+	if (!pkt->skb->nf_trace)
+		return;
+
+	info->chain = chain;
+	info->type = type;
+
+	nft_trace_notify(info);
+
 	nf_log_trace(pkt->net, pkt->pf, pkt->hook, pkt->skb, pkt->in,
 		     pkt->out, &trace_loginfo, "TRACE: %s:%s:%s:%u ",
 		     chain->table->name, chain->name, comments[type],
 		     rulenum);
 }
 
-static inline void nft_trace_packet(const struct nft_pktinfo *pkt,
+static inline void nft_trace_packet(struct nft_traceinfo *info,
 				    const struct nft_chain *chain,
-				    int rulenum, enum nft_trace type)
+				    const struct nft_rule *rule,
+				    int rulenum,
+				    enum nft_trace_types type)
 {
-	if (unlikely(pkt->skb->nf_trace))
-		__nft_trace_packet(pkt, chain, rulenum, type);
+	if (unlikely(info->trace)) {
+		info->rule = rule;
+		__nft_trace_packet(info, chain, rulenum, type);
+	}
 }
 
 static void nft_cmp_fast_eval(const struct nft_expr *expr,
@@ -121,7 +135,9 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 	struct nft_stats *stats;
 	int rulenum;
 	unsigned int gencursor = nft_genmask_cur(net);
+	struct nft_traceinfo info;
 
+	nft_trace_init(&info, pkt, &regs.verdict, basechain);
 do_chain:
 	rulenum = 0;
 	rule = list_entry(&chain->rules, struct nft_rule, list);
@@ -151,7 +167,8 @@ next_rule:
 			regs.verdict.code = NFT_CONTINUE;
 			continue;
 		case NFT_CONTINUE:
-			nft_trace_packet(pkt, chain, rulenum, NFT_TRACE_RULE);
+			nft_trace_packet(&info, chain, rule,
+					 rulenum, NFT_TRACETYPE_RULE);
 			continue;
 		}
 		break;
@@ -161,7 +178,8 @@ next_rule:
 	case NF_ACCEPT:
 	case NF_DROP:
 	case NF_QUEUE:
-		nft_trace_packet(pkt, chain, rulenum, NFT_TRACE_RULE);
+		nft_trace_packet(&info, chain, rule,
+				 rulenum, NFT_TRACETYPE_RULE);
 		return regs.verdict.code;
 	}
 
@@ -174,7 +192,8 @@ next_rule:
 		stackptr++;
 		/* fall through */
 	case NFT_GOTO:
-		nft_trace_packet(pkt, chain, rulenum, NFT_TRACE_RULE);
+		nft_trace_packet(&info, chain, rule,
+				 rulenum, NFT_TRACETYPE_RULE);
 
 		chain = regs.verdict.chain;
 		goto do_chain;
@@ -182,7 +201,8 @@ next_rule:
 		rulenum++;
 		/* fall through */
 	case NFT_RETURN:
-		nft_trace_packet(pkt, chain, rulenum, NFT_TRACE_RETURN);
+		nft_trace_packet(&info, chain, rule,
+				 rulenum, NFT_TRACETYPE_RETURN);
 		break;
 	default:
 		WARN_ON(1);
@@ -196,7 +216,8 @@ next_rule:
 		goto next_rule;
 	}
 
-	nft_trace_packet(pkt, basechain, -1, NFT_TRACE_POLICY);
+	nft_trace_packet(&info, basechain, NULL, -1,
+			 NFT_TRACETYPE_POLICY);
 
 	rcu_read_lock_bh();
 	stats = this_cpu_ptr(rcu_dereference(nft_base_chain(basechain)->stats));
diff --git a/net/netfilter/nf_tables_trace.c b/net/netfilter/nf_tables_trace.c
new file mode 100644
index 0000000..672add8
--- /dev/null
+++ b/net/netfilter/nf_tables_trace.c
@@ -0,0 +1,269 @@
+/*
+ * (C) 2015 Red Hat GmbH
+ * Author: Florian Westphal <fw@strlen.de>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/hash.h>
+#include <linux/jhash.h>
+#include <linux/if_vlan.h>
+#include <linux/init.h>
+#include <linux/skbuff.h>
+#include <linux/netlink.h>
+#include <linux/netfilter.h>
+#include <linux/netfilter/nfnetlink.h>
+#include <linux/netfilter/nf_tables.h>
+#include <net/netfilter/nf_tables_core.h>
+#include <net/netfilter/nf_tables.h>
+
+#define NFT_TRACETYPE_LL_HSIZE		20
+#define NFT_TRACETYPE_NETWORK_HSIZE	40
+#define NFT_TRACETYPE_TRANSPORT_HSIZE	20
+
+static int trace_fill_id(struct sk_buff *nlskb, struct sk_buff *skb)
+{
+	__be32 id;
+
+	/* using skb address as ID results in a limited number of
+	 * values (and quick reuse).
+	 *
+	 * So we attempt to use as many skb members that will not
+	 * change while skb is with netfilter.
+	 */
+	id = (__be32)jhash_2words(hash32_ptr(skb), skb_get_hash(skb),
+				  skb->skb_iif);
+
+	return nla_put_be32(nlskb, NFTA_TRACE_ID, id);
+}
+
+static int trace_fill_header(struct sk_buff *nlskb, u16 type,
+			     const struct sk_buff *skb,
+			     int off, unsigned int len)
+{
+	struct nlattr *nla;
+
+	if (len == 0)
+		return 0;
+
+	nla = nla_reserve(nlskb, type, len);
+	if (!nla || skb_copy_bits(skb, off, nla_data(nla), len))
+		return -1;
+
+	return 0;
+}
+
+static int nf_trace_fill_ll_header(struct sk_buff *nlskb,
+				   const struct sk_buff *skb)
+{
+	struct vlan_ethhdr veth;
+	int off;
+
+	BUILD_BUG_ON(sizeof(veth) > NFT_TRACETYPE_LL_HSIZE);
+
+	off = skb_mac_header(skb) - skb->data;
+	if (off != -ETH_HLEN)
+		return -1;
+
+	if (skb_copy_bits(skb, off, &veth, ETH_HLEN))
+		return -1;
+
+	veth.h_vlan_proto = skb->vlan_proto;
+	veth.h_vlan_TCI = htons(skb_vlan_tag_get(skb));
+	veth.h_vlan_encapsulated_proto = skb->protocol;
+
+	return nla_put(nlskb, NFTA_TRACE_LL_HEADER, sizeof(veth), &veth);
+}
+
+static int nf_trace_fill_dev_info(struct sk_buff *nlskb,
+				  const struct net_device *indev,
+				  const struct net_device *outdev)
+{
+	if (indev) {
+		if (nla_put_be32(nlskb, NFTA_TRACE_IIF,
+				 htonl(indev->ifindex)))
+			return -1;
+
+		if (nla_put_be16(nlskb, NFTA_TRACE_IIFTYPE,
+				 htons(indev->type)))
+			return -1;
+	}
+
+	if (outdev) {
+		if (nla_put_be32(nlskb, NFTA_TRACE_OIF,
+				 htonl(outdev->ifindex)))
+			return -1;
+
+		if (nla_put_be16(nlskb, NFTA_TRACE_OIFTYPE,
+				 htons(outdev->type)))
+			return -1;
+	}
+
+	return 0;
+}
+
+static int nf_trace_fill_pkt_info(struct sk_buff *nlskb,
+				  const struct nft_pktinfo *pkt)
+{
+	const struct sk_buff *skb = pkt->skb;
+	unsigned int len = min_t(unsigned int,
+				 pkt->xt.thoff - skb_network_offset(skb),
+				 NFT_TRACETYPE_NETWORK_HSIZE);
+	int off = skb_network_offset(skb);
+
+	if (trace_fill_header(nlskb, NFTA_TRACE_NETWORK_HEADER, skb, off, len))
+		return -1;
+
+	len = min_t(unsigned int, skb->len - pkt->xt.thoff,
+		    NFT_TRACETYPE_TRANSPORT_HSIZE);
+
+	if (trace_fill_header(nlskb, NFTA_TRACE_TRANSPORT_HEADER, skb,
+			      pkt->xt.thoff, len))
+		return -1;
+
+	if (!skb_mac_header_was_set(skb))
+		return 0;
+
+	if (skb_vlan_tag_get(skb))
+		return nf_trace_fill_ll_header(nlskb, skb);
+
+	off = skb_mac_header(skb) - skb->data;
+	len = min_t(unsigned int, -off, NFT_TRACETYPE_LL_HSIZE);
+	return trace_fill_header(nlskb, NFTA_TRACE_LL_HEADER,
+				 skb, off, len);
+}
+
+static int nf_trace_fill_rule_info(struct sk_buff *nlskb,
+				   const struct nft_traceinfo *info)
+{
+	if (!info->rule)
+		return 0;
+
+	/* a continue verdict with ->type == RETURN means that this is
+	 * an implicit return (end of chain reached).
+	 *
+	 * Since no rule matched, the ->rule pointer is invalid.
+	 */
+	if (info->type == NFT_TRACETYPE_RETURN &&
+	    info->verdict->code == NFT_CONTINUE)
+		return 0;
+
+	return nla_put_be64(nlskb, NFTA_TRACE_RULE_HANDLE,
+			    cpu_to_be64(info->rule->handle));
+}
+
+void nft_trace_notify(struct nft_traceinfo *info)
+{
+	const struct nft_pktinfo *pkt = info->pkt;
+	struct nfgenmsg *nfmsg;
+	struct nlmsghdr *nlh;
+	struct sk_buff *skb;
+	unsigned int size;
+	int event = (NFNL_SUBSYS_NFTABLES << 8) | NFT_MSG_TRACE;
+
+	if (!nfnetlink_has_listeners(pkt->net, NFNLGRP_NFTRACE))
+		return;
+
+	size = nlmsg_total_size(sizeof(struct nfgenmsg)) +
+		nla_total_size(NFT_TABLE_MAXNAMELEN) +
+		nla_total_size(NFT_CHAIN_MAXNAMELEN) +
+		nla_total_size(sizeof(__be64)) +	/* rule handle */
+		nla_total_size(sizeof(__be32)) +	/* trace type */
+		nla_total_size(0) +			/* VERDICT, nested */
+			nla_total_size(sizeof(u32)) +	/* verdict code */
+			nla_total_size(NFT_CHAIN_MAXNAMELEN) + /* jump target */
+		nla_total_size(sizeof(u32)) +		/* id */
+		nla_total_size(NFT_TRACETYPE_LL_HSIZE) +
+		nla_total_size(NFT_TRACETYPE_NETWORK_HSIZE) +
+		nla_total_size(NFT_TRACETYPE_TRANSPORT_HSIZE) +
+		nla_total_size(sizeof(u32)) +		/* mark */
+		nla_total_size(sizeof(u32)) +		/* iif */
+		nla_total_size(sizeof(__be16)) +	/* iiftype */
+		nla_total_size(sizeof(u32)) +		/* oif */
+		nla_total_size(sizeof(__be16));		/* oiftype */
+
+	skb = nlmsg_new(size, GFP_ATOMIC);
+	if (!skb)
+		return;
+
+	nlh = nlmsg_put(skb, 0, 0, event, sizeof(struct nfgenmsg), 0);
+	if (!nlh)
+		goto nla_put_failure;
+
+	nfmsg = nlmsg_data(nlh);
+	nfmsg->nfgen_family	= info->basechain->type->family;
+	nfmsg->version		= NFNETLINK_V0;
+	nfmsg->res_id		= 0;
+
+	if (nla_put_be32(skb, NFTA_TRACE_NFPROTO, htonl(pkt->pf)))
+		goto nla_put_failure;
+
+	if (nla_put_be32(skb, NFTA_TRACE_TYPE, htonl(info->type)))
+		goto nla_put_failure;
+
+	if (trace_fill_id(skb, pkt->skb))
+		goto nla_put_failure;
+
+	if (info->chain) {
+		if (nla_put_string(skb, NFTA_TRACE_CHAIN,
+				   info->chain->name))
+			goto nla_put_failure;
+		if (nla_put_string(skb, NFTA_TRACE_TABLE,
+				   info->chain->table->name))
+			goto nla_put_failure;
+	}
+
+	if (nf_trace_fill_rule_info(skb, info))
+		goto nla_put_failure;
+
+	switch (info->type) {
+	case NFT_TRACETYPE_UNSPEC:
+	case __NFT_TRACETYPE_MAX:
+		break;
+	case NFT_TRACETYPE_RETURN:
+	case NFT_TRACETYPE_RULE:
+		if (nft_verdict_dump(skb, NFTA_TRACE_VERDICT, info->verdict))
+			goto nla_put_failure;
+		break;
+	case NFT_TRACETYPE_POLICY:
+		if (nla_put_be32(skb, NFTA_TRACE_POLICY,
+				 info->basechain->policy))
+			goto nla_put_failure;
+		break;
+	}
+
+	if (pkt->skb->mark &&
+	    nla_put_be32(skb, NFTA_TRACE_MARK, htonl(pkt->skb->mark)))
+		goto nla_put_failure;
+
+	if (!info->packet_dumped) {
+		if (nf_trace_fill_dev_info(skb, pkt->in, pkt->out))
+			goto nla_put_failure;
+
+		if (nf_trace_fill_pkt_info(skb, pkt))
+			goto nla_put_failure;
+		info->packet_dumped = true;
+	}
+
+	nlmsg_end(skb, nlh);
+	nfnetlink_send(skb, pkt->net, 0, NFNLGRP_NFTRACE, 0, GFP_ATOMIC);
+	return;
+
+ nla_put_failure:
+	WARN_ON_ONCE(1);
+	kfree_skb(skb);
+}
+
+void nft_trace_init(struct nft_traceinfo *info, const struct nft_pktinfo *pkt,
+		    const struct nft_verdict *verdict,
+		    const struct nft_chain *chain)
+{
+	info->basechain = nft_base_chain(chain);
+	info->trace = true;
+	info->packet_dumped = false;
+	info->pkt = pkt;
+	info->verdict = verdict;
+}
diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
index 46453ab..28591fa 100644
--- a/net/netfilter/nfnetlink.c
+++ b/net/netfilter/nfnetlink.c
@@ -49,6 +49,7 @@ static const int nfnl_group2type[NFNLGRP_MAX+1] = {
 	[NFNLGRP_CONNTRACK_EXP_DESTROY] = NFNL_SUBSYS_CTNETLINK_EXP,
 	[NFNLGRP_NFTABLES]		= NFNL_SUBSYS_NFTABLES,
 	[NFNLGRP_ACCT_QUOTA]		= NFNL_SUBSYS_ACCT,
+	[NFNLGRP_NFTRACE]		= NFNL_SUBSYS_NFTABLES,
 };
 
 void nfnl_lock(__u8 subsys_id)
-- 
2.4.10


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v3 nf-next 3/3] netfilter: nf_tables: wrap tracing with a static key
  2015-11-27 18:43 GIT: [PATCH v3 0/3] netfilter trace infrastructure support Florian Westphal
  2015-11-27 18:43 ` [PATCH v3 libnftnl 1/3] src: add " Florian Westphal
  2015-11-27 18:43 ` [PATCH v2 nf-next 2/3] netfilter: nf_tables: extend tracing infrastructure Florian Westphal
@ 2015-11-27 18:43 ` Florian Westphal
  2015-11-27 18:55 ` GIT: [PATCH v3 0/3] netfilter trace infrastructure support Patrick McHardy
  3 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2015-11-27 18:43 UTC (permalink / raw)
  To: netfilter-devel; +Cc: Florian Westphal

Only needed when meta nftrace rule(s) were added.
The assumption is that no such rules are active, so the call to
nft_trace_init is "never" needed.

When nftrace rules are active, we always call the nft_trace_* functions,
but will only send netlink messages when all of the following are true:

 - traceinfo structure was initialised
 - skb->nf_trace == 1
 - at least one subscriber to trace group.

Adding an extra conditional
(static_branch ... && skb->nf_trace)
	nft_trace_init( ..)

Is possible but results in a larger nft_do_chain footprint.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 Changes since v2:
  - avoid bloat by always calling __nft_trace_packet once
    static key is active.
 Changes since v1:
  - do not wrap rulenum++ into static key, because
  it might get incremented too late when trace
  rules are added/removed at high rate, giving wrong results.
  - add EXPORT_SYMBOL for nft_meta_set_destroy.

 include/net/netfilter/nf_tables_core.h |  1 +
 include/net/netfilter/nft_meta.h       |  3 +++
 net/bridge/netfilter/nft_meta_bridge.c |  1 +
 net/netfilter/nf_tables_core.c         |  9 ++++++---
 net/netfilter/nf_tables_trace.c        |  4 ++++
 net/netfilter/nft_meta.c               | 16 ++++++++++++++++
 6 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/include/net/netfilter/nf_tables_core.h b/include/net/netfilter/nf_tables_core.h
index 4ff5424..a9060dd 100644
--- a/include/net/netfilter/nf_tables_core.h
+++ b/include/net/netfilter/nf_tables_core.h
@@ -57,6 +57,7 @@ struct nft_payload_set {
 };
 
 extern const struct nft_expr_ops nft_payload_fast_ops;
+extern struct static_key_false nft_trace_enabled;
 
 int nft_payload_module_init(void);
 void nft_payload_module_exit(void);
diff --git a/include/net/netfilter/nft_meta.h b/include/net/netfilter/nft_meta.h
index 711887a..d27588c 100644
--- a/include/net/netfilter/nft_meta.h
+++ b/include/net/netfilter/nft_meta.h
@@ -33,4 +33,7 @@ void nft_meta_set_eval(const struct nft_expr *expr,
 		       struct nft_regs *regs,
 		       const struct nft_pktinfo *pkt);
 
+void nft_meta_set_destroy(const struct nft_ctx *ctx,
+			  const struct nft_expr *expr);
+
 #endif
diff --git a/net/bridge/netfilter/nft_meta_bridge.c b/net/bridge/netfilter/nft_meta_bridge.c
index a21269b..4b901d9 100644
--- a/net/bridge/netfilter/nft_meta_bridge.c
+++ b/net/bridge/netfilter/nft_meta_bridge.c
@@ -84,6 +84,7 @@ static const struct nft_expr_ops nft_meta_bridge_set_ops = {
 	.size		= NFT_EXPR_SIZE(sizeof(struct nft_meta)),
 	.eval		= nft_meta_set_eval,
 	.init		= nft_meta_set_init,
+	.destroy	= nft_meta_set_destroy,
 	.dump		= nft_meta_set_dump,
 };
 
diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c
index 2395de7..67fa41d 100644
--- a/net/netfilter/nf_tables_core.c
+++ b/net/netfilter/nf_tables_core.c
@@ -16,6 +16,7 @@
 #include <linux/skbuff.h>
 #include <linux/netlink.h>
 #include <linux/netfilter.h>
+#include <linux/static_key.h>
 #include <linux/netfilter/nfnetlink.h>
 #include <linux/netfilter/nf_tables.h>
 #include <net/netfilter/nf_tables_core.h>
@@ -50,7 +51,7 @@ static noinline void __nft_trace_packet(struct nft_traceinfo *info,
 {
 	const struct nft_pktinfo *pkt = info->pkt;
 
-	if (!pkt->skb->nf_trace)
+	if (!info->trace || !pkt->skb->nf_trace)
 		return;
 
 	info->chain = chain;
@@ -70,7 +71,7 @@ static inline void nft_trace_packet(struct nft_traceinfo *info,
 				    int rulenum,
 				    enum nft_trace_types type)
 {
-	if (unlikely(info->trace)) {
+	if (static_branch_unlikely(&nft_trace_enabled)) {
 		info->rule = rule;
 		__nft_trace_packet(info, chain, rulenum, type);
 	}
@@ -137,7 +138,9 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 	unsigned int gencursor = nft_genmask_cur(net);
 	struct nft_traceinfo info;
 
-	nft_trace_init(&info, pkt, &regs.verdict, basechain);
+	info.trace = false;
+	if (static_branch_unlikely(&nft_trace_enabled))
+		nft_trace_init(&info, pkt, &regs.verdict, basechain);
 do_chain:
 	rulenum = 0;
 	rule = list_entry(&chain->rules, struct nft_rule, list);
diff --git a/net/netfilter/nf_tables_trace.c b/net/netfilter/nf_tables_trace.c
index 672add8..39beca0 100644
--- a/net/netfilter/nf_tables_trace.c
+++ b/net/netfilter/nf_tables_trace.c
@@ -8,6 +8,7 @@
  */
 
 #include <linux/module.h>
+#include <linux/static_key.h>
 #include <linux/hash.h>
 #include <linux/jhash.h>
 #include <linux/if_vlan.h>
@@ -24,6 +25,9 @@
 #define NFT_TRACETYPE_NETWORK_HSIZE	40
 #define NFT_TRACETYPE_TRANSPORT_HSIZE	20
 
+DEFINE_STATIC_KEY_FALSE(nft_trace_enabled);
+EXPORT_SYMBOL_GPL(nft_trace_enabled);
+
 static int trace_fill_id(struct sk_buff *nlskb, struct sk_buff *skb)
 {
 	__be32 id;
diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c
index 9dfaf4d..85a465b 100644
--- a/net/netfilter/nft_meta.c
+++ b/net/netfilter/nft_meta.c
@@ -18,10 +18,12 @@
 #include <linux/ip.h>
 #include <linux/ipv6.h>
 #include <linux/smp.h>
+#include <linux/static_key.h>
 #include <net/dst.h>
 #include <net/sock.h>
 #include <net/tcp_states.h> /* for TCP_TIME_WAIT */
 #include <net/netfilter/nf_tables.h>
+#include <net/netfilter/nf_tables_core.h>
 #include <net/netfilter/nft_meta.h>
 
 void nft_meta_get_eval(const struct nft_expr *expr,
@@ -297,6 +299,9 @@ int nft_meta_set_init(const struct nft_ctx *ctx,
 	if (err < 0)
 		return err;
 
+	if (priv->key == NFT_META_NFTRACE)
+		static_branch_inc(&nft_trace_enabled);
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(nft_meta_set_init);
@@ -334,6 +339,16 @@ nla_put_failure:
 }
 EXPORT_SYMBOL_GPL(nft_meta_set_dump);
 
+void nft_meta_set_destroy(const struct nft_ctx *ctx,
+			  const struct nft_expr *expr)
+{
+	const struct nft_meta *priv = nft_expr_priv(expr);
+
+	if (priv->key == NFT_META_NFTRACE)
+		static_branch_dec(&nft_trace_enabled);
+}
+EXPORT_SYMBOL_GPL(nft_meta_set_destroy);
+
 static struct nft_expr_type nft_meta_type;
 static const struct nft_expr_ops nft_meta_get_ops = {
 	.type		= &nft_meta_type,
@@ -348,6 +363,7 @@ static const struct nft_expr_ops nft_meta_set_ops = {
 	.size		= NFT_EXPR_SIZE(sizeof(struct nft_meta)),
 	.eval		= nft_meta_set_eval,
 	.init		= nft_meta_set_init,
+	.destroy	= nft_meta_set_destroy,
 	.dump		= nft_meta_set_dump,
 };
 
-- 
2.4.10


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: GIT: [PATCH v3 0/3] netfilter trace infrastructure support
  2015-11-27 18:43 GIT: [PATCH v3 0/3] netfilter trace infrastructure support Florian Westphal
                   ` (2 preceding siblings ...)
  2015-11-27 18:43 ` [PATCH v3 nf-next 3/3] netfilter: nf_tables: wrap tracing with a static key Florian Westphal
@ 2015-11-27 18:55 ` Patrick McHardy
  3 siblings, 0 replies; 8+ messages in thread
From: Patrick McHardy @ 2015-11-27 18:55 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On 27.11, Florian Westphal wrote:
> 3rd round.
> 
> See the individual patches for changelog vs. V 2.
> 
> Patrick: Nothing fancy from libnftnl side, just the 3 new attributes
> (oiftype, nfproto, policy).
> 
> On kernel side I addressed all of the comments.
> 
> In case of NFPROTO_INET, the new NFTA_TRACE_NFPROTO will be 'NFPROTO_IPV4'
> (or IPV6), the netlink header family is set to NFPROTO_INET.

That's sounds good.

> I tried to reduce nft_do_chain bloat, +80 on my system.

That's even better than I expected, thanks!

> We have one unconditional foo = false assignment, all nftrace calls
> are avoided in 'no trace rules set' case via static key.
> 
> Let me know if you spot more problems.

I'll review it tommorrow if you don't mind, currently trying to wrap my
head around the sub-byte and offset expressions. We have some limitations
which I'd like to address if I'm already working in that area before I
continue on the tracing part.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 nf-next 2/3] netfilter: nf_tables: extend tracing infrastructure
  2015-11-27 18:43 ` [PATCH v2 nf-next 2/3] netfilter: nf_tables: extend tracing infrastructure Florian Westphal
@ 2015-11-28 12:36   ` Patrick McHardy
  0 siblings, 0 replies; 8+ messages in thread
From: Patrick McHardy @ 2015-11-28 12:36 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On 27.11, Florian Westphal wrote:
> nft monitor mode can then decode and display this trace data.
> 
> Parts of LL/Network/Transport headers are provided as separate
> attributes.
> 
> Otherwise, printing IP address data becomes virtually impossible
> for userspace since in the case of the netdev family we really don't
> want userspace to have to know all the possible link layer types
> and/or sizes just to display/print an ip address.
> 
> We also don't want userspace to have to follow ipv6 header chains
> to get the s/dport info, the kernel already did this work for us.
> 
> To avoid bloating nft_do_chain all data required for tracing is
> encapsulated in nft_traceinfo.
> 
> The structure is initialized unconditionally(!) for each nft_do_chain
> invocation.
> 
> This unconditionall call will be moved under a static key in a
> followup patch.
> 
> With lots of help from Patrick McHardy and Pablo Neira.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>

Looks very good to me, nice work!

Acked-by: Patrick McHardy <kaber@trash.net>

for both kernel patches.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 libnftnl 1/3] src: add trace infrastructure support
  2015-11-27 18:43 ` [PATCH v3 libnftnl 1/3] src: add " Florian Westphal
@ 2015-12-09 13:55   ` Pablo Neira Ayuso
  2015-12-09 14:06     ` Florian Westphal
  0 siblings, 1 reply; 8+ messages in thread
From: Pablo Neira Ayuso @ 2015-12-09 13:55 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel

On Fri, Nov 27, 2015 at 07:43:18PM +0100, Florian Westphal wrote:
> parses trace monitor netlink messages from the kernel and builds
> nftnl_trace struct that contains the dissected information.
> 
> Provides getters to access these attributes.

I leave it to you to push trace code to libnftnl. Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v3 libnftnl 1/3] src: add trace infrastructure support
  2015-12-09 13:55   ` Pablo Neira Ayuso
@ 2015-12-09 14:06     ` Florian Westphal
  0 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2015-12-09 14:06 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel, kaber

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Fri, Nov 27, 2015 at 07:43:18PM +0100, Florian Westphal wrote:
> > parses trace monitor netlink messages from the kernel and builds
> > nftnl_trace struct that contains the dissected information.
> > 
> > Provides getters to access these attributes.
> 
> I leave it to you to push trace code to libnftnl. Thanks.

Pushed libnftnl changes, thanks Pablo.

@Patrick: I won't push anything trace related to nftables.git since I'm
assuming that you'll take care of nftables side eventually.

Cheers,
Florian

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-12-09 14:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-27 18:43 GIT: [PATCH v3 0/3] netfilter trace infrastructure support Florian Westphal
2015-11-27 18:43 ` [PATCH v3 libnftnl 1/3] src: add " Florian Westphal
2015-12-09 13:55   ` Pablo Neira Ayuso
2015-12-09 14:06     ` Florian Westphal
2015-11-27 18:43 ` [PATCH v2 nf-next 2/3] netfilter: nf_tables: extend tracing infrastructure Florian Westphal
2015-11-28 12:36   ` Patrick McHardy
2015-11-27 18:43 ` [PATCH v3 nf-next 3/3] netfilter: nf_tables: wrap tracing with a static key Florian Westphal
2015-11-27 18:55 ` GIT: [PATCH v3 0/3] netfilter trace infrastructure support Patrick McHardy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).