All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH nf,v2 0/2] nf_tables: limit maximum number of jumps/gotos per netns
@ 2025-10-27 22:17 Pablo Neira Ayuso
  2025-10-27 22:17 ` [PATCH nf 1/2] netfilter: " Pablo Neira Ayuso
  2025-10-27 22:17 ` [PATCH nf 2/2] selftests: netfilter: add test for nf_tables_jumps_max_netns sysctl Pablo Neira Ayuso
  0 siblings, 2 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2025-10-27 22:17 UTC (permalink / raw)
  To: netfilter-devel; +Cc: fw, ffmancera, brady.1345

Hi,

This new series contains v2 to add limit per
-

Pablo Neira Ayuso (2):
  netfilter: nf_tables: limit maximum number of jumps/gotos per netns
  selftests: netfilter: add test for nf_tables_jumps_max_netns sysctl

 Documentation/networking/netfilter-sysctl.rst |  15 ++
 include/net/netfilter/nf_tables.h             |   7 +
 include/net/netns/netfilter.h                 |   6 +
 net/netfilter/Makefile                        |   2 +-
 net/netfilter/core.c                          |   9 ++
 net/netfilter/nf_tables_api.c                 |  95 +++++++++++-
 net/netfilter/nf_tables_sysctl.c              |  91 +++++++++++
 net/netfilter/nft_immediate.c                 |   4 +
 net/netfilter/nft_lookup.c                    |   9 ++
 .../testing/selftests/net/netfilter/Makefile  |   2 +
 .../net/netfilter/gen_ruleset_many_jumps.c    | 145 ++++++++++++++++++
 .../net/netfilter/nft_ruleset_many_jumps.sh   | 118 ++++++++++++++
 12 files changed, 498 insertions(+), 5 deletions(-)
 create mode 100644 net/netfilter/nf_tables_sysctl.c
 create mode 100644 tools/testing/selftests/net/netfilter/gen_ruleset_many_jumps.c
 create mode 100755 tools/testing/selftests/net/netfilter/nft_ruleset_many_jumps.sh

-- 
2.30.2


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH nf 1/2] netfilter: nf_tables: limit maximum number of jumps/gotos per netns
  2025-10-27 22:17 [PATCH nf,v2 0/2] nf_tables: limit maximum number of jumps/gotos per netns Pablo Neira Ayuso
@ 2025-10-27 22:17 ` Pablo Neira Ayuso
  2025-10-28 13:06   ` Florian Westphal
                     ` (3 more replies)
  2025-10-27 22:17 ` [PATCH nf 2/2] selftests: netfilter: add test for nf_tables_jumps_max_netns sysctl Pablo Neira Ayuso
  1 sibling, 4 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2025-10-27 22:17 UTC (permalink / raw)
  To: netfilter-devel; +Cc: fw, ffmancera, brady.1345

Add a new sysctl:

   net.netfilter.nf_tables_jump_max_netns

which is 65535 by default, because iptables-nft rulesets are more likely
to have more jumps/gotos compared to native nftables rulesets.

This limit prevents soft lockups on the packet caused by crafted
rulesets with too many jumps

The default limit (in Shaun Brady's words) was chosen to account for any
normal use case, and when this value (and associated stressing loop
table) was tested against a 1CPU/256MB machine, the system remained
functional.

This jump/goto limit is global for all tables that are defined in the
netns, regardless the family type.

Note that verdict maps count as a single jump/goto because a map lookup
provides a single exact match, therefore, this is equivalent to an
immediate jump.

This limit is not the net count of jumps in your ruleset, but the number
of jumps can be visited traversing the acyclic directed graph with
depth-first search. This is done from control plane where the evaluation
of the selectors is not possible, therefore, this represents the
hypothetical worst case.

This patch adds a jump_count[2] field per table which stores the current
number of jumps/gotos in the present [0] and the future [1]. During the
preparation phase, jump_count[1] is updated to store the number of jumps
after the last table validation while processing the batch.

If this batch does not update the number of jumps in this table, then
jump_count[0] provides the current number of jumps and jump_count[1] is
set to -1. When checking if the number of jumps go over the limit, check
if jump_count[1] is >= 0, meaning the number of jumps for this table has
been modified by this batch, otherwise use jump_count[0], meaning this
table has not been modified in terms of new jumps.

After the commit phase, jump_count[0] is set to jump_count[1] if it is
>= 0. Otherwise, in case of abort, jump_count[1] is reset to -1 to
prepare for handling the next batch.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
v2: - set table's validate_state to NFT_VALIDATE_DO when jump count check fails
      otherwise validation loops forever if tables already exists
    - disallow user_ns to update nf_tables_jumps_max_netns sysctl
    - move sysctl initialization to core instead of nf_tables module
      (it should be easy to take this back so nf_tables_jumps_max_netns sysctl
       becomes available only when nf_tables module is loaded)
    - move sysctl code to nf_tables_sysctl.c
    - update Documentation
    - remove WARN_ON_ONCE when changing validate_state from SKIP -> DO
      needed when jump_count check fails.

Note: Given soft lockup has been always possible, nf-next tree is also good
      target tree for this series.

 Documentation/networking/netfilter-sysctl.rst | 15 +++
 include/net/netfilter/nf_tables.h             |  7 ++
 include/net/netns/netfilter.h                 |  6 ++
 net/netfilter/Makefile                        |  2 +-
 net/netfilter/core.c                          |  9 ++
 net/netfilter/nf_tables_api.c                 | 95 ++++++++++++++++++-
 net/netfilter/nf_tables_sysctl.c              | 91 ++++++++++++++++++
 net/netfilter/nft_immediate.c                 |  4 +
 net/netfilter/nft_lookup.c                    |  9 ++
 9 files changed, 233 insertions(+), 5 deletions(-)
 create mode 100644 net/netfilter/nf_tables_sysctl.c

diff --git a/Documentation/networking/netfilter-sysctl.rst b/Documentation/networking/netfilter-sysctl.rst
index beb6d7b275d4..f0e6312a8814 100644
--- a/Documentation/networking/netfilter-sysctl.rst
+++ b/Documentation/networking/netfilter-sysctl.rst
@@ -15,3 +15,18 @@ nf_log_all_netns - BOOLEAN
 	with LOG target; this aims to prevent containers from flooding host
 	kernel log. If enabled, this target also works in other network
 	namespaces. This variable is only accessible from init_net.
+
+nf_tables_jumps_max_netns - INTEGER (count)
+	default 65536
+
+	This is the maximum number of jumps/gotos that a netns can have across
+	its tables. This limit does not represent the net count of jumps in
+	your ruleset; rather, it represents the number of jumps that can be
+	reached when traversing the ruleset via a depth-first search (DFS).
+	This limit is determined in the control plane, where evaluating the
+	rule selectors is not possible; therefore, it represents the
+	hypothetical worst case. This limit prevents packet path soft lockups
+	caused by rulesets with too many jumps. This limit only applies to
+	non-init_net namespaces and can be read for non-init_user_ns
+	namespaces. Meeting or exceeding this value will prevent additional
+	rules from being added and will return an EMLINK error to the user.
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index fab7dc73f738..c50528a77901 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -209,6 +209,7 @@ static inline void nft_data_copy(u32 *dst, const struct nft_data *src,
  *	@family: protocol family
  *	@level: depth of the chains
  *	@report: notify via unicast netlink message
+ * 	@jump_count: jump to chain counter
  *	@reg_inited: bitmap of initialised registers
  */
 struct nft_ctx {
@@ -222,6 +223,7 @@ struct nft_ctx {
 	u8				family;
 	u8				level;
 	bool				report;
+	int				jump_count;
 	DECLARE_BITMAP(reg_inited, NFT_REG32_NUM);
 };
 
@@ -1279,6 +1281,7 @@ static inline void nft_use_inc_restore(u32 *use)
  *	@family:address family
  *	@flags: table flag (see enum nft_table_flags)
  *	@genmask: generation mask
+ * 	@jump_count: current [0] and next [1] jump to chain counter
  *	@nlpid: netlink port ID
  *	@name: name of the table
  *	@udlen: length of the user data
@@ -1298,6 +1301,7 @@ struct nft_table {
 	u16				family:6,
 					flags:8,
 					genmask:2;
+	int				jump_count[2];
 	u32				nlpid;
 	char				*name;
 	u16				udlen;
@@ -1903,6 +1907,9 @@ __printf(2, 3) int nft_request_module(struct net *net, const char *fmt, ...);
 static inline int nft_request_module(struct net *net, const char *fmt, ...) { return -ENOENT; }
 #endif
 
+int netfilter_nf_tables_sysctl_init(void);
+void netfilter_nf_tables_sysctl_fini(void);
+
 struct nftables_pernet {
 	struct list_head	tables;
 	struct list_head	commit_list;
diff --git a/include/net/netns/netfilter.h b/include/net/netns/netfilter.h
index a6a0bf4a247e..6199f00fa2cb 100644
--- a/include/net/netns/netfilter.h
+++ b/include/net/netns/netfilter.h
@@ -18,6 +18,9 @@ struct netns_nf {
 #ifdef CONFIG_LWTUNNEL
 	struct ctl_table_header *nf_lwtnl_dir_header;
 #endif
+#if IS_ENABLED(CONFIG_NF_TABLES)
+	struct ctl_table_header *nf_tables_dir_header;
+#endif
 #endif
 	struct nf_hook_entries __rcu *hooks_ipv4[NF_INET_NUMHOOKS];
 	struct nf_hook_entries __rcu *hooks_ipv6[NF_INET_NUMHOOKS];
@@ -33,5 +36,8 @@ struct netns_nf {
 #if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
 	unsigned int defrag_ipv6_users;
 #endif
+#if IS_ENABLED(CONFIG_NF_TABLES)
+	unsigned int nf_tables_jumps_max_netns;
+#endif
 };
 #endif
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 6bfc250e474f..cdd9cadbd76c 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-netfilter-objs := core.o nf_log.o nf_queue.o nf_sockopt.o utils.o
+netfilter-objs := core.o nf_log.o nf_tables_sysctl.o nf_queue.o nf_sockopt.o utils.o
 
 nf_conntrack-y	:= nf_conntrack_core.o nf_conntrack_standalone.o nf_conntrack_expect.o nf_conntrack_helper.o \
 		   nf_conntrack_proto.o nf_conntrack_proto_generic.o nf_conntrack_proto_tcp.o nf_conntrack_proto_udp.o \
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 11a702065bab..2753e8aa3f1f 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -24,6 +24,7 @@
 #include <linux/rcupdate.h>
 #include <net/net_namespace.h>
 #include <net/netfilter/nf_queue.h>
+#include <net/netfilter/nf_tables.h>
 #include <net/sock.h>
 
 #include "nf_internals.h"
@@ -814,13 +815,21 @@ int __init netfilter_init(void)
 	ret = netfilter_lwtunnel_init();
 	if (ret < 0)
 		goto err_lwtunnel_pernet;
+#endif
+#if IS_ENABLED(CONFIG_NF_TABLES)
+	ret = netfilter_nf_tables_sysctl_init();
+	if (ret < 0)
+		goto err_nft_pernet;
 #endif
 	ret = netfilter_log_init();
 	if (ret < 0)
 		goto err_log_pernet;
 
 	return 0;
+
 err_log_pernet:
+	netfilter_nf_tables_sysctl_fini();
+err_nft_pernet:
 #ifdef CONFIG_LWTUNNEL
 	netfilter_lwtunnel_fini();
 err_lwtunnel_pernet:
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index eed434e0a970..b3fcdce56e98 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -112,7 +112,6 @@ static void nft_validate_state_update(struct nft_table *table, u8 new_validate_s
 {
 	switch (table->validate_state) {
 	case NFT_VALIDATE_SKIP:
-		WARN_ON_ONCE(new_validate_state == NFT_VALIDATE_DO);
 		break;
 	case NFT_VALIDATE_NEED:
 		break;
@@ -140,6 +139,7 @@ static void nft_ctx_init(struct nft_ctx *ctx,
 	ctx->net	= net;
 	ctx->family	= family;
 	ctx->level	= 0;
+	ctx->jump_count = 0;
 	ctx->table	= table;
 	ctx->chain	= chain;
 	ctx->nla   	= nla;
@@ -1631,6 +1631,8 @@ static int nf_tables_newtable(struct sk_buff *skb, const struct nfnl_info *info,
 	if (table->flags & NFT_TABLE_F_OWNER)
 		table->nlpid = NETLINK_CB(skb).portid;
 
+	table->jump_count[1] = -1;
+
 	nft_ctx_init(&ctx, net, skb, info->nlh, family, table, NULL, nla);
 	err = nft_trans_table_add(&ctx, NFT_MSG_NEWTABLE);
 	if (err < 0)
@@ -4121,13 +4123,14 @@ int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain)
 }
 EXPORT_SYMBOL_GPL(nft_chain_validate);
 
-static int nft_table_validate(struct net *net, const struct nft_table *table)
+static int nft_table_validate(struct net *net, struct nft_table *table)
 {
 	struct nft_chain *chain;
 	struct nft_ctx ctx = {
 		.net	= net,
 		.family	= table->family,
 	};
+	u32 jump_count = 0;
 	int err;
 
 	list_for_each_entry(chain, &table->chains, list) {
@@ -4140,8 +4143,11 @@ static int nft_table_validate(struct net *net, const struct nft_table *table)
 			return err;
 
 		cond_resched();
+		jump_count += ctx.jump_count;
 	}
 
+	table->jump_count[1] = jump_count;
+
 	return 0;
 }
 
@@ -4202,6 +4208,39 @@ int nft_set_catchall_validate(const struct nft_ctx *ctx, struct nft_set *set)
 	return ret;
 }
 
+static u32 nft_jump_count(struct net *net)
+{
+	struct nftables_pernet *nft_net = nft_pernet(net);
+	struct nft_table *table;
+	u32 jump_count = 0;
+
+	list_for_each_entry(table, &nft_net->tables, list) {
+		/* If table has been updated with new jumps in this batch, then
+		 * use future jump count. Otherwise, use current jump count.
+		 */
+		if (table->jump_count[1] < 0)
+			jump_count += table->jump_count[0];
+		else
+			jump_count += table->jump_count[1];
+	}
+
+	return jump_count;
+}
+
+static int nft_jump_count_check(struct net *net)
+{
+	u32 jump_count;
+
+	if (net_eq(net, &init_net))
+		return 0;
+
+	jump_count = nft_jump_count(net);
+	if (jump_count > net->nf.nf_tables_jumps_max_netns)
+		return -EMLINK;
+
+	return 0;
+}
+
 static struct nft_rule *nft_rule_lookup_byid(const struct net *net,
 					     const struct nft_chain *chain,
 					     const struct nlattr *nla);
@@ -4421,8 +4460,17 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info,
 	if (flow)
 		nft_trans_flow_rule(trans) = flow;
 
-	if (table->validate_state == NFT_VALIDATE_DO)
-		return nft_table_validate(net, table);
+	if (table->validate_state == NFT_VALIDATE_DO) {
+		err = nft_table_validate(net, table);
+		if (err < 0)
+			return err;
+
+		/* rule might jump to chain either via immediate or lookup,
+		 * check if jump to chain count goes over the limit.
+		 */
+		if (nft_jump_count_check(net) < 0)
+			return -EMLINK;
+	}
 
 	return 0;
 
@@ -10109,6 +10157,17 @@ static int nf_tables_validate(struct net *net)
 		}
 	}
 
+	if (nft_jump_count_check(net) < 0) {
+		list_for_each_entry(table, &nft_net->tables, list) {
+			if (table->jump_count[1] < 0)
+				continue;
+
+			nft_validate_state_update(table, NFT_VALIDATE_DO);
+		}
+
+		return -EAGAIN;
+	}
+
 	return 0;
 }
 
@@ -10869,6 +10928,30 @@ static void nft_gc_seq_end(struct nftables_pernet *nft_net, unsigned int gc_seq)
 	WRITE_ONCE(nft_net->gc_seq, ++gc_seq);
 }
 
+static void nft_jump_count_reset(struct net *net)
+{
+	struct nftables_pernet *nft_net = nft_pernet(net);
+	struct nft_table *table;
+
+	list_for_each_entry(table, &nft_net->tables, list)
+		table->jump_count[1] = -1;
+}
+
+static void nft_jump_count_update(struct net *net)
+{
+	struct nftables_pernet *nft_net = nft_pernet(net);
+	struct nft_table *table;
+
+	list_for_each_entry(table, &nft_net->tables, list) {
+		/* no new jumps in this table, skip. */
+		if (table->jump_count[1] < 0)
+			continue;
+
+		table->jump_count[0] = table->jump_count[1];
+		table->jump_count[1] = -1;
+	}
+}
+
 static int nf_tables_commit(struct net *net, struct sk_buff *skb)
 {
 	struct nftables_pernet *nft_net = nft_pernet(net);
@@ -10926,6 +11009,8 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
 	if (err < 0)
 		return err;
 
+	nft_jump_count_update(net);
+
 	/* 1.  Allocate space for next generation rules_gen_X[] */
 	list_for_each_entry_safe(trans, next, &nft_net->commit_list, list) {
 		struct nft_table *table = trans->table;
@@ -11266,6 +11351,8 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
 	    nf_tables_validate(net) < 0)
 		err = -EAGAIN;
 
+	nft_jump_count_reset(net);
+
 	list_for_each_entry_safe_reverse(trans, next, &nft_net->commit_list,
 					 list) {
 		struct nft_table *table = trans->table;
diff --git a/net/netfilter/nf_tables_sysctl.c b/net/netfilter/nf_tables_sysctl.c
new file mode 100644
index 000000000000..a4a062898c3d
--- /dev/null
+++ b/net/netfilter/nf_tables_sysctl.c
@@ -0,0 +1,91 @@
+#include <linux/init.h>
+#include <linux/sysctl.h>
+#include <net/netfilter/nf_tables.h>
+#include <net/net_namespace.h>
+
+#ifdef CONFIG_SYSCTL
+enum nf_ct_sysctl_index {
+	NF_SYSCTL_NFT_JUMPS_MAX,
+	NF_SYSCTL_NFT_LAST_SYSCTL
+};
+
+static struct ctl_table nf_tables_sysctl_table[] = {
+	[NF_SYSCTL_NFT_JUMPS_MAX] = {
+		.procname       = "nf_tables_jumps_max_netns",
+		.data           = &init_net.nf.nf_tables_jumps_max_netns,
+		.maxlen         = sizeof(init_net.nf.nf_tables_jumps_max_netns),
+		.mode           = 0644,
+		.proc_handler   = proc_dointvec,
+		.extra1		= SYSCTL_ONE,
+		.extra2		= SYSCTL_INT_MAX,
+	},
+};
+
+#define NFT_TABLE_DEFAULT_JUMPS_MAX 65535
+
+static int __net_init nf_tables_sysctl_init(struct net *net)
+{
+	struct ctl_table *table = nf_tables_sysctl_table;
+
+	BUILD_BUG_ON(ARRAY_SIZE(nf_tables_sysctl_table) != NF_SYSCTL_NFT_LAST_SYSCTL);
+
+	if (net_eq(net, &init_net)) {
+		net->nf.nf_tables_jumps_max_netns = NFT_TABLE_DEFAULT_JUMPS_MAX;
+	} else {
+		table = kmemdup(nf_tables_sysctl_table,
+				sizeof(nf_tables_sysctl_table), GFP_KERNEL);
+		if (!table)
+			return -ENOMEM;
+
+		net->nf.nf_tables_jumps_max_netns =
+			init_net.nf.nf_tables_jumps_max_netns;
+		table[NF_SYSCTL_NFT_JUMPS_MAX].data =
+			&net->nf.nf_tables_jumps_max_netns;
+
+		if (net->user_ns != &init_user_ns)
+			table[NF_SYSCTL_NFT_JUMPS_MAX].mode &= ~0222;
+	}
+
+	net->nf.nf_tables_dir_header =
+		register_net_sysctl_sz(net, "net/netfilter", table,
+				       ARRAY_SIZE(nf_tables_sysctl_table));
+	if (!net->nf.nf_tables_dir_header)
+		goto err_tbl_free;
+
+	return 0;
+
+err_tbl_free:
+	if (table != nf_tables_sysctl_table)
+		kfree(table);
+
+	return -ENOMEM;
+}
+
+static void nf_tables_sysctl_exit(struct net *net)
+{
+	const struct ctl_table *table;
+
+	unregister_net_sysctl_table(net->nf.nf_tables_dir_header);
+	table = net->nf.nf_tables_dir_header->ctl_table_arg;
+	if (!net_eq(net, &init_net))
+		kfree(table);
+}
+
+static struct pernet_operations nf_tables_sysctl_net_ops = {
+	.init = nf_tables_sysctl_init,
+	.exit = nf_tables_sysctl_exit,
+};
+
+int __init netfilter_nf_tables_sysctl_init(void)
+{
+	return register_pernet_subsys(&nf_tables_sysctl_net_ops);
+}
+
+void netfilter_nf_tables_sysctl_fini(void)
+{
+	unregister_pernet_subsys(&nf_tables_sysctl_net_ops);
+}
+#else
+int __init netfilter_nf_tables_sysctl_init(void) { return 0; }
+void netfilter_nf_tables_sysctl_fini(void) {}
+#endif /* CONFIG_SYSCTL */
diff --git a/net/netfilter/nft_immediate.c b/net/netfilter/nft_immediate.c
index 02ee5fb69871..43f81c81d179 100644
--- a/net/netfilter/nft_immediate.c
+++ b/net/netfilter/nft_immediate.c
@@ -259,6 +259,10 @@ static int nft_immediate_validate(const struct nft_ctx *ctx,
 	switch (data->verdict.code) {
 	case NFT_JUMP:
 	case NFT_GOTO:
+		if (pctx->jump_count >= INT_MAX)
+			return -EMLINK;
+
+		pctx->jump_count++;
 		pctx->level++;
 		err = nft_chain_validate(ctx, data->verdict.chain);
 		if (err < 0)
diff --git a/net/netfilter/nft_lookup.c b/net/netfilter/nft_lookup.c
index 58c5b14889c4..0051aa5574e6 100644
--- a/net/netfilter/nft_lookup.c
+++ b/net/netfilter/nft_lookup.c
@@ -246,12 +246,16 @@ static int nft_lookup_validate(const struct nft_ctx *ctx,
 			       const struct nft_expr *expr)
 {
 	const struct nft_lookup *priv = nft_expr_priv(expr);
+	struct nft_ctx *pctx = (struct nft_ctx *)ctx;
 	struct nft_set_iter iter;
 
 	if (!(priv->set->flags & NFT_SET_MAP) ||
 	    priv->set->dtype != NFT_DATA_VERDICT)
 		return 0;
 
+	if (pctx->jump_count >= INT_MAX)
+		return -EMLINK;
+
 	iter.genmask	= nft_genmask_next(ctx->net);
 	iter.type	= NFT_ITER_UPDATE;
 	iter.skip	= 0;
@@ -266,6 +270,11 @@ static int nft_lookup_validate(const struct nft_ctx *ctx,
 	if (iter.err < 0)
 		return iter.err;
 
+	/* Verdict maps always have one exact match per lookup at least, count
+	 * only one jump per set reference.
+	 */
+	pctx->jump_count++;
+
 	return 0;
 }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH nf 2/2] selftests: netfilter: add test for nf_tables_jumps_max_netns sysctl
  2025-10-27 22:17 [PATCH nf,v2 0/2] nf_tables: limit maximum number of jumps/gotos per netns Pablo Neira Ayuso
  2025-10-27 22:17 ` [PATCH nf 1/2] netfilter: " Pablo Neira Ayuso
@ 2025-10-27 22:17 ` Pablo Neira Ayuso
  1 sibling, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2025-10-27 22:17 UTC (permalink / raw)
  To: netfilter-devel; +Cc: fw, ffmancera, brady.1345

This patch adds gen_ruleset_many_jumps.c which is a program that
generates a random ruleset with many jumps and it estimates the number
of jumps that results from its evaluation.

nft_ruleset_many_jumps.sh creates the ruleset and tests if it loads or
fail as expected according to the estimated number of jumps.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
v2: new in this series, in gen_ruleset_many_jumps.c:
    - create_ruleset() represents jump rules per chain at different levels
      in an array.
    - count_jumps() provides an estimation of the number of jumps in the worst
      case scenario which is similar to the DFS-based count that the kernel
      performs. This code can be generalised later on to make a tool for tuning
      nf_tables_jumps_max_netns, if user really ever needs to.

 .../testing/selftests/net/netfilter/Makefile  |   2 +
 .../net/netfilter/gen_ruleset_many_jumps.c    | 145 ++++++++++++++++++
 .../net/netfilter/nft_ruleset_many_jumps.sh   | 118 ++++++++++++++
 3 files changed, 265 insertions(+)
 create mode 100644 tools/testing/selftests/net/netfilter/gen_ruleset_many_jumps.c
 create mode 100755 tools/testing/selftests/net/netfilter/nft_ruleset_many_jumps.sh

diff --git a/tools/testing/selftests/net/netfilter/Makefile b/tools/testing/selftests/net/netfilter/Makefile
index ee2d1a5254f8..dc1b328be31d 100644
--- a/tools/testing/selftests/net/netfilter/Makefile
+++ b/tools/testing/selftests/net/netfilter/Makefile
@@ -31,6 +31,7 @@ TEST_PROGS := \
 	nft_meta.sh \
 	nft_nat.sh \
 	nft_nat_zones.sh \
+	nft_ruleset_many_jumps.sh \
 	nft_queue.sh \
 	nft_synproxy.sh \
 	nft_tproxy_tcp.sh \
@@ -48,6 +49,7 @@ TEST_GEN_FILES = \
 	connect_close \
 	conntrack_dump_flush \
 	conntrack_reverse_clash \
+	gen_ruleset_many_jumps \
 	nf_queue \
 	sctp_collision \
 	udpclash \
diff --git a/tools/testing/selftests/net/netfilter/gen_ruleset_many_jumps.c b/tools/testing/selftests/net/netfilter/gen_ruleset_many_jumps.c
new file mode 100644
index 000000000000..ddc150131bc7
--- /dev/null
+++ b/tools/testing/selftests/net/netfilter/gen_ruleset_many_jumps.c
@@ -0,0 +1,145 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include <stdio.h>
+#include <time.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/time.h>
+#include <errno.h>
+
+#define MAX_LEVELS 10
+
+static void create_ruleset(int *rules, int *depth)
+{
+	struct timeval tv;
+	int levels;
+	int i;
+
+	gettimeofday(&tv, NULL);
+	srand(tv.tv_usec);
+
+	levels = (random() % (MAX_LEVELS - 1)) + 2;
+	rules[0] = 1;
+	for (i = 1; i < levels; i++)
+		rules[i] = (random() % 4) + 1;
+
+	*depth = levels;
+
+#if DEBUG_RULESET
+	for (i = 0; i < depth; i++)
+		printf("%u : %u\n", i, depth);
+#endif
+}
+
+static void count_jumps(int *count, int *rules, int depth)
+{
+	int tmp[MAX_LEVELS] = {};
+	int i = 0;
+
+	while (1) {
+		if (tmp[i]++ < rules[i]) {
+			(*count)++;
+			if (i < depth - 1)
+				i++;
+		} else {
+			tmp[i] = 0;
+			if (--i <= 0)
+				break;
+		}
+	}
+}
+
+static int print_ruleset(int *rules, int depth, int jump_count, char *filename)
+{
+	int fd, i, j;
+	FILE *fp;
+
+	fd = mkstemp(filename);
+	if (fd < 0) {
+		fprintf(stderr, "failed to create temporary ruleset file: %s\n", strerror(errno));
+		return -1;
+	}
+
+	fp = fdopen(fd, "w+");
+	if (!fp) {
+		close(fd);
+		fprintf(stderr, "failed to create temporary ruleset file\n");
+		return -1;
+	}
+
+	fprintf(fp, "# jump_count %d\n", jump_count);
+	fprintf(fp, "table ip x {\n");
+	fprintf(fp, "\tchain y%u {\n", depth);
+	fprintf(fp, "\t}\n");
+
+	for (i = depth - 1; i >= 1; i--) {
+		fprintf(fp, "\tchain y%u {\n", i);
+		for (j = 0; j < rules[i]; j++)
+			fprintf(fp, "\t\tjump y%d\n", i+1);
+
+		fprintf(fp, "\t}\n");
+	}
+	fprintf(fp, "\tchain y0 {\n", i);
+	fprintf(fp, "\t\ttype filter hook input priority 0;\n");
+	fprintf(fp, "\t\tjump y1\n");
+	fprintf(fp, "\t}\n");
+	fprintf(fp, "}\n");
+
+	return 0;
+}
+
+enum {
+	RANDOM = 0,
+	FAIL,
+	OK,
+};
+
+int main(int argc, const char *argv[])
+{
+	unsigned int type, nf_tables_jumps_max_netns;
+	int rules[10], depth, i, jump_count = 0;
+	char filename[] = "/tmp/rulesetXXXXXX";
+
+	if (argc == 3) {
+		if (!strcmp(argv[1], "ok"))
+			type = OK;
+		else if (!strcmp(argv[1], "fail"))
+			type = FAIL;
+
+		nf_tables_jumps_max_netns = atoi(argv[2]);
+	} else {
+		type = RANDOM;
+	}
+
+	switch (type) {
+	case RANDOM:
+		memset(rules, 0, sizeof(rules));
+		create_ruleset(rules, &depth);
+		count_jumps(&jump_count, rules, depth);
+		break;
+	case OK:
+		while (1) {
+			memset(rules, 0, sizeof(rules));
+			create_ruleset(rules, &depth);
+			count_jumps(&jump_count, rules, depth);
+			if (jump_count <= nf_tables_jumps_max_netns)
+				break;
+
+			jump_count = 0;
+		}
+		break;
+	case FAIL:
+		while (1) {
+			memset(rules, 0, sizeof(rules));
+			create_ruleset(rules, &depth);
+			count_jumps(&jump_count, rules, depth);
+			if (jump_count > nf_tables_jumps_max_netns)
+				break;
+
+			jump_count = 0;
+		}
+		break;
+	}
+	print_ruleset(rules, depth, jump_count, filename);
+	printf("%s\n", filename);
+}
diff --git a/tools/testing/selftests/net/netfilter/nft_ruleset_many_jumps.sh b/tools/testing/selftests/net/netfilter/nft_ruleset_many_jumps.sh
new file mode 100755
index 000000000000..c25bf0dbe054
--- /dev/null
+++ b/tools/testing/selftests/net/netfilter/nft_ruleset_many_jumps.sh
@@ -0,0 +1,118 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+SYSCTL_MAX_JUMPS=32
+DEFAULT_SYSCTL=65536
+
+rnd=$(mktemp -u XXXXXXXX)
+ns="nft-$rnd"
+
+cleanup() {
+        ip netns del $ns 2>/dev/null || true
+        rm -f $ruleset
+}
+
+trap cleanup EXIT
+
+set_max_jumps()
+{
+        local max_jumps=$1
+
+        sysctl -w net.netfilter.nf_tables_jumps_max_netns=$max_jumps 2>&1 >/dev/null
+        new_value=$(sysctl -n net.netfilter.nf_tables_jumps_max_netns)
+}
+
+get_max_jumps()
+{
+        local init_net_value=$(sysctl -n net.netfilter.nf_tables_jumps_max_netns)
+        echo "$init_net_value"
+}
+
+load_ruleset()
+{
+	local ruleset=$1
+
+	jumps=$(head -1 $ruleset | cut -f3 -d' ')
+
+	ip netns exec $ns nft -f $ruleset &> /dev/null
+	if [ "$?" -eq 0 ];then
+		if [ $jumps -gt $SYSCTL_MAX_JUMPS ];then
+			echo "FAIL: $jumps > $SYSCTL_MAX_JUMPS but ruleset loads"
+			cat $ruleset > /tmp/ruleset.nft
+			exit 1
+		fi
+		echo "OK: good ruleset with $jumps jump loads as expected"
+	else
+		if [ $jumps -lt $SYSCTL_MAX_JUMPS ];then
+			echo "FAIL: $jumps < $SYSCTL_MAX_JUMPS but ruleset does not load"
+			cat $ruleset > /tmp/ruleset.nft
+			exit 1
+		fi
+		echo "OK: bad ruleset with $jumps jumps fails as expected"
+	fi
+}
+
+load_ruleset_basic()
+{
+	ruleset=$(mktemp nft-tempXXXXXXXX.nft)
+	echo "table ip x {" > $ruleset
+	echo "	chain y0 {" >> $ruleset
+	echo "		type filter hook input priority 0;" >> $ruleset
+	echo "	}" >> $ruleset
+	echo "}" >> $ruleset
+
+	ip netns exec $ns nft -f $ruleset &> /dev/null
+	if [ "$?" -ne 0 ];then
+		echo "FAIL: cannot load basic ruleset"
+		exit 1
+	fi
+}
+
+flush_ruleset()
+{
+	local ruleset=$1
+
+	ip netns exec $ns nft flush ruleset
+	if [ "$?" -ne 0 ];then
+		echo "FAIL: cannot flush ruleset"
+		cat $ruleset > /tmp/ruleset.nft
+		exit 1
+	fi
+	rm -f $ruleset
+}
+
+pre_max_jumps=$(get_max_jumps)
+set_max_jumps $SYSCTL_MAX_JUMPS
+
+ip netns add $ns
+
+for ((i=0;i<10;i++))
+do
+	echo "=== iteration $i ==="
+	filename=$(./gen_ruleset_many_jumps)
+	load_ruleset $filename
+	flush_ruleset $filename
+done
+
+echo "Testing abort path with initial table w/o jumps"
+
+for ((i=0;i<10;i++))
+do
+	echo "=== iteration $i ==="
+	load_ruleset_basic
+	filename=$(./gen_ruleset_many_jumps fail $SYSCTL_MAX_JUMPS)
+	load_ruleset $filename
+	filename=$(./gen_ruleset_many_jumps ok $SYSCTL_MAX_JUMPS)
+	load_ruleset $filename
+	flush_ruleset $filename
+done
+
+set_max_jumps $pre_max_jumps
+post_max_jumps=$(get_max_jumps)
+
+if [ "$pre_max_jumps" -ne "$post_max_jumps" ];then
+	echo "Fail: Does not init default value: $init_net_value"
+	exit 1
+fi
+
+exit 0
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH nf 1/2] netfilter: nf_tables: limit maximum number of jumps/gotos per netns
  2025-10-27 22:17 ` [PATCH nf 1/2] netfilter: " Pablo Neira Ayuso
@ 2025-10-28 13:06   ` Florian Westphal
  2025-10-28 17:26     ` Pablo Neira Ayuso
  2025-10-28 14:32   ` kernel test robot
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Florian Westphal @ 2025-10-28 13:06 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, ffmancera, brady.1345

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> Add a new sysctl:
> 
>    net.netfilter.nf_tables_jump_max_netns
> 
> which is 65535 by default, because iptables-nft rulesets are more likely
> to have more jumps/gotos compared to native nftables rulesets.

I have existing / real-world iptables dumps that exceed 64k :-/
I'll forward you one of them.

Seems this patch misses a reset somewhere to deal with
chains/tables being deleted:

sysctl net.netfilter.nf_tables_jumps_max_netns=256000
net.netfilter.nf_tables_jumps_max_netns = 256000
iptables-nft-restore < kubernetes-huge-may-2018.txt; echo $?
0
iptables-nft-restore < kubernetes-huge-may-2018.txt
iptables-nft-restore v1.8.11 (nf_tables):
line 52222: RULE_APPEND failed (Too many links): rule in chain KUBE-SVC-FCXG7AJXWMSO3TT5

works after 'nft flush ruleset'.

I also have a hunch that a followup patch that sepearates ip and ip6
families (since they are mutually exclusive) will be needed sooner than
later.

If even a random old iptables-dump exceeds the 64k limit I would expect
combined ip+ip6tables rulesets to be even more brittle.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH nf 1/2] netfilter: nf_tables: limit maximum number of jumps/gotos per netns
  2025-10-27 22:17 ` [PATCH nf 1/2] netfilter: " Pablo Neira Ayuso
  2025-10-28 13:06   ` Florian Westphal
@ 2025-10-28 14:32   ` kernel test robot
  2025-10-28 14:54   ` kernel test robot
  2025-10-29  4:49   ` kernel test robot
  3 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2025-10-28 14:32 UTC (permalink / raw)
  To: Pablo Neira Ayuso, netfilter-devel
  Cc: oe-kbuild-all, fw, ffmancera, brady.1345

Hi Pablo,

kernel test robot noticed the following build errors:

[auto build test ERROR on netfilter-nf/main]
[also build test ERROR on linus/master v6.18-rc3 next-20251028]
[cannot apply to nf-next/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Pablo-Neira-Ayuso/netfilter-nf_tables-limit-maximum-number-of-jumps-gotos-per-netns/20251028-062221
base:   https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git main
patch link:    https://lore.kernel.org/r/20251027221722.183398-2-pablo%40netfilter.org
patch subject: [PATCH nf 1/2] netfilter: nf_tables: limit maximum number of jumps/gotos per netns
config: arm-aspeed_g5_defconfig (https://download.01.org/0day-ci/archive/20251028/202510282205.FvXf2zL5-lkp@intel.com/config)
compiler: arm-linux-gnueabi-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251028/202510282205.FvXf2zL5-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510282205.FvXf2zL5-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

   net/netfilter/core.c: In function 'netfilter_init':
>> net/netfilter/core.c:832:1: warning: label 'err_nft_pernet' defined but not used [-Wunused-label]
     832 | err_nft_pernet:
         | ^~~~~~~~~~~~~~
--
>> net/netfilter/nf_tables_sysctl.c:15:47: error: 'struct netns_nf' has no member named 'nf_tables_jumps_max_netns'
      15 |                 .data           = &init_net.nf.nf_tables_jumps_max_netns,
         |                                               ^
   net/netfilter/nf_tables_sysctl.c:16:53: error: 'struct netns_nf' has no member named 'nf_tables_jumps_max_netns'
      16 |                 .maxlen         = sizeof(init_net.nf.nf_tables_jumps_max_netns),
         |                                                     ^
   net/netfilter/nf_tables_sysctl.c: In function 'nf_tables_sysctl_init':
   net/netfilter/nf_tables_sysctl.c:33:24: error: 'struct netns_nf' has no member named 'nf_tables_jumps_max_netns'
      33 |                 net->nf.nf_tables_jumps_max_netns = NFT_TABLE_DEFAULT_JUMPS_MAX;
         |                        ^
   net/netfilter/nf_tables_sysctl.c:40:24: error: 'struct netns_nf' has no member named 'nf_tables_jumps_max_netns'
      40 |                 net->nf.nf_tables_jumps_max_netns =
         |                        ^
   net/netfilter/nf_tables_sysctl.c:41:36: error: 'struct netns_nf' has no member named 'nf_tables_jumps_max_netns'
      41 |                         init_net.nf.nf_tables_jumps_max_netns;
         |                                    ^
   net/netfilter/nf_tables_sysctl.c:43:33: error: 'struct netns_nf' has no member named 'nf_tables_jumps_max_netns'
      43 |                         &net->nf.nf_tables_jumps_max_netns;
         |                                 ^
>> net/netfilter/nf_tables_sysctl.c:49:17: error: 'struct netns_nf' has no member named 'nf_tables_dir_header'; did you mean 'nf_log_dir_header'?
      49 |         net->nf.nf_tables_dir_header =
         |                 ^~~~~~~~~~~~~~~~~~~~
         |                 nf_log_dir_header
   net/netfilter/nf_tables_sysctl.c:52:22: error: 'struct netns_nf' has no member named 'nf_tables_dir_header'; did you mean 'nf_log_dir_header'?
      52 |         if (!net->nf.nf_tables_dir_header)
         |                      ^~~~~~~~~~~~~~~~~~~~
         |                      nf_log_dir_header
   net/netfilter/nf_tables_sysctl.c: In function 'nf_tables_sysctl_exit':
   net/netfilter/nf_tables_sysctl.c:68:45: error: 'struct netns_nf' has no member named 'nf_tables_dir_header'; did you mean 'nf_log_dir_header'?
      68 |         unregister_net_sysctl_table(net->nf.nf_tables_dir_header);
         |                                             ^~~~~~~~~~~~~~~~~~~~
         |                                             nf_log_dir_header
   net/netfilter/nf_tables_sysctl.c:69:25: error: 'struct netns_nf' has no member named 'nf_tables_dir_header'; did you mean 'nf_log_dir_header'?
      69 |         table = net->nf.nf_tables_dir_header->ctl_table_arg;
         |                         ^~~~~~~~~~~~~~~~~~~~
         |                         nf_log_dir_header


vim +15 net/netfilter/nf_tables_sysctl.c

    11	
    12	static struct ctl_table nf_tables_sysctl_table[] = {
    13		[NF_SYSCTL_NFT_JUMPS_MAX] = {
    14			.procname       = "nf_tables_jumps_max_netns",
  > 15			.data           = &init_net.nf.nf_tables_jumps_max_netns,
    16			.maxlen         = sizeof(init_net.nf.nf_tables_jumps_max_netns),
    17			.mode           = 0644,
    18			.proc_handler   = proc_dointvec,
    19			.extra1		= SYSCTL_ONE,
    20			.extra2		= SYSCTL_INT_MAX,
    21		},
    22	};
    23	
    24	#define NFT_TABLE_DEFAULT_JUMPS_MAX 65535
    25	
    26	static int __net_init nf_tables_sysctl_init(struct net *net)
    27	{
    28		struct ctl_table *table = nf_tables_sysctl_table;
    29	
    30		BUILD_BUG_ON(ARRAY_SIZE(nf_tables_sysctl_table) != NF_SYSCTL_NFT_LAST_SYSCTL);
    31	
    32		if (net_eq(net, &init_net)) {
    33			net->nf.nf_tables_jumps_max_netns = NFT_TABLE_DEFAULT_JUMPS_MAX;
    34		} else {
    35			table = kmemdup(nf_tables_sysctl_table,
    36					sizeof(nf_tables_sysctl_table), GFP_KERNEL);
    37			if (!table)
    38				return -ENOMEM;
    39	
    40			net->nf.nf_tables_jumps_max_netns =
    41				init_net.nf.nf_tables_jumps_max_netns;
    42			table[NF_SYSCTL_NFT_JUMPS_MAX].data =
    43				&net->nf.nf_tables_jumps_max_netns;
    44	
    45			if (net->user_ns != &init_user_ns)
    46				table[NF_SYSCTL_NFT_JUMPS_MAX].mode &= ~0222;
    47		}
    48	
  > 49		net->nf.nf_tables_dir_header =
    50			register_net_sysctl_sz(net, "net/netfilter", table,
    51					       ARRAY_SIZE(nf_tables_sysctl_table));
    52		if (!net->nf.nf_tables_dir_header)
    53			goto err_tbl_free;
    54	
    55		return 0;
    56	
    57	err_tbl_free:
    58		if (table != nf_tables_sysctl_table)
    59			kfree(table);
    60	
    61		return -ENOMEM;
    62	}
    63	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH nf 1/2] netfilter: nf_tables: limit maximum number of jumps/gotos per netns
  2025-10-27 22:17 ` [PATCH nf 1/2] netfilter: " Pablo Neira Ayuso
  2025-10-28 13:06   ` Florian Westphal
  2025-10-28 14:32   ` kernel test robot
@ 2025-10-28 14:54   ` kernel test robot
  2025-10-29  4:49   ` kernel test robot
  3 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2025-10-28 14:54 UTC (permalink / raw)
  To: Pablo Neira Ayuso, netfilter-devel
  Cc: llvm, oe-kbuild-all, fw, ffmancera, brady.1345

Hi Pablo,

kernel test robot noticed the following build errors:

[auto build test ERROR on netfilter-nf/main]
[also build test ERROR on linus/master v6.18-rc3 next-20251028]
[cannot apply to nf-next/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Pablo-Neira-Ayuso/netfilter-nf_tables-limit-maximum-number-of-jumps-gotos-per-netns/20251028-062221
base:   https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git main
patch link:    https://lore.kernel.org/r/20251027221722.183398-2-pablo%40netfilter.org
patch subject: [PATCH nf 1/2] netfilter: nf_tables: limit maximum number of jumps/gotos per netns
config: i386-defconfig (https://download.01.org/0day-ci/archive/20251028/202510282243.y0oDegX0-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251028/202510282243.y0oDegX0-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510282243.y0oDegX0-lkp@intel.com/

All errors (new ones prefixed by >>):

>> net/netfilter/nf_tables_sysctl.c:15:34: error: no member named 'nf_tables_jumps_max_netns' in 'struct netns_nf'
      15 |                 .data           = &init_net.nf.nf_tables_jumps_max_netns,
         |                                    ~~~~~~~~~~~ ^
   net/netfilter/nf_tables_sysctl.c:16:40: error: no member named 'nf_tables_jumps_max_netns' in 'struct netns_nf'
      16 |                 .maxlen         = sizeof(init_net.nf.nf_tables_jumps_max_netns),
         |                                          ~~~~~~~~~~~ ^
   net/netfilter/nf_tables_sysctl.c:33:11: error: no member named 'nf_tables_jumps_max_netns' in 'struct netns_nf'
      33 |                 net->nf.nf_tables_jumps_max_netns = NFT_TABLE_DEFAULT_JUMPS_MAX;
         |                 ~~~~~~~ ^
   net/netfilter/nf_tables_sysctl.c:40:11: error: no member named 'nf_tables_jumps_max_netns' in 'struct netns_nf'
      40 |                 net->nf.nf_tables_jumps_max_netns =
         |                 ~~~~~~~ ^
   net/netfilter/nf_tables_sysctl.c:41:16: error: no member named 'nf_tables_jumps_max_netns' in 'struct netns_nf'
      41 |                         init_net.nf.nf_tables_jumps_max_netns;
         |                         ~~~~~~~~~~~ ^
   net/netfilter/nf_tables_sysctl.c:43:13: error: no member named 'nf_tables_jumps_max_netns' in 'struct netns_nf'
      43 |                         &net->nf.nf_tables_jumps_max_netns;
         |                          ~~~~~~~ ^
>> net/netfilter/nf_tables_sysctl.c:49:10: error: no member named 'nf_tables_dir_header' in 'struct netns_nf'; did you mean 'nf_log_dir_header'?
      49 |         net->nf.nf_tables_dir_header =
         |                 ^~~~~~~~~~~~~~~~~~~~
         |                 nf_log_dir_header
   include/net/netns/netfilter.h:17:27: note: 'nf_log_dir_header' declared here
      17 |         struct ctl_table_header *nf_log_dir_header;
         |                                  ^
   net/netfilter/nf_tables_sysctl.c:52:15: error: no member named 'nf_tables_dir_header' in 'struct netns_nf'; did you mean 'nf_log_dir_header'?
      52 |         if (!net->nf.nf_tables_dir_header)
         |                      ^~~~~~~~~~~~~~~~~~~~
         |                      nf_log_dir_header
   include/net/netns/netfilter.h:17:27: note: 'nf_log_dir_header' declared here
      17 |         struct ctl_table_header *nf_log_dir_header;
         |                                  ^
   net/netfilter/nf_tables_sysctl.c:68:38: error: no member named 'nf_tables_dir_header' in 'struct netns_nf'; did you mean 'nf_log_dir_header'?
      68 |         unregister_net_sysctl_table(net->nf.nf_tables_dir_header);
         |                                             ^~~~~~~~~~~~~~~~~~~~
         |                                             nf_log_dir_header
   include/net/netns/netfilter.h:17:27: note: 'nf_log_dir_header' declared here
      17 |         struct ctl_table_header *nf_log_dir_header;
         |                                  ^
   net/netfilter/nf_tables_sysctl.c:69:18: error: no member named 'nf_tables_dir_header' in 'struct netns_nf'; did you mean 'nf_log_dir_header'?
      69 |         table = net->nf.nf_tables_dir_header->ctl_table_arg;
         |                         ^~~~~~~~~~~~~~~~~~~~
         |                         nf_log_dir_header
   include/net/netns/netfilter.h:17:27: note: 'nf_log_dir_header' declared here
      17 |         struct ctl_table_header *nf_log_dir_header;
         |                                  ^
   10 errors generated.


vim +15 net/netfilter/nf_tables_sysctl.c

    11	
    12	static struct ctl_table nf_tables_sysctl_table[] = {
    13		[NF_SYSCTL_NFT_JUMPS_MAX] = {
    14			.procname       = "nf_tables_jumps_max_netns",
  > 15			.data           = &init_net.nf.nf_tables_jumps_max_netns,
    16			.maxlen         = sizeof(init_net.nf.nf_tables_jumps_max_netns),
    17			.mode           = 0644,
    18			.proc_handler   = proc_dointvec,
    19			.extra1		= SYSCTL_ONE,
    20			.extra2		= SYSCTL_INT_MAX,
    21		},
    22	};
    23	
    24	#define NFT_TABLE_DEFAULT_JUMPS_MAX 65535
    25	
    26	static int __net_init nf_tables_sysctl_init(struct net *net)
    27	{
    28		struct ctl_table *table = nf_tables_sysctl_table;
    29	
    30		BUILD_BUG_ON(ARRAY_SIZE(nf_tables_sysctl_table) != NF_SYSCTL_NFT_LAST_SYSCTL);
    31	
    32		if (net_eq(net, &init_net)) {
    33			net->nf.nf_tables_jumps_max_netns = NFT_TABLE_DEFAULT_JUMPS_MAX;
    34		} else {
    35			table = kmemdup(nf_tables_sysctl_table,
    36					sizeof(nf_tables_sysctl_table), GFP_KERNEL);
    37			if (!table)
    38				return -ENOMEM;
    39	
    40			net->nf.nf_tables_jumps_max_netns =
    41				init_net.nf.nf_tables_jumps_max_netns;
    42			table[NF_SYSCTL_NFT_JUMPS_MAX].data =
    43				&net->nf.nf_tables_jumps_max_netns;
    44	
    45			if (net->user_ns != &init_user_ns)
    46				table[NF_SYSCTL_NFT_JUMPS_MAX].mode &= ~0222;
    47		}
    48	
  > 49		net->nf.nf_tables_dir_header =
    50			register_net_sysctl_sz(net, "net/netfilter", table,
    51					       ARRAY_SIZE(nf_tables_sysctl_table));
    52		if (!net->nf.nf_tables_dir_header)
    53			goto err_tbl_free;
    54	
    55		return 0;
    56	
    57	err_tbl_free:
    58		if (table != nf_tables_sysctl_table)
    59			kfree(table);
    60	
    61		return -ENOMEM;
    62	}
    63	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH nf 1/2] netfilter: nf_tables: limit maximum number of jumps/gotos per netns
  2025-10-28 13:06   ` Florian Westphal
@ 2025-10-28 17:26     ` Pablo Neira Ayuso
  2025-10-28 17:36       ` Florian Westphal
  0 siblings, 1 reply; 9+ messages in thread
From: Pablo Neira Ayuso @ 2025-10-28 17:26 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, ffmancera, brady.1345

On Tue, Oct 28, 2025 at 02:06:38PM +0100, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > Add a new sysctl:
> > 
> >    net.netfilter.nf_tables_jump_max_netns
> > 
> > which is 65535 by default, because iptables-nft rulesets are more likely
> > to have more jumps/gotos compared to native nftables rulesets.

iptables needs higher value, while nftables with vmaps could set a
much lower value, because vmap counts as a single immediate jump.

> I have existing / real-world iptables dumps that exceed 64k :-/

OK, so k8s can load this ruleset inside userns (because netns can
still rise this value). But your concern is the default value, right?

I can extract from that iptables k8 ruleset a good default value.

> I'll forward you one of them.

Yes, you passed me this huge ruleset.

> Seems this patch misses a reset somewhere to deal with
> chains/tables being deleted:
> 
> sysctl net.netfilter.nf_tables_jumps_max_netns=256000
> net.netfilter.nf_tables_jumps_max_netns = 256000
> iptables-nft-restore < kubernetes-huge-may-2018.txt; echo $?
> 0
> iptables-nft-restore < kubernetes-huge-may-2018.txt
> iptables-nft-restore v1.8.11 (nf_tables):
> line 52222: RULE_APPEND failed (Too many links): rule in chain KUBE-SVC-FCXG7AJXWMSO3TT5

Ah I see, let me have a look, it is missing deleted tables, yes.

> works after 'nft flush ruleset'.
> 
> I also have a hunch that a followup patch that sepearates ip and ip6
> families (since they are mutually exclusive) will be needed sooner than
> later.

I can take look, you mean:

- IPv4 count => count jumps in all table except ipv6.
- IPv6 count => count jumps in all table except ipv4.

Here, IIRC, I needed ~8 million jumps (_not_ net jump counts in the
rule, I mean number of jumps according to nft_jump_count_check()) in
the input chain to see softlockup with KASAN+KMEMLEAK.

256k is still far from 8 million.

> If even a random old iptables-dump exceeds the 64k limit I would expect
> combined ip+ip6tables rulesets to be even more brittle.

Yes, the problem here to set this default value is iptables,
nftables can set it very small, but iptables needs a large one.

I guess native nftables users can safely shrink the default value we
are going to set here.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH nf 1/2] netfilter: nf_tables: limit maximum number of jumps/gotos per netns
  2025-10-28 17:26     ` Pablo Neira Ayuso
@ 2025-10-28 17:36       ` Florian Westphal
  0 siblings, 0 replies; 9+ messages in thread
From: Florian Westphal @ 2025-10-28 17:36 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, ffmancera, brady.1345

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> iptables needs higher value, while nftables with vmaps could set a
> much lower value, because vmap counts as a single immediate jump.

Agreed.  nftables can use something like 100 :-)

> > I have existing / real-world iptables dumps that exceed 64k :-/
> 
> OK, so k8s can load this ruleset inside userns (because netns can
> still rise this value). But your concern is the default value, right?

Yes, exactly.  If you have a system that runs iptables-restore on
startup then after kernel update that might fail.

I see that init_net is exempted from any limits and thats a good choice.
I'm concerned about containers here.

> I can take look, you mean:
> 
> - IPv4 count => count jumps in all table except ipv6.
> - IPv6 count => count jumps in all table except ipv4.

Yes, exactly.  That should remove a bit of pressure to
use a super-large default value.

> Here, IIRC, I needed ~8 million jumps (_not_ net jump counts in the
> rule, I mean number of jumps according to nft_jump_count_check()) in
> the input chain to see softlockup with KASAN+KMEMLEAK.
> 
> 256k is still far from 8 million.

Agreed.

> > If even a random old iptables-dump exceeds the 64k limit I would expect
> > combined ip+ip6tables rulesets to be even more brittle.
> 
> Yes, the problem here to set this default value is iptables,
> nftables can set it very small, but iptables needs a large one.

Right.

> I guess native nftables users can safely shrink the default value we
> are going to set here.

Yes, absolutely.  nft --omptimize could eventually yell at users when
it sees too many jumps :)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH nf 1/2] netfilter: nf_tables: limit maximum number of jumps/gotos per netns
  2025-10-27 22:17 ` [PATCH nf 1/2] netfilter: " Pablo Neira Ayuso
                     ` (2 preceding siblings ...)
  2025-10-28 14:54   ` kernel test robot
@ 2025-10-29  4:49   ` kernel test robot
  3 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2025-10-29  4:49 UTC (permalink / raw)
  To: Pablo Neira Ayuso, netfilter-devel
  Cc: llvm, oe-kbuild-all, fw, ffmancera, brady.1345

Hi Pablo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on netfilter-nf/main]
[also build test WARNING on linus/master v6.18-rc3 next-20251029]
[cannot apply to nf-next/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Pablo-Neira-Ayuso/netfilter-nf_tables-limit-maximum-number-of-jumps-gotos-per-netns/20251028-062221
base:   https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git main
patch link:    https://lore.kernel.org/r/20251027221722.183398-2-pablo%40netfilter.org
patch subject: [PATCH nf 1/2] netfilter: nf_tables: limit maximum number of jumps/gotos per netns
config: arm-randconfig-003-20251028 (https://download.01.org/0day-ci/archive/20251029/202510291201.P7nkKt1R-lkp@intel.com/config)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project d1c086e82af239b245fe8d7832f2753436634990)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251029/202510291201.P7nkKt1R-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510291201.P7nkKt1R-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from net/netfilter/core.c:27:
   In file included from include/net/netfilter/nf_tables.h:13:
   In file included from include/net/netfilter/nf_flow_table.h:13:
   In file included from include/linux/if_pppox.h:17:
   include/uapi/linux/if_pppox.h:71:4: warning: field sa_addr within 'struct sockaddr_pppox' is less aligned than 'union (unnamed union at include/uapi/linux/if_pppox.h:68:2)' and is usually due to 'struct sockaddr_pppox' being packed, which can lead to unaligned accesses [-Wunaligned-access]
      71 |         } sa_addr;
         |           ^
>> net/netfilter/core.c:832:1: warning: unused label 'err_nft_pernet' [-Wunused-label]
     832 | err_nft_pernet:
         | ^~~~~~~~~~~~~~~
   2 warnings generated.


vim +/err_nft_pernet +832 net/netfilter/core.c

   805	
   806	int __init netfilter_init(void)
   807	{
   808		int ret;
   809	
   810		ret = register_pernet_subsys(&netfilter_net_ops);
   811		if (ret < 0)
   812			goto err;
   813	
   814	#ifdef CONFIG_LWTUNNEL
   815		ret = netfilter_lwtunnel_init();
   816		if (ret < 0)
   817			goto err_lwtunnel_pernet;
   818	#endif
   819	#if IS_ENABLED(CONFIG_NF_TABLES)
   820		ret = netfilter_nf_tables_sysctl_init();
   821		if (ret < 0)
   822			goto err_nft_pernet;
   823	#endif
   824		ret = netfilter_log_init();
   825		if (ret < 0)
   826			goto err_log_pernet;
   827	
   828		return 0;
   829	
   830	err_log_pernet:
   831		netfilter_nf_tables_sysctl_fini();
 > 832	err_nft_pernet:

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-10-29  4:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-27 22:17 [PATCH nf,v2 0/2] nf_tables: limit maximum number of jumps/gotos per netns Pablo Neira Ayuso
2025-10-27 22:17 ` [PATCH nf 1/2] netfilter: " Pablo Neira Ayuso
2025-10-28 13:06   ` Florian Westphal
2025-10-28 17:26     ` Pablo Neira Ayuso
2025-10-28 17:36       ` Florian Westphal
2025-10-28 14:32   ` kernel test robot
2025-10-28 14:54   ` kernel test robot
2025-10-29  4:49   ` kernel test robot
2025-10-27 22:17 ` [PATCH nf 2/2] selftests: netfilter: add test for nf_tables_jumps_max_netns sysctl Pablo Neira Ayuso

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.