* [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate
@ 2025-05-20 3:08 Shaun Brady
2025-05-20 3:08 ` [PATCH v5 2/2] Add test for nft_max_table_jumps_netns sysctl Shaun Brady
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Shaun Brady @ 2025-05-20 3:08 UTC (permalink / raw)
To: netfilter-devel; +Cc: ppwaskie, fw, pablo
Observing https://bugzilla.netfilter.org/show_bug.cgi?id=1665, I was
able to reproduce the bug using linux-stable. Summarized, when adding
large/repeated jump chains, while still staying under the
NFT_JUMP_STACK_SIZE (currently 16), the kernel soons locks up.
From the bug report:
table='loop-test'
nft add table inet $table
nft add chain inet $table test0 '{type filter hook input priority 0; policy accept;}'
for((i=1;i<16;i++));do nft add chain inet $table test$i;done
nft add rule inet $table test0 jump test1
for((i=1;i<15;i++));do nft add rule inet $table test$i jump test$((i+1));done
nft add rule inet $table test15 tcp dport 8080 drop
After the jump rule is added for 3 to 5 times, the system freezes and even softlockup occurs.
for((i=1;i<15;i++));do nft add rule inet $table test$i jump test$((i+1));done
for((i=1;i<15;i++));do nft add rule inet $table test$i jump test$((i+1));done
for((i=1;i<15;i++));do nft add rule inet $table test$i jump test$((i+1));done
This patch is a different approach than the original proposed approach
found in the bug report. Additionally, the limit, namespace specific,
is only applied to non-init-net namespaces, with the common use case
being to protect against rogue containers.
Add a new counter, total_jump_counter, to nft_ctx. On every call to
nft_table_validate() (rule addition time, versus packet inspection time)
start the counter at the current sum of all jump counts in all other
tables, sans 4 vs 6 differences.
Increment said counter for every jump encountered during table
validation. If the counter ever exceeds the namespaces jump limit
*during validation*, gracefully reject the rule with -EMLINK (the same
behavior as exceeding NFT_JUMP_STACK_SIZE).
This allows immediate feedback to the user about a bad chain, versus the
original idea (from the bug report) of allowing the addition to the
table. It keeps the in memory ruleset consistent, versus catching the
failure during packet inspection at some unknown point in the future and
arbitrarily denying the packet. Most importantly, it removes the
original problem of a kernel crash.
The compile time limit NFT_DEFAULT_MAX_TABLE_JUMPS of 65536 was chosen
to account for any normal use case, and when this value (and associated
stressing loop table) was tested against a 1CPU/256MB machine, the
system remained functional.
A sysctl entry net/netfilter/nf_max_table_jumps_netns for the limit was
also added for any use cases that may exceed this limit. It is network
namespace specific. As it is a control limit, for namespaces with an
owner that is non-init_user_ns, this sysctl is read only.
Example output of nft when patch is applied (and count is reached):
Error: Could not process rule: Too many links
add rule inet loop-test test14 jump test15
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
v2: nf_max_table_jumps renamed to nf_max_table_jumps_netns;
Limit namespace specific, only applies to non-init netns;
Limit raised to 16k
v3: Fix improper conditional compile; removed unneeded pernet subsys;
Include NFPROTO_IPV4|6 for _INET matches; add selftest script
v4: Simplify table jump count logic in for netns
v5: Move jump family test to helper function; move test to second commit
Signed-off-by: Shaun Brady <brady.1345@gmail.com>
---
Documentation/networking/netfilter-sysctl.rst | 9 ++
include/net/netfilter/nf_tables.h | 4 +
include/net/netns/netfilter.h | 2 +
net/netfilter/nf_tables_api.c | 125 +++++++++++++++++-
net/netfilter/nft_immediate.c | 1 +
5 files changed, 139 insertions(+), 2 deletions(-)
diff --git a/Documentation/networking/netfilter-sysctl.rst b/Documentation/networking/netfilter-sysctl.rst
index beb6d7b275d4..8a87c72f0672 100644
--- a/Documentation/networking/netfilter-sysctl.rst
+++ b/Documentation/networking/netfilter-sysctl.rst
@@ -15,3 +15,12 @@ nf_log_all_netns - BOOLEAN
with LOG target; this aims to prevent containers from flooding host
kernel log. If enabled, this target also works in other network
namespaces. This variable is only accessible from init_net.
+
+nf_max_table_jumps_netns - INTEGER (count)
+ default 65536
+
+ The maximum numbers of jumps a netns can have across its tables.
+ This only applies to non-init_net namespaces, and is read only for
+ non-init_user_ns namespaces. Meeting or exceeding this value will
+ cause additional rules to not be added, with EMLINK being return to
+ the user.
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 803d5f1601f9..61b8ae421e4f 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -205,6 +205,7 @@ static inline void nft_data_copy(u32 *dst, const struct nft_data *src,
* @nla: netlink attributes
* @portid: netlink portID of the original message
* @seq: netlink sequence number
+ * @total_jump_count: Found jumps for netns
* @flags: modifiers to new request
* @family: protocol family
* @level: depth of the chains
@@ -218,6 +219,7 @@ struct nft_ctx {
const struct nlattr * const *nla;
u32 portid;
u32 seq;
+ u32 total_jump_count;
u16 flags;
u8 family;
u8 level;
@@ -1275,6 +1277,7 @@ static inline void nft_use_inc_restore(u32 *use)
* @hgenerator: handle generator state
* @handle: table handle
* @use: number of chain references to this table
+ * @total_jump_count: jumps as per last validate
* @family:address family
* @flags: table flag (see enum nft_table_flags)
* @genmask: generation mask
@@ -1294,6 +1297,7 @@ struct nft_table {
u64 hgenerator;
u64 handle;
u32 use;
+ u32 total_jump_count;
u16 family:6,
flags:8,
genmask:2;
diff --git a/include/net/netns/netfilter.h b/include/net/netns/netfilter.h
index a6a0bf4a247e..c0dcd85d108c 100644
--- a/include/net/netns/netfilter.h
+++ b/include/net/netns/netfilter.h
@@ -15,6 +15,7 @@ struct netns_nf {
const struct nf_logger __rcu *nf_loggers[NFPROTO_NUMPROTO];
#ifdef CONFIG_SYSCTL
struct ctl_table_header *nf_log_dir_header;
+ struct ctl_table_header *nf_limit_control_dir_header;
#ifdef CONFIG_LWTUNNEL
struct ctl_table_header *nf_lwtnl_dir_header;
#endif
@@ -33,5 +34,6 @@ struct netns_nf {
#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
unsigned int defrag_ipv6_users;
#endif
+ u32 nf_max_table_jumps_netns __maybe_unused;
};
#endif
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index b28f6730e26d..78dcd0381ed7 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -16,6 +16,7 @@
#include <linux/netfilter.h>
#include <linux/netfilter/nfnetlink.h>
#include <linux/netfilter/nf_tables.h>
+#include <linux/sysctl.h>
#include <net/netfilter/nf_flow_table.h>
#include <net/netfilter/nf_tables_core.h>
#include <net/netfilter/nf_tables.h>
@@ -25,6 +26,7 @@
#define NFT_MODULE_AUTOLOAD_LIMIT (MODULE_NAME_LEN - sizeof("nft-expr-255-"))
#define NFT_SET_MAX_ANONLEN 16
+#define NFT_DEFAULT_MAX_TABLE_JUMPS 65536
/* limit compaction to avoid huge kmalloc/krealloc sizes. */
#define NFT_MAX_SET_NELEMS ((2048 - sizeof(struct nft_trans_elem)) / sizeof(struct nft_trans_one_elem))
@@ -4011,8 +4013,16 @@ int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain)
struct nft_expr *expr, *last;
struct nft_rule *rule;
int err;
+ u32 jump_check = NFT_DEFAULT_MAX_TABLE_JUMPS;
- if (ctx->level == NFT_JUMP_STACK_SIZE)
+ if (IS_ENABLED(CONFIG_SYSCTL)) {
+ if (!net_eq(ctx->net, &init_net))
+ jump_check = ctx->net->nf.nf_max_table_jumps_netns;
+ }
+
+ if (ctx->level == NFT_JUMP_STACK_SIZE ||
+ (!net_eq(ctx->net, &init_net) &&
+ ctx->total_jump_count >= jump_check))
return -EMLINK;
list_for_each_entry(rule, &chain->rules, list) {
@@ -4039,14 +4049,62 @@ int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain)
}
EXPORT_SYMBOL_GPL(nft_chain_validate);
-static int nft_table_validate(struct net *net, const struct nft_table *table)
+/** nft_families_inc_jump - Determine if tables should add to the total jump
+ * count for a netns.
+ *
+ * @table: table of interest
+ * @sibling_table: a 'sibling' table to compare against
+ *
+ * Compare attributes of the tables to determine if the sibling tables
+ * total_jump_count should be added to the working context (done by caller).
+ * Mostly concerned with family compatibilities, but also identifying equality
+ * (a tables own addition will be recalculated soon).
+ *
+ * Ex: v4 tables do not apply to v6 packets
+ */
+static bool nft_families_inc_jump(struct nft_table *table, struct nft_table *sibling_table)
+{
+ /* Invert parameters to on require one test for two cases (the reverse) */
+ if (sibling_table->family > table->family) /* include/uapi/linux/netfilter.h */
+ return nft_families_inc_jump(sibling_table, table);
+
+ /* We found ourselves; don't add current jump count (will be counted dynamically) */
+ if (sibling_table == table)
+ return false;
+
+ switch (table->family) {
+ case NFPROTO_IPV4:
+ if (sibling_table->family == NFPROTO_IPV6)
+ return false;
+ break;
+ }
+
+ return true;
+}
+
+static int nft_table_validate(struct net *net, struct nft_table *table)
{
struct nft_chain *chain;
+ struct nftables_pernet *nft_net;
struct nft_ctx ctx = {
.net = net,
.family = table->family,
+ .total_jump_count = 0,
};
int err;
+ u32 total_jumps_before_validate;
+ struct nft_table *sibling_table;
+
+ nft_net = nft_pernet(net);
+
+ if (!net_eq(net, &init_net)) {
+ list_for_each_entry(sibling_table, &nft_net->tables, list) {
+ if(nft_families_inc_jump(table, sibling_table))
+ ctx.total_jump_count += sibling_table->total_jump_count;
+ }
+ }
+
+ total_jumps_before_validate = ctx.total_jump_count;
list_for_each_entry(chain, &table->chains, list) {
if (!nft_is_base_chain(chain))
@@ -4060,6 +4118,8 @@ static int nft_table_validate(struct net *net, const struct nft_table *table)
cond_resched();
}
+ table->total_jump_count = ctx.total_jump_count - total_jumps_before_validate;
+
return 0;
}
@@ -4084,6 +4144,7 @@ int nft_setelem_validate(const struct nft_ctx *ctx, struct nft_set *set,
case NFT_JUMP:
case NFT_GOTO:
pctx->level++;
+ pctx->total_jump_count++;
err = nft_chain_validate(ctx, data->verdict.chain);
if (err < 0)
return err;
@@ -11916,6 +11977,59 @@ static struct notifier_block nft_nl_notifier = {
.notifier_call = nft_rcv_nl_event,
};
+#ifdef CONFIG_SYSCTL
+static struct ctl_table nf_limit_control_sysctl_table[] = {
+ {
+ .procname = "nf_max_table_jumps_netns",
+ .data = &init_net.nf.nf_max_table_jumps_netns,
+ .maxlen = sizeof(init_net.nf.nf_max_table_jumps_netns),
+ .mode = 0644,
+ .proc_handler = proc_dointvec,
+ },
+};
+
+static int netfilter_limit_control_sysctl_init(struct net *net)
+{
+ struct ctl_table *tbl;
+
+ net->nf.nf_max_table_jumps_netns = NFT_DEFAULT_MAX_TABLE_JUMPS;
+ tbl = nf_limit_control_sysctl_table;
+ if (!net_eq(net, &init_net)) {
+ tbl = kmemdup(tbl, sizeof(nf_limit_control_sysctl_table), GFP_KERNEL);
+ tbl->data = &net->nf.nf_max_table_jumps_netns;
+ if (net->user_ns != &init_user_ns)
+ tbl->mode &= ~0222;
+ }
+
+ net->nf.nf_limit_control_dir_header = register_net_sysctl_sz(
+ net, "net/netfilter", tbl, ARRAY_SIZE(nf_limit_control_sysctl_table));
+
+ if (!net->nf.nf_limit_control_dir_header)
+ goto err_alloc;
+
+ return 0;
+
+err_alloc:
+ if (tbl != nf_limit_control_sysctl_table)
+ kfree(tbl);
+ return -ENOMEM;
+}
+
+static void netfilter_limit_control_sysctl_exit(struct net *net)
+{
+ unregister_net_sysctl_table(net->nf.nf_limit_control_dir_header);
+}
+#else
+static int netfilter_limit_control_sysctl_init(struct net *net)
+{
+ return 0;
+}
+
+static void netfilter_limit_control_sysctl_exit(struct net *net)
+{
+}
+#endif /* CONFIG_SYSCTL */
+
static int __net_init nf_tables_init_net(struct net *net)
{
struct nftables_pernet *nft_net = nft_pernet(net);
@@ -11933,6 +12047,11 @@ static int __net_init nf_tables_init_net(struct net *net)
nft_net->validate_state = NFT_VALIDATE_SKIP;
INIT_WORK(&nft_net->destroy_work, nf_tables_trans_destroy_work);
+ int ret = netfilter_limit_control_sysctl_init(net);
+
+ if (ret < 0)
+ return ret;
+
return 0;
}
@@ -11971,6 +12090,8 @@ static void __net_exit nf_tables_exit_net(struct net *net)
WARN_ON_ONCE(!list_empty(&nft_net->module_list));
WARN_ON_ONCE(!list_empty(&nft_net->notify_list));
WARN_ON_ONCE(!list_empty(&nft_net->destroy_list));
+
+ netfilter_limit_control_sysctl_exit(net);
}
static void nf_tables_exit_batch(struct list_head *net_exit_list)
diff --git a/net/netfilter/nft_immediate.c b/net/netfilter/nft_immediate.c
index 02ee5fb69871..b21736e389a4 100644
--- a/net/netfilter/nft_immediate.c
+++ b/net/netfilter/nft_immediate.c
@@ -260,6 +260,7 @@ static int nft_immediate_validate(const struct nft_ctx *ctx,
case NFT_JUMP:
case NFT_GOTO:
pctx->level++;
+ pctx->total_jump_count++;
err = nft_chain_validate(ctx, data->verdict.chain);
if (err < 0)
return err;
--
2.49.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v5 2/2] Add test for nft_max_table_jumps_netns sysctl
2025-05-20 3:08 [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate Shaun Brady
@ 2025-05-20 3:08 ` Shaun Brady
2025-05-20 10:02 ` [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate Florian Westphal
2025-07-22 2:11 ` Pablo Neira Ayuso
2 siblings, 0 replies; 8+ messages in thread
From: Shaun Brady @ 2025-05-20 3:08 UTC (permalink / raw)
To: netfilter-devel; +Cc: ppwaskie, fw, pablo
Introduce test for recently added jump limit functionality. Tests
sysctl behavior with regard to netns, as well as calling user_ns.
Signed-off-by: Shaun Brady <brady.1345@gmail.com>
---
.../testing/selftests/net/netfilter/Makefile | 1 +
.../netfilter/nft_max_table_jumps_netns.sh | 227 ++++++++++++++++++
2 files changed, 228 insertions(+)
create mode 100755 tools/testing/selftests/net/netfilter/nft_max_table_jumps_netns.sh
diff --git a/tools/testing/selftests/net/netfilter/Makefile b/tools/testing/selftests/net/netfilter/Makefile
index ffe161fac8b5..bc7df8feb0f7 100644
--- a/tools/testing/selftests/net/netfilter/Makefile
+++ b/tools/testing/selftests/net/netfilter/Makefile
@@ -23,6 +23,7 @@ TEST_PROGS += nft_concat_range.sh
TEST_PROGS += nft_conntrack_helper.sh
TEST_PROGS += nft_fib.sh
TEST_PROGS += nft_flowtable.sh
+TEST_PROGS += nft_max_table_jumps_netns.sh
TEST_PROGS += nft_meta.sh
TEST_PROGS += nft_nat.sh
TEST_PROGS += nft_nat_zones.sh
diff --git a/tools/testing/selftests/net/netfilter/nft_max_table_jumps_netns.sh b/tools/testing/selftests/net/netfilter/nft_max_table_jumps_netns.sh
new file mode 100755
index 000000000000..9dedd45f4fd2
--- /dev/null
+++ b/tools/testing/selftests/net/netfilter/nft_max_table_jumps_netns.sh
@@ -0,0 +1,227 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# A test script for nf_max_table_jumps_netns limit sysctl
+#
+source lib.sh
+
+DEFAULT_SYSCTL=65536
+
+user_owned_netns="a_user_owned_netns"
+
+cleanup() {
+ ip netns del $user_owned_netns 2>/dev/null || true
+}
+
+trap cleanup EXIT
+
+init_net_value=$(sysctl -n net.netfilter.nf_max_table_jumps_netns)
+
+# Check that init ns inits to default value
+if [ "$init_net_value" -ne "$DEFAULT_SYSCTL" ];then
+ echo "Fail: Does not init default value"
+ exit 1
+fi
+
+# Set to extremely small, demonstrate CAN exceed value
+sysctl -w net.netfilter.nf_max_table_jumps_netns=32 2>&1 >/dev/null
+new_value=$(sysctl -n net.netfilter.nf_max_table_jumps_netns)
+if [ "$new_value" -ne "32" ];then
+ echo "Fail: Set value not respected"
+ exit 1
+fi
+
+nft -f - <<EOF
+table inet loop-test {
+ chain test0 {
+ type filter hook input priority filter; policy accept;
+ jump test1
+ jump test1
+ }
+
+ chain test1 {
+ jump test2
+ jump test2
+ }
+
+ chain test2 {
+ jump test3
+ tcp dport 8080 drop
+ tcp dport 8080 drop
+ }
+
+ chain test3 {
+ jump test4
+ }
+
+ chain test4 {
+ jump test5
+ }
+
+ chain test5 {
+ jump test6
+ }
+
+ chain test6 {
+ jump test7
+ }
+
+ chain test7 {
+ jump test8
+ }
+
+ chain test8 {
+ jump test9
+ }
+
+ chain test9 {
+ jump test10
+ }
+
+ chain test10 {
+ jump test11
+ }
+
+ chain test11 {
+ jump test12
+ }
+
+ chain test12 {
+ jump test13
+ }
+
+ chain test13 {
+ jump test14
+ }
+
+ chain test14 {
+ jump test15
+ jump test15
+ }
+
+ chain test15 {
+ }
+}
+EOF
+
+if [ $? -ne 0 ];then
+ echo "Fail: limit not exceeded when expected"
+ exit 1
+fi
+
+nft flush ruleset
+
+# reset to default
+sysctl -w net.netfilter.nf_max_table_jumps_netns=$DEFAULT_SYSCTL 2>&1 >/dev/null
+
+# Make init_user_ns owned netns, can change value, limit is applied
+ip netns add $user_owned_netns
+ip netns exec $user_owned_netns sysctl -qw net.netfilter.nf_max_table_jumps_netns=32 2>&1
+if [ $? -ne 0 ];then
+ echo "Fail: Can't change value in init_user_ns owned namespace"
+ exit 1
+fi
+
+ip netns exec $user_owned_netns \
+nft -f - 2>&1 <<EOF
+table inet loop-test {
+ chain test0 {
+ type filter hook input priority filter; policy accept;
+ jump test1
+ jump test1
+ }
+
+ chain test1 {
+ jump test2
+ jump test2
+ }
+
+ chain test2 {
+ jump test3
+ tcp dport 8080 drop
+ tcp dport 8080 drop
+ }
+
+ chain test3 {
+ jump test4
+ }
+
+ chain test4 {
+ jump test5
+ }
+
+ chain test5 {
+ jump test6
+ }
+
+ chain test6 {
+ jump test7
+ }
+
+ chain test7 {
+ jump test8
+ }
+
+ chain test8 {
+ jump test9
+ }
+
+ chain test9 {
+ jump test10
+ }
+
+ chain test10 {
+ jump test11
+ }
+
+ chain test11 {
+ jump test12
+ }
+
+ chain test12 {
+ jump test13
+ }
+
+ chain test13 {
+ jump test14
+ }
+
+ chain test14 {
+ jump test15
+ jump test15
+ }
+
+ chain test15 {
+ }
+}
+EOF
+
+if [ $? -eq 0 ];then
+ echo "Fail: Limited incorrectly applied"
+ exit 1
+fi
+ip netns del $user_owned_netns
+
+# Previously set value does not impact root namespace; check value from before
+new_value=$(sysctl -n net.netfilter.nf_max_table_jumps_netns)
+if [ "$new_value" -ne "$DEFAULT_SYSCTL" ];then
+ echo "Fail: Non-init namespace altered init namespace"
+ exit 1
+fi
+
+# Make non-init_user_ns owned netns, can not change value
+unshare -Un sysctl -w net.netfilter.nf_max_table_jumps_netns=1234 2>&1
+if [ $? -ne 0 ];then
+ echo "Fail: Error message incorrect when non-user-init"
+ exit 1
+fi
+
+# Double check user namespace can still see limit
+new_value=(unshare -Un sysctl -n net.netfilter.nf_max_table_jumps_netns)
+if [ "$new_value" -ne "$DEFAULT_SYSCTL" ];then
+ echo "Fail: Unexpected failure when non-user-init"
+ exit 1
+fi
+
+
+exit 0
--
2.49.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate
2025-05-20 3:08 [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate Shaun Brady
2025-05-20 3:08 ` [PATCH v5 2/2] Add test for nft_max_table_jumps_netns sysctl Shaun Brady
@ 2025-05-20 10:02 ` Florian Westphal
2025-05-28 2:39 ` Shaun Brady
2025-07-22 2:11 ` Pablo Neira Ayuso
2 siblings, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2025-05-20 10:02 UTC (permalink / raw)
To: Shaun Brady; +Cc: netfilter-devel, ppwaskie, pablo
Shaun Brady <brady.1345@gmail.com> wrote:
> Observing https://bugzilla.netfilter.org/show_bug.cgi?id=1665, I was
> able to reproduce the bug using linux-stable. Summarized, when adding
> large/repeated jump chains, while still staying under the
> NFT_JUMP_STACK_SIZE (currently 16), the kernel soons locks up.
LGTM, thanks.
Acked-by: Florian Westphal <fw@strlen.de>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate
2025-05-20 10:02 ` [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate Florian Westphal
@ 2025-05-28 2:39 ` Shaun Brady
2025-05-28 10:59 ` Florian Westphal
0 siblings, 1 reply; 8+ messages in thread
From: Shaun Brady @ 2025-05-28 2:39 UTC (permalink / raw)
To: Florian Westphal; +Cc: netfilter-devel, ppwaskie, pablo
On Tue, May 20, 2025 at 12:02:13PM +0200, Florian Westphal wrote:
> LGTM, thanks.
> Acked-by: Florian Westphal <fw@strlen.de>
Much thanks to you for your feedback and patience.
Double checking... is there anything else for me to do in regards to
this patch at this time, or is it in the hands of the merge process of
the team(s)?
Just want to make sure no one is expecting action from me.
Thanks!!
SB
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate
2025-05-28 2:39 ` Shaun Brady
@ 2025-05-28 10:59 ` Florian Westphal
0 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2025-05-28 10:59 UTC (permalink / raw)
To: Shaun Brady; +Cc: netfilter-devel, ppwaskie, pablo
Shaun Brady <brady.1345@gmail.com> wrote:
> On Tue, May 20, 2025 at 12:02:13PM +0200, Florian Westphal wrote:
> > LGTM, thanks.
> > Acked-by: Florian Westphal <fw@strlen.de>
>
> Much thanks to you for your feedback and patience.
>
> Double checking... is there anything else for me to do in regards to
> this patch at this time, or is it in the hands of the merge process of
> the team(s)?
I don't think there is anyhing left do do here except waiting.
As 6.15 was just released I'd expect that there will be a backlog until
the development trees have been re-synced with linus tree.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate
2025-05-20 3:08 [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate Shaun Brady
2025-05-20 3:08 ` [PATCH v5 2/2] Add test for nft_max_table_jumps_netns sysctl Shaun Brady
2025-05-20 10:02 ` [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate Florian Westphal
@ 2025-07-22 2:11 ` Pablo Neira Ayuso
2025-07-23 3:19 ` Shaun Brady
2025-07-28 4:13 ` Shaun Brady
2 siblings, 2 replies; 8+ messages in thread
From: Pablo Neira Ayuso @ 2025-07-22 2:11 UTC (permalink / raw)
To: Shaun Brady; +Cc: netfilter-devel, ppwaskie, fw
On Mon, May 19, 2025 at 11:08:41PM -0400, Shaun Brady wrote:
[...]
> @@ -4039,14 +4049,62 @@ int nft_chain_validate(const struct nft_ctx *ctx, const struct nft_chain *chain)
> }
> EXPORT_SYMBOL_GPL(nft_chain_validate);
>
> -static int nft_table_validate(struct net *net, const struct nft_table *table)
> +/** nft_families_inc_jump - Determine if tables should add to the total jump
> + * count for a netns.
> + *
> + * @table: table of interest
> + * @sibling_table: a 'sibling' table to compare against
> + *
> + * Compare attributes of the tables to determine if the sibling tables
> + * total_jump_count should be added to the working context (done by caller).
> + * Mostly concerned with family compatibilities, but also identifying equality
> + * (a tables own addition will be recalculated soon).
> + *
> + * Ex: v4 tables do not apply to v6 packets
> + */
> +static bool nft_families_inc_jump(struct nft_table *table, struct nft_table *sibling_table)
> +{
I think we mentioned about NFPROTO_BRIDGE here too.
> + /* Invert parameters to on require one test for two cases (the reverse) */
> + if (sibling_table->family > table->family) /* include/uapi/linux/netfilter.h */
> + return nft_families_inc_jump(sibling_table, table);
> +
> + /* We found ourselves; don't add current jump count (will be counted dynamically) */
> + if (sibling_table == table)
> + return false;
> +
> + switch (table->family) {
> + case NFPROTO_IPV4:
> + if (sibling_table->family == NFPROTO_IPV6)
> + return false;
> + break;
> + }
> +
> + return true;
> +}
> +
> +static int nft_table_validate(struct net *net, struct nft_table *table)
This function is called from abort path too. I suspect total_jump_count
for this table will not be OK in such case. And this selftest does cover
many cases.
> {
> struct nft_chain *chain;
> + struct nftables_pernet *nft_net;
> struct nft_ctx ctx = {
> .net = net,
> .family = table->family,
> + .total_jump_count = 0,
> };
> int err;
> + u32 total_jumps_before_validate;
> + struct nft_table *sibling_table;
> +
> + nft_net = nft_pernet(net);
> +
> + if (!net_eq(net, &init_net)) {
> + list_for_each_entry(sibling_table, &nft_net->tables, list) {
> + if(nft_families_inc_jump(table, sibling_table))
^
coding style
> + ctx.total_jump_count += sibling_table->total_jump_count;
> + }
> + }
> +
> + total_jumps_before_validate = ctx.total_jump_count;
>
> list_for_each_entry(chain, &table->chains, list) {
> if (!nft_is_base_chain(chain))
> @@ -4060,6 +4118,8 @@ static int nft_table_validate(struct net *net, const struct nft_table *table)
> cond_resched();
> }
>
> + table->total_jump_count = ctx.total_jump_count - total_jumps_before_validate;
> +
> return 0;
> }
>
> @@ -4084,6 +4144,7 @@ int nft_setelem_validate(const struct nft_ctx *ctx, struct nft_set *set,
> case NFT_JUMP:
> case NFT_GOTO:
> pctx->level++;
> + pctx->total_jump_count++;
> err = nft_chain_validate(ctx, data->verdict.chain);
> if (err < 0)
> return err;
> @@ -11916,6 +11977,59 @@ static struct notifier_block nft_nl_notifier = {
> .notifier_call = nft_rcv_nl_event,
> };
>
> +#ifdef CONFIG_SYSCTL
> +static struct ctl_table nf_limit_control_sysctl_table[] = {
> + {
> + .procname = "nf_max_table_jumps_netns",
> + .data = &init_net.nf.nf_max_table_jumps_netns,
> + .maxlen = sizeof(init_net.nf.nf_max_table_jumps_netns),
> + .mode = 0644,
> + .proc_handler = proc_dointvec,
> + },
> +};
> +
> +static int netfilter_limit_control_sysctl_init(struct net *net)
> +{
> + struct ctl_table *tbl;
> +
> + net->nf.nf_max_table_jumps_netns = NFT_DEFAULT_MAX_TABLE_JUMPS;
> + tbl = nf_limit_control_sysctl_table;
> + if (!net_eq(net, &init_net)) {
> + tbl = kmemdup(tbl, sizeof(nf_limit_control_sysctl_table), GFP_KERNEL);
Not checking error:
if (!tbl)
...
> + tbl->data = &net->nf.nf_max_table_jumps_netns;
> + if (net->user_ns != &init_user_ns)
> + tbl->mode &= ~0222;
> + }
> +
> + net->nf.nf_limit_control_dir_header = register_net_sysctl_sz(
> + net, "net/netfilter", tbl, ARRAY_SIZE(nf_limit_control_sysctl_table));
> +
> + if (!net->nf.nf_limit_control_dir_header)
> + goto err_alloc;
> +
> + return 0;
> +
> +err_alloc:
> + if (tbl != nf_limit_control_sysctl_table)
> + kfree(tbl);
> + return -ENOMEM;
> +}
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate
2025-07-22 2:11 ` Pablo Neira Ayuso
@ 2025-07-23 3:19 ` Shaun Brady
2025-07-28 4:13 ` Shaun Brady
1 sibling, 0 replies; 8+ messages in thread
From: Shaun Brady @ 2025-07-23 3:19 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel, ppwaskie, fw
On Tue, Jul 22, 2025 at 04:11:36AM +0200, Pablo Neira Ayuso wrote:
> On Mon, May 19, 2025 at 11:08:41PM -0400, Shaun Brady wrote:
> > +static bool nft_families_inc_jump(struct nft_table *table, struct nft_table *sibling_table)
> > +{
>
> I think we mentioned about NFPROTO_BRIDGE here too.
>
NFPROTO_BRIDGE was indeed brought up, which brought us to the current
implementation of nft_families_inc_jump.
---- excerpt ----
> Maybe it is better to have a global limit for all tables, regardless
> the family, in a non-init-netns?
Looks like it would be simpler.
The only cases where processing is disjunct is ipv4 vs. ipv6.
---- /excerpt ----
The current implementation includes all tables which are not:
1) the same table we're working with (tbl == other_tbl) (self)
2) tables where the protocol is tbl.family = v4, other_tbl.family = v6
(and vice versa)
I believe from this jumps from NFPROTO_BRIDGE to high protocols will be
counted.
> > +
> > +static int nft_table_validate(struct net *net, struct nft_table *table)
>
> This function is called from abort path too. I suspect total_jump_count
> for this table will not be OK in such case. And this selftest does cover
> many cases.
The abort case is something I did not consider. Would
nft_table_validate be called after a transaction rollback, and thusly a
likely set of tables that previously validated successfully?
Another potential save is that I don't update nft_table.total_jump_count
until after a successful validation of the table.
I might ask, how is the abort state different for validation, or is the
abort state when someone literally kills an update (and thus would be a
rollback?)?
Would a test case that attempts to demonstrate safety of the value
suffice?
>
> > + if(nft_families_inc_jump(table, sibling_table))
> ^
> coding style
I see it (or really, don't). Will fix in v6.
>
> > + if (!net_eq(net, &init_net)) {
> > + tbl = kmemdup(tbl, sizeof(nf_limit_control_sysctl_table), GFP_KERNEL);
>
> Not checking error:
>
> if (!tbl)
> ...
Good catch. I think I can jump down to the same err_alloc label below.
Thanks for the feedback!
SB
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate
2025-07-22 2:11 ` Pablo Neira Ayuso
2025-07-23 3:19 ` Shaun Brady
@ 2025-07-28 4:13 ` Shaun Brady
1 sibling, 0 replies; 8+ messages in thread
From: Shaun Brady @ 2025-07-28 4:13 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel, ppwaskie, fw
On Tue, Jul 22, 2025 at 04:11:36AM +0200, Pablo Neira Ayuso wrote:
>
> This function is called from abort path too. I suspect total_jump_count
> for this table will not be OK in such case. And this selftest does cover
> many cases.
>
Your suspicion was correct. It took a bit to run over the abort path
that used validation (I think it would just be userland sending abort),
but it looked like it updated the total before the roll back.
My solution was to send in the abort state, and when
action != NFNL_ABORT_VALIDATE, we don't update the totals.
Another alternative would be to just run validate again at the end of
the abort (after the rollback), but that seemed wrong to do the costly
validate twice.
See v6 patch (just sent) that also addresses your other feedback.
Thanks!
SB
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-07-28 4:13 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-20 3:08 [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate Shaun Brady
2025-05-20 3:08 ` [PATCH v5 2/2] Add test for nft_max_table_jumps_netns sysctl Shaun Brady
2025-05-20 10:02 ` [PATCH v5 1/2] netfilter: nf_tables: Implement jump limit for nft_table_validate Florian Westphal
2025-05-28 2:39 ` Shaun Brady
2025-05-28 10:59 ` Florian Westphal
2025-07-22 2:11 ` Pablo Neira Ayuso
2025-07-23 3:19 ` Shaun Brady
2025-07-28 4:13 ` Shaun Brady
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).