From: Florian Westphal <fw@strlen.de>
To: Shaun Brady <brady.1345@gmail.com>
Cc: netfilter-devel@vger.kernel.org, ppwaskie@kernel.org
Subject: Re: [PATCH] netfilter: nf_tables: Implement jump limit for nft_table_validate
Date: Tue, 22 Apr 2025 07:44:10 +0200 [thread overview]
Message-ID: <20250422054410.GA25299@breakpoint.cc> (raw)
In-Reply-To: <20250422001643.113149-1-brady.1345@gmail.com>
Shaun Brady <brady.1345@gmail.com> wrote:
> The compile time limit NFT_DEFAULT_MAX_TABLE_JUMPS of 8192 was chosen to
> account for any normal use case, and when this value (and associated
> stressing loop table) was tested against a 1CPU/256MB machine, the
> system remained functional.
Keep in mind that one can register 1024 base chains, and that we have
at least 5 hook points (ingress -> prerouting -> forward -> postrouting
-> egress), so one could create a ruleset where a packet visits 41943040
chain jumps while in softirq context.
Furthermore, the largest ruleset I have archived here (iptables-save
kubernetes ruleset dump) has 27k jumps (many who are mutually exclusive
and user-defined chains that are always terminal), but nf_tables_api.c
lacks the ability to detect either of these cases).
With the proposed change, the ruleset won't load anymore.
> +u32 sysctl_nf_max_table_jumps __read_mostly = NFT_DEFAULT_MAX_TABLE_JUMPS;
> +EXPORT_SYMBOL(sysctl_nf_max_table_jumps);
Why is this exported?
> +static int netfilter_limit_control_sysctl_init(struct net *net)
> +{
> + if (net_eq(net, &init_net)) {
> + net->nf.nf_limit_control_dir_header = register_net_sysctl(
> + net,
> + "net/netfilter",
> + nf_limit_control_sysctl_table);
> + if (!net->nf.nf_limit_control_dir_header)
> + goto err_alloc;
> + }
> + return 0;
I think you can just make this a global variable.
Or, thats the alternative, make this a pernet tunable as long as
the owning user_ns is the initial user namespace.
I think the idea is fine, but I'm not sure its going to work as-is.
Possible solutions to soften the impact/breakage potential:
- make the sysctl only affect non-init-net namespaces.
- make the sysctl only affect non-init-user-ns owned namespaces.
- Add the obseved total jump count to the table structure
Then, when validating, do not start from 0 but from the sum
of the total jump count of all registered tables in the same family.
netdev family will need to be counted unconditionally.
This will allow to reject ruleset that create 1k base chains for each
family:hook point combination, which in turn would allow to increase the
default upper limit.
next prev parent reply other threads:[~2025-04-22 5:44 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-22 0:16 [PATCH] netfilter: nf_tables: Implement jump limit for nft_table_validate Shaun Brady
2025-04-22 5:44 ` Florian Westphal [this message]
2025-04-23 4:22 ` Shaun Brady
2025-04-23 10:52 ` Florian Westphal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250422054410.GA25299@breakpoint.cc \
--to=fw@strlen.de \
--cc=brady.1345@gmail.com \
--cc=netfilter-devel@vger.kernel.org \
--cc=ppwaskie@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.