From: Florian Westphal <fw@strlen.de>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel <netfilter-devel@vger.kernel.org>
Subject: Re: [nf-next 0/2] netfilter: nf_tables: make set flush more resistant to memory pressure
Date: Mon, 28 Jul 2025 23:28:50 +0200 [thread overview]
Message-ID: <aIfrktUYzla8f9dw@strlen.de> (raw)
In-Reply-To: <aIOcq2sdP17aYgAE@calendula>
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> Yes, u32 flush_id (or trans_id) needs to be added, then set
> transaction id incrementally.
Not enough, unfortunately.
The key difference between flush (delete all elements) and delset
(remove the set and all elements) is that the set itself gets detached
from the dataplane. Then, when elements get free'd, we can just iterate
the set and free all elements, they are all unreachable from the
dataplane.
But in case of a flush, thats not the case, releasing the elements will
cause use-after-free. Current DELSETELEM method unlinks the elements
from the set, links them to the DELSETELEM transactional container.
Then, on abort they get re-linked to the set, or, in case of commit,
they can be free'd after the final synchronize_rcu().
That leaves two options:
1. Use the first patchset, that makes delsetelem allocations sleepable.
2. Add a pointer + and id to nft_set_ext.
The drawback of 2) is the added mem cost for every set eleemnt (first
patch series only forces it for rhashtable).
The major upside however is that DELSETELEM transaction objects are
simplified a lot, the to-be-deleted elements could be linked to it by
the then-always-available nft_set_ext pointer, i.e., each DELSETELEM
transaction object can take an arbitrary number of elements.
Unless you disagree, I will go for 2).
This will also allow to remove the krealloc() compaction for DELSETELEM,
so it should be a net code-removal patch.
Another option might be to replace a flush with delset+newset
internally, but this will get tricky because the set/map still being
referenced by other rules, we'd have to fixup the ruleset internally to
use the new/empty set while still being able to roll back.
Proably too tricky / hard to get right, but I'll check it anyway.
next prev parent reply other threads:[~2025-07-28 21:28 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-04 12:30 [nf-next 0/2] netfilter: nf_tables: make set flush more resistant to memory pressure Florian Westphal
2025-07-04 12:30 ` [nf-next 1/2] netfilter: nf_tables: allow iter callbacks to sleep Florian Westphal
2025-07-04 12:30 ` [nf-next 2/2] netfilter: nf_tables: all transaction allocations can now sleep Florian Westphal
2025-07-24 23:19 ` [nf-next 0/2] netfilter: nf_tables: make set flush more resistant to memory pressure Pablo Neira Ayuso
2025-07-25 0:24 ` Florian Westphal
2025-07-25 10:10 ` Pablo Neira Ayuso
2025-07-25 11:15 ` Florian Westphal
2025-07-25 15:03 ` Pablo Neira Ayuso
2025-07-28 21:28 ` Florian Westphal [this message]
2025-07-29 7:22 ` Jozsef Kadlecsik
2025-07-29 10:27 ` Pablo Neira Ayuso
2025-07-29 10:50 ` Jozsef Kadlecsik
2025-07-29 10:38 ` Pablo Neira Ayuso
2025-07-29 11:37 ` Florian Westphal
2025-07-30 16:16 ` Pablo Neira Ayuso
2025-07-30 16:35 ` Florian Westphal
2025-08-19 19:10 ` Florian Westphal
2025-08-19 22:23 ` Pablo Neira Ayuso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aIfrktUYzla8f9dw@strlen.de \
--to=fw@strlen.de \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.