netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Florian Westphal <fw@strlen.de>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel <netfilter-devel@vger.kernel.org>
Subject: Re: [nf-next 0/2] netfilter: nf_tables: make set flush more resistant to memory pressure
Date: Fri, 25 Jul 2025 02:24:04 +0200	[thread overview]
Message-ID: <aILOpGOJhR5xQCrc@strlen.de> (raw)
In-Reply-To: <aIK_aSCR67ge5q7s@calendula>

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Fri, Jul 04, 2025 at 02:30:16PM +0200, Florian Westphal wrote:
> > Removal of many set elements, e.g. during set flush or ruleset
> > deletion, can sometimes fail due to memory pressure.
> > Reduce likelyhood of this happening and enable sleeping allocations
> > for this.
> 
> I am exploring to skip the allocation of the transaction objects for
> this case. This needs a closer look to deal with batches like:
> 
>  delelem + flush set + abort
>  flush set + del set + abort
> 
> Special care need to be taken to avoid restoring the state of the
> element twice on abort.

Its possible to defer the flush to until after we've reached the
point of no return.

But I was worried about delete/add from datapath, since it can
happen in parallel.

Also, I think for:
flush set x + delelem x y

You get an error, as the flush marks the element as invalid in
the new generation. Can we handle this with a flag
in nft_set, that disallows all del elem operations on
the set after a flush was seen?

And, is that safe from a backwards-compat point of view?
I tought the answer was: no.
Maybe we can turn delsetelem after flush into a no-op
in case the element existed.  Not sure.

Which then means that we either can't do it, or
need to make sure that the "del elem x" is always
handled before the flush-set.

For maps it becomes even more problematic as we
would elide the deactivate step on chains.

And given walk isn't stable for rhashtable at the
moment, I don't think we can rely on "two walks" scheme.

Right now its fine because even if elements get inserted
during or after the delset operation has done the walk+deactivate,
those elements are not on the transaction list so we don't run into
trouble on abort and always undo only what the walk placed on the
transaction log.

> This would allow to save the memory allocation entirely, as well as
> speeding up the transaction handling.

Sure, it sounds tempting to pursue this.

> From userspace, the idea would be to print this event:
> 
>         flush set inet x y
> 
> to skip a large burst of events when a set is flushed.

I think thats fine.

> Is this worth to be pursued?

Yes, but I am not sure it is doable without
breaking some existing behaviour.

  reply	other threads:[~2025-07-25  0:24 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-04 12:30 [nf-next 0/2] netfilter: nf_tables: make set flush more resistant to memory pressure Florian Westphal
2025-07-04 12:30 ` [nf-next 1/2] netfilter: nf_tables: allow iter callbacks to sleep Florian Westphal
2025-07-04 12:30 ` [nf-next 2/2] netfilter: nf_tables: all transaction allocations can now sleep Florian Westphal
2025-07-24 23:19 ` [nf-next 0/2] netfilter: nf_tables: make set flush more resistant to memory pressure Pablo Neira Ayuso
2025-07-25  0:24   ` Florian Westphal [this message]
2025-07-25 10:10     ` Pablo Neira Ayuso
2025-07-25 11:15       ` Florian Westphal
2025-07-25 15:03         ` Pablo Neira Ayuso
2025-07-28 21:28           ` Florian Westphal
2025-07-29  7:22             ` Jozsef Kadlecsik
2025-07-29 10:27               ` Pablo Neira Ayuso
2025-07-29 10:50                 ` Jozsef Kadlecsik
2025-07-29 10:38             ` Pablo Neira Ayuso
2025-07-29 11:37               ` Florian Westphal
2025-07-30 16:16                 ` Pablo Neira Ayuso
2025-07-30 16:35                   ` Florian Westphal
2025-08-19 19:10                     ` Florian Westphal
2025-08-19 22:23                       ` Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aILOpGOJhR5xQCrc@strlen.de \
    --to=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).