From: Florian Westphal <fw@strlen.de>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>, netfilter-devel@vger.kernel.org
Subject: Re: [PATCH nf 1/2] netfilter: nft_set_rbtree: move sync GC from insert path to set->ops->commit
Date: Sun, 1 Oct 2023 23:08:16 +0200 [thread overview]
Message-ID: <20231001210816.GA15564@breakpoint.cc> (raw)
In-Reply-To: <ZRnSGwk40jpUActD@calendula>
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Sat, Sep 30, 2023 at 10:10:38AM +0200, Florian Westphal wrote:
> > It does. If lockless doesn't return a match it falls back to readlock.
>
> And it will in case of the update to remove the expired element, right?
Yes, but I don't think its a huge problem, this is short and we do not
grab wrlock until we actually unlink.
> > > it might be possible to simply grab the spinlock
> > > in write mode and release entries inmediately, without requiring the
> > > sync GC batch infrastructure that pipapo is using.
> >
> > Is there evidence that the on-demand GC is a problem?
>
> After your last fix, not really, other than we have to be care not to
> zap elements that are in any of the pending transaction in this batch.
Right. I don't see a problem with zapping those (newly added ones)
from the point-of-no-return phase, however, we don't have to worry about
rollback then.
> > It only searches in the relevant subtree, it should rarely, if ever,
> > encounter any expired element.
>
> Not a problem now, but this path that we follow blocks a requested
> feature.
>
> I currently do not know how to support for set element timeout update
> with this on-demand GC on the rbtree. This feature will require a new
> update state in the transaction to update the timer for an element. We
> will have to be careful because on-demand GC might zap an expired
> element that got just refreshed in this transaction (unless we
> reintroduce some sort of 'busy' bit for updates again which is
> something I prefer we do not). With 2ee52ae94baa ("netfilter:
> nft_set_rbtree: skip sync GC for new elements in this transaction")
> I could fix by looking at the generation mask to infer that this
> element is 'busy' by some pending transaction, but I do not see a way
> with an element update command in place.
Not convinced. In particular, I don't understand what this has to do
with async vs. sync gc. To me this depends how we want to handle
elements that have expired or expire during transaction.
Example:
Element E1, times out in 1 hour
Element E2, times out in 1 second
Element E3, timed out (1 second ago, 3 minutes ago, doesn't matter).
Userspace batch to kernel:
Update Element E1 to time out in 2 hours.
Update Element E2 to time out in 1 hour.
Update Element E3 to time out in 1 hour.
What is the expected outcome of this request?
Ignore E3 being reaped already and refresh the timeout (resurrection?)
Ignore E3 being reaped already and ignore the request?
Fail the transaction as E3 timed out already ("does not exist")?
Now, what about E2? If transaction is large, it could become
like E3 *during the transaction* unless we introduce some freezing
mechanism. Whats the expected outcome?
Whats the expected outcome if there is some other, unrelated
failure? I assume we have to roll back all the timeout updates, right?
If so, why not temporarily make the timeouts effective right away
and then roll back?
What if userspace asks to *shrink* timeouts? Same scenario:
Update Element E1 to time out in 1 second
Update Element E2 to time out in 1 second
Update Element E3 to time out in 1 second
We could say 'not supported' of course, would avoid the
rollback headache, because in this case long transaction
gc would definitely start to pick some of those elements
up for reaping...
> Maybe take your patch and then follow up to nf-next with this approach
> based once set element timer update is introduced? Otherwise, rbtree
> will block the introduction of this new feature for other sets too.
See above, I would first like to understand the expected
semantics of the timeout update facility.
I've pushed a (not very much tested) version of gc overhaul
to passive lookups based on expiry candidates, this removes
the need for gc sequence counters.
Its vs. nf.git but really should be re-targetted to nf-next, I'll
try to do this next week:
https://git.kernel.org/pub/scm/linux/kernel/git/fwestphal/nf.git/log/?h=nft_set_gc_query_08
next prev parent reply other threads:[~2023-10-01 21:08 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-29 16:44 [PATCH nf 1/2] netfilter: nft_set_rbtree: move sync GC from insert path to set->ops->commit Pablo Neira Ayuso
2023-09-29 16:44 ` [PATCH nf 2/2] netfilter: nft_set_rbtree: remove async GC Pablo Neira Ayuso
2023-09-29 22:25 ` [PATCH nf 1/2] netfilter: nft_set_rbtree: move sync GC from insert path to set->ops->commit Pablo Neira Ayuso
2023-09-30 8:10 ` Florian Westphal
2023-10-01 20:10 ` Pablo Neira Ayuso
2023-10-01 21:08 ` Florian Westphal [this message]
2023-10-02 8:20 ` Pablo Neira Ayuso
2023-10-02 8:47 ` Florian Westphal
2023-10-02 10:24 ` Pablo Neira Ayuso
2023-10-02 12:42 ` update element timeout support [was Re: [PATCH nf 1/2] netfilter: nft_set_rbtree: move sync GC from insert path to set->ops->commit] Pablo Neira Ayuso
2023-10-02 13:58 ` Florian Westphal
2023-10-02 14:21 ` Florian Westphal
2023-10-03 8:22 ` Pablo Neira Ayuso
2023-10-03 9:04 ` Florian Westphal
2023-10-03 9:42 ` Pablo Neira Ayuso
2023-10-03 18:24 ` Florian Westphal
2023-10-04 8:30 ` Pablo Neira Ayuso
2023-10-02 21:10 ` Pablo Neira Ayuso
2023-10-02 21:14 ` Pablo Neira Ayuso
2023-10-02 14:23 ` [PATCH nf 1/2] netfilter: nft_set_rbtree: move sync GC from insert path to set->ops->commit Florian Westphal
2023-10-02 21:37 ` Pablo Neira Ayuso
2023-10-02 21:42 ` Pablo Neira Ayuso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231001210816.GA15564@breakpoint.cc \
--to=fw@strlen.de \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).