All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phil Sutter <phil@nwl.cc>
To: Florian Westphal <fw@strlen.de>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>, netfilter-devel@vger.kernel.org
Subject: Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
Date: Tue, 10 Oct 2023 15:18:00 +0200	[thread overview]
Message-ID: <ZSVPCIMbH9zj22an@orbyte.nwl.cc> (raw)
In-Reply-To: <20231004124845.GA3974@breakpoint.cc>

On Wed, Oct 04, 2023 at 02:48:45PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > On Wed, Oct 04, 2023 at 10:46:23AM +0200, Florian Westphal wrote:
> > > I don't think the dump paths were ever designed to munge existing
> > > data.  For those parts that provide full/consistent serialization,
> > > e.g. single rule is fetched, its fine.
> > > 
> > > But whenever we may yield in between successive partial dumps, its not.
> > 
> > Yes, netlink dumps do not provide atomic listing semantics, that is
> > why there is the generation ID from userspace. I am trying to think of
> > a way to achieve this from the kernel but I did not come with any so
> > far.
> >
> > From userspace, it should be possible to keep a generation ID per
> > object in the cache, so the cache becomes a collection of objects of
> > different generations, then when listing, just take the objects that
> > are still current. Or it should be possible to disambiguate this from
> > userspace with the u64 handle, ie. keep stale objects around and merge
> > them to new ones when fetching the counters.
> > 
> > But this is lots of work from userspace, and it might get more
> > complicated if more transactions happen while retrying (see test
> > script from Phil where transaction happens in an infinite loop).
> >
> > This is the other open issue we have been discussing lately :)
> 
> Right, there is an amalgamation of different issues, lets not get swamped
> trying to solve everything at once :-)
> 
> I've collected a few patches and pushed them out to :testing.
> Stresstest is still running but so far it looks good to me.
> 
> The updated audit selftest passes, as does Xins sctp test case.
> 
> I expect to do the PR in the next few hours or so.
> 
> I will followup on nf-next once upstream has pulled the PR
> into net and did a net -> net-next merge, which might take a
> while. Followup as in "send rebased patches that get rid of
> the async gc in nft_set_rbtree".
> 
> Let me know if there is anything else I can help
> with.
> 
> Phil, I know all of this sucks from your point of view,
> this is also my fault, I did not pay too close attention
> to the reset patches, and, to be clear, even if I would have
> its likely I would have missed all of the implications
> of reusing the dump infrastructure for this.
> 
> I have not applied Pablos patch to revert the "timeout reset"
> because its relatively fresh compared to the other patches
> in the batch (and the batch is already large enough).
> 
> But I do agree with having a separate facility for timeout
> resets (timeout updates) would be better.

I still think they are orthogonal in practice (even though they
"clash"):

Dynamic sets use a timeout as they are populated from packet path and
may grow exceedingly large over time. Manual intervention means deleting
elements (which "resets" them) or adding ones (as an alternative to
packet path).

Would non-dynamic sets use both a timeout and other state? I can't
imagine a use-case for this.

> I also think we need to find a different strategy for the
> dump-and-reset part when the reset could be interrupted
> by a transaction.
> 
> ATM userspace would have to add a userspace lock like
> iptables-legacy to prevent any generation id changes while
> a "reset dump" is happening and I hope we can all agree that
> this is something we definitely do not want :-)

I suggest to communicate the situation (for now) and make 'reset rule'
return the rule to user space so the non-plural reset commands (apart
from 'reset set') become reliable alternatives for those depending upon
it.

Setting expectations straight is still the easiest fix to both the
inconsistency problem and the "resets also timeouts" problem. In the
latter case, this should be fine already as nft(8) explicitly states
"resets timeout or other state" when describing 'reset element' and
'reset set'.

Then, consistent output for plural reset commands becomes a missing
feature one may add or not. Reset with or without timeout requires a
use-case (see above). The only real bug left is the concurrent reset
messing up values.

Cheers, Phil

      parent reply	other threads:[~2023-10-10 13:18 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-02  9:05 [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element Pablo Neira Ayuso
2023-10-02  9:07 ` Florian Westphal
2023-10-02 18:06 ` Phil Sutter
2023-10-02 21:50   ` Pablo Neira Ayuso
2023-10-02 22:17     ` Pablo Neira Ayuso
2023-10-02 22:55       ` Phil Sutter
2023-10-03  7:46         ` Pablo Neira Ayuso
2023-10-03 15:57           ` Phil Sutter
2023-10-03 17:21             ` Pablo Neira Ayuso
2023-10-03 17:52               ` Phil Sutter
2023-10-03 18:03                 ` Pablo Neira Ayuso
2023-10-03 20:12                   ` Phil Sutter
2023-10-04  8:01                     ` Pablo Neira Ayuso
2023-10-04  8:07                       ` Florian Westphal
2023-10-04  8:23                         ` Pablo Neira Ayuso
2023-10-04  8:46                           ` Florian Westphal
2023-10-04  9:27                             ` Pablo Neira Ayuso
2023-10-04 12:48                               ` Florian Westphal
2023-10-04 14:32                                 ` Pablo Neira Ayuso
2023-10-10 12:48                                   ` Phil Sutter
2023-10-10 13:18                                 ` Phil Sutter [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZSVPCIMbH9zj22an@orbyte.nwl.cc \
    --to=phil@nwl.cc \
    --cc=fw@strlen.de \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.