* [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
@ 2023-10-02 9:05 Pablo Neira Ayuso
2023-10-02 9:07 ` Florian Westphal
2023-10-02 18:06 ` Phil Sutter
0 siblings, 2 replies; 21+ messages in thread
From: Pablo Neira Ayuso @ 2023-10-02 9:05 UTC (permalink / raw)
To: netfilter-devel; +Cc: phil
The dump and reset command should not refresh the timeout, this command
is intended to allow users to list existing stateful objects and reset
them, element expiration should be refresh via transaction instead with
a specific command to achieve this, otherwise this is entering combo
semantics that will be hard to be undone later (eg. a user asking to
retrieve counters but _not_ requiring to refresh expiration).
Fixes: 079cd633219d ("netfilter: nf_tables: Introduce NFT_MSG_GETSETELEM_RESET")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_tables_api.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 3bb5ea5d4eed..0e375b7a7cad 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -5615,9 +5615,6 @@ static int nf_tables_fill_setelem(struct sk_buff *skb,
nf_jiffies64_to_msecs(expires),
NFTA_SET_ELEM_PAD))
goto nla_put_failure;
-
- if (reset)
- *nft_set_ext_expiration(ext) = now + timeout;
}
if (nft_set_ext_exists(ext, NFT_SET_EXT_USERDATA)) {
--
2.30.2
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-02 9:05 [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element Pablo Neira Ayuso
@ 2023-10-02 9:07 ` Florian Westphal
2023-10-02 18:06 ` Phil Sutter
1 sibling, 0 replies; 21+ messages in thread
From: Florian Westphal @ 2023-10-02 9:07 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel, phil
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> The dump and reset command should not refresh the timeout, this command
> is intended to allow users to list existing stateful objects and reset
> them, element expiration should be refresh via transaction instead with
> a specific command to achieve this, otherwise this is entering combo
> semantics that will be hard to be undone later (eg. a user asking to
> retrieve counters but _not_ requiring to refresh expiration).
Agreed.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-02 9:05 [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element Pablo Neira Ayuso
2023-10-02 9:07 ` Florian Westphal
@ 2023-10-02 18:06 ` Phil Sutter
2023-10-02 21:50 ` Pablo Neira Ayuso
1 sibling, 1 reply; 21+ messages in thread
From: Phil Sutter @ 2023-10-02 18:06 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel
On Mon, Oct 02, 2023 at 11:05:16AM +0200, Pablo Neira Ayuso wrote:
> The dump and reset command should not refresh the timeout, this command
> is intended to allow users to list existing stateful objects and reset
> them, element expiration should be refresh via transaction instead with
> a specific command to achieve this, otherwise this is entering combo
> semantics that will be hard to be undone later (eg. a user asking to
> retrieve counters but _not_ requiring to refresh expiration).
From a users' perspective, what is special about the element expires
value disqualifying it from being reset along with any counter/quota
values?
Do you have a PoC for set element reset via transaction yet? Can we
integrate non-timeout resets with it, too? Because IIUC, that's an
alternative to the pending reset locking.
What we have now is a broad 'reset element', not specifying what to
reset. If the above is a feature being asked for, I'd rather implement
'reset element counter', 'reset element timeout', 'reset element quota',
etc. commands.
Cheers, Phil
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-02 18:06 ` Phil Sutter
@ 2023-10-02 21:50 ` Pablo Neira Ayuso
2023-10-02 22:17 ` Pablo Neira Ayuso
0 siblings, 1 reply; 21+ messages in thread
From: Pablo Neira Ayuso @ 2023-10-02 21:50 UTC (permalink / raw)
To: Phil Sutter, netfilter-devel
On Mon, Oct 02, 2023 at 08:06:42PM +0200, Phil Sutter wrote:
> On Mon, Oct 02, 2023 at 11:05:16AM +0200, Pablo Neira Ayuso wrote:
> > The dump and reset command should not refresh the timeout, this command
> > is intended to allow users to list existing stateful objects and reset
> > them, element expiration should be refresh via transaction instead with
> > a specific command to achieve this, otherwise this is entering combo
> > semantics that will be hard to be undone later (eg. a user asking to
> > retrieve counters but _not_ requiring to refresh expiration).
>
> From a users' perspective, what is special about the element expires
> value disqualifying it from being reset along with any counter/quota
> values?
>
> Do you have a PoC for set element reset via transaction yet? Can we
> integrate non-timeout resets with it, too? Because IIUC, that's an
> alternative to the pending reset locking.
Problem is listing is not supported from transaction path, this is
using existing netlink dump infrastructure which runs lockless via
rcu. So we could support reset, but we could not use netlink dump
semantics to fetch the values, and user likely wants this to
fetch-and-reset as in ctnetlink.
> What we have now is a broad 'reset element', not specifying what to
> reset. If the above is a feature being asked for, I'd rather implement
> 'reset element counter', 'reset element timeout', 'reset element quota',
> etc. commands.
We are currently discussing how to implement refresh timeout into the
transaction model.
I would suggest we keep this chunk away by now for the _RESET command,
until we agree on next steps.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-02 21:50 ` Pablo Neira Ayuso
@ 2023-10-02 22:17 ` Pablo Neira Ayuso
2023-10-02 22:55 ` Phil Sutter
0 siblings, 1 reply; 21+ messages in thread
From: Pablo Neira Ayuso @ 2023-10-02 22:17 UTC (permalink / raw)
To: Phil Sutter, netfilter-devel
On Mon, Oct 02, 2023 at 11:50:25PM +0200, Pablo Neira Ayuso wrote:
> On Mon, Oct 02, 2023 at 08:06:42PM +0200, Phil Sutter wrote:
> > On Mon, Oct 02, 2023 at 11:05:16AM +0200, Pablo Neira Ayuso wrote:
> > > The dump and reset command should not refresh the timeout, this command
> > > is intended to allow users to list existing stateful objects and reset
> > > them, element expiration should be refresh via transaction instead with
> > > a specific command to achieve this, otherwise this is entering combo
> > > semantics that will be hard to be undone later (eg. a user asking to
> > > retrieve counters but _not_ requiring to refresh expiration).
> >
> > From a users' perspective, what is special about the element expires
> > value disqualifying it from being reset along with any counter/quota
> > values?
> >
> > Do you have a PoC for set element reset via transaction yet? Can we
> > integrate non-timeout resets with it, too? Because IIUC, that's an
> > alternative to the pending reset locking.
>
> Problem is listing is not supported from transaction path, this is
> using existing netlink dump infrastructure which runs lockless via
> rcu. So we could support reset, but we could not use netlink dump
> semantics to fetch the values, and user likely wants this to
> fetch-and-reset as in ctnetlink.
Well, with NLM_F_ECHO, it should be possible to explore reset under
commit_mutex, but is it really worth the effort?
With two concurrent threads, we just want to ensure that no invalid
state shows in the listing (you mentioned it is possible to list
negative values with two threads listing-and-resetting at the same
time).
I think we should just make sure something valid is included in the
listing, but as for the two threads performing list-and-reset, why
ensure strict serialization?
This is a rare operation to fetch statistics, most likely having a
single process performing this in place? So we are currently
discussing how to fix the (negative) non-sense in the listing.
> > What we have now is a broad 'reset element', not specifying what to
> > reset. If the above is a feature being asked for, I'd rather implement
> > 'reset element counter', 'reset element timeout', 'reset element quota',
> > etc. commands.
>
> We are currently discussing how to implement refresh timeout into the
> transaction model.
>
> I would suggest we keep this chunk away by now for the _RESET command,
> until we agree on next steps.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-02 22:17 ` Pablo Neira Ayuso
@ 2023-10-02 22:55 ` Phil Sutter
2023-10-03 7:46 ` Pablo Neira Ayuso
0 siblings, 1 reply; 21+ messages in thread
From: Phil Sutter @ 2023-10-02 22:55 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel
On Tue, Oct 03, 2023 at 12:17:53AM +0200, Pablo Neira Ayuso wrote:
> On Mon, Oct 02, 2023 at 11:50:25PM +0200, Pablo Neira Ayuso wrote:
> > On Mon, Oct 02, 2023 at 08:06:42PM +0200, Phil Sutter wrote:
> > > On Mon, Oct 02, 2023 at 11:05:16AM +0200, Pablo Neira Ayuso wrote:
> > > > The dump and reset command should not refresh the timeout, this command
> > > > is intended to allow users to list existing stateful objects and reset
> > > > them, element expiration should be refresh via transaction instead with
> > > > a specific command to achieve this, otherwise this is entering combo
> > > > semantics that will be hard to be undone later (eg. a user asking to
> > > > retrieve counters but _not_ requiring to refresh expiration).
> > >
> > > From a users' perspective, what is special about the element expires
> > > value disqualifying it from being reset along with any counter/quota
> > > values?
> > >
> > > Do you have a PoC for set element reset via transaction yet? Can we
> > > integrate non-timeout resets with it, too? Because IIUC, that's an
> > > alternative to the pending reset locking.
> >
> > Problem is listing is not supported from transaction path, this is
> > using existing netlink dump infrastructure which runs lockless via
> > rcu. So we could support reset, but we could not use netlink dump
> > semantics to fetch the values, and user likely wants this to
> > fetch-and-reset as in ctnetlink.
>
> Well, with NLM_F_ECHO, it should be possible to explore reset under
> commit_mutex, but is it really worth the effort?
I don't understand. Wasn't it your proposal to move things into the
transaction? Above you write: "element expiration should be refresh via
transaction instead". I asked what is special about timeout, why not
handle all element state reset the same way?
> With two concurrent threads, we just want to ensure that no invalid
> state shows in the listing (you mentioned it is possible to list
> negative values with two threads listing-and-resetting at the same
> time).
It's not just in the listing, the actual values underrun. A quota e.g.
will immediately deny.
> I think we should just make sure something valid is included in the
> listing, but as for the two threads performing list-and-reset, why
> ensure strict serialization?
I seem to lack context here. Is there an alternative to "strict"
serializing? Expressions' dump+reset callbacks must not run multiple
times at the same time for the same expression. At least not how they
are currently implemented.
> This is a rare operation to fetch statistics, most likely having a
> single process performing this in place? So we are currently
> discussing how to fix the (negative) non-sense in the listing.
>
> > > What we have now is a broad 'reset element', not specifying what to
> > > reset. If the above is a feature being asked for, I'd rather implement
> > > 'reset element counter', 'reset element timeout', 'reset element quota',
> > > etc. commands.
> >
> > We are currently discussing how to implement refresh timeout into the
> > transaction model.
> >
> > I would suggest we keep this chunk away by now for the _RESET command,
> > until we agree on next steps.
I would suggest to leave things as-is until there's hard evidence why it
has to change now or there is a viable alternative implementation.
It's your call to make since you're the maintainer of the project, but
please stick to the standards you demand from other contributors.
Cheers, Phil
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-02 22:55 ` Phil Sutter
@ 2023-10-03 7:46 ` Pablo Neira Ayuso
2023-10-03 15:57 ` Phil Sutter
0 siblings, 1 reply; 21+ messages in thread
From: Pablo Neira Ayuso @ 2023-10-03 7:46 UTC (permalink / raw)
To: Phil Sutter, netfilter-devel
On Tue, Oct 03, 2023 at 12:55:41AM +0200, Phil Sutter wrote:
> On Tue, Oct 03, 2023 at 12:17:53AM +0200, Pablo Neira Ayuso wrote:
> > On Mon, Oct 02, 2023 at 11:50:25PM +0200, Pablo Neira Ayuso wrote:
> > > On Mon, Oct 02, 2023 at 08:06:42PM +0200, Phil Sutter wrote:
> > > > On Mon, Oct 02, 2023 at 11:05:16AM +0200, Pablo Neira Ayuso wrote:
> > > > > The dump and reset command should not refresh the timeout, this command
> > > > > is intended to allow users to list existing stateful objects and reset
> > > > > them, element expiration should be refresh via transaction instead with
> > > > > a specific command to achieve this, otherwise this is entering combo
> > > > > semantics that will be hard to be undone later (eg. a user asking to
> > > > > retrieve counters but _not_ requiring to refresh expiration).
> > > >
> > > > From a users' perspective, what is special about the element expires
> > > > value disqualifying it from being reset along with any counter/quota
> > > > values?
> > > >
> > > > Do you have a PoC for set element reset via transaction yet? Can we
> > > > integrate non-timeout resets with it, too? Because IIUC, that's an
> > > > alternative to the pending reset locking.
> > >
> > > Problem is listing is not supported from transaction path, this is
> > > using existing netlink dump infrastructure which runs lockless via
> > > rcu. So we could support reset, but we could not use netlink dump
> > > semantics to fetch the values, and user likely wants this to
> > > fetch-and-reset as in ctnetlink.
> >
> > Well, with NLM_F_ECHO, it should be possible to explore reset under
> > commit_mutex, but is it really worth the effort?
>
> I don't understand. Wasn't it your proposal to move things into the
> transaction? Above you write: "element expiration should be refresh via
> transaction instead". I asked what is special about timeout, why not
> handle all element state reset the same way?
>
> > With two concurrent threads, we just want to ensure that no invalid
> > state shows in the listing (you mentioned it is possible to list
> > negative values with two threads listing-and-resetting at the same
> > time).
>
> It's not just in the listing, the actual values underrun. A quota e.g.
> will immediately deny.
>
> > I think we should just make sure something valid is included in the
> > listing, but as for the two threads performing list-and-reset, why
> > ensure strict serialization?
>
> I seem to lack context here. Is there an alternative to "strict"
> serializing? Expressions' dump+reset callbacks must not run multiple
> times at the same time for the same expression. At least not how they
> are currently implemented.
>
> > This is a rare operation to fetch statistics, most likely having a
> > single process performing this in place? So we are currently
> > discussing how to fix the (negative) non-sense in the listing.
> >
> > > > What we have now is a broad 'reset element', not specifying what to
> > > > reset. If the above is a feature being asked for, I'd rather implement
> > > > 'reset element counter', 'reset element timeout', 'reset element quota',
> > > > etc. commands.
> > >
> > > We are currently discussing how to implement refresh timeout into the
> > > transaction model.
> > >
> > > I would suggest we keep this chunk away by now for the _RESET command,
> > > until we agree on next steps.
>
> I would suggest to leave things as-is until there's hard evidence why it
> has to change now or there is a viable alternative implementation.
Leave things as is means we will have this implicit refresh in the
element refresh. We have no such semantics in conntrack, for example,
and conntrack can be seen as a hardcoded set with a fixed number of
tuples.
> It's your call to make since you're the maintainer of the project, but
> please stick to the standards you demand from other contributors.
We are discussing a very specific topic here and I am expecting to get
some kind of acknowledgement from you that this revert of this
specific chunk is good to have.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-03 7:46 ` Pablo Neira Ayuso
@ 2023-10-03 15:57 ` Phil Sutter
2023-10-03 17:21 ` Pablo Neira Ayuso
0 siblings, 1 reply; 21+ messages in thread
From: Phil Sutter @ 2023-10-03 15:57 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel
On Tue, Oct 03, 2023 at 09:46:46AM +0200, Pablo Neira Ayuso wrote:
> On Tue, Oct 03, 2023 at 12:55:41AM +0200, Phil Sutter wrote:
> > On Tue, Oct 03, 2023 at 12:17:53AM +0200, Pablo Neira Ayuso wrote:
> > > On Mon, Oct 02, 2023 at 11:50:25PM +0200, Pablo Neira Ayuso wrote:
> > > > On Mon, Oct 02, 2023 at 08:06:42PM +0200, Phil Sutter wrote:
> > > > > On Mon, Oct 02, 2023 at 11:05:16AM +0200, Pablo Neira Ayuso wrote:
> > > > > > The dump and reset command should not refresh the timeout, this command
> > > > > > is intended to allow users to list existing stateful objects and reset
> > > > > > them, element expiration should be refresh via transaction instead with
> > > > > > a specific command to achieve this, otherwise this is entering combo
> > > > > > semantics that will be hard to be undone later (eg. a user asking to
> > > > > > retrieve counters but _not_ requiring to refresh expiration).
> > > > >
> > > > > From a users' perspective, what is special about the element expires
> > > > > value disqualifying it from being reset along with any counter/quota
> > > > > values?
> > > > >
> > > > > Do you have a PoC for set element reset via transaction yet? Can we
> > > > > integrate non-timeout resets with it, too? Because IIUC, that's an
> > > > > alternative to the pending reset locking.
> > > >
> > > > Problem is listing is not supported from transaction path, this is
> > > > using existing netlink dump infrastructure which runs lockless via
> > > > rcu. So we could support reset, but we could not use netlink dump
> > > > semantics to fetch the values, and user likely wants this to
> > > > fetch-and-reset as in ctnetlink.
> > >
> > > Well, with NLM_F_ECHO, it should be possible to explore reset under
> > > commit_mutex, but is it really worth the effort?
> >
> > I don't understand. Wasn't it your proposal to move things into the
> > transaction? Above you write: "element expiration should be refresh via
> > transaction instead". I asked what is special about timeout, why not
> > handle all element state reset the same way?
> >
> > > With two concurrent threads, we just want to ensure that no invalid
> > > state shows in the listing (you mentioned it is possible to list
> > > negative values with two threads listing-and-resetting at the same
> > > time).
> >
> > It's not just in the listing, the actual values underrun. A quota e.g.
> > will immediately deny.
> >
> > > I think we should just make sure something valid is included in the
> > > listing, but as for the two threads performing list-and-reset, why
> > > ensure strict serialization?
> >
> > I seem to lack context here. Is there an alternative to "strict"
> > serializing? Expressions' dump+reset callbacks must not run multiple
> > times at the same time for the same expression. At least not how they
> > are currently implemented.
> >
> > > This is a rare operation to fetch statistics, most likely having a
> > > single process performing this in place? So we are currently
> > > discussing how to fix the (negative) non-sense in the listing.
> > >
> > > > > What we have now is a broad 'reset element', not specifying what to
> > > > > reset. If the above is a feature being asked for, I'd rather implement
> > > > > 'reset element counter', 'reset element timeout', 'reset element quota',
> > > > > etc. commands.
> > > >
> > > > We are currently discussing how to implement refresh timeout into the
> > > > transaction model.
> > > >
> > > > I would suggest we keep this chunk away by now for the _RESET command,
> > > > until we agree on next steps.
> >
> > I would suggest to leave things as-is until there's hard evidence why it
> > has to change now or there is a viable alternative implementation.
>
> Leave things as is means we will have this implicit refresh in the
> element refresh. We have no such semantics in conntrack, for example,
> and conntrack can be seen as a hardcoded set with a fixed number of
> tuples.
It's just as implicit as counter or quota reset. I define "reset
element" command as "reset any state in given element", so from my
perspective it makes perfectly sense to reset the timeout as well.
> > It's your call to make since you're the maintainer of the project, but
> > please stick to the standards you demand from other contributors.
>
> We are discussing a very specific topic here and I am expecting to get
> some kind of acknowledgement from you that this revert of this
> specific chunk is good to have.
Given the data you provided: No, I don't see any sense in this change.
Cheers, Phil
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-03 15:57 ` Phil Sutter
@ 2023-10-03 17:21 ` Pablo Neira Ayuso
2023-10-03 17:52 ` Phil Sutter
0 siblings, 1 reply; 21+ messages in thread
From: Pablo Neira Ayuso @ 2023-10-03 17:21 UTC (permalink / raw)
To: Phil Sutter, netfilter-devel
On Tue, Oct 03, 2023 at 05:57:59PM +0200, Phil Sutter wrote:
> On Tue, Oct 03, 2023 at 09:46:46AM +0200, Pablo Neira Ayuso wrote:
> > On Tue, Oct 03, 2023 at 12:55:41AM +0200, Phil Sutter wrote:
> > > On Tue, Oct 03, 2023 at 12:17:53AM +0200, Pablo Neira Ayuso wrote:
> > > > On Mon, Oct 02, 2023 at 11:50:25PM +0200, Pablo Neira Ayuso wrote:
> > > > > On Mon, Oct 02, 2023 at 08:06:42PM +0200, Phil Sutter wrote:
> > > > > > On Mon, Oct 02, 2023 at 11:05:16AM +0200, Pablo Neira Ayuso wrote:
> > > > > > > The dump and reset command should not refresh the timeout, this command
> > > > > > > is intended to allow users to list existing stateful objects and reset
> > > > > > > them, element expiration should be refresh via transaction instead with
> > > > > > > a specific command to achieve this, otherwise this is entering combo
> > > > > > > semantics that will be hard to be undone later (eg. a user asking to
> > > > > > > retrieve counters but _not_ requiring to refresh expiration).
> > > > > >
> > > > > > From a users' perspective, what is special about the element expires
> > > > > > value disqualifying it from being reset along with any counter/quota
> > > > > > values?
> > > > > >
> > > > > > Do you have a PoC for set element reset via transaction yet? Can we
> > > > > > integrate non-timeout resets with it, too? Because IIUC, that's an
> > > > > > alternative to the pending reset locking.
> > > > >
> > > > > Problem is listing is not supported from transaction path, this is
> > > > > using existing netlink dump infrastructure which runs lockless via
> > > > > rcu. So we could support reset, but we could not use netlink dump
> > > > > semantics to fetch the values, and user likely wants this to
> > > > > fetch-and-reset as in ctnetlink.
> > > >
> > > > Well, with NLM_F_ECHO, it should be possible to explore reset under
> > > > commit_mutex, but is it really worth the effort?
> > >
> > > I don't understand. Wasn't it your proposal to move things into the
> > > transaction? Above you write: "element expiration should be refresh via
> > > transaction instead". I asked what is special about timeout, why not
> > > handle all element state reset the same way?
> > >
> > > > With two concurrent threads, we just want to ensure that no invalid
> > > > state shows in the listing (you mentioned it is possible to list
> > > > negative values with two threads listing-and-resetting at the same
> > > > time).
> > >
> > > It's not just in the listing, the actual values underrun. A quota e.g.
> > > will immediately deny.
> > >
> > > > I think we should just make sure something valid is included in the
> > > > listing, but as for the two threads performing list-and-reset, why
> > > > ensure strict serialization?
> > >
> > > I seem to lack context here. Is there an alternative to "strict"
> > > serializing? Expressions' dump+reset callbacks must not run multiple
> > > times at the same time for the same expression. At least not how they
> > > are currently implemented.
> > >
> > > > This is a rare operation to fetch statistics, most likely having a
> > > > single process performing this in place? So we are currently
> > > > discussing how to fix the (negative) non-sense in the listing.
> > > >
> > > > > > What we have now is a broad 'reset element', not specifying what to
> > > > > > reset. If the above is a feature being asked for, I'd rather implement
> > > > > > 'reset element counter', 'reset element timeout', 'reset element quota',
> > > > > > etc. commands.
> > > > >
> > > > > We are currently discussing how to implement refresh timeout into the
> > > > > transaction model.
> > > > >
> > > > > I would suggest we keep this chunk away by now for the _RESET command,
> > > > > until we agree on next steps.
> > >
> > > I would suggest to leave things as-is until there's hard evidence why it
> > > has to change now or there is a viable alternative implementation.
> >
> > Leave things as is means we will have this implicit refresh in the
> > element refresh. We have no such semantics in conntrack, for example,
> > and conntrack can be seen as a hardcoded set with a fixed number of
> > tuples.
>
> It's just as implicit as counter or quota reset. I define "reset
> element" command as "reset any state in given element", so from my
> perspective it makes perfectly sense to reset the timeout as well.
The timer is usually updated from the packet path. We are now planning
to support for refreshing the timer from control plane which is not
supported.
Note it is possible to set custom timeouts in elements too, for
example:
# nft add element x y { 1.2.3.4 expires 50s }
while default set policy might be different 3m. In this case, you
assume resetting to the default set timeout policy (3m) is fine.
I can keep searching for more scenarios where this timeout refresh
does not make sense in this reset path.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-03 17:21 ` Pablo Neira Ayuso
@ 2023-10-03 17:52 ` Phil Sutter
2023-10-03 18:03 ` Pablo Neira Ayuso
0 siblings, 1 reply; 21+ messages in thread
From: Phil Sutter @ 2023-10-03 17:52 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel
On Tue, Oct 03, 2023 at 07:21:33PM +0200, Pablo Neira Ayuso wrote:
> On Tue, Oct 03, 2023 at 05:57:59PM +0200, Phil Sutter wrote:
> > On Tue, Oct 03, 2023 at 09:46:46AM +0200, Pablo Neira Ayuso wrote:
> > > On Tue, Oct 03, 2023 at 12:55:41AM +0200, Phil Sutter wrote:
> > > > On Tue, Oct 03, 2023 at 12:17:53AM +0200, Pablo Neira Ayuso wrote:
> > > > > On Mon, Oct 02, 2023 at 11:50:25PM +0200, Pablo Neira Ayuso wrote:
> > > > > > On Mon, Oct 02, 2023 at 08:06:42PM +0200, Phil Sutter wrote:
> > > > > > > On Mon, Oct 02, 2023 at 11:05:16AM +0200, Pablo Neira Ayuso wrote:
> > > > > > > > The dump and reset command should not refresh the timeout, this command
> > > > > > > > is intended to allow users to list existing stateful objects and reset
> > > > > > > > them, element expiration should be refresh via transaction instead with
> > > > > > > > a specific command to achieve this, otherwise this is entering combo
> > > > > > > > semantics that will be hard to be undone later (eg. a user asking to
> > > > > > > > retrieve counters but _not_ requiring to refresh expiration).
> > > > > > >
> > > > > > > From a users' perspective, what is special about the element expires
> > > > > > > value disqualifying it from being reset along with any counter/quota
> > > > > > > values?
> > > > > > >
> > > > > > > Do you have a PoC for set element reset via transaction yet? Can we
> > > > > > > integrate non-timeout resets with it, too? Because IIUC, that's an
> > > > > > > alternative to the pending reset locking.
> > > > > >
> > > > > > Problem is listing is not supported from transaction path, this is
> > > > > > using existing netlink dump infrastructure which runs lockless via
> > > > > > rcu. So we could support reset, but we could not use netlink dump
> > > > > > semantics to fetch the values, and user likely wants this to
> > > > > > fetch-and-reset as in ctnetlink.
> > > > >
> > > > > Well, with NLM_F_ECHO, it should be possible to explore reset under
> > > > > commit_mutex, but is it really worth the effort?
> > > >
> > > > I don't understand. Wasn't it your proposal to move things into the
> > > > transaction? Above you write: "element expiration should be refresh via
> > > > transaction instead". I asked what is special about timeout, why not
> > > > handle all element state reset the same way?
> > > >
> > > > > With two concurrent threads, we just want to ensure that no invalid
> > > > > state shows in the listing (you mentioned it is possible to list
> > > > > negative values with two threads listing-and-resetting at the same
> > > > > time).
> > > >
> > > > It's not just in the listing, the actual values underrun. A quota e.g.
> > > > will immediately deny.
> > > >
> > > > > I think we should just make sure something valid is included in the
> > > > > listing, but as for the two threads performing list-and-reset, why
> > > > > ensure strict serialization?
> > > >
> > > > I seem to lack context here. Is there an alternative to "strict"
> > > > serializing? Expressions' dump+reset callbacks must not run multiple
> > > > times at the same time for the same expression. At least not how they
> > > > are currently implemented.
> > > >
> > > > > This is a rare operation to fetch statistics, most likely having a
> > > > > single process performing this in place? So we are currently
> > > > > discussing how to fix the (negative) non-sense in the listing.
> > > > >
> > > > > > > What we have now is a broad 'reset element', not specifying what to
> > > > > > > reset. If the above is a feature being asked for, I'd rather implement
> > > > > > > 'reset element counter', 'reset element timeout', 'reset element quota',
> > > > > > > etc. commands.
> > > > > >
> > > > > > We are currently discussing how to implement refresh timeout into the
> > > > > > transaction model.
> > > > > >
> > > > > > I would suggest we keep this chunk away by now for the _RESET command,
> > > > > > until we agree on next steps.
> > > >
> > > > I would suggest to leave things as-is until there's hard evidence why it
> > > > has to change now or there is a viable alternative implementation.
> > >
> > > Leave things as is means we will have this implicit refresh in the
> > > element refresh. We have no such semantics in conntrack, for example,
> > > and conntrack can be seen as a hardcoded set with a fixed number of
> > > tuples.
> >
> > It's just as implicit as counter or quota reset. I define "reset
> > element" command as "reset any state in given element", so from my
> > perspective it makes perfectly sense to reset the timeout as well.
>
> The timer is usually updated from the packet path. We are now planning
> to support for refreshing the timer from control plane which is not
> supported.
This implies that people create dynamic sets and then use reset command
on the temporary elements. I would assume one either lets packet path
"maintain" elements or control path but not both.
> Note it is possible to set custom timeouts in elements too, for
> example:
>
> # nft add element x y { 1.2.3.4 expires 50s }
I assumed this is for dump'n'restore purposes. If you set a custom
*timeout* when adding a set, the current reset code respects that.
> while default set policy might be different 3m. In this case, you
> assume resetting to the default set timeout policy (3m) is fine.
>
> I can keep searching for more scenarios where this timeout refresh
> does not make sense in this reset path.
Cheers, Phil
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-03 17:52 ` Phil Sutter
@ 2023-10-03 18:03 ` Pablo Neira Ayuso
2023-10-03 20:12 ` Phil Sutter
0 siblings, 1 reply; 21+ messages in thread
From: Pablo Neira Ayuso @ 2023-10-03 18:03 UTC (permalink / raw)
To: Phil Sutter, netfilter-devel
On Tue, Oct 03, 2023 at 07:52:31PM +0200, Phil Sutter wrote:
> On Tue, Oct 03, 2023 at 07:21:33PM +0200, Pablo Neira Ayuso wrote:
> > On Tue, Oct 03, 2023 at 05:57:59PM +0200, Phil Sutter wrote:
> > > On Tue, Oct 03, 2023 at 09:46:46AM +0200, Pablo Neira Ayuso wrote:
> > > > On Tue, Oct 03, 2023 at 12:55:41AM +0200, Phil Sutter wrote:
> > > > > On Tue, Oct 03, 2023 at 12:17:53AM +0200, Pablo Neira Ayuso wrote:
> > > > > > On Mon, Oct 02, 2023 at 11:50:25PM +0200, Pablo Neira Ayuso wrote:
> > > > > > > On Mon, Oct 02, 2023 at 08:06:42PM +0200, Phil Sutter wrote:
> > > > > > > > On Mon, Oct 02, 2023 at 11:05:16AM +0200, Pablo Neira Ayuso wrote:
> > > > > > > > > The dump and reset command should not refresh the timeout, this command
> > > > > > > > > is intended to allow users to list existing stateful objects and reset
> > > > > > > > > them, element expiration should be refresh via transaction instead with
> > > > > > > > > a specific command to achieve this, otherwise this is entering combo
> > > > > > > > > semantics that will be hard to be undone later (eg. a user asking to
> > > > > > > > > retrieve counters but _not_ requiring to refresh expiration).
> > > > > > > >
> > > > > > > > From a users' perspective, what is special about the element expires
> > > > > > > > value disqualifying it from being reset along with any counter/quota
> > > > > > > > values?
> > > > > > > >
> > > > > > > > Do you have a PoC for set element reset via transaction yet? Can we
> > > > > > > > integrate non-timeout resets with it, too? Because IIUC, that's an
> > > > > > > > alternative to the pending reset locking.
> > > > > > >
> > > > > > > Problem is listing is not supported from transaction path, this is
> > > > > > > using existing netlink dump infrastructure which runs lockless via
> > > > > > > rcu. So we could support reset, but we could not use netlink dump
> > > > > > > semantics to fetch the values, and user likely wants this to
> > > > > > > fetch-and-reset as in ctnetlink.
> > > > > >
> > > > > > Well, with NLM_F_ECHO, it should be possible to explore reset under
> > > > > > commit_mutex, but is it really worth the effort?
> > > > >
> > > > > I don't understand. Wasn't it your proposal to move things into the
> > > > > transaction? Above you write: "element expiration should be refresh via
> > > > > transaction instead". I asked what is special about timeout, why not
> > > > > handle all element state reset the same way?
> > > > >
> > > > > > With two concurrent threads, we just want to ensure that no invalid
> > > > > > state shows in the listing (you mentioned it is possible to list
> > > > > > negative values with two threads listing-and-resetting at the same
> > > > > > time).
> > > > >
> > > > > It's not just in the listing, the actual values underrun. A quota e.g.
> > > > > will immediately deny.
> > > > >
> > > > > > I think we should just make sure something valid is included in the
> > > > > > listing, but as for the two threads performing list-and-reset, why
> > > > > > ensure strict serialization?
> > > > >
> > > > > I seem to lack context here. Is there an alternative to "strict"
> > > > > serializing? Expressions' dump+reset callbacks must not run multiple
> > > > > times at the same time for the same expression. At least not how they
> > > > > are currently implemented.
> > > > >
> > > > > > This is a rare operation to fetch statistics, most likely having a
> > > > > > single process performing this in place? So we are currently
> > > > > > discussing how to fix the (negative) non-sense in the listing.
> > > > > >
> > > > > > > > What we have now is a broad 'reset element', not specifying what to
> > > > > > > > reset. If the above is a feature being asked for, I'd rather implement
> > > > > > > > 'reset element counter', 'reset element timeout', 'reset element quota',
> > > > > > > > etc. commands.
> > > > > > >
> > > > > > > We are currently discussing how to implement refresh timeout into the
> > > > > > > transaction model.
> > > > > > >
> > > > > > > I would suggest we keep this chunk away by now for the _RESET command,
> > > > > > > until we agree on next steps.
> > > > >
> > > > > I would suggest to leave things as-is until there's hard evidence why it
> > > > > has to change now or there is a viable alternative implementation.
> > > >
> > > > Leave things as is means we will have this implicit refresh in the
> > > > element refresh. We have no such semantics in conntrack, for example,
> > > > and conntrack can be seen as a hardcoded set with a fixed number of
> > > > tuples.
> > >
> > > It's just as implicit as counter or quota reset. I define "reset
> > > element" command as "reset any state in given element", so from my
> > > perspective it makes perfectly sense to reset the timeout as well.
> >
> > The timer is usually updated from the packet path. We are now planning
> > to support for refreshing the timer from control plane which is not
> > supported.
>
> This implies that people create dynamic sets and then use reset command
> on the temporary elements. I would assume one either lets packet path
> "maintain" elements or control path but not both.
>
> > Note it is possible to set custom timeouts in elements too, for
> > example:
> >
> > # nft add element x y { 1.2.3.4 expires 50s }
>
> I assumed this is for dump'n'restore purposes. If you set a custom
> *timeout* when adding a set, the current reset code respects that.
This could be an exception, ie. a specific timeout for a given
element, I am afraid there are more usecases that just a
dump'n'restore. User is really free to add specific timeouts for the
sets it adds. The default set timeout only applies if user does not
specify a timeout for new elements.
> > while default set policy might be different 3m. In this case, you
> > assume resetting to the default set timeout policy (3m) is fine.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-03 18:03 ` Pablo Neira Ayuso
@ 2023-10-03 20:12 ` Phil Sutter
2023-10-04 8:01 ` Pablo Neira Ayuso
0 siblings, 1 reply; 21+ messages in thread
From: Phil Sutter @ 2023-10-03 20:12 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel
On Tue, Oct 03, 2023 at 08:03:10PM +0200, Pablo Neira Ayuso wrote:
> On Tue, Oct 03, 2023 at 07:52:31PM +0200, Phil Sutter wrote:
> > On Tue, Oct 03, 2023 at 07:21:33PM +0200, Pablo Neira Ayuso wrote:
> > > On Tue, Oct 03, 2023 at 05:57:59PM +0200, Phil Sutter wrote:
> > > > On Tue, Oct 03, 2023 at 09:46:46AM +0200, Pablo Neira Ayuso wrote:
> > > > > On Tue, Oct 03, 2023 at 12:55:41AM +0200, Phil Sutter wrote:
> > > > > > On Tue, Oct 03, 2023 at 12:17:53AM +0200, Pablo Neira Ayuso wrote:
> > > > > > > On Mon, Oct 02, 2023 at 11:50:25PM +0200, Pablo Neira Ayuso wrote:
> > > > > > > > On Mon, Oct 02, 2023 at 08:06:42PM +0200, Phil Sutter wrote:
> > > > > > > > > On Mon, Oct 02, 2023 at 11:05:16AM +0200, Pablo Neira Ayuso wrote:
> > > > > > > > > > The dump and reset command should not refresh the timeout, this command
> > > > > > > > > > is intended to allow users to list existing stateful objects and reset
> > > > > > > > > > them, element expiration should be refresh via transaction instead with
> > > > > > > > > > a specific command to achieve this, otherwise this is entering combo
> > > > > > > > > > semantics that will be hard to be undone later (eg. a user asking to
> > > > > > > > > > retrieve counters but _not_ requiring to refresh expiration).
> > > > > > > > >
> > > > > > > > > From a users' perspective, what is special about the element expires
> > > > > > > > > value disqualifying it from being reset along with any counter/quota
> > > > > > > > > values?
> > > > > > > > >
> > > > > > > > > Do you have a PoC for set element reset via transaction yet? Can we
> > > > > > > > > integrate non-timeout resets with it, too? Because IIUC, that's an
> > > > > > > > > alternative to the pending reset locking.
> > > > > > > >
> > > > > > > > Problem is listing is not supported from transaction path, this is
> > > > > > > > using existing netlink dump infrastructure which runs lockless via
> > > > > > > > rcu. So we could support reset, but we could not use netlink dump
> > > > > > > > semantics to fetch the values, and user likely wants this to
> > > > > > > > fetch-and-reset as in ctnetlink.
> > > > > > >
> > > > > > > Well, with NLM_F_ECHO, it should be possible to explore reset under
> > > > > > > commit_mutex, but is it really worth the effort?
> > > > > >
> > > > > > I don't understand. Wasn't it your proposal to move things into the
> > > > > > transaction? Above you write: "element expiration should be refresh via
> > > > > > transaction instead". I asked what is special about timeout, why not
> > > > > > handle all element state reset the same way?
> > > > > >
> > > > > > > With two concurrent threads, we just want to ensure that no invalid
> > > > > > > state shows in the listing (you mentioned it is possible to list
> > > > > > > negative values with two threads listing-and-resetting at the same
> > > > > > > time).
> > > > > >
> > > > > > It's not just in the listing, the actual values underrun. A quota e.g.
> > > > > > will immediately deny.
> > > > > >
> > > > > > > I think we should just make sure something valid is included in the
> > > > > > > listing, but as for the two threads performing list-and-reset, why
> > > > > > > ensure strict serialization?
> > > > > >
> > > > > > I seem to lack context here. Is there an alternative to "strict"
> > > > > > serializing? Expressions' dump+reset callbacks must not run multiple
> > > > > > times at the same time for the same expression. At least not how they
> > > > > > are currently implemented.
> > > > > >
> > > > > > > This is a rare operation to fetch statistics, most likely having a
> > > > > > > single process performing this in place? So we are currently
> > > > > > > discussing how to fix the (negative) non-sense in the listing.
> > > > > > >
> > > > > > > > > What we have now is a broad 'reset element', not specifying what to
> > > > > > > > > reset. If the above is a feature being asked for, I'd rather implement
> > > > > > > > > 'reset element counter', 'reset element timeout', 'reset element quota',
> > > > > > > > > etc. commands.
> > > > > > > >
> > > > > > > > We are currently discussing how to implement refresh timeout into the
> > > > > > > > transaction model.
> > > > > > > >
> > > > > > > > I would suggest we keep this chunk away by now for the _RESET command,
> > > > > > > > until we agree on next steps.
> > > > > >
> > > > > > I would suggest to leave things as-is until there's hard evidence why it
> > > > > > has to change now or there is a viable alternative implementation.
> > > > >
> > > > > Leave things as is means we will have this implicit refresh in the
> > > > > element refresh. We have no such semantics in conntrack, for example,
> > > > > and conntrack can be seen as a hardcoded set with a fixed number of
> > > > > tuples.
> > > >
> > > > It's just as implicit as counter or quota reset. I define "reset
> > > > element" command as "reset any state in given element", so from my
> > > > perspective it makes perfectly sense to reset the timeout as well.
> > >
> > > The timer is usually updated from the packet path. We are now planning
> > > to support for refreshing the timer from control plane which is not
> > > supported.
> >
> > This implies that people create dynamic sets and then use reset command
> > on the temporary elements. I would assume one either lets packet path
> > "maintain" elements or control path but not both.
> >
> > > Note it is possible to set custom timeouts in elements too, for
> > > example:
> > >
> > > # nft add element x y { 1.2.3.4 expires 50s }
> >
> > I assumed this is for dump'n'restore purposes. If you set a custom
> > *timeout* when adding a set, the current reset code respects that.
>
> This could be an exception, ie. a specific timeout for a given
> element, I am afraid there are more usecases that just a
> dump'n'restore. User is really free to add specific timeouts for the
> sets it adds. The default set timeout only applies if user does not
> specify a timeout for new elements.
Yes, and if a user chooses a custom timeout like so:
| nft add element x y { 1.2.3.4 timeout 50s }
the element will timeout in 50s and reset command will reset expires
value to 50s not the set's default. Adding an element with a defined
expires value makes it expire within that time but the set's timeout
value applies.
Or am I missing a use-case requiring to use expires instead of timeout?
> > > while default set policy might be different 3m. In this case, you
> > > assume resetting to the default set timeout policy (3m) is fine.
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-03 20:12 ` Phil Sutter
@ 2023-10-04 8:01 ` Pablo Neira Ayuso
2023-10-04 8:07 ` Florian Westphal
0 siblings, 1 reply; 21+ messages in thread
From: Pablo Neira Ayuso @ 2023-10-04 8:01 UTC (permalink / raw)
To: Phil Sutter, netfilter-devel
On Tue, Oct 03, 2023 at 10:12:18PM +0200, Phil Sutter wrote:
> On Tue, Oct 03, 2023 at 08:03:10PM +0200, Pablo Neira Ayuso wrote:
> > On Tue, Oct 03, 2023 at 07:52:31PM +0200, Phil Sutter wrote:
> > > On Tue, Oct 03, 2023 at 07:21:33PM +0200, Pablo Neira Ayuso wrote:
> > > > On Tue, Oct 03, 2023 at 05:57:59PM +0200, Phil Sutter wrote:
> > > > > On Tue, Oct 03, 2023 at 09:46:46AM +0200, Pablo Neira Ayuso wrote:
> > > > > > On Tue, Oct 03, 2023 at 12:55:41AM +0200, Phil Sutter wrote:
> > > > > > > On Tue, Oct 03, 2023 at 12:17:53AM +0200, Pablo Neira Ayuso wrote:
> > > > > > > > On Mon, Oct 02, 2023 at 11:50:25PM +0200, Pablo Neira Ayuso wrote:
> > > > > > > > > On Mon, Oct 02, 2023 at 08:06:42PM +0200, Phil Sutter wrote:
> > > > > > > > > > On Mon, Oct 02, 2023 at 11:05:16AM +0200, Pablo Neira Ayuso wrote:
> > > > > > > > > > > The dump and reset command should not refresh the timeout, this command
> > > > > > > > > > > is intended to allow users to list existing stateful objects and reset
> > > > > > > > > > > them, element expiration should be refresh via transaction instead with
> > > > > > > > > > > a specific command to achieve this, otherwise this is entering combo
> > > > > > > > > > > semantics that will be hard to be undone later (eg. a user asking to
> > > > > > > > > > > retrieve counters but _not_ requiring to refresh expiration).
> > > > > > > > > >
> > > > > > > > > > From a users' perspective, what is special about the element expires
> > > > > > > > > > value disqualifying it from being reset along with any counter/quota
> > > > > > > > > > values?
> > > > > > > > > >
> > > > > > > > > > Do you have a PoC for set element reset via transaction yet? Can we
> > > > > > > > > > integrate non-timeout resets with it, too? Because IIUC, that's an
> > > > > > > > > > alternative to the pending reset locking.
> > > > > > > > >
> > > > > > > > > Problem is listing is not supported from transaction path, this is
> > > > > > > > > using existing netlink dump infrastructure which runs lockless via
> > > > > > > > > rcu. So we could support reset, but we could not use netlink dump
> > > > > > > > > semantics to fetch the values, and user likely wants this to
> > > > > > > > > fetch-and-reset as in ctnetlink.
> > > > > > > >
> > > > > > > > Well, with NLM_F_ECHO, it should be possible to explore reset under
> > > > > > > > commit_mutex, but is it really worth the effort?
> > > > > > >
> > > > > > > I don't understand. Wasn't it your proposal to move things into the
> > > > > > > transaction? Above you write: "element expiration should be refresh via
> > > > > > > transaction instead". I asked what is special about timeout, why not
> > > > > > > handle all element state reset the same way?
> > > > > > >
> > > > > > > > With two concurrent threads, we just want to ensure that no invalid
> > > > > > > > state shows in the listing (you mentioned it is possible to list
> > > > > > > > negative values with two threads listing-and-resetting at the same
> > > > > > > > time).
> > > > > > >
> > > > > > > It's not just in the listing, the actual values underrun. A quota e.g.
> > > > > > > will immediately deny.
> > > > > > >
> > > > > > > > I think we should just make sure something valid is included in the
> > > > > > > > listing, but as for the two threads performing list-and-reset, why
> > > > > > > > ensure strict serialization?
> > > > > > >
> > > > > > > I seem to lack context here. Is there an alternative to "strict"
> > > > > > > serializing? Expressions' dump+reset callbacks must not run multiple
> > > > > > > times at the same time for the same expression. At least not how they
> > > > > > > are currently implemented.
> > > > > > >
> > > > > > > > This is a rare operation to fetch statistics, most likely having a
> > > > > > > > single process performing this in place? So we are currently
> > > > > > > > discussing how to fix the (negative) non-sense in the listing.
> > > > > > > >
> > > > > > > > > > What we have now is a broad 'reset element', not specifying what to
> > > > > > > > > > reset. If the above is a feature being asked for, I'd rather implement
> > > > > > > > > > 'reset element counter', 'reset element timeout', 'reset element quota',
> > > > > > > > > > etc. commands.
> > > > > > > > >
> > > > > > > > > We are currently discussing how to implement refresh timeout into the
> > > > > > > > > transaction model.
> > > > > > > > >
> > > > > > > > > I would suggest we keep this chunk away by now for the _RESET command,
> > > > > > > > > until we agree on next steps.
> > > > > > >
> > > > > > > I would suggest to leave things as-is until there's hard evidence why it
> > > > > > > has to change now or there is a viable alternative implementation.
> > > > > >
> > > > > > Leave things as is means we will have this implicit refresh in the
> > > > > > element refresh. We have no such semantics in conntrack, for example,
> > > > > > and conntrack can be seen as a hardcoded set with a fixed number of
> > > > > > tuples.
> > > > >
> > > > > It's just as implicit as counter or quota reset. I define "reset
> > > > > element" command as "reset any state in given element", so from my
> > > > > perspective it makes perfectly sense to reset the timeout as well.
> > > >
> > > > The timer is usually updated from the packet path. We are now planning
> > > > to support for refreshing the timer from control plane which is not
> > > > supported.
> > >
> > > This implies that people create dynamic sets and then use reset command
> > > on the temporary elements. I would assume one either lets packet path
> > > "maintain" elements or control path but not both.
> > >
> > > > Note it is possible to set custom timeouts in elements too, for
> > > > example:
> > > >
> > > > # nft add element x y { 1.2.3.4 expires 50s }
> > >
> > > I assumed this is for dump'n'restore purposes. If you set a custom
> > > *timeout* when adding a set, the current reset code respects that.
> >
> > This could be an exception, ie. a specific timeout for a given
> > element, I am afraid there are more usecases that just a
> > dump'n'restore. User is really free to add specific timeouts for the
> > sets it adds. The default set timeout only applies if user does not
> > specify a timeout for new elements.
>
> Yes, and if a user chooses a custom timeout like so:
>
> | nft add element x y { 1.2.3.4 timeout 50s }
>
> the element will timeout in 50s and reset command will reset expires
> value to 50s not the set's default. Adding an element with a defined
> expires value makes it expire within that time but the set's timeout
> value applies.
That is correct.
> Or am I missing a use-case requiring to use expires instead of timeout?
You are correct.
We will soon need NFT_MSG_GETRULE_RESET_NO_TIMEOUT to undo this combo
command semantics, from userspace this will require some sort of 'nft
reset table x notimeout' syntax.
I really don't understand why you consider a timer is a stateful
property, ctnetlink does not reset timeouts when you list-and-reset
counters, I do not see why you have to do this with set elements.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-04 8:01 ` Pablo Neira Ayuso
@ 2023-10-04 8:07 ` Florian Westphal
2023-10-04 8:23 ` Pablo Neira Ayuso
0 siblings, 1 reply; 21+ messages in thread
From: Florian Westphal @ 2023-10-04 8:07 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: Phil Sutter, netfilter-devel
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> We will soon need NFT_MSG_GETRULE_RESET_NO_TIMEOUT to undo this combo
> command semantics, from userspace this will require some sort of 'nft
> reset table x notimeout' syntax.
NFT_MSG_GETRULE_RESET_NO_TIMEOUT sounds super ugly :/
Do you think we can add a flags attr that describes which parts
to reset?
No flags attr would reset everything.
Do you consider reset of timers to be something that must
be handled via transaction infra or do you think it can
(re)use the dump-and-reset approach?
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-04 8:07 ` Florian Westphal
@ 2023-10-04 8:23 ` Pablo Neira Ayuso
2023-10-04 8:46 ` Florian Westphal
0 siblings, 1 reply; 21+ messages in thread
From: Pablo Neira Ayuso @ 2023-10-04 8:23 UTC (permalink / raw)
To: Florian Westphal; +Cc: Phil Sutter, netfilter-devel
On Wed, Oct 04, 2023 at 10:07:02AM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > We will soon need NFT_MSG_GETRULE_RESET_NO_TIMEOUT to undo this combo
> > command semantics, from userspace this will require some sort of 'nft
> > reset table x notimeout' syntax.
>
> NFT_MSG_GETRULE_RESET_NO_TIMEOUT sounds super ugly :/
>
> Do you think we can add a flags attr that describes which parts
> to reset?
Sure. This will require one attribute for each object type, also
reject it where it does not make sense.
> No flags attr would reset everything.
Refreshing timers is a bad default behaviour.
And how does this mix with the set element timeout model from
transaction? Now timers becomes a "moving target" again with this
refresh? Oh, this will drag commit_mutex to netlink dump path to avoid
that.
> Do you consider reset of timers to be something that must
> be handled via transaction infra or do you think it can
> (re)use the dump-and-reset approach?
The question why user wants to reset the timers in this path.
For counters, this is to collect stats while leaving remaining things
as is. Refreshing timers make no sense to me.
For quota, this is to fetch the consumed quota and restart it, it
might make sense to refresh the timer, but transaction sounds like a
better path for this usecase?
For limit, they do not expose internal stateful information, so this
just a reset. Timer refresh makes no sense to me here.
If this is for a dynamic set, user is refreshing/extending the
timeout, but usually it is packet path that refreshes this timeouts
via update.
This reset feature is just there to collect stateful properties and
leave things as is.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-04 8:23 ` Pablo Neira Ayuso
@ 2023-10-04 8:46 ` Florian Westphal
2023-10-04 9:27 ` Pablo Neira Ayuso
0 siblings, 1 reply; 21+ messages in thread
From: Florian Westphal @ 2023-10-04 8:46 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: Florian Westphal, Phil Sutter, netfilter-devel
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Wed, Oct 04, 2023 at 10:07:02AM +0200, Florian Westphal wrote:
> > Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > > We will soon need NFT_MSG_GETRULE_RESET_NO_TIMEOUT to undo this combo
> > > command semantics, from userspace this will require some sort of 'nft
> > > reset table x notimeout' syntax.
> >
> > NFT_MSG_GETRULE_RESET_NO_TIMEOUT sounds super ugly :/
> >
> > Do you think we can add a flags attr that describes which parts
> > to reset?
>
> Sure. This will require one attribute for each object type, also
> reject it where it does not make sense.
>
> > No flags attr would reset everything.
>
> Refreshing timers is a bad default behaviour.
Looking at the "evolution" of the reset command in nftables.git
I agree. "reset counters" etc. was rather specific as to what
is reset.
> And how does this mix with the set element timeout model from
> transaction? Now timers becomes a "moving target" again with this
> refresh? Oh, this will drag commit_mutex to netlink dump path to avoid
> that.
The lock is needed to prevent *two* reset calls messing up the
interal object state, e.g. quota or counters.
We will need *some* type of lock for the commands where
the reset logic is handled via dump path.
At this point I think we should consider reverting ALL
reset changes that use the dump path, because we also have
the consistency issue:
Cpu 1: nft list ruleset
Cpu 2: nft reset ruleset
Cpu 3: transaction, seq is bumped
AFAIU Cpu2 will restart dump due to interrupt, so the listing
will be consistent but will, on retry -- show counters zeroed
in previous, inconsitent and suppressed dump.
So I think the reset logic should be reworked to walk the rules
and use the transaction infra to reset everything manually.
I think we can optimize by letting userspace skip rules that
lack a stateful object that cannot be reset.
I don't think the dump paths were ever designed to munge existing
data. For those parts that provide full/consistent serialization,
e.g. single rule is fetched, its fine.
But whenever we may yield in between successive partial dumps, its not.
> For counters, this is to collect stats while leaving remaining things
> as is. Refreshing timers make no sense to me.
Looking at the history, I agree... for something like "reset counters"
its clear what should happen.
> For quota, this is to fetch the consumed quota and restart it, it
> might make sense to refresh the timer, but transaction sounds like a
> better path for this usecase?
See above, I agree.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-04 8:46 ` Florian Westphal
@ 2023-10-04 9:27 ` Pablo Neira Ayuso
2023-10-04 12:48 ` Florian Westphal
0 siblings, 1 reply; 21+ messages in thread
From: Pablo Neira Ayuso @ 2023-10-04 9:27 UTC (permalink / raw)
To: Florian Westphal; +Cc: Phil Sutter, netfilter-devel
On Wed, Oct 04, 2023 at 10:46:23AM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > On Wed, Oct 04, 2023 at 10:07:02AM +0200, Florian Westphal wrote:
> > > Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > > > We will soon need NFT_MSG_GETRULE_RESET_NO_TIMEOUT to undo this combo
> > > > command semantics, from userspace this will require some sort of 'nft
> > > > reset table x notimeout' syntax.
> > >
> > > NFT_MSG_GETRULE_RESET_NO_TIMEOUT sounds super ugly :/
> > >
> > > Do you think we can add a flags attr that describes which parts
> > > to reset?
> >
> > Sure. This will require one attribute for each object type, also
> > reject it where it does not make sense.
> >
> > > No flags attr would reset everything.
> >
> > Refreshing timers is a bad default behaviour.
>
> Looking at the "evolution" of the reset command in nftables.git
> I agree. "reset counters" etc. was rather specific as to what
> is reset.
>
> > And how does this mix with the set element timeout model from
> > transaction? Now timers becomes a "moving target" again with this
> > refresh? Oh, this will drag commit_mutex to netlink dump path to avoid
> > that.
>
> The lock is needed to prevent *two* reset calls messing up the
> interal object state, e.g. quota or counters.
Agreed, no question there is a need for a lock from that path, the
discussion so far has been what type of lock, whether reset_spinlock
or commit_mutex.
reset_spinlock just makes sure that:
Cpu 1: nft list ruleset
Cpu 2: nft reset ruleset
do not show negative counters, as Phil reported with a test script.
commit_mutex just mitigates this issue, for reason see my reply below.
> We will need *some* type of lock for the commands where
> the reset logic is handled via dump path.
>
> At this point I think we should consider reverting ALL
> reset changes that use the dump path, because we also have
> the consistency issue:
>
> Cpu 1: nft list ruleset
> Cpu 2: nft reset ruleset
> Cpu 3: transaction, seq is bumped
>
> AFAIU Cpu2 will restart dump due to interrupt, so the listing
> will be consistent but will, on retry -- show counters zeroed
> in previous, inconsitent and suppressed dump.
>
> So I think the reset logic should be reworked to walk the rules
> and use the transaction infra to reset everything manually.
> I think we can optimize by letting userspace skip rules that
> lack a stateful object that cannot be reset.
>
> I don't think the dump paths were ever designed to munge existing
> data. For those parts that provide full/consistent serialization,
> e.g. single rule is fetched, its fine.
>
> But whenever we may yield in between successive partial dumps, its not.
Yes, netlink dumps do not provide atomic listing semantics, that is
why there is the generation ID from userspace. I am trying to think of
a way to achieve this from the kernel but I did not come with any so
far.
From userspace, it should be possible to keep a generation ID per
object in the cache, so the cache becomes a collection of objects of
different generations, then when listing, just take the objects that
are still current. Or it should be possible to disambiguate this from
userspace with the u64 handle, ie. keep stale objects around and merge
them to new ones when fetching the counters.
But this is lots of work from userspace, and it might get more
complicated if more transactions happen while retrying (see test
script from Phil where transaction happens in an infinite loop).
This is the other open issue we have been discussing lately :)
> > For counters, this is to collect stats while leaving remaining things
> > as is. Refreshing timers make no sense to me.
>
> Looking at the history, I agree... for something like "reset counters"
> its clear what should happen.
>
> > For quota, this is to fetch the consumed quota and restart it, it
> > might make sense to refresh the timer, but transaction sounds like a
> > better path for this usecase?
>
> See above, I agree.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-04 9:27 ` Pablo Neira Ayuso
@ 2023-10-04 12:48 ` Florian Westphal
2023-10-04 14:32 ` Pablo Neira Ayuso
2023-10-10 13:18 ` Phil Sutter
0 siblings, 2 replies; 21+ messages in thread
From: Florian Westphal @ 2023-10-04 12:48 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: Florian Westphal, Phil Sutter, netfilter-devel
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Wed, Oct 04, 2023 at 10:46:23AM +0200, Florian Westphal wrote:
> > I don't think the dump paths were ever designed to munge existing
> > data. For those parts that provide full/consistent serialization,
> > e.g. single rule is fetched, its fine.
> >
> > But whenever we may yield in between successive partial dumps, its not.
>
> Yes, netlink dumps do not provide atomic listing semantics, that is
> why there is the generation ID from userspace. I am trying to think of
> a way to achieve this from the kernel but I did not come with any so
> far.
>
> From userspace, it should be possible to keep a generation ID per
> object in the cache, so the cache becomes a collection of objects of
> different generations, then when listing, just take the objects that
> are still current. Or it should be possible to disambiguate this from
> userspace with the u64 handle, ie. keep stale objects around and merge
> them to new ones when fetching the counters.
>
> But this is lots of work from userspace, and it might get more
> complicated if more transactions happen while retrying (see test
> script from Phil where transaction happens in an infinite loop).
>
> This is the other open issue we have been discussing lately :)
Right, there is an amalgamation of different issues, lets not get swamped
trying to solve everything at once :-)
I've collected a few patches and pushed them out to :testing.
Stresstest is still running but so far it looks good to me.
The updated audit selftest passes, as does Xins sctp test case.
I expect to do the PR in the next few hours or so.
I will followup on nf-next once upstream has pulled the PR
into net and did a net -> net-next merge, which might take a
while. Followup as in "send rebased patches that get rid of
the async gc in nft_set_rbtree".
Let me know if there is anything else I can help
with.
Phil, I know all of this sucks from your point of view,
this is also my fault, I did not pay too close attention
to the reset patches, and, to be clear, even if I would have
its likely I would have missed all of the implications
of reusing the dump infrastructure for this.
I have not applied Pablos patch to revert the "timeout reset"
because its relatively fresh compared to the other patches
in the batch (and the batch is already large enough).
But I do agree with having a separate facility for timeout
resets (timeout updates) would be better.
I also think we need to find a different strategy for the
dump-and-reset part when the reset could be interrupted
by a transaction.
ATM userspace would have to add a userspace lock like
iptables-legacy to prevent any generation id changes while
a "reset dump" is happening and I hope we can all agree that
this is something we definitely do not want :-)
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-04 12:48 ` Florian Westphal
@ 2023-10-04 14:32 ` Pablo Neira Ayuso
2023-10-10 12:48 ` Phil Sutter
2023-10-10 13:18 ` Phil Sutter
1 sibling, 1 reply; 21+ messages in thread
From: Pablo Neira Ayuso @ 2023-10-04 14:32 UTC (permalink / raw)
To: Florian Westphal; +Cc: Phil Sutter, netfilter-devel
On Wed, Oct 04, 2023 at 02:48:45PM +0200, Florian Westphal wrote:
> I also think we need to find a different strategy for the
> dump-and-reset part when the reset could be interrupted
> by a transaction.
I think it should be possible to deal with this from userspace.
The idea would be to keep the old cache. Then, from the new cache, if
EINTR happened before, iterate over the list of objects in the new
cache and then lookup for the old objects, then pour the stats from
the old to the new objects, then release old cache. Then only one old
cache is kept around in worst case. This needs a lookup function for
each stateful object type on the old cache based on the handle.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-04 14:32 ` Pablo Neira Ayuso
@ 2023-10-10 12:48 ` Phil Sutter
0 siblings, 0 replies; 21+ messages in thread
From: Phil Sutter @ 2023-10-10 12:48 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel
On Wed, Oct 04, 2023 at 04:32:26PM +0200, Pablo Neira Ayuso wrote:
> On Wed, Oct 04, 2023 at 02:48:45PM +0200, Florian Westphal wrote:
> > I also think we need to find a different strategy for the
> > dump-and-reset part when the reset could be interrupted
> > by a transaction.
>
> I think it should be possible to deal with this from userspace.
>
> The idea would be to keep the old cache. Then, from the new cache, if
> EINTR happened before, iterate over the list of objects in the new
> cache and then lookup for the old objects, then pour the stats from
> the old to the new objects, then release old cache. Then only one old
> cache is kept around in worst case. This needs a lookup function for
> each stateful object type on the old cache based on the handle.
In case of EINTR, user space will reset multiple times, right? So
returned state from generation X must be combined with that from
generation X+1. Easy with counters (val1 + val2) but funny and/or
inconsistent with quotas (val1 - val2 might be < 0).
I'd still just declare reset command for multiple items unreliable and
point out the need for scripts to use a combination of list command and
a number of reset commands for the individual items if the actual values
matter.
I just noticed, for the above to be viable 'reset rule' command must be
changed - it is silent right now.
Cheers, Phil
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element
2023-10-04 12:48 ` Florian Westphal
2023-10-04 14:32 ` Pablo Neira Ayuso
@ 2023-10-10 13:18 ` Phil Sutter
1 sibling, 0 replies; 21+ messages in thread
From: Phil Sutter @ 2023-10-10 13:18 UTC (permalink / raw)
To: Florian Westphal; +Cc: Pablo Neira Ayuso, netfilter-devel
On Wed, Oct 04, 2023 at 02:48:45PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > On Wed, Oct 04, 2023 at 10:46:23AM +0200, Florian Westphal wrote:
> > > I don't think the dump paths were ever designed to munge existing
> > > data. For those parts that provide full/consistent serialization,
> > > e.g. single rule is fetched, its fine.
> > >
> > > But whenever we may yield in between successive partial dumps, its not.
> >
> > Yes, netlink dumps do not provide atomic listing semantics, that is
> > why there is the generation ID from userspace. I am trying to think of
> > a way to achieve this from the kernel but I did not come with any so
> > far.
> >
> > From userspace, it should be possible to keep a generation ID per
> > object in the cache, so the cache becomes a collection of objects of
> > different generations, then when listing, just take the objects that
> > are still current. Or it should be possible to disambiguate this from
> > userspace with the u64 handle, ie. keep stale objects around and merge
> > them to new ones when fetching the counters.
> >
> > But this is lots of work from userspace, and it might get more
> > complicated if more transactions happen while retrying (see test
> > script from Phil where transaction happens in an infinite loop).
> >
> > This is the other open issue we have been discussing lately :)
>
> Right, there is an amalgamation of different issues, lets not get swamped
> trying to solve everything at once :-)
>
> I've collected a few patches and pushed them out to :testing.
> Stresstest is still running but so far it looks good to me.
>
> The updated audit selftest passes, as does Xins sctp test case.
>
> I expect to do the PR in the next few hours or so.
>
> I will followup on nf-next once upstream has pulled the PR
> into net and did a net -> net-next merge, which might take a
> while. Followup as in "send rebased patches that get rid of
> the async gc in nft_set_rbtree".
>
> Let me know if there is anything else I can help
> with.
>
> Phil, I know all of this sucks from your point of view,
> this is also my fault, I did not pay too close attention
> to the reset patches, and, to be clear, even if I would have
> its likely I would have missed all of the implications
> of reusing the dump infrastructure for this.
>
> I have not applied Pablos patch to revert the "timeout reset"
> because its relatively fresh compared to the other patches
> in the batch (and the batch is already large enough).
>
> But I do agree with having a separate facility for timeout
> resets (timeout updates) would be better.
I still think they are orthogonal in practice (even though they
"clash"):
Dynamic sets use a timeout as they are populated from packet path and
may grow exceedingly large over time. Manual intervention means deleting
elements (which "resets" them) or adding ones (as an alternative to
packet path).
Would non-dynamic sets use both a timeout and other state? I can't
imagine a use-case for this.
> I also think we need to find a different strategy for the
> dump-and-reset part when the reset could be interrupted
> by a transaction.
>
> ATM userspace would have to add a userspace lock like
> iptables-legacy to prevent any generation id changes while
> a "reset dump" is happening and I hope we can all agree that
> this is something we definitely do not want :-)
I suggest to communicate the situation (for now) and make 'reset rule'
return the rule to user space so the non-plural reset commands (apart
from 'reset set') become reliable alternatives for those depending upon
it.
Setting expectations straight is still the easiest fix to both the
inconsistency problem and the "resets also timeouts" problem. In the
latter case, this should be fine already as nft(8) explicitly states
"resets timeout or other state" when describing 'reset element' and
'reset set'.
Then, consistent output for plural reset commands becomes a missing
feature one may add or not. Reset with or without timeout requires a
use-case (see above). The only real bug left is the concurrent reset
messing up values.
Cheers, Phil
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2023-10-10 13:18 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-02 9:05 [PATCH nf] netfilter: nf_tables: do not refresh timeout when resetting element Pablo Neira Ayuso
2023-10-02 9:07 ` Florian Westphal
2023-10-02 18:06 ` Phil Sutter
2023-10-02 21:50 ` Pablo Neira Ayuso
2023-10-02 22:17 ` Pablo Neira Ayuso
2023-10-02 22:55 ` Phil Sutter
2023-10-03 7:46 ` Pablo Neira Ayuso
2023-10-03 15:57 ` Phil Sutter
2023-10-03 17:21 ` Pablo Neira Ayuso
2023-10-03 17:52 ` Phil Sutter
2023-10-03 18:03 ` Pablo Neira Ayuso
2023-10-03 20:12 ` Phil Sutter
2023-10-04 8:01 ` Pablo Neira Ayuso
2023-10-04 8:07 ` Florian Westphal
2023-10-04 8:23 ` Pablo Neira Ayuso
2023-10-04 8:46 ` Florian Westphal
2023-10-04 9:27 ` Pablo Neira Ayuso
2023-10-04 12:48 ` Florian Westphal
2023-10-04 14:32 ` Pablo Neira Ayuso
2023-10-10 12:48 ` Phil Sutter
2023-10-10 13:18 ` Phil Sutter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).