All of lore.kernel.org
 help / color / mirror / Atom feed
* Sets update
@ 2024-07-21  7:44 Slavko
  2024-07-21  9:58 ` Kerin Millar
  0 siblings, 1 reply; 18+ messages in thread
From: Slavko @ 2024-07-21  7:44 UTC (permalink / raw)
  To: netfilter ML

Hi all,

i recently start to port my ipsets to nftables set. I was using these
ipsets (beside other) in manner, where they was some timeout set
and its content was regulary updated (from various sources, online
and local). If some IP(v6) was removed, i didn't bother to remove it
from ipset, as it was removed by timeout...

Now i fight with the same approach in nftables sets (kernel 5.15, nft
1.0.8). I learned, that to update element's timeout i need to remove
element and then (re)add it again. As here is not simple way to do
it from shell (simple shell command) i play with python script, which
consumes IP list on input and produces appropriate nft commands,
eg.:

   fetch someiplist | myscript.py | nft -f -

Now i am not sure, how to produce that output. Have i do it per IP?
Eg.:

    add element ... {IP1}
    delete element ... {IP1}
    add element ... {IP1 ...}
    add element ... {IP2}
    delete element ... {IP2}
    add element ... {IP2 ...}
    etc

Or have i produce output for all IPs at once? Eg:

    add element ... {IP1, IP2, ...}
    delete element ... {IP1, IP2, ...}
    add element ... {IP1 ..., IP2 ..., ...}

Please, is here technical difference or something other to consider?
The IP lists ranges from some hundreds to some thounsand of IPs,
thus nothing really big, but not small.

Another question/problem is, that this approach (delete/add) does't
preserve counters. Please, how to preserve counters? Is only way
to fetch and parse counters before i delete element and then add
them into final add? IMO, this isn't very memory friendly (in script...).

While these counters are not crucial for me, i use them in some
ipsets for statistics, eg. i fill items from various sources into one
ipset and group/count them by comment then. In ipset it is really
simple task for awk and it was working even on small OpenWrt
devices. But nftables sets doesn't produce as straighforward output.
Please, how i can/have to parse element' counters? I am even not
sure, if i will able to parse it by python (but i didn't try it yet), is here
some tool for that?

regards

-- 
Slavko
https://www.slavino.sk/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-21  7:44 Sets update Slavko
@ 2024-07-21  9:58 ` Kerin Millar
  2024-07-21 10:04   ` Kerin Millar
                     ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Kerin Millar @ 2024-07-21  9:58 UTC (permalink / raw)
  To: Slavko, netfilter ML

On Sun, 21 Jul 2024, at 8:44 AM, Slavko wrote:
> Hi all,
>
> i recently start to port my ipsets to nftables set. I was using these
> ipsets (beside other) in manner, where they was some timeout set
> and its content was regulary updated (from various sources, online
> and local). If some IP(v6) was removed, i didn't bother to remove it
> from ipset, as it was removed by timeout...
>
> Now i fight with the same approach in nftables sets (kernel 5.15, nft
> 1.0.8). I learned, that to update element's timeout i need to remove
> element and then (re)add it again. As here is not simple way to do
> it from shell (simple shell command) i play with python script, which
> consumes IP list on input and produces appropriate nft commands,
> eg.:
>
>    fetch someiplist | myscript.py | nft -f -
>
> Now i am not sure, how to produce that output. Have i do it per IP?
> Eg.:
>
>     add element ... {IP1}
>     delete element ... {IP1}
>     add element ... {IP1 ...}
>     add element ... {IP2}
>     delete element ... {IP2}
>     add element ... {IP2 ...}
>     etc

Firstly, you should not use "delete element". It will never be reliable because a given element may have expired by the time the command is processed, in which case nft will raise an error and abort. Use "destroy element" instead.

Secondly, though you could handle each element one by one, I would not recommend it.

>
> Or have i produce output for all IPs at once? Eg:
>
>     add element ... {IP1, IP2, ...}
>     delete element ... {IP1, IP2, ...}
>     add element ... {IP1 ..., IP2 ..., ...}

Yes, grouping elements in this way is sensible. Note that there is no need to worry about the ARG_MAX constraint because nft is being made to read from the standard input rather than use the -c option. However, I would suggest generating one element per line because the way in which nft reports errors is atrocious in the case of long lines.

{
   echo "add element ... {"
   sed 's/$/,/'
   echo "}"
} | nft -f -

>
> Please, is here technical difference or something other to consider?
> The IP lists ranges from some hundreds to some thounsand of IPs,
> thus nothing really big, but not small.
>
> Another question/problem is, that this approach (delete/add) does't
> preserve counters. Please, how to preserve counters? Is only way
> to fetch and parse counters before i delete element and then add
> them into final add? IMO, this isn't very memory friendly (in script...).

As far as I am aware, it is impossible to add elements in a way that merely reset their expiry times in the case that they exist, which is a ridiculous state of affairs. Though there exists the "update" set statement, only rules can take advantage of it. Neither of the following commands achieve the intended outcome, despite being processed without error.

add element ... { <existing-ip> timeout <new-timeout> }
add element ... { <existing-ip> expires <new-timeout> }

>
> While these counters are not crucial for me, i use them in some
> ipsets for statistics, eg. i fill items from various sources into one
> ipset and group/count them by comment then. In ipset it is really
> simple task for awk and it was working even on small OpenWrt
> devices. But nftables sets doesn't produce as straighforward output.
> Please, how i can/have to parse element' counters? I am even not
> sure, if i will able to parse it by python (but i didn't try it yet), is here
> some tool for that?

I doubt it. Perhaps we need a new "update element" command to act similarly to the set statement bearing the same name.

Mind you, it gets worse. While the nft utility supports a JSON output mode to ease parsing, its output is less accurate than the conventional output mode in so far as all "timeout" and "expires" values are truncated to integer seconds and conveyed as JSON integers.

-- 
Kerin Millar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-21  9:58 ` Kerin Millar
@ 2024-07-21 10:04   ` Kerin Millar
  2024-07-21 11:23   ` Slavko
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Kerin Millar @ 2024-07-21 10:04 UTC (permalink / raw)
  To: Slavko, netfilter ML

Hi Slavko,

My previous message contained a minor error.

> Yes, grouping elements in this way is sensible. Note that there is no 
> need to worry about the ARG_MAX constraint because nft is being made to 
> read from the standard input rather than use the -c option. However, I 

Of course, the -c option isn't relevant. I should have written something like "rather than convey the commands as a collective argument".

--
Kerin Millar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-21  9:58 ` Kerin Millar
  2024-07-21 10:04   ` Kerin Millar
@ 2024-07-21 11:23   ` Slavko
  2024-07-21 12:29     ` Kerin Millar
  2024-07-21 16:09   ` Eric
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Slavko @ 2024-07-21 11:23 UTC (permalink / raw)
  To: netfilter ML

[-- Attachment #1: Type: text/plain, Size: 2796 bytes --]

Hi,

Dňa Sun, 21 Jul 2024 10:58:30 +0100 "Kerin Millar" <kfm@plushkava.net>
napísal:

> Firstly, you should not use "delete element". It will never be

I am aware of this, but AFAIK the "destroy" is available from kernel 6.3
and i am on 5.15, thus i am out of luck. With any luck, the timeout will
not happen during update (due timeout value and set update interval),
thus in my case it can work reliable.

> Secondly, though you could handle each element one by one, I would
> not recommend it.

OK, i was in doubt.

> Yes, grouping elements in this way is sensible. Note that there is no
> need to worry about the ARG_MAX constraint because nft is being made
> to read from the standard input rather than use the -c option.

I hope in that, don't worry about -c option, i can understand the point
of answer ;-)

> However, I would suggest generating one element per line because the
> way in which nft reports errors is atrocious in the case of long
> lines.

Of course, it was simplified example in mail... Beside of nft errors,
it is more readable in any case...

> {
>    echo "add element ... {"
>    sed 's/$/,/'
>    echo "}"
> } | nft -f -

Eh, that looks really nice (simple), but IMO it is good for (nonexistent
yet) update command. As one needs any IP twice/thrice, thus list must be
read into memory in script, or i miss something?

> add element ... { <existing-ip> timeout <new-timeout> }
> add element ... { <existing-ip> expires <new-timeout> }

In case of existing IP i understand the "add element" as noop (does
nothing) command, thus this is IMO expected.

> I doubt it. Perhaps we need a new "update element" command to act
> similarly to the set statement bearing the same name.

Yes, and while it is possible to update timeout from packet path, it is
surprising, that it is impossible from command line, the code itself
must exists, just user interface is missing?

> Mind you, it gets worse. While the nft utility supports a JSON output
> mode to ease parsing, its output is less accurate than the
> conventional output mode in so far as all "timeout" and "expires"
> values are truncated to integer seconds and conveyed as JSON integers.

I am fain with that timeout precision, mostly because i am not
interested on per element timeouts :-D

I start to play with jq in shell for that, it seems as possible, but
that is in only on my "playing" machine, where i use nftables already.

I am curios why is not possible to directly use ipsets from nftables.
Yes, the nftables's sets are powerful and becomes better, but still are
not direct replacement of ipsets in some cases. Perhaps someone
competent can provide answer (other than "nobody coded it").

regards

-- 
Slavko
https://www.slavino.sk

[-- Attachment #2: Digitálny podpis OpenPGP --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-21 11:23   ` Slavko
@ 2024-07-21 12:29     ` Kerin Millar
  2024-07-21 14:38       ` Slavko
  2024-07-23  7:24       ` Slavko
  0 siblings, 2 replies; 18+ messages in thread
From: Kerin Millar @ 2024-07-21 12:29 UTC (permalink / raw)
  To: Slavko, netfilter ML

On Sun, 21 Jul 2024, at 12:23 PM, Slavko wrote:
> Hi,
>
> Dňa Sun, 21 Jul 2024 10:58:30 +0100 "Kerin Millar" <kfm@plushkava.net>
> napísal:
>
>> Firstly, you should not use "delete element". It will never be
>
> I am aware of this, but AFAIK the "destroy" is available from kernel 6.3
> and i am on 5.15, thus i am out of luck. With any luck, the timeout will
> not happen during update (due timeout value and set update interval),
> thus in my case it can work reliable.

I see.

>
>> Secondly, though you could handle each element one by one, I would
>> not recommend it.
>
> OK, i was in doubt.
>
>> Yes, grouping elements in this way is sensible. Note that there is no
>> need to worry about the ARG_MAX constraint because nft is being made
>> to read from the standard input rather than use the -c option.
>
> I hope in that, don't worry about -c option, i can understand the point
> of answer ;-)
>
>> However, I would suggest generating one element per line because the
>> way in which nft reports errors is atrocious in the case of long
>> lines.
>
> Of course, it was simplified example in mail... Beside of nft errors,
> it is more readable in any case...
>
>> {
>>    echo "add element ... {"
>>    sed 's/$/,/'
>>    echo "}"
>> } | nft -f -
>
> Eh, that looks really nice (simple), but IMO it is good for (nonexistent
> yet) update command. As one needs any IP twice/thrice, thus list must be
> read into memory in script, or i miss something?

You have a point. If on a severely RAM-constrained system, you may prefer to avoid the use of asynchronous pipelines and generate temporary files instead. Doing so would minimise RAM consumption.

#!/bin/sh
set -e
tmpfile1=
tmpfile2=
trap 'rm -f -- "$tmpfile1" "$tmpfile2"' EXIT
tmpfile1=$(mktemp)
tmpfile2=$(mktemp)
cat > "$tmpfile1"
{
   echo "delete element $* {"; sed 's/$/,/' "$tmpfile1"; echo "}"
   echo "add    element $* {"; sed 's/$/,/' "$tmpfile1"; echo "}"
} > "$tmpfile2"
nft -f "$tmpfile2"

In case you are unaware, $* joins the positional parameters by the first character of IFS, which is the space character by default. In other words, the above script consumes newline-delimited addresses from the standard input while expecting to be given the qualified set name as its arguments. Further, it does not require that the input be piped. From a shell, "the-script < address-list.txt" would also work.

>
>> add element ... { <existing-ip> timeout <new-timeout> }
>> add element ... { <existing-ip> expires <new-timeout> }
>
> In case of existing IP i understand the "add element" as noop (does
> nothing) command, thus this is IMO expected.
>
>> I doubt it. Perhaps we need a new "update element" command to act
>> similarly to the set statement bearing the same name.
>
> Yes, and while it is possible to update timeout from packet path, it is
> surprising, that it is impossible from command line, the code itself
> must exists, just user interface is missing?

So it would appear.

>
>> Mind you, it gets worse. While the nft utility supports a JSON output
>> mode to ease parsing, its output is less accurate than the
>> conventional output mode in so far as all "timeout" and "expires"
>> values are truncated to integer seconds and conveyed as JSON integers.
>
> I am fain with that timeout precision, mostly because i am not
> interested on per element timeouts :-D
>
> I start to play with jq in shell for that, it seems as possible, but
> that is in only on my "playing" machine, where i use nftables already.
>
> I am curios why is not possible to directly use ipsets from nftables.
> Yes, the nftables's sets are powerful and becomes better, but still are
> not direct replacement of ipsets in some cases. Perhaps someone
> competent can provide answer (other than "nobody coded it").

You are definitely not the only one to think so. Indeed, the matter came up again on IRC just recently. Another complaint that I have frequently encountered is that there is no way to atomically swap sets by name.

-- 
Kerin Millar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-21 12:29     ` Kerin Millar
@ 2024-07-21 14:38       ` Slavko
  2024-07-21 15:46         ` Kerin Millar
  2024-07-23  7:24       ` Slavko
  1 sibling, 1 reply; 18+ messages in thread
From: Slavko @ 2024-07-21 14:38 UTC (permalink / raw)
  To: netfilter ML

Hi,

Dňa 21. júla 2024 12:29:44 UTC používateľ Kerin Millar <kfm@plushkava.net> napísal:

>You have a point. If on a severely RAM-constrained system, you may prefer to avoid the use of asynchronous pipelines and generate temporary files instead. Doing so would minimise RAM consumption.

Correct me, if i am wrong, please...

AFAIK memory is not problem in pipes. Yes, any pipe consumes some
memory, but AFAIK it is somewhat limited (i don't know exact numbers)
and if full, the writing process is paused, thus stops to produce more
output, and thus not consumes more memory.

On other side, nowadays linux systems uses tmpfs for /tmp, thus storing
output in temp file can be even worse (in mean of RAM) than using pipes,
as whole output is in that file. Storing temporary file on disk is possible too,
but doing that too often can drain Flash's lifetime, as modern Flash storage
are really limited in that (in comparison with magnetic disks). IIRC the disk
was used for pipes in old Unixes...

But yes, without temp file, one cannot produce the same IP list multiple
times (reading it into variable is the same), but at least second temp file
have to be avoided...

>You are definitely not the only one to think so.

:-) nice to know

regards


-- 
Slavko
https://www.slavino.sk/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-21 14:38       ` Slavko
@ 2024-07-21 15:46         ` Kerin Millar
  0 siblings, 0 replies; 18+ messages in thread
From: Kerin Millar @ 2024-07-21 15:46 UTC (permalink / raw)
  To: Slavko, netfilter ML

On Sun, 21 Jul 2024, at 3:38 PM, Slavko wrote:
> Hi,
>
> Dňa 21. júla 2024 12:29:44 UTC používateľ Kerin Millar 
> <kfm@plushkava.net> napísal:
>
>>You have a point. If on a severely RAM-constrained system, you may prefer to avoid the use of asynchronous pipelines and generate temporary files instead. Doing so would minimise RAM consumption.
>
> Correct me, if i am wrong, please...
>
> AFAIK memory is not problem in pipes. Yes, any pipe consumes some
> memory, but AFAIK it is somewhat limited (i don't know exact numbers)
> and if full, the writing process is paused, thus stops to produce more
> output, and thus not consumes more memory.

Yes, the default size of a pipe buffer is just 16 pages in Linux. However, one must also take into account that pipelines are fundamentally asynchronous. That is, all of the commands that comprise a shell pipeline are executed at approximately the same time and run in parallel to one another. In that particular respect, there is a potential bearing on memory consumption. I must emphasise that I am definitely not trying to discourage the use of pipelines - far from it! I only brought it up because I had the impression that you might be contending with a system that is severely short on RAM, in which case running programs serially can be helpful as a deliberate memory-saving measure.

>
> On other side, nowadays linux systems uses tmpfs for /tmp, thus storing
> output in temp file can be even worse (in mean of RAM) than using pipes,
> as whole output is in that file. Storing temporary file on disk is possible too,
> but doing that too often can drain Flash's lifetime, as modern Flash storage
> are really limited in that (in comparison with magnetic disks). IIRC the disk
> was used for pipes in old Unixes...

It depends on the exact circumstances but yes, it could easily be worse. If you have a mktemp(1) utility, it is probable that it responds to the TMPDIR variable, in which case you are free to point it to something other than a tmpfs-backed filesystem such as /var/tmp.

>
> But yes, without temp file, one cannot produce the same IP list multiple
> times (reading it into variable is the same), but at least second temp file
> have to be avoided...

It is not quite the same but I understand your point. Assuming that TMPDIR is backed by a tmpfs, both cases would result in non-disk-backed pages being allocated that may later be paged out to a swap - if any - under sufficient memory pressure.

--
Kerin Millar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-21  9:58 ` Kerin Millar
  2024-07-21 10:04   ` Kerin Millar
  2024-07-21 11:23   ` Slavko
@ 2024-07-21 16:09   ` Eric
  2024-07-21 16:46     ` Kerin Millar
  2024-07-21 17:58   ` Slavko
  2024-07-22 20:36   ` Pablo Neira Ayuso
  4 siblings, 1 reply; 18+ messages in thread
From: Eric @ 2024-07-21 16:09 UTC (permalink / raw)
  To: Kerin Millar; +Cc: Slavko, netfilter ML

On Sunday, July 21st, 2024 at 02:58, Kerin Millar <kfm@plushkava.net> wrote:
> As far as I am aware, it is impossible to add elements in a way that merely reset their expiry times in the case that they exist

'reset element' was added in 1.0.8, but requires kernel 6.5 or later:

nft add   element inet xyz set_ipv4 '{ 1.0.0.1 expires 1h }'
sleep 10
nft reset element inet xyz set_ipv4 '{ 1.0.0.1 }'
nft get   element inet xyz set_ipv4 '{ 1.0.0.1 }'

See https://bugzilla.netfilter.org/show_bug.cgi?id=1689

Eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-21 16:09   ` Eric
@ 2024-07-21 16:46     ` Kerin Millar
  0 siblings, 0 replies; 18+ messages in thread
From: Kerin Millar @ 2024-07-21 16:46 UTC (permalink / raw)
  To: Eric; +Cc: Slavko, netfilter ML

On Sun, 21 Jul 2024, at 5:09 PM, Eric wrote:
> On Sunday, July 21st, 2024 at 02:58, Kerin Millar <kfm@plushkava.net> wrote:
>> As far as I am aware, it is impossible to add elements in a way that merely reset their expiry times in the case that they exist
>
> 'reset element' was added in 1.0.8, but requires kernel 6.5 or later:
>
> nft add   element inet xyz set_ipv4 '{ 1.0.0.1 expires 1h }'
> sleep 10
> nft reset element inet xyz set_ipv4 '{ 1.0.0.1 }'
> nft get   element inet xyz set_ipv4 '{ 1.0.0.1 }'
>
> See https://bugzilla.netfilter.org/show_bug.cgi?id=1689

Thank you. Putting aside my disappointment that the terminology shall now forever be disjoint with that employed by packet path updates, it appears to be a broken feature.

# nft -v; uname -r; nft list ruleset
nftables v1.0.9 (Old Doc Yak #3)
6.6.36-1-lts
table ip filter {
        set x {
                type ipv4_addr
                size 65536
                flags dynamic,timeout
                timeout 1h
                elements = { 1.2.3.4 expires 57m38s723ms }
        }
}
# nft 'reset element ip filter x { 1.2.3.4 }'
table ip filter {
        set x {
                type ipv4_addr
                size 65536
                flags dynamic,timeout
                timeout 1h
                elements = { 1.2.3.4 expires 56m37s83ms }
        }
}
# nft list ruleset
table ip filter {
        set x {
                type ipv4_addr
                size 65536
                flags dynamic,timeout
                timeout 1h
                elements = { 1.2.3.4 expires 56m5s366ms }
        }
}

Further, defining the initial timeout on a per-element basis has no impact on the proceedings here, nor does defining a new timeout and/or expires value in the course of issuing the reset command.

An additional problem with it is that it raises an error in the case that the specified element does not already exist, just as delete does. So, even if it were working as intended, it would continue to fall short of achieving feature parity with ipset(8).

--
Kerin Millar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-21  9:58 ` Kerin Millar
                     ` (2 preceding siblings ...)
  2024-07-21 16:09   ` Eric
@ 2024-07-21 17:58   ` Slavko
  2024-07-22 20:36   ` Pablo Neira Ayuso
  4 siblings, 0 replies; 18+ messages in thread
From: Slavko @ 2024-07-21 17:58 UTC (permalink / raw)
  To: netfilter ML

[-- Attachment #1: Type: text/plain, Size: 1243 bytes --]

Hi,

Dňa Sun, 21 Jul 2024 10:58:30 +0100 "Kerin Millar" <kfm@plushkava.net>
napísal:

> Mind you, it gets worse. While the nft utility supports a JSON output
> mode to ease parsing, its output is less accurate than the
> conventional output mode in so far as all "timeout" and "expires"
> values are truncated to integer seconds and conveyed as JSON integers.

Even worse, the JSON output lacks counters at all, the nft output:

    ...
    elements = { IP.AD.DR.ES last used never counter packets 0 bytes 0\
                     expires 1d23h36m40s340ms
    ...

But JSON output lacks counters:

    nft -j list set ... | jq '.nftables[] | select(.set != null).set.elem[0]'
    { "elem": {
        "val": "IP.AD.DR.ES",
        "expires": 171653,
        "last": null
      }
    }

And yes, error messages are more than cryptic... We was talking about
memory usage, and my first attempt to add/update list of ~2700 items
ended with "Out of memory" error. First i was shocked, 2700 items
consumes 6GB RAM??? It took some time to realize, that i had duplicate
IPs in list. Really not helpful error message, and one element per line
doesn't help with that...

regards

-- 
Slavko
https://www.slavino.sk

[-- Attachment #2: Digitálny podpis OpenPGP --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-21  9:58 ` Kerin Millar
                     ` (3 preceding siblings ...)
  2024-07-21 17:58   ` Slavko
@ 2024-07-22 20:36   ` Pablo Neira Ayuso
  2024-07-23  7:26     ` Slavko
  4 siblings, 1 reply; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-07-22 20:36 UTC (permalink / raw)
  To: Kerin Millar; +Cc: Slavko, netfilter ML, netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 360 bytes --]

On Sun, Jul 21, 2024 at 10:58:30AM +0100, Kerin Millar wrote:
> I doubt it. Perhaps we need a new "update element" command to act
> similarly to the set statement bearing the same name.

I have these two patches that have remained in my local tree. They
need a rebase, update tests and so on.

I can aim at merging this upstream in the next development cycle.

[-- Attachment #2: 0001-netfilter-nf_tables-add-timeout-extension-to-element.patch --]
[-- Type: text/x-diff, Size: 1778 bytes --]

From deb81f87834b117f60609177e1c8dd58bf09cd00 Mon Sep 17 00:00:00 2001
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed, 18 Oct 2023 22:49:03 +0200
Subject: [PATCH nf-next,RFC 1/2] netfilter: nf_tables: add timeout extension
 to elements to prepare for updates

Timeout extension is not allocated in case that the default set timeout
value is the same. However, with set element updates, this can be updated
too so, allocate it but do not include it in netlink messages so users
do not observe any change in the existing listings / events.

This updates c3e1b005ed1c ("netfilter: nf_tables: add set element
timeout support").

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
Need rebase and extend tests.

 net/netfilter/nf_tables_api.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index ec616bbe75de..b7ede2aba06d 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -5642,6 +5642,7 @@ static int nf_tables_fill_setelem(struct sk_buff *skb,
 		goto nla_put_failure;
 
 	if (nft_set_ext_exists(ext, NFT_SET_EXT_TIMEOUT) &&
+	    *nft_set_ext_timeout(ext) != READ_ONCE(set->timeout) &&
 	    nla_put_be64(skb, NFTA_SET_ELEM_TIMEOUT,
 			 nf_jiffies64_to_msecs(*nft_set_ext_timeout(ext)),
 			 NFTA_SET_ELEM_PAD))
@@ -6752,11 +6753,9 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 		if (err < 0)
 			goto err_parse_key_end;
 
-		if (timeout != READ_ONCE(set->timeout)) {
-			err = nft_set_ext_add(&tmpl, NFT_SET_EXT_TIMEOUT);
-			if (err < 0)
-				goto err_parse_key_end;
-		}
+		err = nft_set_ext_add(&tmpl, NFT_SET_EXT_TIMEOUT);
+		if (err < 0)
+			goto err_parse_key_end;
 	}
 
 	if (num_exprs) {
-- 
2.30.2


[-- Attachment #3: 0002-netfilter-nf_tables-set-element-timeout-update-suppo.patch --]
[-- Type: text/x-diff, Size: 5937 bytes --]

From 6f9de93f7a4e9cc004689cbffe836a5d80ce9c5f Mon Sep 17 00:00:00 2001
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed, 18 Oct 2023 23:07:21 +0200
Subject: [PATCH nf-next,RFC 2/2] netfilter: nf_tables: set element timeout
 update support

Store new timeout and expiration to be updated from .commit path.
Simply discard the timeout update if .abort path is exercised.

This patch requires ("netfilter: nf_tables: use timestamp to check for
set element timeout") to make sure an element does not expire while
transaction is ongoing.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
Need rebase and extend tests.

 include/net/netfilter/nf_tables.h | 19 +++++++++-
 net/netfilter/nf_tables_api.c     | 62 ++++++++++++++++++++++++++-----
 2 files changed, 69 insertions(+), 12 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index d791dbfd12f9..9e48b5ab5f39 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -801,8 +801,14 @@ static inline struct nft_set_elem_expr *nft_set_ext_expr(const struct nft_set_ex
 static inline bool __nft_set_elem_expired(const struct nft_set_ext *ext,
 					  u64 tstamp)
 {
-	return nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION) &&
-	       time_after_eq64(tstamp, *nft_set_ext_expiration(ext));
+	u64 expiration;
+
+	if (!nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION))
+		return false;
+
+	expiration = READ_ONCE(*nft_set_ext_expiration(ext));
+
+	return time_after_eq64(tstamp, expiration);
 }
 
 static inline bool nft_set_elem_expired(const struct nft_set_ext *ext)
@@ -1657,6 +1663,9 @@ struct nft_trans_table {
 struct nft_trans_elem {
 	struct nft_set			*set;
 	struct nft_elem_priv		*elem_priv;
+	u64				timeout;
+	u64				expiration;
+	bool				update;
 	bool				bound;
 };
 
@@ -1664,6 +1673,12 @@ struct nft_trans_elem {
 	(((struct nft_trans_elem *)trans->data)->set)
 #define nft_trans_elem_priv(trans)	\
 	(((struct nft_trans_elem *)trans->data)->elem_priv)
+#define nft_trans_elem_update(trans)	\
+	(((struct nft_trans_elem *)trans->data)->update)
+#define nft_trans_elem_timeout(trans)	\
+	(((struct nft_trans_elem *)trans->data)->timeout)
+#define nft_trans_elem_expiration(trans)	\
+	(((struct nft_trans_elem *)trans->data)->expiration)
 #define nft_trans_elem_set_bound(trans)	\
 	(((struct nft_trans_elem *)trans->data)->bound)
 
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index b7ede2aba06d..74696ec4a491 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -5641,17 +5641,20 @@ static int nf_tables_fill_setelem(struct sk_buff *skb,
 		         htonl(*nft_set_ext_flags(ext))))
 		goto nla_put_failure;
 
-	if (nft_set_ext_exists(ext, NFT_SET_EXT_TIMEOUT) &&
-	    *nft_set_ext_timeout(ext) != READ_ONCE(set->timeout) &&
-	    nla_put_be64(skb, NFTA_SET_ELEM_TIMEOUT,
-			 nf_jiffies64_to_msecs(*nft_set_ext_timeout(ext)),
-			 NFTA_SET_ELEM_PAD))
-		goto nla_put_failure;
+	if (nft_set_ext_exists(ext, NFT_SET_EXT_TIMEOUT)) {
+		u64 timeout = READ_ONCE(*nft_set_ext_timeout(ext));
+
+		if (timeout != READ_ONCE(set->timeout) &&
+		    nla_put_be64(skb, NFTA_SET_ELEM_TIMEOUT,
+				 nf_jiffies64_to_msecs(timeout),
+				 NFTA_SET_ELEM_PAD))
+			goto nla_put_failure;
+	}
 
 	if (nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION)) {
 		u64 expires, now = get_jiffies_64();
 
-		expires = *nft_set_ext_expiration(ext);
+		expires = READ_ONCE(*nft_set_ext_expiration(ext));
 		if (time_before64(now, expires))
 			expires -= now;
 		else
@@ -6584,6 +6587,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 	struct nft_data_desc desc;
 	enum nft_registers dreg;
 	struct nft_trans *trans;
+	bool update = false;
 	u64 expiration;
 	u64 timeout;
 	int err, i;
@@ -6893,8 +6897,28 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 			     nft_set_ext_exists(ext2, NFT_SET_EXT_OBJREF) &&
 			     *nft_set_ext_obj(ext) != *nft_set_ext_obj(ext2)))
 				goto err_element_clash;
-			else if (!(nlmsg_flags & NLM_F_EXCL))
+			else if (!(nlmsg_flags & NLM_F_EXCL)) {
 				err = 0;
+				if (timeout) {
+					nft_trans_elem_timeout(trans) = timeout;
+					if (expiration == 0)
+						expiration = timeout;
+
+					update = true;
+				}
+				if (expiration) {
+					nft_trans_elem_expiration(trans) =
+						expiration;
+					update = true;
+				}
+
+				if (update) {
+					nft_trans_elem_priv(trans) = elem_priv;
+					nft_trans_elem_update(trans) = true;
+					nft_trans_commit_list_add_tail(ctx->net, trans);
+					goto err_elem_free;
+				}
+			}
 		} else if (err == -ENOTEMPTY) {
 			/* ENOTEMPTY reports overlapping between this element
 			 * and an existing one.
@@ -10096,7 +10120,24 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
 		case NFT_MSG_NEWSETELEM:
 			te = (struct nft_trans_elem *)trans->data;
 
-			nft_setelem_activate(net, te->set, te->elem_priv);
+			if (te->update) {
+				const struct nft_set_ext *ext =
+					nft_set_elem_ext(te->set, te->elem_priv);
+
+				if (te->timeout &&
+				    nft_set_ext_exists(ext, NFT_SET_EXT_TIMEOUT)) {
+					WRITE_ONCE(*nft_set_ext_timeout(ext),
+						   te->timeout);
+				}
+				if (te->expiration &&
+				    nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION)) {
+					WRITE_ONCE(*nft_set_ext_expiration(ext),
+						   get_jiffies_64() + te->expiration);
+				}
+			} else {
+				nft_setelem_activate(net, te->set, te->elem_priv);
+			}
+
 			nf_tables_setelem_notify(&trans->ctx, te->set,
 						 te->elem_priv,
 						 NFT_MSG_NEWSETELEM);
@@ -10377,7 +10418,8 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
 			nft_trans_destroy(trans);
 			break;
 		case NFT_MSG_NEWSETELEM:
-			if (nft_trans_elem_set_bound(trans)) {
+			if (nft_trans_elem_update(trans) ||
+			    nft_trans_elem_set_bound(trans)) {
 				nft_trans_destroy(trans);
 				break;
 			}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-21 12:29     ` Kerin Millar
  2024-07-21 14:38       ` Slavko
@ 2024-07-23  7:24       ` Slavko
  2024-07-23  7:37         ` Slavko
  2024-07-23  9:39         ` Pablo Neira Ayuso
  1 sibling, 2 replies; 18+ messages in thread
From: Slavko @ 2024-07-23  7:24 UTC (permalink / raw)
  To: netfilter ML

[-- Attachment #1: Type: text/plain, Size: 1837 bytes --]

Hi,

Dňa Sun, 21 Jul 2024 13:29:44 +0100 "Kerin Millar" <kfm@plushkava.net>
napísal:

> #!/bin/sh
> set -e
> tmpfile1=
> tmpfile2=
> trap 'rm -f -- "$tmpfile1" "$tmpfile2"' EXIT
> tmpfile1=$(mktemp)
> tmpfile2=$(mktemp)
> cat > "$tmpfile1"
> {
>    echo "delete element $* {"; sed 's/$/,/' "$tmpfile1"; echo "}"
>    echo "add    element $* {"; sed 's/$/,/' "$tmpfile1"; echo "}"
> } > "$tmpfile2"
> nft -f "$tmpfile2"

I play with that, i add one more "add" before "delete" and seems to
work, except after boot...

After boot the set is (obviously) empty, i try to fill it with something
as:

    curl -s https://my_local_URL | ./nft-addset.sh inet fw4 myset | nft -f-

But that fail with::

    /dev/stdin:1:1-2: Error: Could not process rule: Out of memory
    add    element inet fw4 myset {
    ^^

(The host has >6 GB of free RAM, the list has ~1700 items)

Initially i blame IP duplicates in downloaded list of IPs, but that was
unrelated to this error. When i remove the first "add" and "delete"
lines from script (thus just one "add"), it works. After initial fill it
works with all three commands (add+delete+add) on already filled set
and even when i flush that set, it still works. Only first fill (after
boot) ends with that error.

When i delete and create that set, i got "Out of memory" again, the set
is defined as::

    table inet fw4 {
	set myset {
		type ipv4_addr
		last counter
		timeout 2d
	}
    }

I tried to add "size" into it, but that doesn't help.

I roughly remember, that i read something about some memory limit in
container (i am not in container), but i am not able to find that
again to check if that is problem.

Please, what can cause that initial fill error, how i can debug/solve
it?

regards

-- 
Slavko
https://www.slavino.sk

[-- Attachment #2: Digitálny podpis OpenPGP --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-22 20:36   ` Pablo Neira Ayuso
@ 2024-07-23  7:26     ` Slavko
  0 siblings, 0 replies; 18+ messages in thread
From: Slavko @ 2024-07-23  7:26 UTC (permalink / raw)
  To: Pablo Neira Ayuso, netfilter ML

[-- Attachment #1: Type: text/plain, Size: 619 bytes --]

Ahoj,

Dňa Mon, 22 Jul 2024 22:36:49 +0200 Pablo Neira Ayuso
<pablo@netfilter.org> napísal:

> On Sun, Jul 21, 2024 at 10:58:30AM +0100, Kerin Millar wrote:
> > I doubt it. Perhaps we need a new "update element" command to act
> > similarly to the set statement bearing the same name.  
> 
> I have these two patches that have remained in my local tree. They
> need a rebase, update tests and so on.
> 
> I can aim at merging this upstream in the next development cycle.

While i don't understand the code, i will be happy if things will be
improved.

regards

-- 
Slavko
https://www.slavino.sk

[-- Attachment #2: Digitálny podpis OpenPGP --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-23  7:24       ` Slavko
@ 2024-07-23  7:37         ` Slavko
  2024-07-23  9:39         ` Pablo Neira Ayuso
  1 sibling, 0 replies; 18+ messages in thread
From: Slavko @ 2024-07-23  7:37 UTC (permalink / raw)
  To: netfilter ML

[-- Attachment #1: Type: text/plain, Size: 358 bytes --]

Ahoj,

Dňa Tue, 23 Jul 2024 09:24:39 +0200 Slavko <linux@slavino.sk> napísal:

> When i delete and create that set, i got "Out of memory" again, the
> set is defined as::

Strange, after several delete/add set the initial fill works on empty
set (with all three commands).

I am out of ideas...

regards

-- 
Slavko
https://www.slavino.sk

[-- Attachment #2: Digitálny podpis OpenPGP --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-23  7:24       ` Slavko
  2024-07-23  7:37         ` Slavko
@ 2024-07-23  9:39         ` Pablo Neira Ayuso
  2024-07-23 10:23           ` Slavko
  2024-07-23 11:32           ` Kerin Millar
  1 sibling, 2 replies; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-07-23  9:39 UTC (permalink / raw)
  To: Slavko; +Cc: netfilter ML

On Tue, Jul 23, 2024 at 09:24:39AM +0200, Slavko wrote:
[...]
> (The host has >6 GB of free RAM, the list has ~1700 items)
> 
> Initially i blame IP duplicates in downloaded list of IPs, but that was
> unrelated to this error. When i remove the first "add" and "delete"
> lines from script (thus just one "add"), it works. After initial fill it
> works with all three commands (add+delete+add) on already filled set
> and even when i flush that set, it still works. Only first fill (after
> boot) ends with that error.
> 
> When i delete and create that set, i got "Out of memory" again, the set
> is defined as::
> 
>     table inet fw4 {
> 	set myset {
> 		type ipv4_addr
> 		last counter
> 		timeout 2d
> 	}
>     }
> 
> I tried to add "size" into it, but that doesn't help.
> 
> I roughly remember, that i read something about some memory limit in
> container (i am not in container), but i am not able to find that
> again to check if that is problem.
> 
> Please, what can cause that initial fill error, how i can debug/solve
> it?

Kernel is likely missing this patch:

commit fa23e0d4b756d25829e124d6b670a4c6bbd4bf7e
Author: Florian Westphal <fw@strlen.de>
Date:   Wed May 8 14:52:47 2024 +0200

    netfilter: nf_tables: allow clone callbacks to sleep

You mentioned you are on 5.15, right?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-23  9:39         ` Pablo Neira Ayuso
@ 2024-07-23 10:23           ` Slavko
  2024-07-23 11:32           ` Kerin Millar
  1 sibling, 0 replies; 18+ messages in thread
From: Slavko @ 2024-07-23 10:23 UTC (permalink / raw)
  To: netfilter ML

[-- Attachment #1: Type: text/plain, Size: 1007 bytes --]

Ahoj,

Dňa Tue, 23 Jul 2024 11:39:24 +0200 Pablo Neira Ayuso
<pablo@netfilter.org> napísal:

> commit fa23e0d4b756d25829e124d6b670a4c6bbd4bf7e

Found i properly that is (will be) in 6.10?

> You mentioned you are on 5.15, right?

Yes, 5.15.150 (openwrt), i choose to use openwrt mostly due its luci
(web interface), but as i usually don't work with openwrt (only on my
home router), i have not set build environment for it, nor i follow
its patches, thus no simple way to get newer kernel nor to patch
existing for me.

For now i installed ip(6)tables-nft (+ ipset), it seems to work. The
only downside is, that i see no way how to customize openwrt firewall's
(chain's) hooks priority (to ensure that they are run after iptables)...
I hope that it will not be problem. I hope that can live with both,
dropped before or after nftables.

But i will like to abandon iptables at all, is here some workaround for
that commit, please?

regards

-- 
Slavko
https://www.slavino.sk

[-- Attachment #2: Digitálny podpis OpenPGP --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-23  9:39         ` Pablo Neira Ayuso
  2024-07-23 10:23           ` Slavko
@ 2024-07-23 11:32           ` Kerin Millar
  2024-07-23 12:19             ` Pablo Neira Ayuso
  1 sibling, 1 reply; 18+ messages in thread
From: Kerin Millar @ 2024-07-23 11:32 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Slavko; +Cc: netfilter ML

On Tue, 23 Jul 2024, at 10:39 AM, Pablo Neira Ayuso wrote:
> On Tue, Jul 23, 2024 at 09:24:39AM +0200, Slavko wrote:
> [...]
>> (The host has >6 GB of free RAM, the list has ~1700 items)
>> 
>> Initially i blame IP duplicates in downloaded list of IPs, but that was
>> unrelated to this error. When i remove the first "add" and "delete"
>> lines from script (thus just one "add"), it works. After initial fill it
>> works with all three commands (add+delete+add) on already filled set
>> and even when i flush that set, it still works. Only first fill (after
>> boot) ends with that error.
>> 
>> When i delete and create that set, i got "Out of memory" again, the set
>> is defined as::
>> 
>>     table inet fw4 {
>> 	set myset {
>> 		type ipv4_addr
>> 		last counter
>> 		timeout 2d
>> 	}
>>     }
>> 
>> I tried to add "size" into it, but that doesn't help.
>> 
>> I roughly remember, that i read something about some memory limit in
>> container (i am not in container), but i am not able to find that
>> again to check if that is problem.
>> 
>> Please, what can cause that initial fill error, how i can debug/solve
>> it?
>
> Kernel is likely missing this patch:
>
> commit fa23e0d4b756d25829e124d6b670a4c6bbd4bf7e
> Author: Florian Westphal <fw@strlen.de>
> Date:   Wed May 8 14:52:47 2024 +0200
>
>     netfilter: nf_tables: allow clone callbacks to sleep 

Would it be safe to backport the following two commits to linux-6.6.y in isolation?

3c13725f43dcf43ad8a9bcd6a9f12add19a8f93e (bail out if stateful expression provides no .clone)
fa23e0d4b756d25829e124d6b670a4c6bbd4bf7e (allow clone callbacks to sleep)

-- 
Kerin Millar

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sets update
  2024-07-23 11:32           ` Kerin Millar
@ 2024-07-23 12:19             ` Pablo Neira Ayuso
  0 siblings, 0 replies; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-07-23 12:19 UTC (permalink / raw)
  To: Kerin Millar; +Cc: Slavko, netfilter ML

On Tue, Jul 23, 2024 at 12:32:50PM +0100, Kerin Millar wrote:
> On Tue, 23 Jul 2024, at 10:39 AM, Pablo Neira Ayuso wrote:
> > On Tue, Jul 23, 2024 at 09:24:39AM +0200, Slavko wrote:
> > [...]
> >> (The host has >6 GB of free RAM, the list has ~1700 items)
> >> 
> >> Initially i blame IP duplicates in downloaded list of IPs, but that was
> >> unrelated to this error. When i remove the first "add" and "delete"
> >> lines from script (thus just one "add"), it works. After initial fill it
> >> works with all three commands (add+delete+add) on already filled set
> >> and even when i flush that set, it still works. Only first fill (after
> >> boot) ends with that error.
> >> 
> >> When i delete and create that set, i got "Out of memory" again, the set
> >> is defined as::
> >> 
> >>     table inet fw4 {
> >> 	set myset {
> >> 		type ipv4_addr
> >> 		last counter
> >> 		timeout 2d
> >> 	}
> >>     }
> >> 
> >> I tried to add "size" into it, but that doesn't help.
> >> 
> >> I roughly remember, that i read something about some memory limit in
> >> container (i am not in container), but i am not able to find that
> >> again to check if that is problem.
> >> 
> >> Please, what can cause that initial fill error, how i can debug/solve
> >> it?
> >
> > Kernel is likely missing this patch:
> >
> > commit fa23e0d4b756d25829e124d6b670a4c6bbd4bf7e
> > Author: Florian Westphal <fw@strlen.de>
> > Date:   Wed May 8 14:52:47 2024 +0200
> >
> >     netfilter: nf_tables: allow clone callbacks to sleep 
> 
> Would it be safe to backport the following two commits to linux-6.6.y in isolation?
> 
> 3c13725f43dcf43ad8a9bcd6a9f12add19a8f93e (bail out if stateful expression provides no .clone)
> fa23e0d4b756d25829e124d6b670a4c6bbd4bf7e (allow clone callbacks to sleep)

I have enqueued those for -stable submission.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-07-23 12:19 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-21  7:44 Sets update Slavko
2024-07-21  9:58 ` Kerin Millar
2024-07-21 10:04   ` Kerin Millar
2024-07-21 11:23   ` Slavko
2024-07-21 12:29     ` Kerin Millar
2024-07-21 14:38       ` Slavko
2024-07-21 15:46         ` Kerin Millar
2024-07-23  7:24       ` Slavko
2024-07-23  7:37         ` Slavko
2024-07-23  9:39         ` Pablo Neira Ayuso
2024-07-23 10:23           ` Slavko
2024-07-23 11:32           ` Kerin Millar
2024-07-23 12:19             ` Pablo Neira Ayuso
2024-07-21 16:09   ` Eric
2024-07-21 16:46     ` Kerin Millar
2024-07-21 17:58   ` Slavko
2024-07-22 20:36   ` Pablo Neira Ayuso
2024-07-23  7:26     ` Slavko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.