* nftables rule optimization - evaluating efficiency
@ 2024-07-02 19:03 William N.
2024-07-03 9:37 ` Reindl Harald
2024-07-10 18:34 ` William N.
0 siblings, 2 replies; 8+ messages in thread
From: William N. @ 2024-07-02 19:03 UTC (permalink / raw)
To: netfilter
Hi,
Since it is possible to do the same thing using different rules, I am
looking for the most optimal (low resource usage, high speed) way to
write my rules.
Here is just a very simple test to compare the different approaches:
#!/usr/sbin/nft -f
flush ruleset
table ip6 t {
# Goal: fast processing through early "exit"
chain A {
ip6 hoplimit != 255 return
icmpv6 type != 133 return
icmpv6 code != 0 return
accept
}
# Goal: compact syntax
chain B {
icmpv6 type . icmpv6 code . ip6 hoplimit {
133 . 0 . 255
} \
accept
return
}
# Goal: no specific, using "general" syntax
chain C {
icmpv6 type 133 icmpv6 code 0 ip6 hoplimit 255 \
accept
return
}
}
Looking at the output of 'nft -c --debug=netlink -f <this file>', it
seems:
- chain A would work best (least instructions to verdict) if there is
no match (e.g. if hoplimit is indeed not 255) but in all other cases
the total number of instructions to be processed is greater
- chain B and C seem to have the same number of instructions but
perhaps B would outperform C in case of multiple elements in the set
(e.g. more types or codes to check)
Also, it is not clear what is the actual "load" of different
instructions in terms of CPU cycles and memory, i.e. one instruction
may look as "one" but may actually cost more than another 2, right?
What is the proper way to evaluate and optimize rule efficiency?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: nftables rule optimization - evaluating efficiency
2024-07-02 19:03 nftables rule optimization - evaluating efficiency William N.
@ 2024-07-03 9:37 ` Reindl Harald
2024-07-03 10:44 ` William N.
2024-07-10 18:34 ` William N.
1 sibling, 1 reply; 8+ messages in thread
From: Reindl Harald @ 2024-07-03 9:37 UTC (permalink / raw)
To: netfilter
Am 02.07.24 um 21:03 schrieb William N.:
> - chain A would work best (least instructions to verdict) if there is
> no match (e.g. if hoplimit is indeed not 255) but in all other cases
> the total number of instructions to be processed is greater
>
> - chain B and C seem to have the same number of instructions but
> perhaps B would outperform C in case of multiple elements in the set
> (e.g. more types or codes to check)
>
> Also, it is not clear what is the actual "load" of different
> instructions in terms of CPU cycles and memory, i.e. one instruction
> may look as "one" but may actually cost more than another 2, right?
>
> What is the proper way to evaluate and optimize rule efficiency?
understanding what is your primary load and make final decisions as soon
as possible
"ctstate RELATED,ESTABLISHED" hits 99% of all packages and after that
you only handle new connections
when you have 99% of your load on port 443 and before the ACCEPT rule
are 50 others rules for whatever services they are all evaluated
the same for drop/reject rules - on top the ones which hit most of teh time
you have rule counters how much packets every rule triggered
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: nftables rule optimization - evaluating efficiency
2024-07-03 9:37 ` Reindl Harald
@ 2024-07-03 10:44 ` William N.
0 siblings, 0 replies; 8+ messages in thread
From: William N. @ 2024-07-03 10:44 UTC (permalink / raw)
To: netfilter
On Wed, 3 Jul 2024 11:37:10 +0200 Reindl Harald wrote:
> understanding what is your primary load and make final decisions as
> soon as possible
>
> "ctstate RELATED,ESTABLISHED" hits 99% of all packages and after that
> you only handle new connections
That particular problem was discussed in another thread:
https://marc.info/?t=171360284600001&r=1&w=2
A little side note: The capitalized words imply iptables syntax. In
case I may somehow been misunderstood, please let me note just for the
sake of clarity that the actual question is about nftables.
> when you have 99% of your load on port 443 and before the ACCEPT rule
> are 50 others rules for whatever services they are all evaluated
>
> the same for drop/reject rules - on top the ones which hit most of
> teh time
Sure. That is clear. The question is not how to order rules but how to
write a rule in the most optimal way and to evaluate its performance,
i.e. I would like to go beyond ordering and into the rule itself.
> you have rule counters how much packets every rule triggered
Counters don't tell how much system resources a rule consumes.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: nftables rule optimization - evaluating efficiency
2024-07-02 19:03 nftables rule optimization - evaluating efficiency William N.
2024-07-03 9:37 ` Reindl Harald
@ 2024-07-10 18:34 ` William N.
2024-07-10 21:27 ` Kerin Millar
1 sibling, 1 reply; 8+ messages in thread
From: William N. @ 2024-07-10 18:34 UTC (permalink / raw)
To: netfilter
On Tue, 2 Jul 2024 19:03:18 -0000 William N. wrote:
> [...]
> Also, it is not clear what is the actual "load" of different
> instructions in terms of CPU cycles and memory, i.e. one instruction
> may look as "one" but may actually cost more than another 2, right?
>
> What is the proper way to evaluate and optimize rule efficiency?
Are those difficult or somehow inappropriate questions?
Or is there a better place to ask?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: nftables rule optimization - evaluating efficiency
2024-07-10 18:34 ` William N.
@ 2024-07-10 21:27 ` Kerin Millar
2024-07-10 21:39 ` Florian Westphal
2024-07-11 19:14 ` William N.
0 siblings, 2 replies; 8+ messages in thread
From: Kerin Millar @ 2024-07-10 21:27 UTC (permalink / raw)
To: netfilter
On Wed, 10 Jul 2024, at 7:34 PM, William N. wrote:
> On Tue, 2 Jul 2024 19:03:18 -0000 William N. wrote:
>
>> [...]
>> Also, it is not clear what is the actual "load" of different
>> instructions in terms of CPU cycles and memory, i.e. one instruction
>> may look as "one" but may actually cost more than another 2, right?
Indeed. It cannot be presumed that all instructions are equal in expense.
>>
>> What is the proper way to evaluate and optimize rule efficiency?
>
> Are those difficult or somehow inappropriate questions?
> Or is there a better place to ask?
You are enquiring as to how to assess the relative efficiency of the bytecode instructions through the direct understanding of how they are processed by the nftables VM. Said VM is unique and - as far as I know - wholly undocumented. As regards difficulty, the intersection of people who follow the list and possess the necessary expertise can probably be conveyed quite comfortably by a single digit. It is a pity; I would also have been interested to see an informed reply but the absence of one wasn't altogether surprising to me.
--
Kerin Millar
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: nftables rule optimization - evaluating efficiency
2024-07-10 21:27 ` Kerin Millar
@ 2024-07-10 21:39 ` Florian Westphal
2024-07-11 19:15 ` William N.
2024-07-11 19:14 ` William N.
1 sibling, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2024-07-10 21:39 UTC (permalink / raw)
To: Kerin Millar; +Cc: netfilter
Kerin Millar <kfm@plushkava.net> wrote:
> On Wed, 10 Jul 2024, at 7:34 PM, William N. wrote:
> > On Tue, 2 Jul 2024 19:03:18 -0000 William N. wrote:
> >
> >> [...]
> >> Also, it is not clear what is the actual "load" of different
> >> instructions in terms of CPU cycles and memory, i.e. one instruction
> >> may look as "one" but may actually cost more than another 2, right?
>
> Indeed. It cannot be presumed that all instructions are equal in expense.
Yes, but then again nft will try to make reasonable choices.
tcp dport { 22, 80 } will not allocate a huge hash table for two values.
Is it faster than
tcp dport 22
tcp dport 80
?
No idea. And given the volume of bugs I don't really care too much.
> >> What is the proper way to evaluate and optimize rule efficiency?
> >
> > Are those difficult or somehow inappropriate questions?
> > Or is there a better place to ask?
>
> You are enquiring as to how to assess the relative efficiency of the bytecode instructions through the direct understanding of how they are processed by the nftables VM. Said VM is unique and - as far as I know - wholly undocumented. As regards difficulty, the intersection of people who follow the list and possess the necessary expertise can probably be conveyed quite comfortably by a single digit. It is a pity; I would also have been interested to see an informed reply but the absence of one wasn't altogether surprising to me.
The additional issue that its hard to give a useful answer.
Use 'perf record -a ... and pktgen' is likely a rather useless reply.
And might even incorrect depending on what one wants to measure.
In general, its best to compact the ruleset as much as possible
and use sets/maps/vmaps wherever possible.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: nftables rule optimization - evaluating efficiency
2024-07-10 21:27 ` Kerin Millar
2024-07-10 21:39 ` Florian Westphal
@ 2024-07-11 19:14 ` William N.
1 sibling, 0 replies; 8+ messages in thread
From: William N. @ 2024-07-11 19:14 UTC (permalink / raw)
To: netfilter
On Wed, 10 Jul 2024 22:27:02 +0100 Kerin Millar wrote:
> Indeed. It cannot be presumed that all instructions are equal in expense.
I suppose it may actually be even more complicated if different CPU
architectures (all capable of running nft) are considered in a
comparison. So, for simplicity, I was hoping to be able to evaluate
things at least on a single architecture.
> It is a pity
Well, I guess I have touched a subject not interesting for many (hence
not documented).
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: nftables rule optimization - evaluating efficiency
2024-07-10 21:39 ` Florian Westphal
@ 2024-07-11 19:15 ` William N.
0 siblings, 0 replies; 8+ messages in thread
From: William N. @ 2024-07-11 19:15 UTC (permalink / raw)
To: netfilter
On Wed, 10 Jul 2024 23:39:01 +0200 Florian Westphal wrote:
> In general, its best to compact the ruleset as much as possible
> and use sets/maps/vmaps wherever possible.
That's what I suspected too.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-07-11 19:15 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-02 19:03 nftables rule optimization - evaluating efficiency William N.
2024-07-03 9:37 ` Reindl Harald
2024-07-03 10:44 ` William N.
2024-07-10 18:34 ` William N.
2024-07-10 21:27 ` Kerin Millar
2024-07-10 21:39 ` Florian Westphal
2024-07-11 19:15 ` William N.
2024-07-11 19:14 ` William N.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).