From: Phil Sutter <phil@nwl.cc>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Florian Westphal <fw@strlen.de>, netfilter-devel@vger.kernel.org
Subject: Re: [net-next PATCH 0/2] netfilter: Improve inverted IP prefix matches
Date: Mon, 26 Oct 2020 13:29:23 +0100 [thread overview]
Message-ID: <20201026122923.GY13016@orbyte.nwl.cc> (raw)
In-Reply-To: <20201021104952.GA31026@salvia>
Hi Pablo,
On Wed, Oct 21, 2020 at 12:49:52PM +0200, Pablo Neira Ayuso wrote:
> On Wed, Oct 21, 2020 at 12:43:21PM +0200, Pablo Neira Ayuso wrote:
> > Hi Phil,
> >
> > On Fri, Oct 02, 2020 at 11:00:33AM +0200, Phil Sutter wrote:
> > > Hi Florian,
> > >
> > > On Fri, Oct 02, 2020 at 12:25:36AM +0200, Florian Westphal wrote:
> > > > Phil Sutter <phil@nwl.cc> wrote:
> > > > > The following two patches improve packet throughput in a test setup
> > > > > sending UDP packets (using iperf3) between two netns. The ruleset used
> > > > > on receiver side is like this:
> > > > >
> > > > > | *filter
> > > > > | :test - [0:0]
> > > > > | -A INPUT -j test
> > > > > | -A INPUT -j ACCEPT
> > > > > | -A test ! -s 10.0.0.0/10 -j DROP # this line repeats 10000 times
> > > > > | COMMIT
> > > > >
> > > > > These are the generated VM instructions for each rule:
> > > > >
> > > > > | [ payload load 4b @ network header + 12 => reg 1 ]
> > > > > | [ bitwise reg 1 = (reg=1 & 0x0000c0ff ) ^ 0x00000000 ]
> > > >
> > > > Not related to this patch, but we should avoid the bitop if the
> > > > netmask is divisble by 8 (can adjust the cmp -- adjusting the
> > > > payload expr is probably not worth it).
> > >
> > > See the patch I just sent to this list. I adjusted both - it simply
> > > didn't appear to me that I could get by with reducing the cmp expression
> > > size only. The upside though is that detecting the prefix match based on
> > > payload expression length is quick and easy.
> > >
> > > Someone will have to adjust nft tool, though. ;)
> > >
> > > > > | [ cmp eq reg 1 0x0000000a ]
> > > > > | [ counter pkts 0 bytes 0 ]
> > > >
> > > > Out of curiosity, does omitting 'counter' help?
> > > >
> > > > nft counter is rather expensive due to bh disable,
> > > > iptables does it once at the evaluation loop only.
> > >
> > > I changed the test to create the base ruleset using iptables-nft-restore
> > > just as before, but create the rules in 'test' chain like so:
> > >
> > > | nft add rule filter test ip saddr != 10.0.0.0/10 drop
> > >
> > > The VM code is as expected:
> > >
> > > | [ payload load 4b @ network header + 12 => reg 1 ]
> > > | [ bitwise reg 1 = (reg=1 & 0x0000c0ff ) ^ 0x00000000 ]
> > > | [ cmp eq reg 1 0x0000000a ]
> > > | [ immediate reg 0 drop ]
> > >
> > > Performance is ~7000pkt/s. So while it's faster than iptables-nft, it's
> > > still quite a bit slower than legacy iptables despite the skipped
> > > counters.
> >
> > iptables is optimized for matching on input/output device name and
> > IPv4 address + mask (see ip_packet_match()) for historical reasons,
> > iptables does not use a match for this since the beginning.
Ah, thanks for the pointer. That function (and the code therein) pretty
clearly shows why rule-shredding is so much slower in iptables-nft than
legacy despite the attempts at improving it.
> For clarity here, I mean: iptables does not use the generic match
> infrastructure for matching on these fields, instead it is using
> ip_packet_match() which is called from ipt_do_table() which is the
> core function that evaluates the packet.
>
> > One possibility (in the short-term) is to add an internal kernel
> > expression to achieve the same behaviour. The kernel needs to detects
> > for:
> >
> > payload (nh, offset to ip saddr or ip daddr or ip protocol) + cmp
> > payload (nh, offset to ip saddr or ip daddr) + bitwise + cmp
> > meta (iifname or oifname) + bitwise + cmp
> > meta (iifname or oifname) + cmp
> >
> > at the very beginning of the rule.
> >
> > and squash these expressions into the "built-in" iptables match
> > expression which emulates ip_packet_match().
> >
> > Not nice, but if microbenchmarks using thousand of rules really matter
> > (this is worst case O(n) linear list evaluation...) then it might make
> > sense to explore this.
I appreciate the effort to identify a solution which "just works",
though am not sure if we really should implement such hacks (yet). That
said, the "fast" expressions strictly speaking are hacks as well ...
Cheers, Phil
prev parent reply other threads:[~2020-10-26 12:46 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-01 16:57 [net-next PATCH 0/2] netfilter: Improve inverted IP prefix matches Phil Sutter
2020-10-01 16:57 ` [net-next PATCH 1/2] net: netfilter: Enable fast nft_cmp for inverted matches Phil Sutter
2020-10-02 13:50 ` [net-next PATCH 1/2 v2] " Phil Sutter
2020-10-04 19:10 ` Pablo Neira Ayuso
2020-10-01 16:57 ` [net-next PATCH 2/2] net: netfilter: Implement fast bitwise expression Phil Sutter
2020-10-04 19:11 ` Pablo Neira Ayuso
2020-10-01 22:25 ` [net-next PATCH 0/2] netfilter: Improve inverted IP prefix matches Florian Westphal
2020-10-02 9:00 ` Phil Sutter
2020-10-21 10:43 ` Pablo Neira Ayuso
2020-10-21 10:49 ` Pablo Neira Ayuso
2020-10-26 12:29 ` Phil Sutter [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201026122923.GY13016@orbyte.nwl.cc \
--to=phil@nwl.cc \
--cc=fw@strlen.de \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).