From: Florian Westphal <fw@strlen.de>
To: David Miller <davem@davemloft.net>
Cc: phil@nwl.cc, laforge@gnumonks.org, fw@strlen.de,
daniel@iogearbox.net, netdev@vger.kernel.org,
netfilter-devel@vger.kernel.org, alexei.starovoitov@gmail.com
Subject: Re: [PATCH RFC 0/4] net: add bpfilter
Date: Mon, 19 Feb 2018 22:13:09 +0100 [thread overview]
Message-ID: <20180219211309.GF23857@breakpoint.cc> (raw)
In-Reply-To: <20180219.122226.896334578399862770.davem@davemloft.net>
David Miller <davem@davemloft.net> wrote:
> From: Phil Sutter <phil@nwl.cc>
> Date: Mon, 19 Feb 2018 18:14:11 +0100
>
> > OK, so reading between the lines you're saying that nftables project
> > has failed to provide an adequate successor to iptables?
>
> Whilst it is great that the atomic table update problem was solved, I
> think the emphasis on flexibility often at the expense of performance
> was a bad move.
Thats not true, IMO.
One idea previosuly discussed was to add a 'freeze' option
to our nftables syntax. Essentially what would happen is that further
updates to the table become impossible, with exception of named sets
(which can be changed independently similar to ebpf maps is suppose).
As further updates to the table are then no longer allowed this would
then make it possible to e.g. jit all rules into a single program.
The table could still be removed (and recreated) of course so its
not impossible to make changes, but no longer at the rule level.
> Netfilter's chronic performance differential is why a lot of mindshare
> was lost to userspace networking technologies.
I think this is a unfair statement and also not true.
If you refer to the linear-ruleset-evaluation of iptables, this is
what ipset was added for.
Yes, its a band aid. But again, that problem come from the UAPI
format/limitations of only having one source or destination address per
rule, a limitation not present in nftables.
Other reason why iptables is a bit more costly than needed (although it
IS rather fast given, no spinlocks in main eval loop) are the rule
counter updates which were built into the design all those years ago.
Again, a problem solved in nftables by making the counters optional.
If you want to speedup forward path with XDP -- fine.
But AFAIU its still possible with XDP to have packets being sent to
full stack, right?
If so, it would be possible to even combine nftables with XDP, f.e.
by allowing an ebpf program running on host CPU to query netfilter
conntrack.
No Entry -> push to normal path
Entry -> check 'fastpath' flag (which would be in nf_conn struct).
Not set -> also normal path.
Otherwise continue XDP, stack bypass.
nftables would have a rule similar to this:
nft add rule inet forward ct state established ct label set fastpath
to switch such conntrack to xdp mode.
This decision can then be combined with nftables infra,
for example 'fatpath for tcp flows that saw more than 1mbit of data
in either direction' or the like.
Yes, this needs ebpf support for conntrack and NAT transformations,
and it does beg question how to handle the details, e.g. conntrack
timeouts. Don't see any unsolveable issues with this though.
Also has similarities with the 'flow offload' proposal, i.e. we
could perhaps even reuse what we already have to add provide flow
offload in software using epbf/XDP as offload backend.
next prev parent reply other threads:[~2018-02-19 21:16 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-16 13:40 [PATCH RFC 0/4] net: add bpfilter Daniel Borkmann
2018-02-16 13:40 ` [PATCH RFC 1/4] modules: allow insmod load regular elf binaries Daniel Borkmann
2018-02-16 13:40 ` [PATCH RFC 2/4] bpf: introduce bpfilter commands Daniel Borkmann
2018-02-16 13:40 ` [PATCH RFC 3/4] net: initial bpfilter skeleton Daniel Borkmann
2018-02-16 13:40 ` [PATCH RFC 4/4] bpf: rough bpfilter codegen example hack Daniel Borkmann
2018-02-16 14:57 ` [PATCH RFC 0/4] net: add bpfilter Florian Westphal
2018-02-16 16:14 ` Florian Westphal
2018-02-16 20:44 ` Daniel Borkmann
2018-02-17 12:33 ` Harald Welte
2018-02-17 19:18 ` Florian Westphal
2018-02-16 22:33 ` David Miller
2018-02-17 12:21 ` Harald Welte
2018-02-17 20:10 ` Florian Westphal
2018-02-17 22:38 ` Florian Westphal
2018-02-16 16:53 ` Daniel Borkmann
2018-02-16 22:32 ` David Miller
2018-02-17 12:11 ` Harald Welte
2018-02-18 0:35 ` Florian Westphal
2018-02-19 12:03 ` Daniel Borkmann
2018-02-19 12:52 ` Harald Welte
2018-02-19 14:44 ` David Miller
2018-02-19 14:53 ` Florian Westphal
2018-02-19 15:07 ` David Miller
2018-02-19 15:20 ` Florian Westphal
2018-02-19 15:28 ` David Miller
2018-02-19 15:23 ` Harald Welte
2018-02-19 15:32 ` David Miller
2018-02-19 15:37 ` Jan Engelhardt
2018-02-19 15:43 ` David Miller
2018-02-19 15:36 ` David Miller
2018-02-19 17:20 ` Harald Welte
2018-02-19 17:29 ` David Miller
2018-02-19 18:37 ` Harald Welte
2018-02-19 18:47 ` David Miller
2018-02-19 17:40 ` Arturo Borrero Gonzalez
2018-02-19 18:06 ` Arturo Borrero Gonzalez
2018-02-19 18:43 ` David Miller
2018-02-19 15:00 ` David Miller
2018-02-19 14:59 ` Florian Westphal
2018-02-19 15:13 ` David Miller
2018-02-19 15:15 ` Florian Westphal
2018-02-19 15:27 ` David Miller
2018-02-19 15:38 ` Harald Welte
2018-02-19 15:44 ` David Miller
2018-02-19 17:14 ` Phil Sutter
2018-02-19 17:22 ` David Miller
2018-02-19 18:05 ` Phil Sutter
2018-02-19 18:41 ` David Miller
2018-02-19 20:41 ` Phil Sutter
2018-02-19 21:13 ` Florian Westphal [this message]
2018-02-20 10:44 ` Pablo Neira Ayuso
2018-02-20 14:07 ` Daniel Borkmann
2018-02-20 14:55 ` David Miller
2018-02-21 1:52 ` Alexei Starovoitov
2018-02-21 12:01 ` Pablo Neira Ayuso
2018-02-21 12:13 ` Florian Westphal
2018-02-22 2:20 ` nft/bpf interpreters and spectre2. Was: " Alexei Starovoitov
2018-02-22 11:39 ` Pablo Neira Ayuso
2018-02-22 17:06 ` Alexei Starovoitov
2018-02-22 18:47 ` Jann Horn
2018-02-19 17:41 ` Arturo Borrero Gonzalez
2018-02-19 21:30 ` Jozsef Kadlecsik
2018-02-19 15:27 ` Harald Welte
2018-02-19 15:31 ` David Miller
2018-02-19 17:09 ` Phil Sutter
2018-02-19 17:15 ` David Miller
2018-02-20 13:05 ` Phil Sutter
2018-02-20 9:35 ` Michal Kubecek
2018-02-20 18:10 ` Phil Sutter
2018-02-19 17:32 ` Harald Welte
2018-02-19 17:41 ` Arturo Borrero Gonzalez
2018-02-19 21:42 ` Willem de Bruijn
2018-02-18 23:35 ` Florian Westphal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180219211309.GF23857@breakpoint.cc \
--to=fw@strlen.de \
--cc=alexei.starovoitov@gmail.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=laforge@gnumonks.org \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=phil@nwl.cc \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).