From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Lorenzo Bianconi <lorenzo@kernel.org>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Nikolay Aleksandrov <nikolay@nvidia.com>,
bpf <bpf@vger.kernel.org>,
Network Development <netdev@vger.kernel.org>,
Lorenzo Bianconi <lorenzo.bianconi@redhat.com>,
"David S. Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
David Ahern <dsahern@kernel.org>,
Yoshiki Komachi <komachi.yoshiki@gmail.com>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>,
Andrii Nakryiko <andrii.nakryiko@gmail.com>,
Roopa Prabhu <roopa@nvidia.com>,
"bridge@lists.linux-foundation.org"
<bridge@lists.linux-foundation.org>,
Ido Schimmel <idosch@idosch.org>
Subject: Re: [RFC bpf-next 1/2] net: bridge: add unstable br_fdb_find_port_from_ifindex helper
Date: Wed, 26 Jan 2022 13:02:24 +0100 [thread overview]
Message-ID: <87ee4u3dtb.fsf@toke.dk> (raw)
In-Reply-To: <YfEr3Soy8YuJczHk@lore-desk>
Lorenzo Bianconi <lorenzo@kernel.org> writes:
>> On Mon, Jan 24, 2022 at 10:32 AM Nikolay Aleksandrov <nikolay@nvidia.com> wrote:
>> > >
>> > > +int br_fdb_find_port_from_ifindex(struct xdp_md *xdp_ctx,
>> > > + struct bpf_fdb_lookup *opt,
>> > > + u32 opt__sz)
>> > > +{
>> > > + struct xdp_buff *ctx = (struct xdp_buff *)xdp_ctx;
>> > > + struct net_bridge_port *port;
>> > > + struct net_device *dev;
>> > > + int ret = -ENODEV;
>> > > +
>> > > + BUILD_BUG_ON(sizeof(struct bpf_fdb_lookup) != NF_BPF_FDB_OPTS_SZ);
>> > > + if (!opt || opt__sz != sizeof(struct bpf_fdb_lookup))
>> > > + return -ENODEV;
>> > > +
>> > > + rcu_read_lock();
>> > > +
>> > > + dev = dev_get_by_index_rcu(dev_net(ctx->rxq->dev), opt->ifindex);
>> > > + if (!dev)
>> > > + goto out;
>>
>> imo that is way too much wrapping for an unstable helper.
>> The dev lookup is not cheap.
>>
>> With all the extra checks the XDP acceleration gets reduced.
>> I think it would be better to use kprobe/fentry on bridge
>> functions that operate on fdb and replicate necessary
>> data into bpf map.
>> Then xdp prog would do a single cheap lookup from that map
>> to figure out 'port'.
>
> ack, right. This is a very interesting approach. I will investigate
> it. Thanks.
I think it would be interesting to try both, and compare their
performance. I'm a bit sceptical about Alexei's assertion that
dev_get_by_index_rcu() is that expensive: we do such a lookup in the XDP
redirect code when using the non-map bpf_redirect() helper, and I have
not been able to measure a significant performance difference between
the map and non-map variants (after we added bulking to the latter).
If looking up devices by ifindex does turn out to be too expensive,
maybe what we really need is a way to pass around 'struct net_device'
pointers to BPF helpers, so a given BPF program only has to do the
lookup once if it's calling multiple dev-based helpers? I think this
should be doable with BTF, no?
-Toke
WARNING: multiple messages have this Message-ID (diff)
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Lorenzo Bianconi <lorenzo@kernel.org>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: "bridge@lists.linux-foundation.org"
<bridge@lists.linux-foundation.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Network Development <netdev@vger.kernel.org>,
David Ahern <dsahern@kernel.org>, Roopa Prabhu <roopa@nvidia.com>,
Yoshiki Komachi <komachi.yoshiki@gmail.com>,
Alexei Starovoitov <ast@kernel.org>,
Lorenzo Bianconi <lorenzo.bianconi@redhat.com>,
Ido Schimmel <idosch@idosch.org>,
Nikolay Aleksandrov <nikolay@nvidia.com>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Jakub Kicinski <kuba@kernel.org>, bpf <bpf@vger.kernel.org>,
Andrii Nakryiko <andrii.nakryiko@gmail.com>,
"David S. Miller" <davem@davemloft.net>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>
Subject: Re: [Bridge] [RFC bpf-next 1/2] net: bridge: add unstable br_fdb_find_port_from_ifindex helper
Date: Wed, 26 Jan 2022 13:02:24 +0100 [thread overview]
Message-ID: <87ee4u3dtb.fsf@toke.dk> (raw)
In-Reply-To: <YfEr3Soy8YuJczHk@lore-desk>
Lorenzo Bianconi <lorenzo@kernel.org> writes:
>> On Mon, Jan 24, 2022 at 10:32 AM Nikolay Aleksandrov <nikolay@nvidia.com> wrote:
>> > >
>> > > +int br_fdb_find_port_from_ifindex(struct xdp_md *xdp_ctx,
>> > > + struct bpf_fdb_lookup *opt,
>> > > + u32 opt__sz)
>> > > +{
>> > > + struct xdp_buff *ctx = (struct xdp_buff *)xdp_ctx;
>> > > + struct net_bridge_port *port;
>> > > + struct net_device *dev;
>> > > + int ret = -ENODEV;
>> > > +
>> > > + BUILD_BUG_ON(sizeof(struct bpf_fdb_lookup) != NF_BPF_FDB_OPTS_SZ);
>> > > + if (!opt || opt__sz != sizeof(struct bpf_fdb_lookup))
>> > > + return -ENODEV;
>> > > +
>> > > + rcu_read_lock();
>> > > +
>> > > + dev = dev_get_by_index_rcu(dev_net(ctx->rxq->dev), opt->ifindex);
>> > > + if (!dev)
>> > > + goto out;
>>
>> imo that is way too much wrapping for an unstable helper.
>> The dev lookup is not cheap.
>>
>> With all the extra checks the XDP acceleration gets reduced.
>> I think it would be better to use kprobe/fentry on bridge
>> functions that operate on fdb and replicate necessary
>> data into bpf map.
>> Then xdp prog would do a single cheap lookup from that map
>> to figure out 'port'.
>
> ack, right. This is a very interesting approach. I will investigate
> it. Thanks.
I think it would be interesting to try both, and compare their
performance. I'm a bit sceptical about Alexei's assertion that
dev_get_by_index_rcu() is that expensive: we do such a lookup in the XDP
redirect code when using the non-map bpf_redirect() helper, and I have
not been able to measure a significant performance difference between
the map and non-map variants (after we added bulking to the latter).
If looking up devices by ifindex does turn out to be too expensive,
maybe what we really need is a way to pass around 'struct net_device'
pointers to BPF helpers, so a given BPF program only has to do the
lookup once if it's calling multiple dev-based helpers? I think this
should be doable with BTF, no?
-Toke
next prev parent reply other threads:[~2022-01-26 12:02 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-24 17:20 [RFC bpf-next 0/2] introduce bpf fdb lookup helper for xdp Lorenzo Bianconi
2022-01-24 17:20 ` [RFC bpf-next 1/2] net: bridge: add unstable br_fdb_find_port_from_ifindex helper Lorenzo Bianconi
2022-01-24 17:50 ` Toke Høiland-Jørgensen
2022-01-26 11:42 ` Lorenzo Bianconi
2022-01-26 12:03 ` Kumar Kartikeya Dwivedi
2022-01-26 20:11 ` Andrii Nakryiko
2022-01-26 12:03 ` Toke Høiland-Jørgensen
2022-01-26 14:04 ` Lorenzo Bianconi
2022-01-24 18:32 ` Nikolay Aleksandrov
2022-01-24 18:32 ` [Bridge] " Nikolay Aleksandrov
2022-01-25 5:09 ` Alexei Starovoitov
2022-01-25 5:09 ` [Bridge] " Alexei Starovoitov
2022-01-26 11:09 ` Lorenzo Bianconi
2022-01-26 11:09 ` [Bridge] " Lorenzo Bianconi
2022-01-26 12:02 ` Toke Høiland-Jørgensen [this message]
2022-01-26 12:02 ` Toke Høiland-Jørgensen
2022-01-26 11:27 ` Lorenzo Bianconi
2022-01-26 11:27 ` [Bridge] " Lorenzo Bianconi
2022-01-26 12:08 ` Kumar Kartikeya Dwivedi
2022-01-26 12:08 ` [Bridge] " Kumar Kartikeya Dwivedi
2022-01-26 12:39 ` Nikolay Aleksandrov
2022-01-26 12:39 ` [Bridge] " Nikolay Aleksandrov
2022-01-26 12:50 ` Toke Høiland-Jørgensen
2022-01-26 12:50 ` [Bridge] " Toke Høiland-Jørgensen
2022-01-26 12:57 ` Nikolay Aleksandrov
2022-01-26 12:57 ` [Bridge] " Nikolay Aleksandrov
2022-01-26 15:00 ` Lorenzo Bianconi
2022-01-26 15:00 ` [Bridge] " Lorenzo Bianconi
2022-01-24 17:20 ` [RFC bpf-next 2/2] samples: bpf: add xdp fdb lookup program Lorenzo Bianconi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ee4u3dtb.fsf@toke.dk \
--to=toke@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=bridge@lists.linux-foundation.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=idosch@idosch.org \
--cc=komachi.yoshiki@gmail.com \
--cc=kuba@kernel.org \
--cc=lorenzo.bianconi@redhat.com \
--cc=lorenzo@kernel.org \
--cc=memxor@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=nikolay@nvidia.com \
--cc=roopa@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.