From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Hangbin Liu <liuhangbin@gmail.com>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
"Jiri Benc" <jbenc@redhat.com>,
"Jesper Dangaard Brouer" <brouer@redhat.com>,
"Eelco Chaudron" <echaudro@redhat.com>,
ast@kernel.org, "Daniel Borkmann" <daniel@iogearbox.net>,
"Lorenzo Bianconi" <lorenzo.bianconi@redhat.com>,
"David Ahern" <dsahern@gmail.com>,
"Andrii Nakryiko" <andrii.nakryiko@gmail.com>,
"Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
"John Fastabend" <john.fastabend@gmail.com>,
"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
"Björn Töpel" <bjorn.topel@gmail.com>
Subject: Re: [PATCHv3 bpf-next 2/4] xdp: extend xdp_redirect_map with broadcast support
Date: Thu, 01 Apr 2021 13:10:29 +0200 [thread overview]
Message-ID: <87k0pmntui.fsf@toke.dk> (raw)
In-Reply-To: <20210401043004.GE2900@Leo-laptop-t470s>
Hangbin Liu <liuhangbin@gmail.com> writes:
> On Wed, Mar 31, 2021 at 03:41:17PM +0200, Toke Høiland-Jørgensen wrote:
>> > @@ -1491,13 +1492,20 @@ static __always_inline int __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifind
>> > */
>> > ri->map_id = INT_MAX; /* Valid map id idr range: [1,INT_MAX[ */
>> > ri->map_type = BPF_MAP_TYPE_UNSPEC;
>> > - return flags;
>> > + return flags & BPF_F_ACTION_MASK;
>> > }
>> >
>> > ri->tgt_index = ifindex;
>> > ri->map_id = map->id;
>> > ri->map_type = map->map_type;
>> >
>> > + if ((map->map_type == BPF_MAP_TYPE_DEVMAP ||
>> > + map->map_type == BPF_MAP_TYPE_DEVMAP_HASH) &&
>> > + (flags & BPF_F_BROADCAST)) {
>> > + ri->flags = flags;
>>
>> This, combined with this:
>>
>> [...]
>>
>> > + if (ri->flags & BPF_F_BROADCAST) {
>> > + map = READ_ONCE(ri->map);
>> > + WRITE_ONCE(ri->map, NULL);
>> > + }
>> > +
>> > switch (map_type) {
>> > case BPF_MAP_TYPE_DEVMAP:
>> > fallthrough;
>> > case BPF_MAP_TYPE_DEVMAP_HASH:
>> > - err = dev_map_enqueue(fwd, xdp, dev);
>> > + if (ri->flags & BPF_F_BROADCAST)
>> > + err = dev_map_enqueue_multi(xdp, dev, map,
>> > + ri->flags & BPF_F_EXCLUDE_INGRESS);
>> > + else
>> > + err = dev_map_enqueue(fwd, xdp, dev);
>>
>> Means that (since the flags value is never cleared again) once someone
>> has done a broadcast redirect, that's all they'll ever get until the
>> next reboot ;)
>
> How about just get the ri->flags first and clean it directly. e.g.
>
> flags = ri->flags;
> ri->flags = 0;
That would fix the "until next reboot" issue, but would still render
bpf_clear_redirect_map() useless. So you still need to check ri->map and
if you do that you don't actually need to clear the flag field as long
as it is set correctly whenever the map pointer is.
> With this we don't need to add an extra field ri->exclude_ingress as you
> mentioned later.
The ri->exclude_ingress would be *instead* of the flags field. You could
of course also just keep the flags field, but turning it into a bool
makes it obvious that only one of the bits is actually used (and thus
easier to see that it's correct to not clear it).
> People may also need the flags field in future.
In which case they can add it back at that time :)
>> Also, the bpf_clear_redirect_map() call has no effect since the call to
>> dev_map_enqueue_multi() only checks the flags and not the value of the
>> map pointer before deciding which enqueue function to call.
>>
>> To fix both of these, how about changing the logic so that:
>>
>> - __bpf_xdp_redirect_map() sets the map pointer if the broadcast flag is
>> set (and clears it if the flag isn't set!)
>
> OK
>>
>> - xdp_do_redirect() does the READ_ONCE/WRITE_ONCE on ri->map
>> unconditionally and then dispatches to dev_map_enqueue_multi() if the
>> read resulted in a non-NULL pointer
>>
>> Also, it should be invalid to set the broadcast flag with a map type
>> other than a devmap; __bpf_xdp_redirect_map() should check that.
>
> The current code only do if (unlikely(flags > XDP_TX)) and didn't check the
> map type. I also only set the map when there has devmap + broadcast flag.
> Do you mean we need a more strict check? like
>
> if (unlikely((flags & ~(BPF_F_ACTION_MASK | BPF_F_REDIR_MASK)) ||
> (map->map_type != BPF_MAP_TYPE_DEVMAP &&
> map->map_type != BPF_MAP_TYPE_DEVMAP_HASH &&
> flags & BPF_F_REDIR_MASK)))
Yeah, that's what I meant, but when you type it out that does seem like
a bit too many checks.
However, I think we can do something different: Since Björn has
helpfully split out the helper functions for the different map types, we
can add another argument to __bpf_xdp_redirect_map() which is the mask
of valid flag values. With this, dev_{hash_,}map_redirect() can include
BPF_F_REDIR_MASK in the valid flags, and {xsk,cpu}_map_redirect() can
leave them out. That makes the code do the right thing without actually
adding any more checks in the fast path :)
(You'd still need to AND the return value with BPF_F_ACTION_MASK when
returning, of course).
-Toke
next prev parent reply other threads:[~2021-04-01 11:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-25 9:27 [PATCHv3 bpf-next 0/4] xdp: extend xdp_redirect_map with broadcast support Hangbin Liu
2021-03-25 9:27 ` [PATCHv3 bpf-next 1/4] bpf: run devmap xdp_prog on flush instead of bulk enqueue Hangbin Liu
2021-03-25 9:27 ` [PATCHv3 bpf-next 2/4] xdp: extend xdp_redirect_map with broadcast support Hangbin Liu
2021-03-31 13:41 ` Toke Høiland-Jørgensen
2021-04-01 4:30 ` Hangbin Liu
2021-04-01 11:10 ` Toke Høiland-Jørgensen [this message]
2021-03-25 9:27 ` [PATCHv3 bpf-next 3/4] sample/bpf: add xdp_redirect_map_multi for redirect_map broadcast test Hangbin Liu
2021-03-25 9:27 ` [PATCHv3 bpf-next 4/4] selftests/bpf: add xdp_redirect_multi test Hangbin Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k0pmntui.fsf@toke.dk \
--to=toke@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=ast@kernel.org \
--cc=bjorn.topel@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=dsahern@gmail.com \
--cc=echaudro@redhat.com \
--cc=jbenc@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=liuhangbin@gmail.com \
--cc=lorenzo.bianconi@redhat.com \
--cc=maciej.fijalkowski@intel.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).