From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Hangbin Liu <liuhangbin@gmail.com>
Cc: bpf@vger.kernel.org, netdev@vger.kernel.org,
"Jiri Benc" <jbenc@redhat.com>,
"Jesper Dangaard Brouer" <brouer@redhat.com>,
"Eelco Chaudron" <echaudro@redhat.com>,
ast@kernel.org, "Daniel Borkmann" <daniel@iogearbox.net>,
"Lorenzo Bianconi" <lorenzo.bianconi@redhat.com>,
"David Ahern" <dsahern@gmail.com>,
"Andrii Nakryiko" <andrii.nakryiko@gmail.com>,
"Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
"John Fastabend" <john.fastabend@gmail.com>,
"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
"Björn Töpel" <bjorn.topel@gmail.com>
Subject: Re: [PATCHv3 bpf-next 2/4] xdp: extend xdp_redirect_map with broadcast support
Date: Thu, 01 Apr 2021 13:10:29 +0200 [thread overview]
Message-ID: <87k0pmntui.fsf@toke.dk> (raw)
In-Reply-To: <20210401043004.GE2900@Leo-laptop-t470s>
Hangbin Liu <liuhangbin@gmail.com> writes:
> On Wed, Mar 31, 2021 at 03:41:17PM +0200, Toke Høiland-Jørgensen wrote:
>> > @@ -1491,13 +1492,20 @@ static __always_inline int __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifind
>> > */
>> > ri->map_id = INT_MAX; /* Valid map id idr range: [1,INT_MAX[ */
>> > ri->map_type = BPF_MAP_TYPE_UNSPEC;
>> > - return flags;
>> > + return flags & BPF_F_ACTION_MASK;
>> > }
>> >
>> > ri->tgt_index = ifindex;
>> > ri->map_id = map->id;
>> > ri->map_type = map->map_type;
>> >
>> > + if ((map->map_type == BPF_MAP_TYPE_DEVMAP ||
>> > + map->map_type == BPF_MAP_TYPE_DEVMAP_HASH) &&
>> > + (flags & BPF_F_BROADCAST)) {
>> > + ri->flags = flags;
>>
>> This, combined with this:
>>
>> [...]
>>
>> > + if (ri->flags & BPF_F_BROADCAST) {
>> > + map = READ_ONCE(ri->map);
>> > + WRITE_ONCE(ri->map, NULL);
>> > + }
>> > +
>> > switch (map_type) {
>> > case BPF_MAP_TYPE_DEVMAP:
>> > fallthrough;
>> > case BPF_MAP_TYPE_DEVMAP_HASH:
>> > - err = dev_map_enqueue(fwd, xdp, dev);
>> > + if (ri->flags & BPF_F_BROADCAST)
>> > + err = dev_map_enqueue_multi(xdp, dev, map,
>> > + ri->flags & BPF_F_EXCLUDE_INGRESS);
>> > + else
>> > + err = dev_map_enqueue(fwd, xdp, dev);
>>
>> Means that (since the flags value is never cleared again) once someone
>> has done a broadcast redirect, that's all they'll ever get until the
>> next reboot ;)
>
> How about just get the ri->flags first and clean it directly. e.g.
>
> flags = ri->flags;
> ri->flags = 0;
That would fix the "until next reboot" issue, but would still render
bpf_clear_redirect_map() useless. So you still need to check ri->map and
if you do that you don't actually need to clear the flag field as long
as it is set correctly whenever the map pointer is.
> With this we don't need to add an extra field ri->exclude_ingress as you
> mentioned later.
The ri->exclude_ingress would be *instead* of the flags field. You could
of course also just keep the flags field, but turning it into a bool
makes it obvious that only one of the bits is actually used (and thus
easier to see that it's correct to not clear it).
> People may also need the flags field in future.
In which case they can add it back at that time :)
>> Also, the bpf_clear_redirect_map() call has no effect since the call to
>> dev_map_enqueue_multi() only checks the flags and not the value of the
>> map pointer before deciding which enqueue function to call.
>>
>> To fix both of these, how about changing the logic so that:
>>
>> - __bpf_xdp_redirect_map() sets the map pointer if the broadcast flag is
>> set (and clears it if the flag isn't set!)
>
> OK
>>
>> - xdp_do_redirect() does the READ_ONCE/WRITE_ONCE on ri->map
>> unconditionally and then dispatches to dev_map_enqueue_multi() if the
>> read resulted in a non-NULL pointer
>>
>> Also, it should be invalid to set the broadcast flag with a map type
>> other than a devmap; __bpf_xdp_redirect_map() should check that.
>
> The current code only do if (unlikely(flags > XDP_TX)) and didn't check the
> map type. I also only set the map when there has devmap + broadcast flag.
> Do you mean we need a more strict check? like
>
> if (unlikely((flags & ~(BPF_F_ACTION_MASK | BPF_F_REDIR_MASK)) ||
> (map->map_type != BPF_MAP_TYPE_DEVMAP &&
> map->map_type != BPF_MAP_TYPE_DEVMAP_HASH &&
> flags & BPF_F_REDIR_MASK)))
Yeah, that's what I meant, but when you type it out that does seem like
a bit too many checks.
However, I think we can do something different: Since Björn has
helpfully split out the helper functions for the different map types, we
can add another argument to __bpf_xdp_redirect_map() which is the mask
of valid flag values. With this, dev_{hash_,}map_redirect() can include
BPF_F_REDIR_MASK in the valid flags, and {xsk,cpu}_map_redirect() can
leave them out. That makes the code do the right thing without actually
adding any more checks in the fast path :)
(You'd still need to AND the return value with BPF_F_ACTION_MASK when
returning, of course).
-Toke
next prev parent reply other threads:[~2021-04-01 11:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-25 9:27 [PATCHv3 bpf-next 0/4] xdp: extend xdp_redirect_map with broadcast support Hangbin Liu
2021-03-25 9:27 ` [PATCHv3 bpf-next 1/4] bpf: run devmap xdp_prog on flush instead of bulk enqueue Hangbin Liu
2021-03-25 9:27 ` [PATCHv3 bpf-next 2/4] xdp: extend xdp_redirect_map with broadcast support Hangbin Liu
2021-03-31 13:41 ` Toke Høiland-Jørgensen
2021-04-01 4:30 ` Hangbin Liu
2021-04-01 11:10 ` Toke Høiland-Jørgensen [this message]
2021-03-25 9:27 ` [PATCHv3 bpf-next 3/4] sample/bpf: add xdp_redirect_map_multi for redirect_map broadcast test Hangbin Liu
2021-03-25 9:27 ` [PATCHv3 bpf-next 4/4] selftests/bpf: add xdp_redirect_multi test Hangbin Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k0pmntui.fsf@toke.dk \
--to=toke@redhat.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii.nakryiko@gmail.com \
--cc=ast@kernel.org \
--cc=bjorn.topel@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=dsahern@gmail.com \
--cc=echaudro@redhat.com \
--cc=jbenc@redhat.com \
--cc=john.fastabend@gmail.com \
--cc=liuhangbin@gmail.com \
--cc=lorenzo.bianconi@redhat.com \
--cc=maciej.fijalkowski@intel.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.