From: John Fastabend <john.fastabend@gmail.com>
To: Alexei Starovoitov <ast@fb.com>,
Jesper Dangaard Brouer <brouer@redhat.com>,
Andy Gospodarek <andy@greyhouse.net>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Daniel Borkmann <daniel@iogearbox.net>,
Daniel Borkmann <borkmann@iogearbox.net>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"xdp-newbies@vger.kernel.org" <xdp-newbies@vger.kernel.org>
Subject: Re: xdp_redirect ifindex vs port. Was: best API for returning/setting egress port?
Date: Thu, 27 Apr 2017 22:06:13 -0700 [thread overview]
Message-ID: <5902CDC5.5010209@gmail.com> (raw)
In-Reply-To: <53e9dd2f-f40a-b43b-99c9-62f5ce3a665c@fb.com>
On 17-04-27 04:31 PM, Alexei Starovoitov wrote:
> On 4/27/17 1:41 AM, Jesper Dangaard Brouer wrote:
>> When registering/attaching a XDP/bpf program, we would just send the
>> file-descriptor for this port-map along (like we do with the bpf_prog
>> FD). Plus, it own ingress-port number this program is in the port-map.
>>
>> It is not clear to me, in-which-data-structure on the kernel-side we
>> store this reference to the port-map and ingress-port. As today we only
>> have the "raw" struct bpf_prog pointer. I see several options:
>>
>> 1. Create a new xdp_prog struct that contains existing bpf_prog,
>> a port-map pointer and ingress-port. (IMHO easiest solution)
>>
>> 2. Just create a new pointer to port-map and store it in driver rx-ring
>> struct (like existing bpf_prog), but this create a race-challenge
>> replacing (cmpxchg) the program (or perhaps it's not a problem as it
>> runs under rcu and RTNL-lock).
>>
>> 3. Extend bpf_prog to store this port-map and ingress-port, and have a
>> fast-way to access it. I assume it will be accessible via
>> bpf_prog->bpf_prog_aux->used_maps[X] but it will be too slow for XDP.
>
> I'm not sure I completely follow the 3 proposals.
> Are you suggesting to have only one netdev_array per program?
> Why not to allow any number like we do for tailcall+prog_array, etc?
> We can teach verifier to allow new helper
> bpf_tx_port(netdev_array, port_num);
> to only be used with netdev_array map type.
> It will fetch netdevice pointer from netdev_array[port_num]
> and will tx the packet into it.
> We can make it similar to bpf_tail_call(), so that program will
> finish on successful bpf_tx_port() or
> make it into 'delayed' tx which will be executed when program finishes.
> Not sure which approach is better.
My reaction would be to make it finish on success but would like to write
a few programs first and see. I can't think of any use _not_ to terminate
but maybe there is something I'm missing.
>
> We can also extend this netdev_array into broadcast/multicast. Like
> bpf_tx_allports(&netdev_array);
> call from the program will xmit the packet to all netdevices
> in that 'netdev_array' map type.
Yep nice solution to the multicast problem.
>
> The map-in-map support can be trivially extended to allow netdev_array,
> then the program can create N multicast groups of netdevices.
> Each multicast group == one netdev_array map.
> The user space will populate a hashmap with these netdev_arrays and
> bpf kernel side can select dynamically which multicast group to use
> to send the packets to.
> bpf kernel side may look like:
> struct bpf_netdev_array *netdev_array = bpf_map_lookup_elem(&hash, key);
> if (!netdev_array)
> ...
> if (my_condition)
> bpf_tx_allports(netdev_array); /* broadcast to all netdevices */
> else
> bpf_tx_port(netdev_array, port_num); /* tx into one netdevice */
>
> that's an artificial example. Just trying to point out
> that we shouldn't restrict the feature too soon.
>
That is more or less what I was thinking as well. The other question
I have though is should we have a bpf_redirect() call for the simple
case where I use the ifindex directly. This will be helpful for taking
existing programs from tc_cls into xdp. I think it makes sense to have
both bpf_tx_allports(), bpf_tx_port(), and bpf_redirect().
.John
next prev parent reply other threads:[~2017-04-28 5:06 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-18 19:58 XDP question: best API for returning/setting egress port? Jesper Dangaard Brouer
2017-04-18 20:54 ` John Fastabend
2017-04-19 12:00 ` Jesper Dangaard Brouer
2017-04-19 12:33 ` Daniel Borkmann
2017-04-19 15:24 ` Jesper Dangaard Brouer
2017-04-19 12:25 ` Hannes Frederic Sowa
2017-04-19 20:02 ` Andy Gospodarek
2017-04-19 21:42 ` Daniel Borkmann
2017-04-20 17:12 ` Andy Gospodarek
2017-04-19 22:51 ` Daniel Borkmann
2017-04-20 2:56 ` xdp_redirect ifindex vs port. Was: " Alexei Starovoitov
2017-04-20 4:38 ` John Fastabend
2017-04-20 4:58 ` Alexei Starovoitov
2017-04-20 5:14 ` John Fastabend
2017-04-20 6:10 ` Jesper Dangaard Brouer
2017-04-20 17:10 ` Alexei Starovoitov
2017-04-25 9:34 ` Jesper Dangaard Brouer
2017-04-26 0:26 ` Alexei Starovoitov
2017-04-26 3:07 ` John Fastabend
2017-04-26 9:11 ` Jesper Dangaard Brouer
2017-04-26 16:35 ` John Fastabend
2017-04-26 17:58 ` Alexei Starovoitov
2017-04-26 20:55 ` Andy Gospodarek
2017-04-27 8:41 ` Jesper Dangaard Brouer
2017-04-27 23:31 ` Alexei Starovoitov
2017-04-28 5:06 ` John Fastabend [this message]
2017-04-28 5:30 ` Alexei Starovoitov
2017-04-28 19:43 ` Hannes Frederic Sowa
2017-04-30 1:35 ` Alexei Starovoitov
2017-04-28 10:58 ` Jesper Dangaard Brouer
2017-04-30 1:04 ` Alexei Starovoitov
2017-04-30 22:55 ` John Fastabend
2017-04-20 6:39 ` XDP question: " Jesper Dangaard Brouer
2017-04-20 4:43 ` John Fastabend
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5902CDC5.5010209@gmail.com \
--to=john.fastabend@gmail.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andy@greyhouse.net \
--cc=ast@fb.com \
--cc=borkmann@iogearbox.net \
--cc=brouer@redhat.com \
--cc=daniel@iogearbox.net \
--cc=netdev@vger.kernel.org \
--cc=xdp-newbies@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.