From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: Cong Wang <xiyou.wangcong@gmail.com>
Cc: "Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Andrii Nakryiko" <andrii@kernel.org>,
"Martin KaFai Lau" <martin.lau@linux.dev>,
"Song Liu" <song@kernel.org>, "Yonghong Song" <yhs@fb.com>,
"John Fastabend" <john.fastabend@gmail.com>,
"KP Singh" <kpsingh@kernel.org>,
"Stanislav Fomichev" <sdf@google.com>,
"Hao Luo" <haoluo@google.com>, "Jiri Olsa" <jolsa@kernel.org>,
"David S. Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"Björn Töpel" <bjorn@kernel.org>,
"Magnus Karlsson" <magnus.karlsson@intel.com>,
"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
"Jonathan Lemon" <jonathan.lemon@gmail.com>,
"Mykola Lysenko" <mykolal@fb.com>,
"Kumar Kartikeya Dwivedi" <memxor@gmail.com>,
netdev@vger.kernel.org, bpf@vger.kernel.org,
"Freysteinn Alfredsson" <freysteinn.alfredsson@kau.se>
Subject: Re: [RFC PATCH 00/17] xdp: Add packet queueing and scheduling capabilities
Date: Mon, 18 Jul 2022 14:45:05 +0200 [thread overview]
Message-ID: <87sfmylhda.fsf@toke.dk> (raw)
In-Reply-To: <YtRLC5ILXZOre8D7@pop-os.localdomain>
Cong Wang <xiyou.wangcong@gmail.com> writes:
> On Wed, Jul 13, 2022 at 01:14:08PM +0200, Toke Høiland-Jørgensen wrote:
>> Packet forwarding is an important use case for XDP, which offers
>> significant performance improvements compared to forwarding using the
>> regular networking stack. However, XDP currently offers no mechanism to
>> delay, queue or schedule packets, which limits the practical uses for
>> XDP-based forwarding to those where the capacity of input and output links
>> always match each other (i.e., no rate transitions or many-to-one
>> forwarding). It also prevents an XDP-based router from doing any kind of
>> traffic shaping or reordering to enforce policy.
>>
>
> Sorry for forgetting to respond to your email to my patchset.
>
> The most important question from you is actually why I give up on PIFO.
> Actually its limitation is already in its name, its name Push In First
> Out already says clearly that it only allows to dequeue the first one.
> Still confusing?
>
> You can take a look at your pifo_map_pop_elem(), which is the
> implementation for bpf_map_pop_elem(), which is:
>
> long bpf_map_pop_elem(struct bpf_map *map, void *value)
>
> Clearly, there is no even 'key' in its parameter list. If you just
> compare it to mine:
>
> BPF_CALL_2(bpf_skb_map_pop, struct bpf_map *, map, u64, key)
>
> Is their difference now 100% clear? :)
>
> The next question is why this is important (actually it is the most
> important)? Because we (I mean for eBPF Qdisc users, not sure about you)
> want the programmability, which I have been emphasizing since V1...
Right, I realise that in a strictly abstract sense, only being able to
dequeue at the head is a limitation. However, what I'm missing is what
concrete thing that limitation prevents you from implementing (see my
reply to your other email about LSTF)? I'm really not trying to be
disingenuous, I have no interest in ending up with a map primitive that
turns out to be limiting down the road...
> BTW, what is _your_ use case for skb map and user-space PIFO map? I am
> sure you have uses case for XDP, it is unclear what you have for other
> cases. Please don't piggy back use cases you don't have, we all have
> to justify all use cases. :)
You keep talking about the SKB and XDP paths as though they're
completely separate things. They're not: we're adding a general
capability to the kernel to implement programmable packet queueing using
BPF. One is for packets forwarded from the XDP hook, the other is for
packets coming from the stack. In an ideal world we'd only need one hook
that could handle both paths, but as the discussion I linked to in my
cover letter shows that is probably not going to be feasible.
So we'll most likely end up with two hooks, but as far as use cases are
concerned they are the same: I want to make sure that the primitives we
add are expressive enough to implement every conceivable queueing and
scheduling algorithm. I don't view the two efforts to be in competition
with each other either; we're really trying to achieve the same thing
here, so let's work together to make sure we end up with something that
works for both the XDP and qdisc layers? :)
The reason I mention the SKB path in particular in this series is that I
want to make sure we make the two as compatible as we can, so as not to
unnecessarily fragment the ecosystem. Sharing primitives is the obvious
way to do that.
-Toke
prev parent reply other threads:[~2022-07-18 12:45 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-13 11:14 [RFC PATCH 00/17] xdp: Add packet queueing and scheduling capabilities Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 01/17] dev: Move received_rps counter next to RPS members in softnet data Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 02/17] bpf: Expand map key argument of bpf_redirect_map to u64 Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 03/17] bpf: Use 64-bit return value for bpf_prog_run Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 04/17] bpf: Add a PIFO priority queue map type Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 05/17] pifomap: Add queue rotation for continuously increasing rank mode Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 06/17] xdp: Add dequeue program type for getting packets from a PIFO Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 07/17] bpf: Teach the verifier about referenced packets returned from dequeue programs Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 08/17] bpf: Add helpers to dequeue from a PIFO map Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 09/17] bpf: Introduce pkt_uid member for PTR_TO_PACKET Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 10/17] bpf: Implement direct packet access in dequeue progs Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 11/17] dev: Add XDP dequeue hook Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 12/17] bpf: Add helper to schedule an interface for TX dequeue Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 13/17] libbpf: Add support for dequeue program type and PIFO map type Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 14/17] libbpf: Add support for querying dequeue programs Toke Høiland-Jørgensen
2022-07-14 5:36 ` Andrii Nakryiko
2022-07-14 10:13 ` Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 15/17] selftests/bpf: Add verifier tests for dequeue prog Toke Høiland-Jørgensen
2022-07-14 5:38 ` Andrii Nakryiko
2022-07-14 6:45 ` Kumar Kartikeya Dwivedi
2022-07-14 18:54 ` Andrii Nakryiko
2022-07-15 11:11 ` Kumar Kartikeya Dwivedi
2022-07-13 11:14 ` [RFC PATCH 16/17] selftests/bpf: Add test for XDP queueing through PIFO maps Toke Høiland-Jørgensen
2022-07-14 5:41 ` Andrii Nakryiko
2022-07-14 10:18 ` Toke Høiland-Jørgensen
2022-07-13 11:14 ` [RFC PATCH 17/17] samples/bpf: Add queueing support to xdp_fwd sample Toke Høiland-Jørgensen
2022-07-13 18:36 ` [RFC PATCH 00/17] xdp: Add packet queueing and scheduling capabilities Stanislav Fomichev
2022-07-13 21:52 ` Toke Høiland-Jørgensen
2022-07-13 22:56 ` Stanislav Fomichev
2022-07-14 10:46 ` Toke Høiland-Jørgensen
2022-07-14 17:24 ` Stanislav Fomichev
2022-07-15 1:12 ` Alexei Starovoitov
2022-07-15 12:55 ` Toke Høiland-Jørgensen
2022-07-17 19:12 ` Cong Wang
2022-07-18 12:25 ` Toke Høiland-Jørgensen
2022-07-14 6:34 ` Kumar Kartikeya Dwivedi
2022-07-17 18:17 ` Cong Wang
2022-07-17 18:41 ` Kumar Kartikeya Dwivedi
2022-07-17 19:23 ` Cong Wang
2022-07-18 12:12 ` Toke Høiland-Jørgensen
2022-07-14 14:05 ` Jamal Hadi Salim
2022-07-14 14:56 ` Dave Taht
2022-07-14 15:33 ` Jamal Hadi Salim
2022-07-14 16:21 ` Toke Høiland-Jørgensen
2022-07-17 17:46 ` Cong Wang
2022-07-18 12:45 ` Toke Høiland-Jørgensen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87sfmylhda.fsf@toke.dk \
--to=toke@redhat.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bjorn@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=freysteinn.alfredsson@kau.se \
--cc=haoluo@google.com \
--cc=hawk@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=jonathan.lemon@gmail.com \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=maciej.fijalkowski@intel.com \
--cc=magnus.karlsson@intel.com \
--cc=martin.lau@linux.dev \
--cc=memxor@gmail.com \
--cc=mykolal@fb.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=xiyou.wangcong@gmail.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).