public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Toke Høiland-Jørgensen" <toke@redhat.com>
To: 黄杰 <huangjie.albert@bytedance.com>
Cc: "David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Björn Töpel" <bjorn@kernel.org>,
	"Magnus Karlsson" <magnus.karlsson@intel.com>,
	"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
	"Jonathan Lemon" <jonathan.lemon@gmail.com>,
	"Pavel Begunkov" <asml.silence@gmail.com>,
	"Yunsheng Lin" <linyunsheng@huawei.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Richard Gobert" <richardbgobert@gmail.com>,
	"open list:NETWORKING DRIVERS" <netdev@vger.kernel.org>,
	"open list" <linux-kernel@vger.kernel.org>,
	"open list:XDP (eXpress Data Path)" <bpf@vger.kernel.org>
Subject: Re: Re: [RFC v3 Optimizing veth xsk performance 0/9]
Date: Wed, 09 Aug 2023 11:06:23 +0200	[thread overview]
Message-ID: <87msz04mb4.fsf@toke.dk> (raw)
In-Reply-To: <CABKxMyNrwSOrzpq6mhqtU_kEk5B9nKPODtmfjJO5_NmGpw_Oag@mail.gmail.com>

黄杰 <huangjie.albert@bytedance.com> writes:

> Toke Høiland-Jørgensen <toke@redhat.com> 于2023年8月8日周二 20:01写道:
>>
>> Albert Huang <huangjie.albert@bytedance.com> writes:
>>
>> > AF_XDP is a kernel bypass technology that can greatly improve performance.
>> > However,for virtual devices like veth,even with the use of AF_XDP sockets,
>> > there are still many additional software paths that consume CPU resources.
>> > This patch series focuses on optimizing the performance of AF_XDP sockets
>> > for veth virtual devices. Patches 1 to 4 mainly involve preparatory work.
>> > Patch 5 introduces tx queue and tx napi for packet transmission, while
>> > patch 8 primarily implements batch sending for IPv4 UDP packets, and patch 9
>> > add support for AF_XDP tx need_wakup feature. These optimizations significantly
>> > reduce the software path and support checksum offload.
>> >
>> > I tested those feature with
>> > A typical topology is shown below:
>> > client(send):                                        server:(recv)
>> > veth<-->veth-peer                                    veth1-peer<--->veth1
>> >   1       |                                                  |   7
>> >           |2                                                6|
>> >           |                                                  |
>> >         bridge<------->eth0(mlnx5)- switch -eth1(mlnx5)<--->bridge1
>> >                   3                    4                 5
>> >              (machine1)                              (machine2)
>>
>> I definitely applaud the effort to improve the performance of af_xdp
>> over veth, this is something we have flagged as in need of improvement
>> as well.
>>
>> However, looking through your patch series, I am less sure that the
>> approach you're taking here is the right one.
>>
>> AFAIU (speaking about the TX side here), the main difference between
>> AF_XDP ZC and the regular transmit mode is that in the regular TX mode
>> the stack will allocate an skb to hold the frame and push that down the
>> stack. Whereas in ZC mode, there's a driver NDO that gets called
>> directly, bypassing the skb allocation entirely.
>>
>> In this series, you're implementing the ZC mode for veth, but the driver
>> code ends up allocating an skb anyway. Which seems to be a bit of a
>> weird midpoint between the two modes, and adds a lot of complexity to
>> the driver that (at least conceptually) is mostly just a
>> reimplementation of what the stack does in non-ZC mode (allocate an skb
>> and push it through the stack).
>>
>> So my question is, why not optimise the non-zc path in the stack instead
>> of implementing the zc logic for veth? It seems to me that it would be
>> quite feasible to apply the same optimisations (bulking, and even GRO)
>> to that path and achieve the same benefits, without having to add all
>> this complexity to the veth driver?
>>
>> -Toke
>>
> thanks!
> This idea is really good indeed. You've reminded me, and that's
> something I overlooked. I will now consider implementing the solution
> you've proposed and test the performance enhancement.

Sounds good, thanks! :)

-Toke


  reply	other threads:[~2023-08-09  9:07 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-08  3:19 [RFC v3 Optimizing veth xsk performance 0/9] Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 1/9] veth: Implement ethtool's get_ringparam() callback Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 2/9] xsk: add dma_check_skip for skipping dma check Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 3/9] veth: add support for send queue Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 4/9] xsk: add xsk_tx_completed_addr function Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 5/9] veth: use send queue tx napi to xmit xsk tx desc Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 6/9] veth: add ndo_xsk_wakeup callback for veth Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 7/9] sk_buff: add destructor_arg_xsk_pool for zero copy Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 8/9] veth: af_xdp tx batch support for ipv4 udp Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 9/9] veth: add support for AF_XDP tx need_wakup feature Albert Huang
2023-08-08 12:01 ` [RFC v3 Optimizing veth xsk performance 0/9] Toke Høiland-Jørgensen
2023-08-09  7:13   ` 黄杰
2023-08-09  9:06     ` Toke Høiland-Jørgensen [this message]
2023-08-09 11:09       ` Jesper Dangaard Brouer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87msz04mb4.fsf@toke.dk \
    --to=toke@redhat.com \
    --cc=asml.silence@gmail.com \
    --cc=ast@kernel.org \
    --cc=bjorn@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=huangjie.albert@bytedance.com \
    --cc=john.fastabend@gmail.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=keescook@chromium.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=richardbgobert@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox