public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <hawk@kernel.org>
To: "huangjie.albert" <huangjie.albert@bytedance.com>,
	davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, Maryam Tahhan <mtahhan@redhat.com>,
	Keith Wiles <keith.wiles@intel.com>,
	Liang Chen <liangchen.linux@gmail.com>
Cc: "Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"Björn Töpel" <bjorn@kernel.org>,
	"Magnus Karlsson" <magnus.karlsson@intel.com>,
	"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
	"Jonathan Lemon" <jonathan.lemon@gmail.com>,
	"Pavel Begunkov" <asml.silence@gmail.com>,
	"Yunsheng Lin" <linyunsheng@huawei.com>,
	"Kees Cook" <keescook@chromium.org>,
	"Richard Gobert" <richardbgobert@gmail.com>,
	"open list:NETWORKING DRIVERS" <netdev@vger.kernel.org>,
	"open list" <linux-kernel@vger.kernel.org>,
	"open list:XDP (eXpress Data Path)" <bpf@vger.kernel.org>
Subject: Re: [RFC Optimizing veth xsk performance 00/10]
Date: Thu, 3 Aug 2023 17:01:37 +0200	[thread overview]
Message-ID: <ae2ef15a-c601-eb5d-66bc-edaae6bda1c3@kernel.org> (raw)
In-Reply-To: <20230803140441.53596-1-huangjie.albert@bytedance.com>



On 03/08/2023 16.04, huangjie.albert wrote:
> AF_XDP is a kernel bypass technology that can greatly improve performance.
> However, for virtual devices like veth, even with the use of AF_XDP sockets,
> there are still many additional software paths that consume CPU resources.
> This patch series focuses on optimizing the performance of AF_XDP sockets
> for veth virtual devices. Patches 1 to 4 mainly involve preparatory work.
> Patch 5 introduces tx queue and tx napi for packet transmission, while
> patch 9 primarily implements zero-copy, and patch 10 adds support for
> batch sending of IPv4 UDP packets. These optimizations significantly reduce
> the software path and support checksum offload.
> 
> I tested those feature with
> A typical topology is shown below:
> veth<-->veth-peer                                    veth1-peer<--->veth1
> 	1       |                                                  |   7
> 	        |2                                                6|
> 	        |                                                  |
> 	      bridge<------->eth0(mlnx5)- switch -eth1(mlnx5)<--->bridge1
>                    3                    4                 5
>               (machine1)                              (machine2)
> AF_XDP socket is attach to veth and veth1. and send packets to physical NIC(eth0)
> veth:(172.17.0.2/24)
> bridge:(172.17.0.1/24)
> eth0:(192.168.156.66/24)
> 
> eth1(172.17.0.2/24)
> bridge1:(172.17.0.1/24)
> eth0:(192.168.156.88/24)
> 
> after set default route、snat、dnat. we can have a tests
> to get the performance results.
> 
> packets send from veth to veth1:
> af_xdp test tool:
> link:https://github.com/cclinuxer/libxudp
> send:(veth)
> ./objs/xudpperf send --dst 192.168.156.88:6002 -l 1300
> recv:(veth1)
> ./objs/xudpperf recv --src 172.17.0.2:6002
> 
> udp test tool:iperf3
> send:(veth)
> iperf3 -c 192.168.156.88 -p 6002 -l 1300 -b 60G -u
> recv:(veth1)
> iperf3 -s -p 6002
> 
> performance:
> performance:(test weth libxdp lib)
> UDP                              : 250 Kpps (with 100% cpu)
> AF_XDP   no  zerocopy + no batch : 480 Kpps (with ksoftirqd 100% cpu)
> AF_XDP  with zerocopy + no batch : 540 Kpps (with ksoftirqd 100% cpu)
> AF_XDP  with  batch  +  zerocopy : 1.5 Mpps (with ksoftirqd 15% cpu)
> 
> With af_xdp batch, the libxdp user-space program reaches a bottleneck.

Do you mean libxdp [1] or libxudp ?

[1] https://github.com/xdp-project/xdp-tools/tree/master/lib/libxdp

> Therefore, the softirq did not reach the limit.
> 
> This is just an RFC patch series, and some code details still need
> further consideration. Please review this proposal.
>

I find this performance work interesting as we have customer requests
(via Maryam (cc)) to improve AF_XDP performance both native and on veth.

Our benchmark is stored at:
  https://github.com/maryamtahhan/veth-benchmark

Great to see other companies also interested in this area.

--Jesper

> thanks!
> 
> huangjie.albert (10):
>    veth: Implement ethtool's get_ringparam() callback
>    xsk: add dma_check_skip for  skipping dma check
>    veth: add support for send queue
>    xsk: add xsk_tx_completed_addr function
>    veth: use send queue tx napi to xmit xsk tx desc
>    veth: add ndo_xsk_wakeup callback for veth
>    sk_buff: add destructor_arg_xsk_pool for zero copy
>    xdp: add xdp_mem_type MEM_TYPE_XSK_BUFF_POOL_TX
>    veth: support zero copy for af xdp
>    veth: af_xdp tx batch support for ipv4 udp
> 
>   drivers/net/veth.c          | 729 +++++++++++++++++++++++++++++++++++-
>   include/linux/skbuff.h      |   1 +
>   include/net/xdp.h           |   1 +
>   include/net/xdp_sock_drv.h  |   1 +
>   include/net/xsk_buff_pool.h |   1 +
>   net/xdp/xsk.c               |   6 +
>   net/xdp/xsk_buff_pool.c     |   3 +-
>   net/xdp/xsk_queue.h         |  11 +
>   8 files changed, 751 insertions(+), 2 deletions(-)
> 

      parent reply	other threads:[~2023-08-03 15:01 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-03 14:04 [RFC Optimizing veth xsk performance 00/10] huangjie.albert
2023-08-03 14:04 ` [RFC Optimizing veth xsk performance 01/10] veth: Implement ethtool's get_ringparam() callback huangjie.albert
2023-08-04 20:41   ` Simon Horman
2023-08-03 14:04 ` [RFC Optimizing veth xsk performance 02/10] xsk: add dma_check_skip for skipping dma check huangjie.albert
2023-08-04 20:42   ` Simon Horman
2023-08-03 14:04 ` [RFC Optimizing veth xsk performance 03/10] veth: add support for send queue huangjie.albert
2023-08-04 20:44   ` Simon Horman
2023-08-03 14:04 ` [RFC Optimizing veth xsk performance 04/10] xsk: add xsk_tx_completed_addr function huangjie.albert
2023-08-04 20:46   ` Simon Horman
2023-08-03 14:04 ` [RFC Optimizing veth xsk performance 05/10] veth: use send queue tx napi to xmit xsk tx desc huangjie.albert
2023-08-04 20:59   ` Simon Horman
2023-08-03 14:04 ` [RFC Optimizing veth xsk performance 06/10] veth: add ndo_xsk_wakeup callback for veth huangjie.albert
2023-08-04 21:01   ` Simon Horman
2023-08-03 14:04 ` [RFC Optimizing veth xsk performance 07/10] sk_buff: add destructor_arg_xsk_pool for zero copy huangjie.albert
2023-08-03 14:04 ` [RFC Optimizing veth xsk performance 08/10] xdp: add xdp_mem_type MEM_TYPE_XSK_BUFF_POOL_TX huangjie.albert
2023-08-03 14:04 ` [RFC Optimizing veth xsk performance 09/10] veth: support zero copy for af xdp huangjie.albert
2023-08-04 21:05   ` Simon Horman
2023-08-03 14:04 ` [RFC Optimizing veth xsk performance 10/10] veth: af_xdp tx batch support for ipv4 udp huangjie.albert
2023-08-04 21:12   ` Simon Horman
2023-08-03 14:20 ` [RFC Optimizing veth xsk performance 00/10] Paolo Abeni
2023-08-04  4:16   ` [External] " 黄杰
2023-08-03 15:01 ` Jesper Dangaard Brouer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ae2ef15a-c601-eb5d-66bc-edaae6bda1c3@kernel.org \
    --to=hawk@kernel.org \
    --cc=asml.silence@gmail.com \
    --cc=ast@kernel.org \
    --cc=bjorn@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=huangjie.albert@bytedance.com \
    --cc=john.fastabend@gmail.com \
    --cc=jonathan.lemon@gmail.com \
    --cc=keescook@chromium.org \
    --cc=keith.wiles@intel.com \
    --cc=kuba@kernel.org \
    --cc=liangchen.linux@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=maciej.fijalkowski@intel.com \
    --cc=magnus.karlsson@intel.com \
    --cc=mtahhan@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=richardbgobert@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox