netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v3 Optimizing veth xsk performance 0/9]
@ 2023-08-08  3:19 Albert Huang
  2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 1/9] veth: Implement ethtool's get_ringparam() callback Albert Huang
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: Albert Huang @ 2023-08-08  3:19 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni
  Cc: Albert Huang, Alexei Starovoitov, Daniel Borkmann,
	Jesper Dangaard Brouer, John Fastabend, Björn Töpel,
	Magnus Karlsson, Maciej Fijalkowski, Jonathan Lemon,
	Pavel Begunkov, Yunsheng Lin, Kees Cook, Richard Gobert,
	open list:NETWORKING DRIVERS, open list,
	open list:XDP (eXpress Data Path)

AF_XDP is a kernel bypass technology that can greatly improve performance.
However,for virtual devices like veth,even with the use of AF_XDP sockets,
there are still many additional software paths that consume CPU resources. 
This patch series focuses on optimizing the performance of AF_XDP sockets 
for veth virtual devices. Patches 1 to 4 mainly involve preparatory work. 
Patch 5 introduces tx queue and tx napi for packet transmission, while 
patch 8 primarily implements batch sending for IPv4 UDP packets, and patch 9
add support for AF_XDP tx need_wakup feature. These optimizations significantly
reduce the software path and support checksum offload.

I tested those feature with
A typical topology is shown below:
client(send):                                        server:(recv)
veth<-->veth-peer                                    veth1-peer<--->veth1
  1       |                                                  |   7
          |2                                                6|
          |                                                  |
        bridge<------->eth0(mlnx5)- switch -eth1(mlnx5)<--->bridge1
                  3                    4                 5    
             (machine1)                              (machine2)    
AF_XDP socket is attach to veth and veth1. and send packets to physical NIC(eth0)
veth:(172.17.0.2/24)
bridge:(172.17.0.1/24)
eth0:(192.168.156.66/24)

eth1(172.17.0.2/24)
bridge1:(172.17.0.1/24)
eth0:(192.168.156.88/24)

after set default route\snat\dnat. we can have a tests
to get the performance results.

packets send from veth to veth1:
af_xdp test tool:
link:https://github.com/cclinuxer/libxudp
send:(veth)
./objs/xudpperf send --dst 192.168.156.88:6002 -l 1300
recv:(veth1)
./objs/xudpperf recv --src 172.17.0.2:6002

udp test tool:iperf3
send:(veth)
iperf3 -c 192.168.156.88 -p 6002 -l 1300 -b 0 -u
recv:(veth1)
iperf3 -s -p 6002

performance:
performance:(test weth libxudp lib)
UDP                              : 320 Kpps (with 100% cpu)
AF_XDP   no  zerocopy + no batch : 480 Kpps (with ksoftirqd 100% cpu)
AF_XDP  with  batch  +  zerocopy : 1.5 Mpps (with ksoftirqd 15% cpu)

With af_xdp batch, the libxudp user-space program reaches a bottleneck.
Therefore, the softirq did not reach the limit.

This is just an RFC patch series, and some code details still need 
further consideration. Please review this proposal.

v2->v3:
- fix build error find by kernel test robot.

v1->v2:
- all the patches pass checkpatch.pl test. suggested by Simon Horman.
- iperf3 tested with -b 0, update the test results. suggested by Paolo Abeni.
- refactor code to make code structure clearer.
- delete some useless code logic in the veth_xsk_tx_xmit function.
- add support for AF_XDP tx need_wakup feature.

Albert Huang (9):
  veth: Implement ethtool's get_ringparam() callback
  xsk: add dma_check_skip for skipping dma check
  veth: add support for send queue
  xsk: add xsk_tx_completed_addr function
  veth: use send queue tx napi to xmit xsk tx desc
  veth: add ndo_xsk_wakeup callback for veth
  sk_buff: add destructor_arg_xsk_pool for zero copy
  veth: af_xdp tx batch support for ipv4 udp
  veth: add support for AF_XDP tx need_wakup feature

 drivers/net/veth.c          | 679 +++++++++++++++++++++++++++++++++++-
 include/linux/skbuff.h      |   2 +
 include/net/xdp_sock_drv.h  |   5 +
 include/net/xsk_buff_pool.h |   1 +
 net/xdp/xsk.c               |   6 +
 net/xdp/xsk_buff_pool.c     |   3 +-
 net/xdp/xsk_queue.h         |  10 +
 7 files changed, 704 insertions(+), 2 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-08-09 11:09 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-08  3:19 [RFC v3 Optimizing veth xsk performance 0/9] Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 1/9] veth: Implement ethtool's get_ringparam() callback Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 2/9] xsk: add dma_check_skip for skipping dma check Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 3/9] veth: add support for send queue Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 4/9] xsk: add xsk_tx_completed_addr function Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 5/9] veth: use send queue tx napi to xmit xsk tx desc Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 6/9] veth: add ndo_xsk_wakeup callback for veth Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 7/9] sk_buff: add destructor_arg_xsk_pool for zero copy Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 8/9] veth: af_xdp tx batch support for ipv4 udp Albert Huang
2023-08-08  3:19 ` [RFC v3 Optimizing veth xsk performance 9/9] veth: add support for AF_XDP tx need_wakup feature Albert Huang
2023-08-08 12:01 ` [RFC v3 Optimizing veth xsk performance 0/9] Toke Høiland-Jørgensen
2023-08-09  7:13   ` 黄杰
2023-08-09  9:06     ` Toke Høiland-Jørgensen
2023-08-09 11:09       ` Jesper Dangaard Brouer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).