netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/9] tun: optimize SKB allocation with NAPI cache
@ 2025-11-25 20:00 Jon Kohler
  2025-11-25 20:00 ` [PATCH net-next v2 1/9] tun: cleanup out label in tun_xdp_one Jon Kohler
                   ` (9 more replies)
  0 siblings, 10 replies; 29+ messages in thread
From: Jon Kohler @ 2025-11-25 20:00 UTC (permalink / raw)
  To: netdev, Alexei Starovoitov, Daniel Borkmann, David S. Miller,
	Jakub Kicinski, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev,
	open list:XDP (eXpress Data Path):Keyword:(?:b|_)xdp(?:b|_)
  Cc: Jon Kohler

Use the per-CPU NAPI cache for SKB allocation in most places, and
leverage bulk allocation for tun_xdp_one since the batch size is known
at submission time. Additionally, utilize napi_build_skb and
napi_consume_skb to further benefit from the NAPI cache. This all
improves efficiency by reducing allocation overhead. 

Note: This series does not address the large payload path in
tun_alloc_skb, which spans sock.c and skbuff.c,A separate series will
handle privatizing the allocation code in tun and integrating the NAPI
cache for that path.

Results using basic iperf3 UDP test:
TX guest: taskset -c 2 iperf3 -c rx-ip-here -t 30 -p 5200 -b 0 -u -i 30
RX guest: taskset -c 2 iperf3 -s -p 5200 -D

        Bitrate       
Before: 6.08 Gbits/sec
After : 6.36 Gbits/sec

However, the basic test doesn't tell the whole story. Looking at a
flamegraph from before and after, less cycles are spent both on RX
vhost thread in the guest-to-guest on a single host case, but also less
cycles in the guest-to-guest case when on separate hosts, as the host
NIC handlers benefit from these NAPI-allocated SKBs (and deferred free)
as well.

Speaking of deferred free, v2 adds exporting deferred free from net
core and using immediately prior in tun_put_user. This not only keeps
the cache as warm as you can get, but also prevents a TX heavy vhost
thread from getting IPI'd like its going out of style. This approach
is similar in concept to what happens from NAPI loop in net_rx_action.

I've also merged this series with a small series about cleaning up
packet drop statistics along the various error paths in tun, as I want
to make sure those all go through kfree_skb_reason(), and we'd have
merge conflicts separating the two. If the maintainers want to take
them separately, happy to break them apart if needed. It is fairly
clean keeping them together otherwise.

Thanks all,
Jon

v2:
- Added drop statistics cleanup series, else merge conflicts abound
- Removed xdp_prog change (Willem)
- Clarified drop scenario in tun_put_user, where it is an extention of
  current behavior (Willem comment from v1)
- Export skb_defer_free_flush
- Use deferred skb free to immediately refill cache prior to bulk alloc,
  which also prevents IPIs from pounding TX heavy / TX only cores

v1: https://patchwork.kernel.org/project/netdevbpf/cover/20250506145530.2877229-1-jon@nutanix.com/

Jon Kohler (9):
  tun: cleanup out label in tun_xdp_one
  tun: correct drop statistics in tun_xdp_one
  tun: correct drop statistics in tun_put_user
  tun: correct drop statistics in tun_get_user
  tun: use bulk NAPI cache allocation in tun_xdp_one
  tun: use napi_build_skb in __tun_build_skb
  tun: use napi_consume_skb() in tun_put_user
  net: core: export skb_defer_free_flush
  tun: flush deferred skb free list before bulk NAPI cache get

 drivers/net/tun.c      | 170 +++++++++++++++++++++++++++++------------
 include/linux/skbuff.h |   1 +
 net/core/dev.c         |   3 +-
 3 files changed, 126 insertions(+), 48 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2025-12-08 11:04 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-25 20:00 [PATCH net-next v2 0/9] tun: optimize SKB allocation with NAPI cache Jon Kohler
2025-11-25 20:00 ` [PATCH net-next v2 1/9] tun: cleanup out label in tun_xdp_one Jon Kohler
2025-11-25 20:00 ` [PATCH net-next v2 2/9] tun: correct drop statistics " Jon Kohler
2025-11-25 20:00 ` [PATCH net-next v2 3/9] tun: correct drop statistics in tun_put_user Jon Kohler
2025-11-29  3:07   ` Willem de Bruijn
2025-12-02 16:40     ` Jon Kohler
2025-12-02 21:34       ` Willem de Bruijn
2025-12-02 21:36         ` Jon Kohler
2025-11-25 20:00 ` [PATCH net-next v2 4/9] tun: correct drop statistics in tun_get_user Jon Kohler
2025-11-25 20:00 ` [PATCH net-next v2 5/9] tun: use bulk NAPI cache allocation in tun_xdp_one Jon Kohler
2025-11-28  3:02   ` Jason Wang
2025-12-02 16:49     ` Jon Kohler
2025-12-02 17:32       ` Jesper Dangaard Brouer
2025-12-02 17:45         ` Jon Kohler
2025-12-03  4:10           ` Jason Wang
2025-12-03  4:34             ` Jon Kohler
2025-12-03  6:40               ` Jason Wang
2025-12-03  8:47         ` Sebastian Andrzej Siewior
2025-12-03 15:35           ` Jon Kohler
2025-12-05  7:58             ` Sebastian Andrzej Siewior
2025-12-05 13:21               ` Jesper Dangaard Brouer
2025-12-05 16:56                 ` Jon Kohler
2025-12-08 11:04                 ` Sebastian Andrzej Siewior
2025-11-25 20:00 ` [PATCH net-next v2 6/9] tun: use napi_build_skb in __tun_build_skb Jon Kohler
2025-11-25 20:00 ` [PATCH net-next v2 7/9] tun: use napi_consume_skb() in tun_put_user Jon Kohler
2025-11-25 20:00 ` [PATCH net-next v2 8/9] net: core: export skb_defer_free_flush Jon Kohler
2025-11-25 20:00 ` [PATCH net-next v2 9/9] tun: flush deferred skb free list before bulk NAPI cache get Jon Kohler
2025-11-29  3:08 ` [PATCH net-next v2 0/9] tun: optimize SKB allocation with NAPI cache Willem de Bruijn
2025-12-02 16:38   ` Jon Kohler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).