netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/3] net: devmem: improve cpu cost of RX token management
@ 2025-09-12  5:28 Bobby Eshleman
  2025-09-12  5:28 ` [PATCH net-next v2 1/3] net: devmem: rename tx_vec to vec in dmabuf binding Bobby Eshleman
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Bobby Eshleman @ 2025-09-12  5:28 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, Neal Cardwell,
	David Ahern
  Cc: netdev, linux-kernel, Stanislav Fomichev, Mina Almasry,
	Bobby Eshleman

This series improves the CPU cost of RX token management by replacing
the xarray allocator with a normal array of atomics. Similar to devmem
TX's page-index lookup scheme for niovs, RX also uses page indices to
lookup the corresponding atomic in the array.

Improvement is ~5% per RX user thread.

Two other approaches were tested, but with no improvement. Namely, 1)
using a hashmap for tokens and 2) keeping an xarray of atomic counters
but using RCU so that the hotpath could be mostly lockless. Neither of
these approaches proved better than the simple array in terms of CPU.

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
---
Changes in v2:
- net: ethtool: prevent user from breaking devmem single-binding rule
  (Mina)
- pre-assign niovs in binding->vec for RX case (Mina)
- remove WARNs on invalid user input (Mina)
- remove extraneous binding ref get (Mina)
- remove WARN for changed binding (Mina)
- always use GFP_ZERO for binding->vec (Mina)
- fix length of alloc for urefs
- use atomic_set(, 0) to initialize sk_user_frags.urefs
- Link to v1: https://lore.kernel.org/r/20250902-scratch-bobbyeshleman-devmem-tcp-token-upstream-v1-0-d946169b5550@meta.com

---
Bobby Eshleman (3):
      net: devmem: rename tx_vec to vec in dmabuf binding
      net: devmem: use niov array for token management
      net: ethtool: prevent user from breaking devmem single-binding rule

 include/net/sock.h       |   6 +-
 net/core/devmem.c        |  29 +++++-----
 net/core/devmem.h        |   4 +-
 net/core/sock.c          |  23 +++++---
 net/ethtool/ioctl.c      | 144 +++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/tcp.c           | 120 ++++++++++++++++-----------------------
 net/ipv4/tcp_ipv4.c      |  45 +++++++++++++--
 net/ipv4/tcp_minisocks.c |   2 -
 8 files changed, 266 insertions(+), 107 deletions(-)
---
base-commit: dc2f650f7e6857bf384069c1a56b2937a1ee370d
change-id: 20250829-scratch-bobbyeshleman-devmem-tcp-token-upstream-292be174d503

Best regards,
-- 
Bobby Eshleman <bobbyeshleman@meta.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread
* [PATCH net-next v4 0/2] net: devmem: improve cpu cost of RX token management
@ 2025-09-26 16:31 Bobby Eshleman
  2025-09-27  6:00 ` [syzbot ci] " syzbot ci
  0 siblings, 1 reply; 11+ messages in thread
From: Bobby Eshleman @ 2025-09-26 16:31 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, Neal Cardwell,
	David Ahern
  Cc: netdev, linux-kernel, Stanislav Fomichev, Mina Almasry,
	Bobby Eshleman

This series improves the CPU cost of RX token management by replacing
the xarray allocator with an niov array and a uref field in niov.

Improvement is ~5% per RX user thread.

Two other approaches were tested, but with no improvement. Namely, 1)
using a hashmap for tokens and 2) keeping an xarray of atomic counters
but using RCU so that the hotpath could be mostly lockless. Neither of
these approaches proved better than the simple array in terms of CPU.

Running with a NCCL workload is still TODO, but I will follow up on this
thread with those results when done.

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
---
Changes in v4:
- rebase to net-next
- Link to v3: https://lore.kernel.org/r/20250926-scratch-bobbyeshleman-devmem-tcp-token-upstream-v3-0-084b46bda88f@meta.com

Changes in v3:
- make urefs per-binding instead of per-socket, reducing memory
  footprint
- fallback to cleaning up references in dmabuf unbind if socket
  leaked tokens
- drop ethtool patch
- Link to v2: https://lore.kernel.org/r/20250911-scratch-bobbyeshleman-devmem-tcp-token-upstream-v2-0-c80d735bd453@meta.com

Changes in v2:
- net: ethtool: prevent user from breaking devmem single-binding rule
  (Mina)
- pre-assign niovs in binding->vec for RX case (Mina)
- remove WARNs on invalid user input (Mina)
- remove extraneous binding ref get (Mina)
- remove WARN for changed binding (Mina)
- always use GFP_ZERO for binding->vec (Mina)
- fix length of alloc for urefs
- use atomic_set(, 0) to initialize sk_user_frags.urefs
- Link to v1:
https://lore.kernel.org/r/20250902-scratch-bobbyeshleman-devmem-tcp-token-upstream-v1-0-d946169b5550@meta.com

---
Bobby Eshleman (2):
      net: devmem: rename tx_vec to vec in dmabuf binding
      net: devmem: use niov array for token management

 include/net/netmem.h     |  1 +
 include/net/sock.h       |  4 +--
 net/core/devmem.c        | 46 +++++++++++++++---------
 net/core/devmem.h        |  4 +--
 net/core/sock.c          | 34 ++++++++++++------
 net/ipv4/tcp.c           | 94 +++++++++++-------------------------------------
 net/ipv4/tcp_ipv4.c      | 18 ++--------
 net/ipv4/tcp_minisocks.c |  2 +-
 8 files changed, 82 insertions(+), 121 deletions(-)
---
base-commit: 203e3beb73e53584ca90bc2a6d8240b9b12b9bcf
change-id: 20250829-scratch-bobbyeshleman-devmem-tcp-token-upstream-292be174d503

Best regards,
-- 
Bobby Eshleman <bobbyeshleman@meta.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread
* [PATCH net-next 0/2] net: devmem: improve cpu cost of RX token management
@ 2025-09-02 21:36 Bobby Eshleman
  2025-09-03 17:46 ` [syzbot ci] " syzbot ci
  0 siblings, 1 reply; 11+ messages in thread
From: Bobby Eshleman @ 2025-09-02 21:36 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, Neal Cardwell,
	David Ahern
  Cc: netdev, linux-kernel, Stanislav Fomichev, Mina Almasry,
	Bobby Eshleman

This series improves the CPU cost of RX token management by replacing
the xarray allocator with a normal array of atomics. Similar to devmem
TX's page-index lookup scheme for niovs, RX also uses page indices to
lookup the corresponding atomic in the array.

Improvement is ~5% per RX user thread.

Two other approaches were tested, but with no improvement. Namely, 1)
using a hashmap for tokens and 2) keeping an xarray of atomic counters
but using RCU so that the hotpath could be mostly lockless. Neither of
these approaches proved better than the simple array in terms of CPU.

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
---
Bobby Eshleman (2):
      net: devmem: rename tx_vec to vec in dmabuf binding
      net: devmem: use niov array for token management

 include/net/sock.h       |   5 ++-
 net/core/devmem.c        |  31 +++++++-------
 net/core/devmem.h        |   4 +-
 net/core/sock.c          |  24 +++++++----
 net/ipv4/tcp.c           | 107 +++++++++++++++--------------------------------
 net/ipv4/tcp_ipv4.c      |  40 +++++++++++++++---
 net/ipv4/tcp_minisocks.c |   2 -
 7 files changed, 107 insertions(+), 106 deletions(-)
---
base-commit: cd8a4cfa6bb43a441901e82f5c222dddc75a18a3
change-id: 20250829-scratch-bobbyeshleman-devmem-tcp-token-upstream-292be174d503

Best regards,
-- 
Bobby Eshleman <bobbyeshleman@meta.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-09-27  6:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-12  5:28 [PATCH net-next v2 0/3] net: devmem: improve cpu cost of RX token management Bobby Eshleman
2025-09-12  5:28 ` [PATCH net-next v2 1/3] net: devmem: rename tx_vec to vec in dmabuf binding Bobby Eshleman
2025-09-12  5:28 ` [PATCH net-next v2 2/3] net: devmem: use niov array for token management Bobby Eshleman
2025-09-17 23:55   ` Mina Almasry
2025-09-18 14:19     ` Bobby Eshleman
2025-09-12  5:28 ` [PATCH net-next v2 3/3] net: ethtool: prevent user from breaking devmem single-binding rule Bobby Eshleman
2025-09-12 22:23   ` Stanislav Fomichev
2025-09-17 23:07     ` Mina Almasry
2025-09-12  9:40 ` [syzbot ci] Re: net: devmem: improve cpu cost of RX token management syzbot ci
  -- strict thread matches above, loose matches on Subject: below --
2025-09-26 16:31 [PATCH net-next v4 0/2] " Bobby Eshleman
2025-09-27  6:00 ` [syzbot ci] " syzbot ci
2025-09-02 21:36 [PATCH net-next 0/2] " Bobby Eshleman
2025-09-03 17:46 ` [syzbot ci] " syzbot ci

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).