netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [syzbot ci] Re: net: devmem: improve cpu cost of RX token management
  2025-09-02 21:36 [PATCH net-next 0/2] " Bobby Eshleman
@ 2025-09-03 17:46 ` syzbot ci
  0 siblings, 0 replies; 11+ messages in thread
From: syzbot ci @ 2025-09-03 17:46 UTC (permalink / raw)
  To: almasrymina, bobbyeshleman, bobbyeshleman, davem, dsahern,
	edumazet, horms, kuba, kuniyu, linux-kernel, ncardwell, netdev,
	pabeni, sdf, willemb
  Cc: syzbot, syzkaller-bugs

syzbot ci has tested the following series

[v1] net: devmem: improve cpu cost of RX token management
https://lore.kernel.org/all/20250902-scratch-bobbyeshleman-devmem-tcp-token-upstream-v1-0-d946169b5550@meta.com
* [PATCH net-next 1/2] net: devmem: rename tx_vec to vec in dmabuf binding
* [PATCH net-next 2/2] net: devmem: use niov array for token management

and found the following issue:
general protection fault in sock_devmem_dontneed

Full report is available here:
https://ci.syzbot.org/series/c0dc7223-4222-461c-b04b-b6f0004c7509

***

general protection fault in sock_devmem_dontneed

tree:      net-next
URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next.git
base:      864ecc4a6dade82d3f70eab43dad0e277aa6fc78
arch:      amd64
compiler:  Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
config:    https://ci.syzbot.org/builds/6c521908-0a73-48bc-a0fd-2576cf868361/config
C repro:   https://ci.syzbot.org/findings/9374f388-d643-42ea-831b-872eb5000b3c/c_repro
syz repro: https://ci.syzbot.org/findings/9374f388-d643-42ea-831b-872eb5000b3c/syz_repro

Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 UID: 0 PID: 5993 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:sock_devmem_dontneed+0x3fd/0x960 net/core/sock.c:1112
Code: 8b 44 24 40 44 8b 28 44 03 6c 24 14 48 8b 44 24 20 42 80 3c 20 00 74 08 4c 89 ff e8 4d eb c9 f8 4d 8b 3f 4c 89 f8 48 c1 e8 03 <42> 80 3c 20 00 74 08 4c 89 ff e8 34 eb c9 f8 4d 8b 3f 4c 89 f8 48
RSP: 0018:ffffc90001c4fac0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff11004ddcfe0
RDX: ffff88801f6a8000 RSI: 0000000000000400 RDI: 0000000000000000
RBP: ffffc90001c4fc50 R08: ffffc90001c4fbdf R09: 0000000000000000
R10: ffffc90001c4fb60 R11: fffff52000389f7c R12: dffffc0000000000
R13: 00000000ffffffff R14: 0000000000000000 R15: 0000000000000000
FS:  000055558634d500(0000) GS:ffff8880b8614000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b30e63fff CR3: 0000000026b68000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 sk_setsockopt+0x682/0x2dc0 net/core/sock.c:1302
 do_sock_setsockopt+0x11b/0x1b0 net/socket.c:2340
 __sys_setsockopt net/socket.c:2369 [inline]
 __do_sys_setsockopt net/socket.c:2375 [inline]
 __se_sys_setsockopt net/socket.c:2372 [inline]
 __x64_sys_setsockopt+0x13f/0x1b0 net/socket.c:2372
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fea2bb8ebe9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffc81c63348 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 00007fea2bdc5fa0 RCX: 00007fea2bb8ebe9
RDX: 0000000000000050 RSI: 0000000000000001 RDI: 0000000000000003
RBP: 00007fea2bc11e19 R08: 0000000000000048 R09: 0000000000000000
R10: 0000200000000100 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fea2bdc5fa0 R14: 00007fea2bdc5fa0 R15: 0000000000000005
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:sock_devmem_dontneed+0x3fd/0x960 net/core/sock.c:1112
Code: 8b 44 24 40 44 8b 28 44 03 6c 24 14 48 8b 44 24 20 42 80 3c 20 00 74 08 4c 89 ff e8 4d eb c9 f8 4d 8b 3f 4c 89 f8 48 c1 e8 03 <42> 80 3c 20 00 74 08 4c 89 ff e8 34 eb c9 f8 4d 8b 3f 4c 89 f8 48
RSP: 0018:ffffc90001c4fac0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff11004ddcfe0
RDX: ffff88801f6a8000 RSI: 0000000000000400 RDI: 0000000000000000
RBP: ffffc90001c4fc50 R08: ffffc90001c4fbdf R09: 0000000000000000
R10: ffffc90001c4fb60 R11: fffff52000389f7c R12: dffffc0000000000
R13: 00000000ffffffff R14: 0000000000000000 R15: 0000000000000000
FS:  000055558634d500(0000) GS:ffff8880b8614000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b30e63fff CR3: 0000000026b68000 CR4: 00000000000006f0
----------------
Code disassembly (best guess):
   0:	8b 44 24 40          	mov    0x40(%rsp),%eax
   4:	44 8b 28             	mov    (%rax),%r13d
   7:	44 03 6c 24 14       	add    0x14(%rsp),%r13d
   c:	48 8b 44 24 20       	mov    0x20(%rsp),%rax
  11:	42 80 3c 20 00       	cmpb   $0x0,(%rax,%r12,1)
  16:	74 08                	je     0x20
  18:	4c 89 ff             	mov    %r15,%rdi
  1b:	e8 4d eb c9 f8       	call   0xf8c9eb6d
  20:	4d 8b 3f             	mov    (%r15),%r15
  23:	4c 89 f8             	mov    %r15,%rax
  26:	48 c1 e8 03          	shr    $0x3,%rax
* 2a:	42 80 3c 20 00       	cmpb   $0x0,(%rax,%r12,1) <-- trapping instruction
  2f:	74 08                	je     0x39
  31:	4c 89 ff             	mov    %r15,%rdi
  34:	e8 34 eb c9 f8       	call   0xf8c9eb6d
  39:	4d 8b 3f             	mov    (%r15),%r15
  3c:	4c 89 f8             	mov    %r15,%rax
  3f:	48                   	rex.W


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
  Tested-by: syzbot@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 0/3] net: devmem: improve cpu cost of RX token management
@ 2025-09-12  5:28 Bobby Eshleman
  2025-09-12  5:28 ` [PATCH net-next v2 1/3] net: devmem: rename tx_vec to vec in dmabuf binding Bobby Eshleman
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Bobby Eshleman @ 2025-09-12  5:28 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, Neal Cardwell,
	David Ahern
  Cc: netdev, linux-kernel, Stanislav Fomichev, Mina Almasry,
	Bobby Eshleman

This series improves the CPU cost of RX token management by replacing
the xarray allocator with a normal array of atomics. Similar to devmem
TX's page-index lookup scheme for niovs, RX also uses page indices to
lookup the corresponding atomic in the array.

Improvement is ~5% per RX user thread.

Two other approaches were tested, but with no improvement. Namely, 1)
using a hashmap for tokens and 2) keeping an xarray of atomic counters
but using RCU so that the hotpath could be mostly lockless. Neither of
these approaches proved better than the simple array in terms of CPU.

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
---
Changes in v2:
- net: ethtool: prevent user from breaking devmem single-binding rule
  (Mina)
- pre-assign niovs in binding->vec for RX case (Mina)
- remove WARNs on invalid user input (Mina)
- remove extraneous binding ref get (Mina)
- remove WARN for changed binding (Mina)
- always use GFP_ZERO for binding->vec (Mina)
- fix length of alloc for urefs
- use atomic_set(, 0) to initialize sk_user_frags.urefs
- Link to v1: https://lore.kernel.org/r/20250902-scratch-bobbyeshleman-devmem-tcp-token-upstream-v1-0-d946169b5550@meta.com

---
Bobby Eshleman (3):
      net: devmem: rename tx_vec to vec in dmabuf binding
      net: devmem: use niov array for token management
      net: ethtool: prevent user from breaking devmem single-binding rule

 include/net/sock.h       |   6 +-
 net/core/devmem.c        |  29 +++++-----
 net/core/devmem.h        |   4 +-
 net/core/sock.c          |  23 +++++---
 net/ethtool/ioctl.c      | 144 +++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/tcp.c           | 120 ++++++++++++++++-----------------------
 net/ipv4/tcp_ipv4.c      |  45 +++++++++++++--
 net/ipv4/tcp_minisocks.c |   2 -
 8 files changed, 266 insertions(+), 107 deletions(-)
---
base-commit: dc2f650f7e6857bf384069c1a56b2937a1ee370d
change-id: 20250829-scratch-bobbyeshleman-devmem-tcp-token-upstream-292be174d503

Best regards,
-- 
Bobby Eshleman <bobbyeshleman@meta.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 1/3] net: devmem: rename tx_vec to vec in dmabuf binding
  2025-09-12  5:28 [PATCH net-next v2 0/3] net: devmem: improve cpu cost of RX token management Bobby Eshleman
@ 2025-09-12  5:28 ` Bobby Eshleman
  2025-09-12  5:28 ` [PATCH net-next v2 2/3] net: devmem: use niov array for token management Bobby Eshleman
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Bobby Eshleman @ 2025-09-12  5:28 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, Neal Cardwell,
	David Ahern
  Cc: netdev, linux-kernel, Stanislav Fomichev, Mina Almasry,
	Bobby Eshleman

From: Bobby Eshleman <bobbyeshleman@meta.com>

Rename the 'tx_vec' field in struct net_devmem_dmabuf_binding to 'vec'.
This field holds pointers to net_iov structures. The rename prepares for
reusing 'vec' for both TX and RX directions.

No functional change intended.

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
---
 net/core/devmem.c | 22 +++++++++++-----------
 net/core/devmem.h |  2 +-
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/net/core/devmem.c b/net/core/devmem.c
index d9de31a6cc7f..b4c570d4f37a 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -74,7 +74,7 @@ void __net_devmem_dmabuf_binding_free(struct work_struct *wq)
 	dma_buf_detach(binding->dmabuf, binding->attachment);
 	dma_buf_put(binding->dmabuf);
 	xa_destroy(&binding->bound_rxqs);
-	kvfree(binding->tx_vec);
+	kvfree(binding->vec);
 	kfree(binding);
 }
 
@@ -231,10 +231,10 @@ net_devmem_bind_dmabuf(struct net_device *dev,
 	}
 
 	if (direction == DMA_TO_DEVICE) {
-		binding->tx_vec = kvmalloc_array(dmabuf->size / PAGE_SIZE,
-						 sizeof(struct net_iov *),
-						 GFP_KERNEL);
-		if (!binding->tx_vec) {
+		binding->vec = kvmalloc_array(dmabuf->size / PAGE_SIZE,
+					      sizeof(struct net_iov *),
+					      GFP_KERNEL);
+		if (!binding->vec) {
 			err = -ENOMEM;
 			goto err_unmap;
 		}
@@ -248,7 +248,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
 					      dev_to_node(&dev->dev));
 	if (!binding->chunk_pool) {
 		err = -ENOMEM;
-		goto err_tx_vec;
+		goto err_vec;
 	}
 
 	virtual = 0;
@@ -294,7 +294,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
 			page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov),
 						      net_devmem_get_dma_addr(niov));
 			if (direction == DMA_TO_DEVICE)
-				binding->tx_vec[owner->area.base_virtual / PAGE_SIZE + i] = niov;
+				binding->vec[owner->area.base_virtual / PAGE_SIZE + i] = niov;
 		}
 
 		virtual += len;
@@ -314,8 +314,8 @@ net_devmem_bind_dmabuf(struct net_device *dev,
 	gen_pool_for_each_chunk(binding->chunk_pool,
 				net_devmem_dmabuf_free_chunk_owner, NULL);
 	gen_pool_destroy(binding->chunk_pool);
-err_tx_vec:
-	kvfree(binding->tx_vec);
+err_vec:
+	kvfree(binding->vec);
 err_unmap:
 	dma_buf_unmap_attachment_unlocked(binding->attachment, binding->sgt,
 					  direction);
@@ -361,7 +361,7 @@ struct net_devmem_dmabuf_binding *net_devmem_get_binding(struct sock *sk,
 	int err = 0;
 
 	binding = net_devmem_lookup_dmabuf(dmabuf_id);
-	if (!binding || !binding->tx_vec) {
+	if (!binding || !binding->vec) {
 		err = -EINVAL;
 		goto out_err;
 	}
@@ -393,7 +393,7 @@ net_devmem_get_niov_at(struct net_devmem_dmabuf_binding *binding,
 	*off = virt_addr % PAGE_SIZE;
 	*size = PAGE_SIZE - *off;
 
-	return binding->tx_vec[virt_addr / PAGE_SIZE];
+	return binding->vec[virt_addr / PAGE_SIZE];
 }
 
 /*** "Dmabuf devmem memory provider" ***/
diff --git a/net/core/devmem.h b/net/core/devmem.h
index 101150d761af..2ada54fb63d7 100644
--- a/net/core/devmem.h
+++ b/net/core/devmem.h
@@ -63,7 +63,7 @@ struct net_devmem_dmabuf_binding {
 	 * address. This array is convenient to map the virtual addresses to
 	 * net_iovs in the TX path.
 	 */
-	struct net_iov **tx_vec;
+	struct net_iov **vec;
 
 	struct work_struct unbind_w;
 };

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 2/3] net: devmem: use niov array for token management
  2025-09-12  5:28 [PATCH net-next v2 0/3] net: devmem: improve cpu cost of RX token management Bobby Eshleman
  2025-09-12  5:28 ` [PATCH net-next v2 1/3] net: devmem: rename tx_vec to vec in dmabuf binding Bobby Eshleman
@ 2025-09-12  5:28 ` Bobby Eshleman
  2025-09-17 23:55   ` Mina Almasry
  2025-09-12  5:28 ` [PATCH net-next v2 3/3] net: ethtool: prevent user from breaking devmem single-binding rule Bobby Eshleman
  2025-09-12  9:40 ` [syzbot ci] Re: net: devmem: improve cpu cost of RX token management syzbot ci
  3 siblings, 1 reply; 11+ messages in thread
From: Bobby Eshleman @ 2025-09-12  5:28 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, Neal Cardwell,
	David Ahern
  Cc: netdev, linux-kernel, Stanislav Fomichev, Mina Almasry,
	Bobby Eshleman

From: Bobby Eshleman <bobbyeshleman@meta.com>

Improve CPU performance of devmem token management by using page offsets
as dmabuf tokens and using them for direct array access lookups instead
of xarray lookups. Consequently, the xarray can be removed. The result
is an average 5% reduction in CPU cycles spent by devmem RX user
threads.

This patch changes the meaning of tokens. Tokens previously referred to
unique fragments of pages. In this patch tokens instead represent
references to pages, not fragments.  Because of this, multiple tokens
may refer to the same page and so have identical value (e.g., two small
fragments may coexist on the same page). The token and offset pair that
the user receives uniquely identifies fragments if needed.  This assumes
that the user is not attempting to sort / uniq the token list using
tokens alone.

A new restriction is added to the implementation: devmem RX sockets
cannot switch dmabuf bindings. In practice, this is a symptom of invalid
configuration as a flow would have to be steered to a different queue or
device where there is a different binding, which is generally bad for
TCP flows. This restriction is necessary because the 32-bit dmabuf token
does not have enough bits to represent both the pages in a large dmabuf
and also a binding or dmabuf ID. For example, a system with 8 NICs and
32 queues requires 8 bits for a binding / queue ID (8 NICs * 32 queues
== 256 queues total == 2^8), which leaves only 24 bits for dmabuf pages
(2^24 * 4096 / (1<<30) == 64GB). This is insufficient for the device and
queue numbers on many current systems or systems that may need larger
GPU dmabufs (as for hard limits, my current H100 has 80GB GPU memory per
device).

Using kperf[1] with 4 flows and workers, this patch improves receive
worker CPU util by ~4.9% with slightly better throughput.

Before, mean cpu util for rx workers ~83.6%:

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
Average:       4    2.30    0.00   79.43    0.00    0.65    0.21    0.00    0.00    0.00   17.41
Average:       5    2.27    0.00   80.40    0.00    0.45    0.21    0.00    0.00    0.00   16.67
Average:       6    2.28    0.00   80.47    0.00    0.46    0.25    0.00    0.00    0.00   16.54
Average:       7    2.42    0.00   82.05    0.00    0.46    0.21    0.00    0.00    0.00   14.86

After, mean cpu util % for rx workers ~78.7%:

Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
Average:       4    2.61    0.00   73.31    0.00    0.76    0.11    0.00    0.00    0.00   23.20
Average:       5    2.95    0.00   74.24    0.00    0.66    0.22    0.00    0.00    0.00   21.94
Average:       6    2.81    0.00   73.38    0.00    0.97    0.11    0.00    0.00    0.00   22.73
Average:       7    3.05    0.00   78.76    0.00    0.76    0.11    0.00    0.00    0.00   17.32

Mean throughput improves, but falls within a standard deviation (~45GB/s
for 4 flows on a 50GB/s NIC, one hop).

This patch adds an array of atomics for counting the tokens returned to
the user for a given page. There is a 4-byte atomic per page in the
dmabuf per socket. Given a 2GB dmabuf, this array is 2MB.

[1]: https://github.com/facebookexperimental/kperf

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
---
Changes in v2:
- always use GFP_ZERO for binding->vec (Mina)
- remove WARN for changed binding (Mina)
- remove extraneous binding ref get (Mina)
- remove WARNs on invalid user input (Mina)
- pre-assign niovs in binding->vec for RX case (Mina)
- use atomic_set(, 0) to initialize sk_user_frags.urefs
- fix length of alloc for urefs
---
 include/net/sock.h       |   5 ++-
 net/core/devmem.c        |  17 +++-----
 net/core/devmem.h        |   2 +-
 net/core/sock.c          |  23 +++++++---
 net/ipv4/tcp.c           | 111 ++++++++++++++++-------------------------------
 net/ipv4/tcp_ipv4.c      |  39 ++++++++++++++---
 net/ipv4/tcp_minisocks.c |   2 -
 7 files changed, 99 insertions(+), 100 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 896bec2d2176..304aad494764 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -575,7 +575,10 @@ struct sock {
 #endif
 	struct rcu_head		sk_rcu;
 	netns_tracker		ns_tracker;
-	struct xarray		sk_user_frags;
+	struct {
+		struct net_devmem_dmabuf_binding	*binding;
+		atomic_t				*urefs;
+	} sk_user_frags;
 
 #if IS_ENABLED(CONFIG_PROVE_LOCKING) && IS_ENABLED(CONFIG_MODULES)
 	struct module		*sk_owner;
diff --git a/net/core/devmem.c b/net/core/devmem.c
index b4c570d4f37a..1dae43934942 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -230,14 +230,12 @@ net_devmem_bind_dmabuf(struct net_device *dev,
 		goto err_detach;
 	}
 
-	if (direction == DMA_TO_DEVICE) {
-		binding->vec = kvmalloc_array(dmabuf->size / PAGE_SIZE,
-					      sizeof(struct net_iov *),
-					      GFP_KERNEL);
-		if (!binding->vec) {
-			err = -ENOMEM;
-			goto err_unmap;
-		}
+	binding->vec = kvmalloc_array(dmabuf->size / PAGE_SIZE,
+				      sizeof(struct net_iov *),
+				      GFP_KERNEL | __GFP_ZERO);
+	if (!binding->vec) {
+		err = -ENOMEM;
+		goto err_unmap;
 	}
 
 	/* For simplicity we expect to make PAGE_SIZE allocations, but the
@@ -293,8 +291,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
 			niov->owner = &owner->area;
 			page_pool_set_dma_addr_netmem(net_iov_to_netmem(niov),
 						      net_devmem_get_dma_addr(niov));
-			if (direction == DMA_TO_DEVICE)
-				binding->vec[owner->area.base_virtual / PAGE_SIZE + i] = niov;
+			binding->vec[owner->area.base_virtual / PAGE_SIZE + i] = niov;
 		}
 
 		virtual += len;
diff --git a/net/core/devmem.h b/net/core/devmem.h
index 2ada54fb63d7..d4eb28d079bb 100644
--- a/net/core/devmem.h
+++ b/net/core/devmem.h
@@ -61,7 +61,7 @@ struct net_devmem_dmabuf_binding {
 
 	/* Array of net_iov pointers for this binding, sorted by virtual
 	 * address. This array is convenient to map the virtual addresses to
-	 * net_iovs in the TX path.
+	 * net_iovs.
 	 */
 	struct net_iov **vec;
 
diff --git a/net/core/sock.c b/net/core/sock.c
index 1f8ef4d8bcd9..15e198842b4a 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -87,6 +87,7 @@
 
 #include <linux/unaligned.h>
 #include <linux/capability.h>
+#include <linux/dma-buf.h>
 #include <linux/errno.h>
 #include <linux/errqueue.h>
 #include <linux/types.h>
@@ -151,6 +152,7 @@
 #include <uapi/linux/pidfd.h>
 
 #include "dev.h"
+#include "devmem.h"
 
 static DEFINE_MUTEX(proto_list_mutex);
 static LIST_HEAD(proto_list);
@@ -1100,32 +1102,39 @@ sock_devmem_dontneed(struct sock *sk, sockptr_t optval, unsigned int optlen)
 		return -EFAULT;
 	}
 
-	xa_lock_bh(&sk->sk_user_frags);
 	for (i = 0; i < num_tokens; i++) {
 		for (j = 0; j < tokens[i].token_count; j++) {
+			struct net_iov *niov;
+			unsigned int token;
+			netmem_ref netmem;
+
+			token = tokens[i].token_start + j;
+			if (token >= sk->sk_user_frags.binding->dmabuf->size / PAGE_SIZE)
+				break;
+
 			if (++num_frags > MAX_DONTNEED_FRAGS)
 				goto frag_limit_reached;
-
-			netmem_ref netmem = (__force netmem_ref)__xa_erase(
-				&sk->sk_user_frags, tokens[i].token_start + j);
+			niov = sk->sk_user_frags.binding->vec[token];
+			netmem = net_iov_to_netmem(niov);
 
 			if (!netmem || WARN_ON_ONCE(!netmem_is_net_iov(netmem)))
 				continue;
 
+			if (atomic_dec_if_positive(&sk->sk_user_frags.urefs[token])
+						< 0)
+				continue;
+
 			netmems[netmem_num++] = netmem;
 			if (netmem_num == ARRAY_SIZE(netmems)) {
-				xa_unlock_bh(&sk->sk_user_frags);
 				for (k = 0; k < netmem_num; k++)
 					WARN_ON_ONCE(!napi_pp_put_page(netmems[k]));
 				netmem_num = 0;
-				xa_lock_bh(&sk->sk_user_frags);
 			}
 			ret++;
 		}
 	}
 
 frag_limit_reached:
-	xa_unlock_bh(&sk->sk_user_frags);
 	for (k = 0; k < netmem_num; k++)
 		WARN_ON_ONCE(!napi_pp_put_page(netmems[k]));
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 7f9c671b1ee0..438b8132ed89 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -261,6 +261,7 @@
 #include <linux/memblock.h>
 #include <linux/highmem.h>
 #include <linux/cache.h>
+#include <linux/dma-buf.h>
 #include <linux/err.h>
 #include <linux/time.h>
 #include <linux/slab.h>
@@ -491,7 +492,8 @@ void tcp_init_sock(struct sock *sk)
 
 	set_bit(SOCK_SUPPORT_ZC, &sk->sk_socket->flags);
 	sk_sockets_allocated_inc(sk);
-	xa_init_flags(&sk->sk_user_frags, XA_FLAGS_ALLOC1);
+	sk->sk_user_frags.binding = NULL;
+	sk->sk_user_frags.urefs = NULL;
 }
 EXPORT_IPV6_MOD(tcp_init_sock);
 
@@ -2402,68 +2404,6 @@ static int tcp_inq_hint(struct sock *sk)
 	return inq;
 }
 
-/* batch __xa_alloc() calls and reduce xa_lock()/xa_unlock() overhead. */
-struct tcp_xa_pool {
-	u8		max; /* max <= MAX_SKB_FRAGS */
-	u8		idx; /* idx <= max */
-	__u32		tokens[MAX_SKB_FRAGS];
-	netmem_ref	netmems[MAX_SKB_FRAGS];
-};
-
-static void tcp_xa_pool_commit_locked(struct sock *sk, struct tcp_xa_pool *p)
-{
-	int i;
-
-	/* Commit part that has been copied to user space. */
-	for (i = 0; i < p->idx; i++)
-		__xa_cmpxchg(&sk->sk_user_frags, p->tokens[i], XA_ZERO_ENTRY,
-			     (__force void *)p->netmems[i], GFP_KERNEL);
-	/* Rollback what has been pre-allocated and is no longer needed. */
-	for (; i < p->max; i++)
-		__xa_erase(&sk->sk_user_frags, p->tokens[i]);
-
-	p->max = 0;
-	p->idx = 0;
-}
-
-static void tcp_xa_pool_commit(struct sock *sk, struct tcp_xa_pool *p)
-{
-	if (!p->max)
-		return;
-
-	xa_lock_bh(&sk->sk_user_frags);
-
-	tcp_xa_pool_commit_locked(sk, p);
-
-	xa_unlock_bh(&sk->sk_user_frags);
-}
-
-static int tcp_xa_pool_refill(struct sock *sk, struct tcp_xa_pool *p,
-			      unsigned int max_frags)
-{
-	int err, k;
-
-	if (p->idx < p->max)
-		return 0;
-
-	xa_lock_bh(&sk->sk_user_frags);
-
-	tcp_xa_pool_commit_locked(sk, p);
-
-	for (k = 0; k < max_frags; k++) {
-		err = __xa_alloc(&sk->sk_user_frags, &p->tokens[k],
-				 XA_ZERO_ENTRY, xa_limit_31b, GFP_KERNEL);
-		if (err)
-			break;
-	}
-
-	xa_unlock_bh(&sk->sk_user_frags);
-
-	p->max = k;
-	p->idx = 0;
-	return k ? 0 : err;
-}
-
 /* On error, returns the -errno. On success, returns number of bytes sent to the
  * user. May not consume all of @remaining_len.
  */
@@ -2472,14 +2412,11 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb,
 			      int remaining_len)
 {
 	struct dmabuf_cmsg dmabuf_cmsg = { 0 };
-	struct tcp_xa_pool tcp_xa_pool;
 	unsigned int start;
 	int i, copy, n;
 	int sent = 0;
 	int err = 0;
 
-	tcp_xa_pool.max = 0;
-	tcp_xa_pool.idx = 0;
 	do {
 		start = skb_headlen(skb);
 
@@ -2526,8 +2463,11 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb,
 		 */
 		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+			struct net_devmem_dmabuf_binding *binding;
 			struct net_iov *niov;
 			u64 frag_offset;
+			size_t len;
+			u32 token;
 			int end;
 
 			/* !skb_frags_readable() should indicate that ALL the
@@ -2560,13 +2500,39 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb,
 					      start;
 				dmabuf_cmsg.frag_offset = frag_offset;
 				dmabuf_cmsg.frag_size = copy;
-				err = tcp_xa_pool_refill(sk, &tcp_xa_pool,
-							 skb_shinfo(skb)->nr_frags - i);
-				if (err)
+
+				binding = net_devmem_iov_binding(niov);
+
+				if (!sk->sk_user_frags.binding) {
+					sk->sk_user_frags.binding = binding;
+
+					len = binding->dmabuf->size / PAGE_SIZE;
+					sk->sk_user_frags.urefs = kzalloc(len * sizeof(*sk->sk_user_frags.urefs),
+									  GFP_KERNEL);
+					if (!sk->sk_user_frags.urefs) {
+						sk->sk_user_frags.binding = NULL;
+						err = -ENOMEM;
+						goto out;
+					}
+
+					for (token = 0; token < len; token++)
+						atomic_set(&sk->sk_user_frags.urefs[token],
+							   0);
+
+					spin_lock_bh(&devmem_sockets_lock);
+					list_add(&sk->sk_devmem_list, &devmem_sockets_list);
+					spin_unlock_bh(&devmem_sockets_lock);
+				}
+
+				if (sk->sk_user_frags.binding != binding) {
+					err = -EFAULT;
 					goto out;
+				}
+
+				token = net_iov_virtual_addr(niov) >> PAGE_SHIFT;
+				dmabuf_cmsg.frag_token = token;
 
 				/* Will perform the exchange later */
-				dmabuf_cmsg.frag_token = tcp_xa_pool.tokens[tcp_xa_pool.idx];
 				dmabuf_cmsg.dmabuf_id = net_devmem_iov_binding_id(niov);
 
 				offset += copy;
@@ -2579,8 +2545,9 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb,
 				if (err)
 					goto out;
 
+				atomic_inc(&sk->sk_user_frags.urefs[token]);
+
 				atomic_long_inc(&niov->pp_ref_count);
-				tcp_xa_pool.netmems[tcp_xa_pool.idx++] = skb_frag_netmem(frag);
 
 				sent += copy;
 
@@ -2590,7 +2557,6 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb,
 			start = end;
 		}
 
-		tcp_xa_pool_commit(sk, &tcp_xa_pool);
 		if (!remaining_len)
 			goto out;
 
@@ -2608,7 +2574,6 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb,
 	}
 
 out:
-	tcp_xa_pool_commit(sk, &tcp_xa_pool);
 	if (!sent)
 		sent = err;
 
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 2a0602035729..68ebf96d06f8 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -87,6 +87,9 @@
 #include <crypto/hash.h>
 #include <linux/scatterlist.h>
 
+#include <linux/dma-buf.h>
+#include "../core/devmem.h"
+
 #include <trace/events/tcp.h>
 
 #ifdef CONFIG_TCP_MD5SIG
@@ -2525,11 +2528,37 @@ static int tcp_v4_init_sock(struct sock *sk)
 static void tcp_release_user_frags(struct sock *sk)
 {
 #ifdef CONFIG_PAGE_POOL
-	unsigned long index;
-	void *netmem;
+	struct net_devmem_dmabuf_binding *binding;
+	struct net_iov *niov;
+	unsigned int token;
+	netmem_ref netmem;
+
+	if (!sk->sk_user_frags.urefs)
+		return;
+
+	binding = sk->sk_user_frags.binding;
+	if (!binding || !binding->vec)
+		return;
+
+	for (token = 0; token < binding->dmabuf->size / PAGE_SIZE; token++) {
+		niov = binding->vec[token];
+
+		/* never used by recvmsg() */
+		if (!niov)
+			continue;
+
+		if (!net_is_devmem_iov(niov))
+			continue;
+
+		netmem = net_iov_to_netmem(niov);
 
-	xa_for_each(&sk->sk_user_frags, index, netmem)
-		WARN_ON_ONCE(!napi_pp_put_page((__force netmem_ref)netmem));
+		while (atomic_dec_return(&sk->sk_user_frags.urefs[token]) >= 0)
+			WARN_ON_ONCE(!napi_pp_put_page(netmem));
+	}
+
+	sk->sk_user_frags.binding = NULL;
+	kvfree(sk->sk_user_frags.urefs);
+	sk->sk_user_frags.urefs = NULL;
 #endif
 }
 
@@ -2539,8 +2568,6 @@ void tcp_v4_destroy_sock(struct sock *sk)
 
 	tcp_release_user_frags(sk);
 
-	xa_destroy(&sk->sk_user_frags);
-
 	trace_tcp_destroy_sock(sk);
 
 	tcp_clear_xmit_timers(sk);
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 7c2ae07d8d5d..6a44df3074df 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -630,8 +630,6 @@ struct sock *tcp_create_openreq_child(const struct sock *sk,
 
 	__TCP_INC_STATS(sock_net(sk), TCP_MIB_PASSIVEOPENS);
 
-	xa_init_flags(&newsk->sk_user_frags, XA_FLAGS_ALLOC1);
-
 	return newsk;
 }
 EXPORT_SYMBOL(tcp_create_openreq_child);

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net-next v2 3/3] net: ethtool: prevent user from breaking devmem single-binding rule
  2025-09-12  5:28 [PATCH net-next v2 0/3] net: devmem: improve cpu cost of RX token management Bobby Eshleman
  2025-09-12  5:28 ` [PATCH net-next v2 1/3] net: devmem: rename tx_vec to vec in dmabuf binding Bobby Eshleman
  2025-09-12  5:28 ` [PATCH net-next v2 2/3] net: devmem: use niov array for token management Bobby Eshleman
@ 2025-09-12  5:28 ` Bobby Eshleman
  2025-09-12 22:23   ` Stanislav Fomichev
  2025-09-12  9:40 ` [syzbot ci] Re: net: devmem: improve cpu cost of RX token management syzbot ci
  3 siblings, 1 reply; 11+ messages in thread
From: Bobby Eshleman @ 2025-09-12  5:28 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, Neal Cardwell,
	David Ahern
  Cc: netdev, linux-kernel, Stanislav Fomichev, Mina Almasry,
	Bobby Eshleman

From: Bobby Eshleman <bobbyeshleman@meta.com>

Prevent the user from breaking devmem's single-binding rule by rejecting
ethtool TCP/IP requests to modify or delete rules that will redirect a
devmem socket to a queue with a different dmabuf binding. This is done
in a "best effort" approach because not all steering rule types are
validated.

If an ethtool_rxnfc flow steering rule evaluates true for:

1) matching a devmem socket's ip addr
2) selecting a queue with a different dmabuf binding
3) is TCP/IP (v4 or v6)

... then reject the ethtool_rxnfc request with -EBUSY to indicate a
devmem socket is using the current rules that steer it to its dmabuf
binding.

Non-TCP/IP rules are completely ignored, and if they do match a devmem
flow then they can still break devmem sockets. For example, bytes 0 and
1 of L2 headers, etc... it is still unknown to me if these are possible
to evaluate at the time of the ethtool call, and so are left to future
work (or never, if not possible).

FLOW_RSS rules which guide flows to an RSS context are also not
evaluated yet. This seems feasible, but the correct path towards
retrieving the RSS context and scanning the queues for dmabuf bindings
seems unclear and maybe overkill (re-use parts of ethtool_get_rxnfc?).

Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
---
 include/net/sock.h  |   1 +
 net/ethtool/ioctl.c | 144 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/tcp.c      |   9 ++++
 net/ipv4/tcp_ipv4.c |   6 +++
 4 files changed, 160 insertions(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index 304aad494764..73a1ff59dcde 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -579,6 +579,7 @@ struct sock {
 		struct net_devmem_dmabuf_binding	*binding;
 		atomic_t				*urefs;
 	} sk_user_frags;
+	struct list_head	sk_devmem_list;
 
 #if IS_ENABLED(CONFIG_PROVE_LOCKING) && IS_ENABLED(CONFIG_MODULES)
 	struct module		*sk_owner;
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 0b2a4d0573b3..99676ac9bbaa 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -29,11 +29,16 @@
 #include <linux/utsname.h>
 #include <net/devlink.h>
 #include <net/ipv6.h>
+#include <net/netdev_rx_queue.h>
 #include <net/xdp_sock_drv.h>
 #include <net/flow_offload.h>
 #include <net/netdev_lock.h>
 #include <linux/ethtool_netlink.h>
 #include "common.h"
+#include "../core/devmem.h"
+
+extern struct list_head devmem_sockets_list;
+extern spinlock_t devmem_sockets_lock;
 
 /* State held across locks and calls for commands which have devlink fallback */
 struct ethtool_devlink_compat {
@@ -1169,6 +1174,142 @@ ethtool_get_rxfh_fields(struct net_device *dev, u32 cmd, void __user *useraddr)
 	return ethtool_rxnfc_copy_to_user(useraddr, &info, info_size, NULL);
 }
 
+static bool
+__ethtool_rx_flow_spec_breaks_devmem_sk(struct ethtool_rx_flow_spec *fs,
+					struct net_device *dev,
+					struct sock *sk)
+{
+	struct in6_addr saddr6, smask6, daddr6, dmask6;
+	struct sockaddr_storage saddr, daddr;
+	struct sockaddr_in6 *src6, *dst6;
+	struct sockaddr_in *src4, *dst4;
+	struct netdev_rx_queue *rxq;
+	__u32 flow_type;
+
+	if (dev != __sk_dst_get(sk)->dev)
+		return false;
+
+	src6 = (struct sockaddr_in6 *)&saddr;
+	dst6 = (struct sockaddr_in6 *)&daddr;
+	src4 = (struct sockaddr_in *)&saddr;
+	dst4 = (struct sockaddr_in *)&daddr;
+
+	if (sk->sk_family == AF_INET6) {
+		src6->sin6_port = inet_sk(sk)->inet_sport;
+		src6->sin6_addr = inet6_sk(sk)->saddr;
+		dst6->sin6_port = inet_sk(sk)->inet_dport;
+		dst6->sin6_addr = sk->sk_v6_daddr;
+	} else {
+		src4->sin_port = inet_sk(sk)->inet_sport;
+		src4->sin_addr.s_addr = inet_sk(sk)->inet_saddr;
+		dst4->sin_port = inet_sk(sk)->inet_dport;
+		dst4->sin_addr.s_addr = inet_sk(sk)->inet_daddr;
+	}
+
+	flow_type = fs->flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
+
+	rxq = __netif_get_rx_queue(dev, fs->ring_cookie);
+	if (!rxq)
+		return false;
+
+	/* If the requested binding and the sk binding is equal then we know
+	 * this rule can't redirect to a different binding.
+	 */
+	if (rxq->mp_params.mp_priv == sk->sk_user_frags.binding)
+		return false;
+
+	/* Reject rules that redirect RX devmem sockets to a queue with a
+	 * different dmabuf binding. Because these sockets are on the RX side
+	 * (registered in the recvmsg() path), we compare the opposite
+	 * endpoints: the socket source with the rule destination, and the
+	 * socket destination with the rule source.
+	 *
+	 * Only perform checks on the simplest rules to check, that is, IP/TCP
+	 * rules. Flow hash options are not verified, so may still break TCP
+	 * devmem flows in theory (VLAN tag, bytes 0 and 1 of L4 header,
+	 * etc...). The author of this function was simply not sure how
+	 * to validate these at the time of the ethtool call.
+	 */
+	switch (flow_type) {
+	case IPV4_USER_FLOW: {
+		const struct ethtool_usrip4_spec *v4_usr_spec, *v4_usr_m_spec;
+
+		v4_usr_spec = &fs->h_u.usr_ip4_spec;
+		v4_usr_m_spec = &fs->m_u.usr_ip4_spec;
+
+		if (((v4_usr_spec->ip4src ^ dst4->sin_addr.s_addr) & v4_usr_m_spec->ip4src) ||
+		    (v4_usr_spec->ip4dst ^ src4->sin_addr.s_addr) & v4_usr_m_spec->ip4dst) {
+			return true;
+		}
+
+		return false;
+	}
+	case TCP_V4_FLOW: {
+		const struct ethtool_tcpip4_spec *v4_spec, *v4_m_spec;
+
+		v4_spec = &fs->h_u.tcp_ip4_spec;
+		v4_m_spec = &fs->m_u.tcp_ip4_spec;
+
+		if (((v4_spec->ip4src ^ dst4->sin_addr.s_addr) & v4_m_spec->ip4src) ||
+		    ((v4_spec->ip4dst ^ src4->sin_addr.s_addr) & v4_m_spec->ip4dst))
+			return true;
+
+		return false;
+	}
+	case IPV6_USER_FLOW: {
+		const struct ethtool_usrip6_spec *v6_usr_spec, *v6_usr_m_spec;
+
+		v6_usr_spec = &fs->h_u.usr_ip6_spec;
+		v6_usr_m_spec = &fs->m_u.usr_ip6_spec;
+
+		memcpy(&daddr6, v6_usr_spec->ip6dst, sizeof(daddr6));
+		memcpy(&dmask6, v6_usr_m_spec->ip6dst, sizeof(dmask6));
+		memcpy(&saddr6, v6_usr_spec->ip6src, sizeof(saddr6));
+		memcpy(&smask6, v6_usr_m_spec->ip6src, sizeof(smask6));
+
+		return !ipv6_masked_addr_cmp(&saddr6, &smask6, &dst6->sin6_addr) &&
+		       !ipv6_masked_addr_cmp(&daddr6, &dmask6, &src6->sin6_addr);
+	}
+	case TCP_V6_FLOW: {
+		const struct ethtool_tcpip6_spec *v6_spec, *v6_m_spec;
+
+		v6_spec = &fs->h_u.tcp_ip6_spec;
+		v6_m_spec = &fs->m_u.tcp_ip6_spec;
+
+		memcpy(&daddr6, v6_spec->ip6dst, sizeof(daddr6));
+		memcpy(&dmask6, v6_m_spec->ip6dst, sizeof(dmask6));
+		memcpy(&saddr6, v6_spec->ip6src, sizeof(saddr6));
+		memcpy(&smask6, v6_m_spec->ip6src, sizeof(smask6));
+
+		return !ipv6_masked_addr_cmp(&daddr6, &dmask6, &src6->sin6_addr) &&
+		       !ipv6_masked_addr_cmp(&saddr6, &smask6, &dst6->sin6_addr);
+	}
+	default:
+		return false;
+	}
+}
+
+static bool
+ethtool_rx_flow_spec_breaks_devmem_sk(struct ethtool_rx_flow_spec *fs,
+				      struct net_device *dev)
+{
+	struct sock *sk;
+	bool ret;
+
+	ret = false;
+
+	spin_lock_bh(&devmem_sockets_lock);
+	list_for_each_entry(sk, &devmem_sockets_list, sk_devmem_list) {
+		if (__ethtool_rx_flow_spec_breaks_devmem_sk(fs, dev, sk)) {
+			ret = true;
+			break;
+		}
+	}
+	spin_unlock_bh(&devmem_sockets_lock);
+
+	return ret;
+}
+
 static noinline_for_stack int ethtool_set_rxnfc(struct net_device *dev,
 						u32 cmd, void __user *useraddr)
 {
@@ -1197,6 +1338,9 @@ static noinline_for_stack int ethtool_set_rxnfc(struct net_device *dev,
 			return -EINVAL;
 	}
 
+	if (ethtool_rx_flow_spec_breaks_devmem_sk(&info.fs, dev))
+		return -EBUSY;
+
 	rc = ops->set_rxnfc(dev, &info);
 	if (rc)
 		return rc;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 438b8132ed89..3f57e658ea80 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -311,6 +311,12 @@ DEFINE_STATIC_KEY_FALSE(tcp_have_smc);
 EXPORT_SYMBOL(tcp_have_smc);
 #endif
 
+struct list_head devmem_sockets_list;
+EXPORT_SYMBOL_GPL(devmem_sockets_list);
+
+DEFINE_SPINLOCK(devmem_sockets_lock);
+EXPORT_SYMBOL_GPL(devmem_sockets_lock);
+
 /*
  * Current number of TCP sockets.
  */
@@ -5229,4 +5235,7 @@ void __init tcp_init(void)
 	BUG_ON(tcp_register_congestion_control(&tcp_reno) != 0);
 	tcp_tsq_work_init();
 	mptcp_init();
+
+	spin_lock_init(&devmem_sockets_lock);
+	INIT_LIST_HEAD(&devmem_sockets_list);
 }
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 68ebf96d06f8..a3213c97aed9 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -92,6 +92,9 @@
 
 #include <trace/events/tcp.h>
 
+extern struct list_head devmem_sockets_list;
+extern spinlock_t devmem_sockets_lock;
+
 #ifdef CONFIG_TCP_MD5SIG
 static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key,
 			       __be32 daddr, __be32 saddr, const struct tcphdr *th);
@@ -2559,6 +2562,9 @@ static void tcp_release_user_frags(struct sock *sk)
 	sk->sk_user_frags.binding = NULL;
 	kvfree(sk->sk_user_frags.urefs);
 	sk->sk_user_frags.urefs = NULL;
+	spin_lock_bh(&devmem_sockets_lock);
+	list_del(&sk->sk_devmem_list);
+	spin_unlock_bh(&devmem_sockets_lock);
 #endif
 }
 

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [syzbot ci] Re: net: devmem: improve cpu cost of RX token management
  2025-09-12  5:28 [PATCH net-next v2 0/3] net: devmem: improve cpu cost of RX token management Bobby Eshleman
                   ` (2 preceding siblings ...)
  2025-09-12  5:28 ` [PATCH net-next v2 3/3] net: ethtool: prevent user from breaking devmem single-binding rule Bobby Eshleman
@ 2025-09-12  9:40 ` syzbot ci
  3 siblings, 0 replies; 11+ messages in thread
From: syzbot ci @ 2025-09-12  9:40 UTC (permalink / raw)
  To: almasrymina, bobbyeshleman, bobbyeshleman, davem, dsahern,
	edumazet, horms, kuba, kuniyu, linux-kernel, ncardwell, netdev,
	pabeni, sdf, willemb
  Cc: syzbot, syzkaller-bugs

syzbot ci has tested the following series

[v2] net: devmem: improve cpu cost of RX token management
https://lore.kernel.org/all/20250911-scratch-bobbyeshleman-devmem-tcp-token-upstream-v2-0-c80d735bd453@meta.com
* [PATCH net-next v2 1/3] net: devmem: rename tx_vec to vec in dmabuf binding
* [PATCH net-next v2 2/3] net: devmem: use niov array for token management
* [PATCH net-next v2 3/3] net: ethtool: prevent user from breaking devmem single-binding rule

and found the following issue:
general protection fault in sock_devmem_dontneed

Full report is available here:
https://ci.syzbot.org/series/40b2252a-f8bb-4cec-bfc1-2ff8a3c55336

***

general protection fault in sock_devmem_dontneed

tree:      net-next
URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next.git
base:      5adf6f2b9972dbb69f4dd11bae52ba251c64ecb7
arch:      amd64
compiler:  Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
config:    https://ci.syzbot.org/builds/2c30c608-f14f-4e6d-9772-cc5e129939fc/config
C repro:   https://ci.syzbot.org/findings/c89c36f8-4666-47d0-bc39-35662a268e4d/c_repro
syz repro: https://ci.syzbot.org/findings/c89c36f8-4666-47d0-bc39-35662a268e4d/syz_repro

Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 1 UID: 0 PID: 6028 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:sock_devmem_dontneed+0x40b/0x910 net/core/sock.c:1112
Code: 8b 44 24 18 44 8b 20 44 03 64 24 14 48 8b 44 24 68 80 3c 18 00 74 08 4c 89 ef e8 f0 bb c9 f8 4d 8b 7d 00 4c 89 f8 48 c1 e8 03 <80> 3c 18 00 74 08 4c 89 ff e8 d7 bb c9 f8 4d 8b 2f 4c 89 e8 48 c1
RSP: 0018:ffffc90002987ac0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: dffffc0000000000 RCX: 1ffff11020d27e78
RDX: ffff88810a039cc0 RSI: 0000000000000003 RDI: 0000000000000000
RBP: ffffc90002987c50 R08: ffffc90002987bdf R09: 0000000000000000
R10: ffffc90002987b60 R11: fffff52000530f7c R12: 0000000000000006
R13: ffff8881235cb710 R14: 0000000000000000 R15: 0000000000000000
FS:  000055555e866500(0000) GS:ffff8881a3c14000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b31b63fff CR3: 0000000027a20000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 sk_setsockopt+0x682/0x2dc0 net/core/sock.c:1301
 do_sock_setsockopt+0x11b/0x1b0 net/socket.c:2340
 __sys_setsockopt net/socket.c:2369 [inline]
 __do_sys_setsockopt net/socket.c:2375 [inline]
 __se_sys_setsockopt net/socket.c:2372 [inline]
 __x64_sys_setsockopt+0x13f/0x1b0 net/socket.c:2372
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7faf24f8eba9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffc3eb96018 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 00007faf251d5fa0 RCX: 00007faf24f8eba9
RDX: 0000000000000050 RSI: 0000000000000001 RDI: 0000000000000003
RBP: 00007faf25011e19 R08: 0000000000000048 R09: 0000000000000000
R10: 0000200000000100 R11: 0000000000000246 R12: 0000000000000000
R13: 00007faf251d5fa0 R14: 00007faf251d5fa0 R15: 0000000000000005
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:sock_devmem_dontneed+0x40b/0x910 net/core/sock.c:1112
Code: 8b 44 24 18 44 8b 20 44 03 64 24 14 48 8b 44 24 68 80 3c 18 00 74 08 4c 89 ef e8 f0 bb c9 f8 4d 8b 7d 00 4c 89 f8 48 c1 e8 03 <80> 3c 18 00 74 08 4c 89 ff e8 d7 bb c9 f8 4d 8b 2f 4c 89 e8 48 c1
RSP: 0018:ffffc90002987ac0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: dffffc0000000000 RCX: 1ffff11020d27e78
RDX: ffff88810a039cc0 RSI: 0000000000000003 RDI: 0000000000000000
RBP: ffffc90002987c50 R08: ffffc90002987bdf R09: 0000000000000000
R10: ffffc90002987b60 R11: fffff52000530f7c R12: 0000000000000006
R13: ffff8881235cb710 R14: 0000000000000000 R15: 0000000000000000
FS:  000055555e866500(0000) GS:ffff8881a3c14000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b31b63fff CR3: 0000000027a20000 CR4: 00000000000006f0
----------------
Code disassembly (best guess):
   0:	8b 44 24 18          	mov    0x18(%rsp),%eax
   4:	44 8b 20             	mov    (%rax),%r12d
   7:	44 03 64 24 14       	add    0x14(%rsp),%r12d
   c:	48 8b 44 24 68       	mov    0x68(%rsp),%rax
  11:	80 3c 18 00          	cmpb   $0x0,(%rax,%rbx,1)
  15:	74 08                	je     0x1f
  17:	4c 89 ef             	mov    %r13,%rdi
  1a:	e8 f0 bb c9 f8       	call   0xf8c9bc0f
  1f:	4d 8b 7d 00          	mov    0x0(%r13),%r15
  23:	4c 89 f8             	mov    %r15,%rax
  26:	48 c1 e8 03          	shr    $0x3,%rax
* 2a:	80 3c 18 00          	cmpb   $0x0,(%rax,%rbx,1) <-- trapping instruction
  2e:	74 08                	je     0x38
  30:	4c 89 ff             	mov    %r15,%rdi
  33:	e8 d7 bb c9 f8       	call   0xf8c9bc0f
  38:	4d 8b 2f             	mov    (%r15),%r13
  3b:	4c 89 e8             	mov    %r13,%rax
  3e:	48                   	rex.W
  3f:	c1                   	.byte 0xc1


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
  Tested-by: syzbot@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v2 3/3] net: ethtool: prevent user from breaking devmem single-binding rule
  2025-09-12  5:28 ` [PATCH net-next v2 3/3] net: ethtool: prevent user from breaking devmem single-binding rule Bobby Eshleman
@ 2025-09-12 22:23   ` Stanislav Fomichev
  2025-09-17 23:07     ` Mina Almasry
  0 siblings, 1 reply; 11+ messages in thread
From: Stanislav Fomichev @ 2025-09-12 22:23 UTC (permalink / raw)
  To: Bobby Eshleman
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, Neal Cardwell,
	David Ahern, netdev, linux-kernel, Stanislav Fomichev,
	Mina Almasry, Bobby Eshleman

On 09/11, Bobby Eshleman wrote:
> From: Bobby Eshleman <bobbyeshleman@meta.com>
> 
> Prevent the user from breaking devmem's single-binding rule by rejecting
> ethtool TCP/IP requests to modify or delete rules that will redirect a
> devmem socket to a queue with a different dmabuf binding. This is done
> in a "best effort" approach because not all steering rule types are
> validated.
> 
> If an ethtool_rxnfc flow steering rule evaluates true for:
> 
> 1) matching a devmem socket's ip addr
> 2) selecting a queue with a different dmabuf binding
> 3) is TCP/IP (v4 or v6)
> 
> ... then reject the ethtool_rxnfc request with -EBUSY to indicate a
> devmem socket is using the current rules that steer it to its dmabuf
> binding.
> 
> Non-TCP/IP rules are completely ignored, and if they do match a devmem
> flow then they can still break devmem sockets. For example, bytes 0 and
> 1 of L2 headers, etc... it is still unknown to me if these are possible
> to evaluate at the time of the ethtool call, and so are left to future
> work (or never, if not possible).
> 
> FLOW_RSS rules which guide flows to an RSS context are also not
> evaluated yet. This seems feasible, but the correct path towards
> retrieving the RSS context and scanning the queues for dmabuf bindings
> seems unclear and maybe overkill (re-use parts of ethtool_get_rxnfc?).
> 
> Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
> ---
>  include/net/sock.h  |   1 +
>  net/ethtool/ioctl.c | 144 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  net/ipv4/tcp.c      |   9 ++++
>  net/ipv4/tcp_ipv4.c |   6 +++
>  4 files changed, 160 insertions(+)
> 
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 304aad494764..73a1ff59dcde 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -579,6 +579,7 @@ struct sock {
>  		struct net_devmem_dmabuf_binding	*binding;
>  		atomic_t				*urefs;
>  	} sk_user_frags;
> +	struct list_head	sk_devmem_list;
>  
>  #if IS_ENABLED(CONFIG_PROVE_LOCKING) && IS_ENABLED(CONFIG_MODULES)
>  	struct module		*sk_owner;
> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
> index 0b2a4d0573b3..99676ac9bbaa 100644
> --- a/net/ethtool/ioctl.c
> +++ b/net/ethtool/ioctl.c
> @@ -29,11 +29,16 @@
>  #include <linux/utsname.h>
>  #include <net/devlink.h>
>  #include <net/ipv6.h>
> +#include <net/netdev_rx_queue.h>
>  #include <net/xdp_sock_drv.h>
>  #include <net/flow_offload.h>
>  #include <net/netdev_lock.h>
>  #include <linux/ethtool_netlink.h>
>  #include "common.h"
> +#include "../core/devmem.h"
> +
> +extern struct list_head devmem_sockets_list;
> +extern spinlock_t devmem_sockets_lock;
>  
>  /* State held across locks and calls for commands which have devlink fallback */
>  struct ethtool_devlink_compat {
> @@ -1169,6 +1174,142 @@ ethtool_get_rxfh_fields(struct net_device *dev, u32 cmd, void __user *useraddr)
>  	return ethtool_rxnfc_copy_to_user(useraddr, &info, info_size, NULL);
>  }
>  
> +static bool
> +__ethtool_rx_flow_spec_breaks_devmem_sk(struct ethtool_rx_flow_spec *fs,
> +					struct net_device *dev,
> +					struct sock *sk)
> +{
> +	struct in6_addr saddr6, smask6, daddr6, dmask6;
> +	struct sockaddr_storage saddr, daddr;
> +	struct sockaddr_in6 *src6, *dst6;
> +	struct sockaddr_in *src4, *dst4;
> +	struct netdev_rx_queue *rxq;
> +	__u32 flow_type;
> +
> +	if (dev != __sk_dst_get(sk)->dev)
> +		return false;
> +
> +	src6 = (struct sockaddr_in6 *)&saddr;
> +	dst6 = (struct sockaddr_in6 *)&daddr;
> +	src4 = (struct sockaddr_in *)&saddr;
> +	dst4 = (struct sockaddr_in *)&daddr;
> +
> +	if (sk->sk_family == AF_INET6) {
> +		src6->sin6_port = inet_sk(sk)->inet_sport;
> +		src6->sin6_addr = inet6_sk(sk)->saddr;
> +		dst6->sin6_port = inet_sk(sk)->inet_dport;
> +		dst6->sin6_addr = sk->sk_v6_daddr;
> +	} else {
> +		src4->sin_port = inet_sk(sk)->inet_sport;
> +		src4->sin_addr.s_addr = inet_sk(sk)->inet_saddr;
> +		dst4->sin_port = inet_sk(sk)->inet_dport;
> +		dst4->sin_addr.s_addr = inet_sk(sk)->inet_daddr;
> +	}
> +
> +	flow_type = fs->flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
> +
> +	rxq = __netif_get_rx_queue(dev, fs->ring_cookie);
> +	if (!rxq)
> +		return false;
> +
> +	/* If the requested binding and the sk binding is equal then we know
> +	 * this rule can't redirect to a different binding.
> +	 */
> +	if (rxq->mp_params.mp_priv == sk->sk_user_frags.binding)
> +		return false;
> +
> +	/* Reject rules that redirect RX devmem sockets to a queue with a
> +	 * different dmabuf binding. Because these sockets are on the RX side
> +	 * (registered in the recvmsg() path), we compare the opposite
> +	 * endpoints: the socket source with the rule destination, and the
> +	 * socket destination with the rule source.
> +	 *
> +	 * Only perform checks on the simplest rules to check, that is, IP/TCP
> +	 * rules. Flow hash options are not verified, so may still break TCP
> +	 * devmem flows in theory (VLAN tag, bytes 0 and 1 of L4 header,
> +	 * etc...). The author of this function was simply not sure how
> +	 * to validate these at the time of the ethtool call.
> +	 */
> +	switch (flow_type) {
> +	case IPV4_USER_FLOW: {
> +		const struct ethtool_usrip4_spec *v4_usr_spec, *v4_usr_m_spec;
> +
> +		v4_usr_spec = &fs->h_u.usr_ip4_spec;
> +		v4_usr_m_spec = &fs->m_u.usr_ip4_spec;
> +
> +		if (((v4_usr_spec->ip4src ^ dst4->sin_addr.s_addr) & v4_usr_m_spec->ip4src) ||
> +		    (v4_usr_spec->ip4dst ^ src4->sin_addr.s_addr) & v4_usr_m_spec->ip4dst) {
> +			return true;
> +		}
> +
> +		return false;
> +	}
> +	case TCP_V4_FLOW: {
> +		const struct ethtool_tcpip4_spec *v4_spec, *v4_m_spec;
> +
> +		v4_spec = &fs->h_u.tcp_ip4_spec;
> +		v4_m_spec = &fs->m_u.tcp_ip4_spec;
> +
> +		if (((v4_spec->ip4src ^ dst4->sin_addr.s_addr) & v4_m_spec->ip4src) ||
> +		    ((v4_spec->ip4dst ^ src4->sin_addr.s_addr) & v4_m_spec->ip4dst))
> +			return true;
> +

The ports need to be checked as well? But my preference overall would
be to go back to checking this condition during recvmsg. We can pick
some new obscure errno number to clearly explain to the user what
happened. EPIPE or something similar, to mean that the socket is cooked.
But let's see if Mina has a different opinion..

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v2 3/3] net: ethtool: prevent user from breaking devmem single-binding rule
  2025-09-12 22:23   ` Stanislav Fomichev
@ 2025-09-17 23:07     ` Mina Almasry
  0 siblings, 0 replies; 11+ messages in thread
From: Mina Almasry @ 2025-09-17 23:07 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Bobby Eshleman, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman, Kuniyuki Iwashima, Willem de Bruijn,
	Neal Cardwell, David Ahern, netdev, linux-kernel,
	Stanislav Fomichev, Bobby Eshleman

On Fri, Sep 12, 2025 at 3:23 PM Stanislav Fomichev <stfomichev@gmail.com> wrote:
>
> On 09/11, Bobby Eshleman wrote:
> > From: Bobby Eshleman <bobbyeshleman@meta.com>
> >
> > Prevent the user from breaking devmem's single-binding rule by rejecting
> > ethtool TCP/IP requests to modify or delete rules that will redirect a
> > devmem socket to a queue with a different dmabuf binding. This is done
> > in a "best effort" approach because not all steering rule types are
> > validated.
> >
> > If an ethtool_rxnfc flow steering rule evaluates true for:
> >
> > 1) matching a devmem socket's ip addr
> > 2) selecting a queue with a different dmabuf binding
> > 3) is TCP/IP (v4 or v6)
> >
> > ... then reject the ethtool_rxnfc request with -EBUSY to indicate a
> > devmem socket is using the current rules that steer it to its dmabuf
> > binding.
> >
> > Non-TCP/IP rules are completely ignored, and if they do match a devmem
> > flow then they can still break devmem sockets. For example, bytes 0 and
> > 1 of L2 headers, etc... it is still unknown to me if these are possible
> > to evaluate at the time of the ethtool call, and so are left to future
> > work (or never, if not possible).
> >
> > FLOW_RSS rules which guide flows to an RSS context are also not
> > evaluated yet. This seems feasible, but the correct path towards
> > retrieving the RSS context and scanning the queues for dmabuf bindings
> > seems unclear and maybe overkill (re-use parts of ethtool_get_rxnfc?).
> >
> > Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
> > ---
> >  include/net/sock.h  |   1 +
> >  net/ethtool/ioctl.c | 144 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  net/ipv4/tcp.c      |   9 ++++
> >  net/ipv4/tcp_ipv4.c |   6 +++
> >  4 files changed, 160 insertions(+)
> >
> > diff --git a/include/net/sock.h b/include/net/sock.h
> > index 304aad494764..73a1ff59dcde 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -579,6 +579,7 @@ struct sock {
> >               struct net_devmem_dmabuf_binding        *binding;
> >               atomic_t                                *urefs;
> >       } sk_user_frags;
> > +     struct list_head        sk_devmem_list;
> >
> >  #if IS_ENABLED(CONFIG_PROVE_LOCKING) && IS_ENABLED(CONFIG_MODULES)
> >       struct module           *sk_owner;
> > diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
> > index 0b2a4d0573b3..99676ac9bbaa 100644
> > --- a/net/ethtool/ioctl.c
> > +++ b/net/ethtool/ioctl.c
> > @@ -29,11 +29,16 @@
> >  #include <linux/utsname.h>
> >  #include <net/devlink.h>
> >  #include <net/ipv6.h>
> > +#include <net/netdev_rx_queue.h>
> >  #include <net/xdp_sock_drv.h>
> >  #include <net/flow_offload.h>
> >  #include <net/netdev_lock.h>
> >  #include <linux/ethtool_netlink.h>
> >  #include "common.h"
> > +#include "../core/devmem.h"
> > +
> > +extern struct list_head devmem_sockets_list;
> > +extern spinlock_t devmem_sockets_lock;
> >
> >  /* State held across locks and calls for commands which have devlink fallback */
> >  struct ethtool_devlink_compat {
> > @@ -1169,6 +1174,142 @@ ethtool_get_rxfh_fields(struct net_device *dev, u32 cmd, void __user *useraddr)
> >       return ethtool_rxnfc_copy_to_user(useraddr, &info, info_size, NULL);
> >  }
> >
> > +static bool
> > +__ethtool_rx_flow_spec_breaks_devmem_sk(struct ethtool_rx_flow_spec *fs,
> > +                                     struct net_device *dev,
> > +                                     struct sock *sk)
> > +{
> > +     struct in6_addr saddr6, smask6, daddr6, dmask6;
> > +     struct sockaddr_storage saddr, daddr;
> > +     struct sockaddr_in6 *src6, *dst6;
> > +     struct sockaddr_in *src4, *dst4;
> > +     struct netdev_rx_queue *rxq;
> > +     __u32 flow_type;
> > +
> > +     if (dev != __sk_dst_get(sk)->dev)
> > +             return false;
> > +
> > +     src6 = (struct sockaddr_in6 *)&saddr;
> > +     dst6 = (struct sockaddr_in6 *)&daddr;
> > +     src4 = (struct sockaddr_in *)&saddr;
> > +     dst4 = (struct sockaddr_in *)&daddr;
> > +
> > +     if (sk->sk_family == AF_INET6) {
> > +             src6->sin6_port = inet_sk(sk)->inet_sport;
> > +             src6->sin6_addr = inet6_sk(sk)->saddr;
> > +             dst6->sin6_port = inet_sk(sk)->inet_dport;
> > +             dst6->sin6_addr = sk->sk_v6_daddr;
> > +     } else {
> > +             src4->sin_port = inet_sk(sk)->inet_sport;
> > +             src4->sin_addr.s_addr = inet_sk(sk)->inet_saddr;
> > +             dst4->sin_port = inet_sk(sk)->inet_dport;
> > +             dst4->sin_addr.s_addr = inet_sk(sk)->inet_daddr;
> > +     }
> > +
> > +     flow_type = fs->flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
> > +
> > +     rxq = __netif_get_rx_queue(dev, fs->ring_cookie);
> > +     if (!rxq)
> > +             return false;
> > +
> > +     /* If the requested binding and the sk binding is equal then we know
> > +      * this rule can't redirect to a different binding.
> > +      */
> > +     if (rxq->mp_params.mp_priv == sk->sk_user_frags.binding)
> > +             return false;
> > +
> > +     /* Reject rules that redirect RX devmem sockets to a queue with a
> > +      * different dmabuf binding. Because these sockets are on the RX side
> > +      * (registered in the recvmsg() path), we compare the opposite
> > +      * endpoints: the socket source with the rule destination, and the
> > +      * socket destination with the rule source.
> > +      *
> > +      * Only perform checks on the simplest rules to check, that is, IP/TCP
> > +      * rules. Flow hash options are not verified, so may still break TCP
> > +      * devmem flows in theory (VLAN tag, bytes 0 and 1 of L4 header,
> > +      * etc...). The author of this function was simply not sure how
> > +      * to validate these at the time of the ethtool call.
> > +      */
> > +     switch (flow_type) {
> > +     case IPV4_USER_FLOW: {
> > +             const struct ethtool_usrip4_spec *v4_usr_spec, *v4_usr_m_spec;
> > +
> > +             v4_usr_spec = &fs->h_u.usr_ip4_spec;
> > +             v4_usr_m_spec = &fs->m_u.usr_ip4_spec;
> > +
> > +             if (((v4_usr_spec->ip4src ^ dst4->sin_addr.s_addr) & v4_usr_m_spec->ip4src) ||
> > +                 (v4_usr_spec->ip4dst ^ src4->sin_addr.s_addr) & v4_usr_m_spec->ip4dst) {
> > +                     return true;
> > +             }
> > +
> > +             return false;
> > +     }
> > +     case TCP_V4_FLOW: {
> > +             const struct ethtool_tcpip4_spec *v4_spec, *v4_m_spec;
> > +
> > +             v4_spec = &fs->h_u.tcp_ip4_spec;
> > +             v4_m_spec = &fs->m_u.tcp_ip4_spec;
> > +
> > +             if (((v4_spec->ip4src ^ dst4->sin_addr.s_addr) & v4_m_spec->ip4src) ||
> > +                 ((v4_spec->ip4dst ^ src4->sin_addr.s_addr) & v4_m_spec->ip4dst))
> > +                     return true;
> > +
>
> The ports need to be checked as well? But my preference overall would
> be to go back to checking this condition during recvmsg. We can pick
> some new obscure errno number to clearly explain to the user what
> happened. EPIPE or something similar, to mean that the socket is cooked.
> But let's see if Mina has a different opinion..

Sorry for the late reply.

IIU it looks to me like AF_XDP set the precedent that the user can
break the socket if they mess with the flow steering rules, and I'm
guessing io_uring zc does something similar. Only devmem tries to have
the socket work regardless on which rxqueue the incoming packets land
on, but that was predicated on the being able to do the tracking
efficiently which seems to not entirely be the case.

I think I'm OK with dropping this patch. We should probably add to the
docs the new restriction on devmem sockets. In our prod code we don't
reprogram rules while the socket is running. I don't think this will
break us, IDK if it will break anyone else, but it is unlikely.

-- 
Thanks,
Mina

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v2 2/3] net: devmem: use niov array for token management
  2025-09-12  5:28 ` [PATCH net-next v2 2/3] net: devmem: use niov array for token management Bobby Eshleman
@ 2025-09-17 23:55   ` Mina Almasry
  2025-09-18 14:19     ` Bobby Eshleman
  0 siblings, 1 reply; 11+ messages in thread
From: Mina Almasry @ 2025-09-17 23:55 UTC (permalink / raw)
  To: Bobby Eshleman
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, Neal Cardwell,
	David Ahern, netdev, linux-kernel, Stanislav Fomichev,
	Bobby Eshleman

On Thu, Sep 11, 2025 at 10:28 PM Bobby Eshleman <bobbyeshleman@gmail.com> wrote:
>
> From: Bobby Eshleman <bobbyeshleman@meta.com>
>
> Improve CPU performance of devmem token management by using page offsets
> as dmabuf tokens and using them for direct array access lookups instead
> of xarray lookups. Consequently, the xarray can be removed. The result
> is an average 5% reduction in CPU cycles spent by devmem RX user
> threads.
>
> This patch changes the meaning of tokens. Tokens previously referred to
> unique fragments of pages. In this patch tokens instead represent
> references to pages, not fragments.  Because of this, multiple tokens
> may refer to the same page and so have identical value (e.g., two small
> fragments may coexist on the same page). The token and offset pair that
> the user receives uniquely identifies fragments if needed.  This assumes
> that the user is not attempting to sort / uniq the token list using
> tokens alone.
>
> A new restriction is added to the implementation: devmem RX sockets
> cannot switch dmabuf bindings. In practice, this is a symptom of invalid
> configuration as a flow would have to be steered to a different queue or
> device where there is a different binding, which is generally bad for
> TCP flows. This restriction is necessary because the 32-bit dmabuf token
> does not have enough bits to represent both the pages in a large dmabuf
> and also a binding or dmabuf ID. For example, a system with 8 NICs and
> 32 queues requires 8 bits for a binding / queue ID (8 NICs * 32 queues
> == 256 queues total == 2^8), which leaves only 24 bits for dmabuf pages
> (2^24 * 4096 / (1<<30) == 64GB). This is insufficient for the device and
> queue numbers on many current systems or systems that may need larger
> GPU dmabufs (as for hard limits, my current H100 has 80GB GPU memory per
> device).
>
> Using kperf[1] with 4 flows and workers, this patch improves receive
> worker CPU util by ~4.9% with slightly better throughput.
>
> Before, mean cpu util for rx workers ~83.6%:
>
> Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> Average:       4    2.30    0.00   79.43    0.00    0.65    0.21    0.00    0.00    0.00   17.41
> Average:       5    2.27    0.00   80.40    0.00    0.45    0.21    0.00    0.00    0.00   16.67
> Average:       6    2.28    0.00   80.47    0.00    0.46    0.25    0.00    0.00    0.00   16.54
> Average:       7    2.42    0.00   82.05    0.00    0.46    0.21    0.00    0.00    0.00   14.86
>
> After, mean cpu util % for rx workers ~78.7%:
>
> Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> Average:       4    2.61    0.00   73.31    0.00    0.76    0.11    0.00    0.00    0.00   23.20
> Average:       5    2.95    0.00   74.24    0.00    0.66    0.22    0.00    0.00    0.00   21.94
> Average:       6    2.81    0.00   73.38    0.00    0.97    0.11    0.00    0.00    0.00   22.73
> Average:       7    3.05    0.00   78.76    0.00    0.76    0.11    0.00    0.00    0.00   17.32
>
> Mean throughput improves, but falls within a standard deviation (~45GB/s
> for 4 flows on a 50GB/s NIC, one hop).
>
> This patch adds an array of atomics for counting the tokens returned to
> the user for a given page. There is a 4-byte atomic per page in the
> dmabuf per socket. Given a 2GB dmabuf, this array is 2MB.
>

I think this may be an issue. A typical devmem application doing real
work will probably use a dmabuf around this size and will have
thousands of connections. For algorithms like all-to-all I believe
every node needs a number of connections to each other node, and it's
common to see 10K devmem connections while a training is happening or
what not.

Having (2MB * 10K) = 20GB extra memory now being required just for
this book-keeping is a bit hard to swallow. Do you know what's the
existing memory footprint of the xarrays? Were they large anyway
(we're not actually adding more memory), or is the 2MB entirely new?

If it's entirely new, I think we may need to resolve that somehow. One
option is implement a resizeable array... IDK if that would be more
efficient, especially since we need to lock it in the
tcp_recvmsg_dmabuf and in the setsockopt.

Another option is to track the userrefs per-binding, not per socket.
If we do that, we can't free user refs the user leaves behind when
they close the socket (or crash). We can only clear refs on dmabuf
unbind. We have to trust the user to do the right thing. I'm finding
it hard to verify that our current userspace is careful about not
leaving refs behind. We'd have to run thorough tests and stuff against
your series.

--
Thanks,
Mina

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net-next v2 2/3] net: devmem: use niov array for token management
  2025-09-17 23:55   ` Mina Almasry
@ 2025-09-18 14:19     ` Bobby Eshleman
  0 siblings, 0 replies; 11+ messages in thread
From: Bobby Eshleman @ 2025-09-18 14:19 UTC (permalink / raw)
  To: Mina Almasry
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Kuniyuki Iwashima, Willem de Bruijn, Neal Cardwell,
	David Ahern, netdev, linux-kernel, Stanislav Fomichev,
	Bobby Eshleman

On Wed, Sep 17, 2025 at 04:55:34PM -0700, Mina Almasry wrote:
> On Thu, Sep 11, 2025 at 10:28 PM Bobby Eshleman <bobbyeshleman@gmail.com> wrote:
> >
> > From: Bobby Eshleman <bobbyeshleman@meta.com>
> >
> > Improve CPU performance of devmem token management by using page offsets
> > as dmabuf tokens and using them for direct array access lookups instead
> > of xarray lookups. Consequently, the xarray can be removed. The result
> > is an average 5% reduction in CPU cycles spent by devmem RX user
> > threads.
> >
> > This patch changes the meaning of tokens. Tokens previously referred to
> > unique fragments of pages. In this patch tokens instead represent
> > references to pages, not fragments.  Because of this, multiple tokens
> > may refer to the same page and so have identical value (e.g., two small
> > fragments may coexist on the same page). The token and offset pair that
> > the user receives uniquely identifies fragments if needed.  This assumes
> > that the user is not attempting to sort / uniq the token list using
> > tokens alone.
> >
> > A new restriction is added to the implementation: devmem RX sockets
> > cannot switch dmabuf bindings. In practice, this is a symptom of invalid
> > configuration as a flow would have to be steered to a different queue or
> > device where there is a different binding, which is generally bad for
> > TCP flows. This restriction is necessary because the 32-bit dmabuf token
> > does not have enough bits to represent both the pages in a large dmabuf
> > and also a binding or dmabuf ID. For example, a system with 8 NICs and
> > 32 queues requires 8 bits for a binding / queue ID (8 NICs * 32 queues
> > == 256 queues total == 2^8), which leaves only 24 bits for dmabuf pages
> > (2^24 * 4096 / (1<<30) == 64GB). This is insufficient for the device and
> > queue numbers on many current systems or systems that may need larger
> > GPU dmabufs (as for hard limits, my current H100 has 80GB GPU memory per
> > device).
> >
> > Using kperf[1] with 4 flows and workers, this patch improves receive
> > worker CPU util by ~4.9% with slightly better throughput.
> >
> > Before, mean cpu util for rx workers ~83.6%:
> >
> > Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> > Average:       4    2.30    0.00   79.43    0.00    0.65    0.21    0.00    0.00    0.00   17.41
> > Average:       5    2.27    0.00   80.40    0.00    0.45    0.21    0.00    0.00    0.00   16.67
> > Average:       6    2.28    0.00   80.47    0.00    0.46    0.25    0.00    0.00    0.00   16.54
> > Average:       7    2.42    0.00   82.05    0.00    0.46    0.21    0.00    0.00    0.00   14.86
> >
> > After, mean cpu util % for rx workers ~78.7%:
> >
> > Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
> > Average:       4    2.61    0.00   73.31    0.00    0.76    0.11    0.00    0.00    0.00   23.20
> > Average:       5    2.95    0.00   74.24    0.00    0.66    0.22    0.00    0.00    0.00   21.94
> > Average:       6    2.81    0.00   73.38    0.00    0.97    0.11    0.00    0.00    0.00   22.73
> > Average:       7    3.05    0.00   78.76    0.00    0.76    0.11    0.00    0.00    0.00   17.32
> >
> > Mean throughput improves, but falls within a standard deviation (~45GB/s
> > for 4 flows on a 50GB/s NIC, one hop).
> >
> > This patch adds an array of atomics for counting the tokens returned to
> > the user for a given page. There is a 4-byte atomic per page in the
> > dmabuf per socket. Given a 2GB dmabuf, this array is 2MB.
> >
> 
> I think this may be an issue. A typical devmem application doing real
> work will probably use a dmabuf around this size and will have
> thousands of connections. For algorithms like all-to-all I believe
> every node needs a number of connections to each other node, and it's
> common to see 10K devmem connections while a training is happening or
> what not.
> 
> Having (2MB * 10K) = 20GB extra memory now being required just for
> this book-keeping is a bit hard to swallow. Do you know what's the
> existing memory footprint of the xarrays? Were they large anyway
> (we're not actually adding more memory), or is the 2MB entirely new?
> 
> If it's entirely new, I think we may need to resolve that somehow. One
> option is implement a resizeable array... IDK if that would be more
> efficient, especially since we need to lock it in the
> tcp_recvmsg_dmabuf and in the setsockopt.
> 

I can give the xarray a measurement on some workloads and see. My guess
is it'll be quite a bit smaller than the aggregate of per-socket arrays.

> Another option is to track the userrefs per-binding, not per socket.
> If we do that, we can't free user refs the user leaves behind when
> they close the socket (or crash). We can only clear refs on dmabuf
> unbind. We have to trust the user to do the right thing. I'm finding
> it hard to verify that our current userspace is careful about not
> leaving refs behind. We'd have to run thorough tests and stuff against
> your series.
> 

I can give this a try and test on our end too, this would work for us.

Thanks!

Best,
Bobby

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [syzbot ci] Re: net: devmem: improve cpu cost of RX token management
  2025-09-26 16:31 [PATCH net-next v4 0/2] " Bobby Eshleman
@ 2025-09-27  6:00 ` syzbot ci
  0 siblings, 0 replies; 11+ messages in thread
From: syzbot ci @ 2025-09-27  6:00 UTC (permalink / raw)
  To: almasrymina, bobbyeshleman, bobbyeshleman, davem, dsahern,
	edumazet, horms, kuba, kuniyu, linux-kernel, ncardwell, netdev,
	pabeni, sdf, willemb
  Cc: syzbot, syzkaller-bugs

syzbot ci has tested the following series

[v4] net: devmem: improve cpu cost of RX token management
https://lore.kernel.org/all/20250926-scratch-bobbyeshleman-devmem-tcp-token-upstream-v4-0-39156563c3ea@meta.com
* [PATCH net-next v4 1/2] net: devmem: rename tx_vec to vec in dmabuf binding
* [PATCH net-next v4 2/2] net: devmem: use niov array for token management

and found the following issue:
general protection fault in sock_devmem_dontneed

Full report is available here:
https://ci.syzbot.org/series/b8209bd4-e9f0-4c54-bad3-613e8431151b

***

general protection fault in sock_devmem_dontneed

tree:      net-next
URL:       https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next.git
base:      dc1dea796b197aba2c3cae25bfef45f4b3ad46fe
arch:      amd64
compiler:  Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
config:    https://ci.syzbot.org/builds/b4d90fd9-9fbe-4e17-8fc0-3d6603df09da/config
C repro:   https://ci.syzbot.org/findings/ce81b3c3-3db8-4643-9731-cbe331c65fdb/c_repro
syz repro: https://ci.syzbot.org/findings/ce81b3c3-3db8-4643-9731-cbe331c65fdb/syz_repro

Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 1 UID: 0 PID: 5996 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) 
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:sock_devmem_dontneed+0x372/0x920 net/core/sock.c:1113
Code: 48 44 8b 20 44 89 74 24 54 45 01 f4 48 8b 44 24 60 42 80 3c 28 00 74 08 48 89 df e8 e8 5a c9 f8 4c 8b 33 4c 89 f0 48 c1 e8 03 <42> 80 3c 28 00 74 08 4c 89 f7 e8 cf 5a c9 f8 4d 8b 3e 4c 89 f8 48
RSP: 0018:ffffc90002a1fac0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810a8ab710 RCX: 1ffff11023002f45
RDX: ffff88801b339cc0 RSI: 0000000000002000 RDI: 0000000000000000
RBP: ffffc90002a1fc50 R08: ffffc90002a1fbdf R09: 0000000000000000
R10: ffffc90002a1fb60 R11: fffff52000543f7c R12: 0000000000f07000
R13: dffffc0000000000 R14: 0000000000000000 R15: ffff88810a8ab710
FS:  000055556f85a500(0000) GS:ffff8881a3c3d000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00002000000a2000 CR3: 0000000024516000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 sk_setsockopt+0x682/0x2dc0 net/core/sock.c:1304
 do_sock_setsockopt+0x11b/0x1b0 net/socket.c:2340
 __sys_setsockopt net/socket.c:2369 [inline]
 __do_sys_setsockopt net/socket.c:2375 [inline]
 __se_sys_setsockopt net/socket.c:2372 [inline]
 __x64_sys_setsockopt+0x13f/0x1b0 net/socket.c:2372
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fea0438ec29
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffd04a8f368 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
RAX: ffffffffffffffda RBX: 00007fea045d5fa0 RCX: 00007fea0438ec29
RDX: 0000000000000050 RSI: 0000000000000001 RDI: 0000000000000003
RBP: 00007fea04411e41 R08: 0000000000000010 R09: 0000000000000000
R10: 00002000000a2000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fea045d5fa0 R14: 00007fea045d5fa0 R15: 0000000000000005
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:sock_devmem_dontneed+0x372/0x920 net/core/sock.c:1113
Code: 48 44 8b 20 44 89 74 24 54 45 01 f4 48 8b 44 24 60 42 80 3c 28 00 74 08 48 89 df e8 e8 5a c9 f8 4c 8b 33 4c 89 f0 48 c1 e8 03 <42> 80 3c 28 00 74 08 4c 89 f7 e8 cf 5a c9 f8 4d 8b 3e 4c 89 f8 48
RSP: 0018:ffffc90002a1fac0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88810a8ab710 RCX: 1ffff11023002f45
RDX: ffff88801b339cc0 RSI: 0000000000002000 RDI: 0000000000000000
RBP: ffffc90002a1fc50 R08: ffffc90002a1fbdf R09: 0000000000000000
R10: ffffc90002a1fb60 R11: fffff52000543f7c R12: 0000000000f07000
R13: dffffc0000000000 R14: 0000000000000000 R15: ffff88810a8ab710
FS:  000055556f85a500(0000) GS:ffff8881a3c3d000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00002000000a2000 CR3: 0000000024516000 CR4: 00000000000006f0
----------------
Code disassembly (best guess):
   0:	48                   	rex.W
   1:	44 8b 20             	mov    (%rax),%r12d
   4:	44 89 74 24 54       	mov    %r14d,0x54(%rsp)
   9:	45 01 f4             	add    %r14d,%r12d
   c:	48 8b 44 24 60       	mov    0x60(%rsp),%rax
  11:	42 80 3c 28 00       	cmpb   $0x0,(%rax,%r13,1)
  16:	74 08                	je     0x20
  18:	48 89 df             	mov    %rbx,%rdi
  1b:	e8 e8 5a c9 f8       	call   0xf8c95b08
  20:	4c 8b 33             	mov    (%rbx),%r14
  23:	4c 89 f0             	mov    %r14,%rax
  26:	48 c1 e8 03          	shr    $0x3,%rax
* 2a:	42 80 3c 28 00       	cmpb   $0x0,(%rax,%r13,1) <-- trapping instruction
  2f:	74 08                	je     0x39
  31:	4c 89 f7             	mov    %r14,%rdi
  34:	e8 cf 5a c9 f8       	call   0xf8c95b08
  39:	4d 8b 3e             	mov    (%r14),%r15
  3c:	4c 89 f8             	mov    %r15,%rax
  3f:	48                   	rex.W


***

If these findings have caused you to resend the series or submit a
separate fix, please add the following tag to your commit message:
  Tested-by: syzbot@syzkaller.appspotmail.com

---
This report is generated by a bot. It may contain errors.
syzbot ci engineers can be reached at syzkaller@googlegroups.com.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-09-27  6:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-12  5:28 [PATCH net-next v2 0/3] net: devmem: improve cpu cost of RX token management Bobby Eshleman
2025-09-12  5:28 ` [PATCH net-next v2 1/3] net: devmem: rename tx_vec to vec in dmabuf binding Bobby Eshleman
2025-09-12  5:28 ` [PATCH net-next v2 2/3] net: devmem: use niov array for token management Bobby Eshleman
2025-09-17 23:55   ` Mina Almasry
2025-09-18 14:19     ` Bobby Eshleman
2025-09-12  5:28 ` [PATCH net-next v2 3/3] net: ethtool: prevent user from breaking devmem single-binding rule Bobby Eshleman
2025-09-12 22:23   ` Stanislav Fomichev
2025-09-17 23:07     ` Mina Almasry
2025-09-12  9:40 ` [syzbot ci] Re: net: devmem: improve cpu cost of RX token management syzbot ci
  -- strict thread matches above, loose matches on Subject: below --
2025-09-26 16:31 [PATCH net-next v4 0/2] " Bobby Eshleman
2025-09-27  6:00 ` [syzbot ci] " syzbot ci
2025-09-02 21:36 [PATCH net-next 0/2] " Bobby Eshleman
2025-09-03 17:46 ` [syzbot ci] " syzbot ci

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).