From: Bobby Eshleman <bobbyeshleman@gmail.com>
To: "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
Kuniyuki Iwashima <kuniyu@google.com>,
Willem de Bruijn <willemb@google.com>,
Neal Cardwell <ncardwell@google.com>,
David Ahern <dsahern@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
Jonathan Corbet <corbet@lwn.net>,
Andrew Lunn <andrew+netdev@lunn.ch>,
Shuah Khan <shuah@kernel.org>,
Mina Almasry <almasrymina@google.com>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-arch@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kselftest@vger.kernel.org,
Stanislav Fomichev <sdf@fomichev.me>,
Bobby Eshleman <bobbyeshleman@meta.com>
Subject: [PATCH net-next v6 4/6] net: devmem: add SO_DEVMEM_AUTORELEASE for autorelease control
Date: Tue, 04 Nov 2025 17:23:23 -0800 [thread overview]
Message-ID: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-4-ea98cf4d40b3@meta.com> (raw)
In-Reply-To: <20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-0-ea98cf4d40b3@meta.com>
From: Bobby Eshleman <bobbyeshleman@meta.com>
Add SO_DEVMEM_AUTORELEASE socket option to allow applications to
control token release behavior on a per-socket basis.
The socket option accepts boolean values (0 or 1):
- 1 (true): outstanding tokens are automatically released when the
socket closes
- 0 (false): outstanding tokens are released when the dmabuf is unbound
The option can only be changed when the socket has no outstanding
tokens, enforced by checking:
1. The frags xarray is empty (no tokens in autorelease mode)
2. The outstanding_urefs counter is zero (no tokens in manual mode)
This restriction prevents inconsistent token tracking state between
acquisition and release calls. If either condition fails, setsockopt
returns -EBUSY.
The default state is autorelease off.
Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
---
include/uapi/asm-generic/socket.h | 2 ++
net/core/sock.c | 51 +++++++++++++++++++++++++++++++++
net/ipv4/tcp.c | 2 +-
tools/include/uapi/asm-generic/socket.h | 2 ++
4 files changed, 56 insertions(+), 1 deletion(-)
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index 53b5a8c002b1..59302318bb34 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -150,6 +150,8 @@
#define SO_INQ 84
#define SCM_INQ SO_INQ
+#define SO_DEVMEM_AUTORELEASE 85
+
#if !defined(__KERNEL__)
#if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
diff --git a/net/core/sock.c b/net/core/sock.c
index 465645c1d74f..27af476f3cd3 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1160,6 +1160,46 @@ sock_devmem_dontneed_autorelease(struct sock *sk, struct dmabuf_token *tokens,
return ret;
}
+static noinline_for_stack int
+sock_devmem_set_autorelease(struct sock *sk, sockptr_t optval, unsigned int optlen)
+{
+ int val;
+
+ if (!sk_is_tcp(sk))
+ return -EBADF;
+
+ if (optlen < sizeof(int))
+ return -EINVAL;
+
+ if (copy_from_sockptr(&val, optval, sizeof(val)))
+ return -EFAULT;
+
+ /* Validate that val is 0 or 1 */
+ if (val != 0 && val != 1)
+ return -EINVAL;
+
+ sockopt_lock_sock(sk);
+
+ /* Can only change autorelease if:
+ * 1. No tokens in the frags xarray (autorelease mode)
+ * 2. No outstanding urefs (manual release mode)
+ */
+ if (!xa_empty(&sk->sk_devmem_info.frags)) {
+ sockopt_release_sock(sk);
+ return -EBUSY;
+ }
+
+ if (atomic_read(&sk->sk_devmem_info.outstanding_urefs) > 0) {
+ sockopt_release_sock(sk);
+ return -EBUSY;
+ }
+
+ sk->sk_devmem_info.autorelease = !!val;
+
+ sockopt_release_sock(sk);
+ return 0;
+}
+
static noinline_for_stack int
sock_devmem_dontneed(struct sock *sk, sockptr_t optval, unsigned int optlen)
{
@@ -1351,6 +1391,9 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
#ifdef CONFIG_PAGE_POOL
case SO_DEVMEM_DONTNEED:
return sock_devmem_dontneed(sk, optval, optlen);
+
+ case SO_DEVMEM_AUTORELEASE:
+ return sock_devmem_set_autorelease(sk, optval, optlen);
#endif
case SO_SNDTIMEO_OLD:
case SO_SNDTIMEO_NEW:
@@ -2208,6 +2251,14 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
v.val = READ_ONCE(sk->sk_txrehash);
break;
+#ifdef CONFIG_PAGE_POOL
+ case SO_DEVMEM_AUTORELEASE:
+ if (!sk_is_tcp(sk))
+ return -EBADF;
+ v.val = sk->sk_devmem_info.autorelease;
+ break;
+#endif
+
default:
/* We implement the SO_SNDLOWAT etc to not be settable
* (1003.1g 7).
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 052875c1b547..8226ba892b36 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -496,7 +496,7 @@ void tcp_init_sock(struct sock *sk)
xa_init_flags(&sk->sk_devmem_info.frags, XA_FLAGS_ALLOC1);
sk->sk_devmem_info.binding = NULL;
atomic_set(&sk->sk_devmem_info.outstanding_urefs, 0);
- sk->sk_devmem_info.autorelease = true;
+ sk->sk_devmem_info.autorelease = false;
}
EXPORT_IPV6_MOD(tcp_init_sock);
diff --git a/tools/include/uapi/asm-generic/socket.h b/tools/include/uapi/asm-generic/socket.h
index f333a0ac4ee4..9710a3d7cc4d 100644
--- a/tools/include/uapi/asm-generic/socket.h
+++ b/tools/include/uapi/asm-generic/socket.h
@@ -147,6 +147,8 @@
#define SO_PASSRIGHTS 83
+#define SO_DEVMEM_AUTORELEASE 85
+
#if !defined(__KERNEL__)
#if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
--
2.47.3
next prev parent reply other threads:[~2025-11-05 1:23 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-05 1:23 [PATCH net-next v6 0/6] net: devmem: improve cpu cost of RX token management Bobby Eshleman
2025-11-05 1:23 ` [PATCH net-next v6 1/6] net: devmem: rename tx_vec to vec in dmabuf binding Bobby Eshleman
2025-11-05 1:23 ` [PATCH net-next v6 2/6] net: devmem: refactor sock_devmem_dontneed for autorelease split Bobby Eshleman
2025-11-05 1:23 ` [PATCH net-next v6 3/6] net: devmem: prepare for autorelease rx token management Bobby Eshleman
2025-11-05 16:02 ` kernel test robot
2025-11-05 20:55 ` kernel test robot
2025-11-06 15:11 ` Dan Carpenter
2025-11-06 15:14 ` Dan Carpenter
2025-11-05 1:23 ` Bobby Eshleman [this message]
2025-11-05 17:16 ` [PATCH net-next v6 4/6] net: devmem: add SO_DEVMEM_AUTORELEASE for autorelease control kernel test robot
2025-11-05 1:23 ` [PATCH net-next v6 5/6] net: devmem: document SO_DEVMEM_AUTORELEASE socket option Bobby Eshleman
2025-11-05 17:34 ` Stanislav Fomichev
2025-11-05 17:44 ` Mina Almasry
2025-11-05 19:31 ` Stanislav Fomichev
2025-11-05 23:17 ` Stanislav Fomichev
2025-11-07 2:22 ` Bobby Eshleman
2025-11-05 17:59 ` Bobby Eshleman
2025-11-05 1:23 ` [PATCH net-next v6 6/6] net: devmem: add tests for " Bobby Eshleman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251104-scratch-bobbyeshleman-devmem-tcp-token-upstream-v6-4-ea98cf4d40b3@meta.com \
--to=bobbyeshleman@gmail.com \
--cc=almasrymina@google.com \
--cc=andrew+netdev@lunn.ch \
--cc=arnd@arndb.de \
--cc=bobbyeshleman@meta.com \
--cc=corbet@lwn.net \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-arch@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=shuah@kernel.org \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).