From: Bobby Eshleman <bobbyeshleman@gmail.com>
To: "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
Kuniyuki Iwashima <kuniyu@google.com>,
Willem de Bruijn <willemb@google.com>,
Neal Cardwell <ncardwell@google.com>,
David Ahern <dsahern@kernel.org>,
Mina Almasry <almasrymina@google.com>
Cc: Stanislav Fomichev <sdf@fomichev.me>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Bobby Eshleman <bobbyeshleman@meta.com>
Subject: [PATCH net-next v5 4/4] net: add per-netns sysctl for devmem autorelease
Date: Thu, 23 Oct 2025 13:58:23 -0700 [thread overview]
Message-ID: <20251023-scratch-bobbyeshleman-devmem-tcp-token-upstream-v5-4-47cb85f5259e@meta.com> (raw)
In-Reply-To: <20251023-scratch-bobbyeshleman-devmem-tcp-token-upstream-v5-0-47cb85f5259e@meta.com>
From: Bobby Eshleman <bobbyeshleman@meta.com>
Add a new per-namespace sysctl to control the autorelease
behavior of devmem dmabuf bindings. The sysctl is found at:
/proc/sys/net/core/devmem_autorelease
When a binding is created, it inherits the autorelease setting from the
network namespace of the device to which it's being bound.
If autorelease is enabled (1):
- Tokens are stored in socket's xarray
- Tokens are automatically released when socket is closed
If autorelease is disabled (0):
- Tokens are tracked via uref counter in each net_iov
- User must manually release tokens via SO_DEVMEM_DONTNEED
- Lingering tokens are released when dmabuf is unbound
- This is the new default behavior for better performance
This allows application developers to choose between automatic cleanup
(easier, backwards compatible) and manual control (more explicit token
management, but more performant).
Changes the default to autorelease=0, so that users gain the performance
benefit by default.
Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
---
include/net/netns/core.h | 1 +
net/core/devmem.c | 2 +-
net/core/net_namespace.c | 1 +
net/core/sysctl_net_core.c | 9 +++++++++
4 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/include/net/netns/core.h b/include/net/netns/core.h
index 9ef3d70e5e9c..7af5ab0d757b 100644
--- a/include/net/netns/core.h
+++ b/include/net/netns/core.h
@@ -18,6 +18,7 @@ struct netns_core {
u8 sysctl_txrehash;
u8 sysctl_tstamp_allow_data;
u8 sysctl_bypass_prot_mem;
+ u8 sysctl_devmem_autorelease;
#ifdef CONFIG_PROC_FS
struct prot_inuse __percpu *prot_inuse;
diff --git a/net/core/devmem.c b/net/core/devmem.c
index 8f3199fe0f7b..9cd6d93676f9 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -331,7 +331,7 @@ net_devmem_bind_dmabuf(struct net_device *dev,
goto err_free_chunks;
list_add(&binding->list, &priv->bindings);
- binding->autorelease = true;
+ binding->autorelease = dev_net(dev)->core.sysctl_devmem_autorelease;
return binding;
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index adcfef55a66f..890826b113d6 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -396,6 +396,7 @@ static __net_init void preinit_net_sysctl(struct net *net)
net->core.sysctl_txrehash = SOCK_TXREHASH_ENABLED;
net->core.sysctl_tstamp_allow_data = 1;
net->core.sysctl_txq_reselection = msecs_to_jiffies(1000);
+ net->core.sysctl_devmem_autorelease = 0;
}
/* init code that must occur even if setup_net() is not called. */
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index 8d4decb2606f..375ec395227e 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -692,6 +692,15 @@ static struct ctl_table netns_core_table[] = {
.extra1 = SYSCTL_ZERO,
.extra2 = SYSCTL_ONE
},
+ {
+ .procname = "devmem_autorelease",
+ .data = &init_net.core.sysctl_devmem_autorelease,
+ .maxlen = sizeof(u8),
+ .mode = 0644,
+ .proc_handler = proc_dou8vec_minmax,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = SYSCTL_ONE
+ },
/* sysctl_core_net_init() will set the values after this
* to readonly in network namespaces
*/
--
2.47.3
next prev parent reply other threads:[~2025-10-23 21:00 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-23 20:58 [PATCH net-next v5 0/4] net: devmem: improve cpu cost of RX token management Bobby Eshleman
2025-10-23 20:58 ` [PATCH net-next v5 1/4] net: devmem: rename tx_vec to vec in dmabuf binding Bobby Eshleman
2025-10-28 0:33 ` Mina Almasry
2025-10-23 20:58 ` [PATCH net-next v5 2/4] net: devmem: refactor sock_devmem_dontneed for autorelease split Bobby Eshleman
2025-10-28 0:36 ` Mina Almasry
2025-10-28 18:33 ` Bobby Eshleman
2025-10-23 20:58 ` [PATCH net-next v5 3/4] net: devmem: use niov array for token management Bobby Eshleman
2025-10-28 1:20 ` Mina Almasry
2025-10-28 20:49 ` Bobby Eshleman
2025-10-29 2:04 ` Mina Almasry
2025-10-29 14:46 ` Bobby Eshleman
2025-10-23 20:58 ` Bobby Eshleman [this message]
2025-10-28 1:22 ` [PATCH net-next v5 4/4] net: add per-netns sysctl for devmem autorelease Mina Almasry
2025-10-28 21:14 ` Bobby Eshleman
2025-10-29 2:09 ` Mina Almasry
2025-10-29 15:00 ` Bobby Eshleman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251023-scratch-bobbyeshleman-devmem-tcp-token-upstream-v5-4-47cb85f5259e@meta.com \
--to=bobbyeshleman@gmail.com \
--cc=almasrymina@google.com \
--cc=bobbyeshleman@meta.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ncardwell@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).