From: Joel Granados <joel.granados@kernel.org>
To: "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Joel Granados <joel.granados@kernel.org>
Subject: [PATCH RFC] net: enforce net sysctl registration
Date: Mon, 29 Jun 2026 14:22:37 +0200 [thread overview]
Message-ID: <20260629-jag-net_const_qualify-v1-1-ee98b8fc400c@kernel.org> (raw)
Replace the warning and file permission change with an error when an
"unsafe" net sysctl registration is detected.
One of the barriers preventing the const qualification of the ctl_tables
in the net directory is the permission (->mode) change in
ensure_safe_net_sysctl. This prep commit removes that barrier allowing
the const qualification of net ctl_tables.
Signed-off-by: Joel Granados <joel.granados@kernel.org>
---
What?
=====
Replace warning and file permission change with an error (reject
registration) when an "unsafe" net sysctl registration is detected.
Why?
====
The main motivation for this is to continue with the const qualification
of the ctl_table arrays [1]. The permission change inside
ensure_safe_net_sysctl disallows cons qualifiaction as it basically
modifies the entries before running the sysctl registration.
ent->mode &= ~0222;
Analysis
========
* I believe that there is currently now way that the permission change
gets executed [2]
* I found one case where the warning message was posted to lore
(vsock_sysctl_register) [3], but it made its to mainline as part of
the second case in [2].
* We should error anyway because writing to the global sysctl value
through a child netns is indicative of a bug [4].
RFC
===
I'm sending it out as an RFC as I would like to discuss the change to
ensure_safe_net_sysctl in isolation, but my idea is to send out a series
that actually const qualifies the clt_tables in the net directory. I
would be very thankful if you point me to anything that I have missed in
my analysis that shows that this cannot/shouldn't be done.
Best
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl.git/commit/?h=constfy-sysctl-6.14-rc1&id=1751f872cc97f992ed5c4c72c55588db1f0021e1
[2]
I have identified 4 contexts relevant to the ensure_safe_net_sysctl call
inside the network sysctl registration.
1. When the (struct net) == &init_net (like in iw_cm_init): In this case
ensure_safe_net_sysctl is not executed and permission modification
never happens.
2. When the ctl_table data (->data) gets "manually" assigned to
something other init_net (like in vsock_sysctl_register): In this
case ensure_safe_net_sysctl *is* executed but the data that is passed
is neither a module address (!is_module_address) nor a kernel core
address (!is_kernel_core_data); so the permission modification never
happens.
3. When the permissions are explicitly changed on a kmemdup'ed ctl_table
array (like in sysctl_core_net_init): in this case
ensure_safe_net_sysctl *is* executed but the permission modification
never happens as the mode is not writable.
4. When ctl have custom proc_handlers (like in nf_lwtunnel_net_init): In
this case ->data is NULL so it is not a module address
(!is_module_address) nor a kernel core address
(!is_kernel_core_data), so permission modification never happens.
It seems like there is no way of executing the permission change in
ensure_safe_net_sysctl. Please correct me if this is inacurate and help
me find the case that I missed.
[3]
https://lore.kernel.org/all/20260302194926.90378-1-graf@amazon.com/
[4]
The ensure_safe_net_sysctl function was introduced in Commit:
31c4d2f160eb7b17cbead24dc6efed06505a3fee ("net: Ensure net namespace
isolation of sysctls") which states that it is trying to prevent a
leak (indicative of a bug).
---
net/sysctl_net.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/net/sysctl_net.c b/net/sysctl_net.c
index 19e8048241bacb18de853d3b904d0f97fd2fe78a..c1630a266f8436d8962ceb87dc629964b2d71260 100644
--- a/net/sysctl_net.c
+++ b/net/sysctl_net.c
@@ -114,16 +114,16 @@ __init int net_sysctl_init(void)
goto out;
}
-/* Verify that sysctls for non-init netns are safe by either:
+/* Return error when sysctls for non-init netns are unsafe by verifying:
* 1) being read-only, or
* 2) having a data pointer which points outside of the global kernel/module
* data segment, and rather into the heap where a per-net object was
* allocated.
*/
-static void ensure_safe_net_sysctl(struct net *net, const char *path,
- struct ctl_table *table, size_t table_size)
+static int ensure_safe_net_sysctl(struct net *net, const char *path,
+ const struct ctl_table *table, size_t table_size)
{
- struct ctl_table *ent;
+ const struct ctl_table *ent;
pr_debug("Registering net sysctl (net %p): %s\n", net, path);
ent = table;
@@ -149,15 +149,14 @@ static void ensure_safe_net_sysctl(struct net *net, const char *path,
else
continue;
- /* If it is writable and points to kernel/module global
- * data, then it's probably a netns leak.
- */
+ /* Warn on netns leak. */
WARN(1, "sysctl %s/%s: data points to %s global data: %ps\n",
- path, ent->procname, where, ent->data);
+ path, ent->procname, where, ent->data);
- /* Make it "safe" by dropping writable perms */
- ent->mode &= ~0222;
+ return -EACCES;
}
+
+ return 0;
}
struct ctl_table_header *register_net_sysctl_sz(struct net *net,
@@ -166,7 +165,8 @@ struct ctl_table_header *register_net_sysctl_sz(struct net *net,
size_t table_size)
{
if (!net_eq(net, &init_net))
- ensure_safe_net_sysctl(net, path, table, table_size);
+ if (ensure_safe_net_sysctl(net, path, table, table_size))
+ return NULL;
return __register_sysctl_table(&net->sysctls, path, table, table_size);
}
---
base-commit: 8cd9520d35a6c38db6567e97dd93b1f11f185dc6
change-id: 20260629-jag-net_const_qualify-f4e09759dac7
Best regards,
--
Joel Granados <joel.granados@kernel.org>
reply other threads:[~2026-06-29 12:22 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260629-jag-net_const_qualify-v1-1-ee98b8fc400c@kernel.org \
--to=joel.granados@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox