From: Simon Horman <horms@kernel.org>
To: maoyixie.tju@gmail.com
Cc: 'Simon Horman' <horms@kernel.org>,
achender@kernel.org, davem@davemloft.net, edumazet@google.com,
kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org,
linux-rdma@vger.kernel.org, rds-devel@oss.oracle.com,
linux-kernel@vger.kernel.org, maoyi.xie@ntu.edu.sg,
praveen.kakkolangara@aumovio.com
Subject: Re: [PATCH net v2] rds: filter RDS_INFO_* getsockopt by caller's netns
Date: Sun, 10 May 2026 15:54:27 +0100 [thread overview]
Message-ID: <20260510145425.1372018-3-horms@kernel.org> (raw)
In-Reply-To: <20260507081332.2868770-1-maoyixie.tju@gmail.com>
From: 'Simon Horman' <horms@kernel.org>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
rds: filter RDS_INFO_* getsockopt by caller's netns
This commit updates the RDS_INFO_* getsockopt handlers to filter sockets and
connections by the caller's network namespace. This prevents users in
isolated namespaces from reading host-wide socket state.
> diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c
> index 76f625986a7f2..98f3cfd48b813 100644
> --- a/net/rds/af_rds.c
> +++ b/net/rds/af_rds.c
[ ... ]
> @@ -847,17 +859,24 @@ static void rds6_sock_info(struct socket *sock, unsigned int len,
> struct rds_info_iterator *iter,
> struct rds_info_lengths *lens)
> {
> + struct net *net = sock_net(sock->sk);
> struct rds6_info_socket sinfo6;
> + unsigned int cnt = 0;
> struct rds_sock *rs;
>
> len /= sizeof(struct rds6_info_socket);
>
> spin_lock_bh(&rds_sock_lock);
>
> - if (len < rds_sock_count)
> + if (len < rds_sock_count) {
> + cnt = rds_sock_count;
> goto out;
> + }
Does this early-out check using the global rds_sock_count break the namespace
isolation and force callers to over-allocate memory?
If a user in an isolated network namespace provides a 0-length buffer to probe
the required size, len < rds_sock_count evaluates to true and the global count
is returned to userspace. Could this allow a user to track the exact host-wide
number of RDS sockets?
Additionally, if an isolated container has 1 socket but the host system has
thousands, the kernel might reject a correctly sized buffer with -ENOSPC and
demand a buffer large enough for all system-wide sockets. Is it possible to
remove this global counter check and always iterate the list to return the
exact netns-filtered count?
Simon says: I'm not convinced that the issue flagged above is a bug.
But it does seem suboptimal to me.
> list_for_each_entry(rs, &rds_sock_list, rs_item) {
> + /* Only show sockets in the caller's netns. */
> + if (!net_eq(sock_net(rds_rs_to_sk(rs)), net))
> + continue;
[ ... ]
> diff --git a/net/rds/tcp.c b/net/rds/tcp.c
> index 654e23d13e3df..ef9e958cafb4f 100644
> --- a/net/rds/tcp.c
> +++ b/net/rds/tcp.c
[ ... ]
> @@ -235,20 +235,27 @@ static void rds_tcp_tc_info(struct socket *rds_sock, unsigned int len,
> struct rds_info_iterator *iter,
> struct rds_info_lengths *lens)
> {
[ ... ]
> list_for_each_entry(tc, &rds_tcp_tc_list, t_list_item) {
> struct inet_sock *inet = inet_sk(tc->t_sock->sk);
Simon says: As a pre-existing issue I don't think this needs to block
progress of this patch. But it does seem worth investigating
(perhaps that is already happening?).
Can concurrent getsockopt calls trigger a NULL pointer dereference here?
While this isn't a regression introduced by this patch, it appears there is
a race condition during list traversal. Looking at rds_tcp_set_callbacks(), a
newly allocated connection tc is added to rds_tcp_tc_list under the lock, but
the lock is released before tc->t_sock is assigned:
rds_tcp_set_callbacks() {
spin_lock(&rds_tcp_tc_list_lock);
list_add_tail(&tc->t_list_item, &rds_tcp_tc_list);
...
spin_unlock(&rds_tcp_tc_list_lock);
...
tc->t_sock = sock;
}
If a caller concurrently executes this getsockopt handler during that window,
it would acquire the lock, observe the new entry, and attempt to evaluate
inet_sk(tc->t_sock->sk). Since tc->t_sock is still NULL, would dereferencing
NULL->sk result in a panic?
> if (tc->t_cpath->cp_conn->c_isv6)
> continue;
> + /* Only show connections in the caller's netns. */
> + if (!net_eq(rds_conn_net(tc->t_cpath->cp_conn), net))
> + continue;
>
> tsinfo.local_addr = inet->inet_saddr;
next prev parent reply other threads:[~2026-05-10 14:57 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-07 8:13 [PATCH net v2] rds: filter RDS_INFO_* getsockopt by caller's netns Maoyi Xie
2026-05-10 14:54 ` Simon Horman [this message]
2026-05-11 6:41 ` Maoyi Xie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260510145425.1372018-3-horms@kernel.org \
--to=horms@kernel.org \
--cc=achender@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=maoyi.xie@ntu.edu.sg \
--cc=maoyixie.tju@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=praveen.kakkolangara@aumovio.com \
--cc=rds-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox