* hsr: netlink notifications leak across network namespaces
@ 2026-05-18 6:44 Maoyi Xie
2026-05-27 7:59 ` [PATCH net] hsr: broadcast netlink notifications in the device's net namespace Maoyi Xie
0 siblings, 1 reply; 4+ messages in thread
From: Maoyi Xie @ 2026-05-18 6:44 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: Simon Horman, netdev, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2889 bytes --]
Hi all,
(Resending from a personal address. My previous attempt from
my NTU corporate account carried an auto-appended confidentiality
disclaimer that you've declined to accept. The content below is
unchanged.)
I noticed what looks like a cross-namespace leak in
net/hsr/hsr_netlink.c on linux-7.1-rc1. Could you confirm if this
is a real defect, and worth fixing?
# Where it is
net/hsr/hsr_netlink.c line 250 in hsr_nl_ringerror()
net/hsr/hsr_netlink.c line 287 in hsr_nl_nodedown()
Both call genlmsg_multicast(&hsr_genl_family, skb, 0, 0, GFP_ATOMIC).
# What looks wrong
hsr_genl_family is registered with .netnsok = true (line 557),
so HSR devices can live in non-init network namespaces. But
genlmsg_multicast() delivers on the kernel default generic
netlink socket. That socket lives in init_net. Events from any
HSR device end up there, regardless of which network namespace
the device sits in.
Other .netnsok = true networking subsystems use
genlmsg_multicast_netns() instead. Examples:
drivers/net/gtp.c
drivers/net/ovpn/netlink.c
drivers/net/team/team_core.c
net/batman-adv/netlink.c
net/core/netdev-genl.c
net/ethtool/netlink.c
net/handshake/netlink.c
# Two effects
1. A privileged listener in init_net subscribed to the
"hsr-network" group receives ring-error and node-down events
for HSR devices in any network namespace. The payload carries
the peer MAC (HSR_A_NODE_ADDR) and the slave port ifindex
(HSR_A_IFINDEX). Looks like a cross-namespace information
leak.
2. A listener inside the HSR device's own network namespace
never sees its own events. Functional defect for any
namespaced HSR consumer.
# How I tested
Attached proof of concept sets up vethA0/vethB0 with hsr1 in
init_net and vethA1/vethB1 with hsr0 in a child network
namespace. Veth pairs bridge the namespace boundary. Both HSRs
talk for 4 seconds so node tables fill. Then vethA0/vethB0 go
down. Each side's prune timer fires hsr_nl_nodedown() for the
peer MAC. Two listeners report what they got.
Vanilla linux-7.1-rc1:
init_net received 2 HSR notifications
child namespace received 0 HSR notifications
With genlmsg_multicast_netns() routed via
dev_net(master->dev) and dev_net(port->dev):
init_net received 1 HSR notification (hsr1's event)
child namespace received 1 HSR notification (hsr0's event)
The proof of concept source poc_hsr_pernet.c and the run log
poc_hsr_pernet.log are attached. The proof of concept build
overrides HSR_NODE_FORGET_TIME from 60000 ms to 3000 ms to
keep wall clock short; the bug shape is the same at the
stock 60 s.
If you confirm this is a real bug, I have a small patch ready
that switches the two callers to genlmsg_multicast_netns().
Please let me know if you would like me to send it as a
follow-up.
Thanks,
Maoyi
--
Nanyang Technological University
https://maoyixie.com/
[-- Attachment #2: poc_hsr_pernet.log --]
[-- Type: application/octet-stream, Size: 1814 bytes --]
# Unpatched (linux-7.1-rc1 vanilla)
$ rmmod hsr; insmod /root/hsr.ko; /root/poc_hsr_pernet 15
[child] bringing up vethA1, vethB1, hsr0
[child] HSR family_id=37 mcgrp_id=15
[child] hsr0 up
[child] listening on HSR mcast in child namespace for 15s...
[child] received 0 HSR notifications in child namespace
[parent] creating veth pairs vethA0<->vethA1, vethB0<->vethB1
[parent] moving vethA1, vethB1 into the child namespace
[parent] creating hsr1 on vethA0+vethB0 in init_net
[parent] HSR family_id=37 mcgrp_id=15 (init_net)
[parent] soak 4s
[parent] bringing vethA0, vethB0 down
[parent] listening 11s for nodedown
[parent(init_net)] HSR notification cmd=2 len=32
[parent(init_net)] HSR notification cmd=2 len=32
========== RESULT ==========
init_net received 2 HSR notifications
child namespace received 0 HSR notifications
UNPATCHED: init_net got events that belong to the child namespace.
# Patched (linux-7.1-rc1 + the two-line genlmsg_multicast_netns conversion)
$ rmmod hsr; insmod /root/hsr.ko; /root/poc_hsr_pernet 15
[child] bringing up vethA1, vethB1, hsr0
[child] HSR family_id=38 mcgrp_id=15
[child] hsr0 up
[child] listening on HSR mcast in child namespace for 15s...
[child] HSR notification cmd=2 len=32
[child] received 1 HSR notifications in child namespace
[parent] creating veth pairs vethA0<->vethA1, vethB0<->vethB1
[parent] moving vethA1, vethB1 into the child namespace
[parent] creating hsr1 on vethA0+vethB0 in init_net
[parent] HSR family_id=38 mcgrp_id=15 (init_net)
[parent] soak 4s
[parent] bringing vethA0, vethB0 down
[parent] listening 11s for nodedown
[parent(init_net)] HSR notification cmd=2 len=32
========== RESULT ==========
init_net received 1 HSR notifications
child namespace received >=1 HSR notifications
PATCHED: each namespace receives only its own events.
[-- Attachment #3: poc_hsr_pernet.c --]
[-- Type: application/octet-stream, Size: 17541 bytes --]
/*
* PoC for HSR genlmsg_multicast cross-namespace leak.
*
* Setup:
* parent in init_net.
* parent creates two veth pairs: vethA0/vethA1 and vethB0/vethB1.
* parent moves vethA1, vethB1 into a child network namespace.
* parent builds hsr1 on vethA0 + vethB0 in init_net.
* child builds hsr0 on vethA1 + vethB1 in the child namespace.
* parent listens on the "hsr-network" mcast group in init_net.
* child listens on the "hsr-network" mcast group in the child
* namespace.
*
* Trigger:
* let both HSRs talk for 4 s.
* parent brings vethA0 and vethB0 down.
* each HSR's prune timer eventually fires hsr_nl_nodedown for the
* peer MAC.
*
* Expected:
* unpatched: parent counts 2 events (hsr1 own + hsr0 leak),
* child counts 0.
* patched: parent counts 1 (hsr1 own), child counts 1 (hsr0 own).
*
* Build: cc -O0 -g poc_hsr_pernet.c -o poc_hsr_pernet
* Run as root in a VM with hsr.ko loaded.
*
* Note on timing:
* The trigger relies on hsr_nl_nodedown firing once each side's
* node-table entry for the peer expires. The expiry time is
* HSR_NODE_FORGET_TIME in net/hsr/hsr_main.h, default 60000 ms.
* To finish in 15 s, rebuild the hsr module with
* HSR_NODE_FORGET_TIME = 3000. To run against a stock kernel,
* pass a timeout argument >= 70:
*
* ./poc_hsr_pernet 75
*
* The bug shape is identical in both cases.
*/
#define _GNU_SOURCE
#include <arpa/inet.h>
#include <errno.h>
#include <fcntl.h>
#include <linux/genetlink.h>
#include <linux/if.h>
#include <linux/if_link.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
#include <linux/veth.h>
#include <poll.h>
#include <sched.h>
#include <signal.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>
/* if_nametoindex is normally in <net/if.h>, which clashes with
* <linux/if.h>. Forward-declare to avoid the conflict.
*/
extern unsigned int if_nametoindex(const char *);
/* HSR netlink UAPI (mirror of include/uapi/linux/hsr_netlink.h) */
#define HSR_GENL_NAME "HSR"
#define HSR_GENL_VERSION 1
#define HSR_GENL_MCGRP_NAME "hsr-network"
#define HSR_C_RING_ERROR 1
#define HSR_C_NODE_DOWN 2
#define HSR_ATTR_NODE_ADDR 2
#define HSR_ATTR_IFINDEX 3
/* IFLA_INFO_* for nested netlink */
#define IFLA_INFO_KIND 1
#define IFLA_INFO_DATA 2
/* HSR rtnetlink attrs (include/uapi/linux/if_link.h IFLA_HSR_*) */
#define IFLA_HSR_SLAVE1 1
#define IFLA_HSR_SLAVE2 2
static int nl_open(int *pid_out)
{
int sk = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
if (sk < 0) { perror("socket(NETLINK_ROUTE)"); return -1; }
struct sockaddr_nl sa = { .nl_family = AF_NETLINK };
if (bind(sk, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
perror("bind NETLINK_ROUTE"); return -1;
}
socklen_t slen = sizeof(sa);
getsockname(sk, (struct sockaddr *)&sa, &slen);
if (pid_out) *pid_out = sa.nl_pid;
return sk;
}
static int nl_open_generic(void)
{
int sk = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
if (sk < 0) { perror("socket(NETLINK_GENERIC)"); return -1; }
struct sockaddr_nl sa = { .nl_family = AF_NETLINK };
if (bind(sk, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
perror("bind NETLINK_GENERIC"); return -1;
}
return sk;
}
static int nl_send_recv(int sk, struct nlmsghdr *nh)
{
if (send(sk, nh, nh->nlmsg_len, 0) < 0) {
perror("send"); return -1;
}
char buf[8192];
int n = recv(sk, buf, sizeof(buf), 0);
if (n < 0) { perror("recv"); return -1; }
struct nlmsghdr *rnh = (struct nlmsghdr *)buf;
if (rnh->nlmsg_type == NLMSG_ERROR) {
struct nlmsgerr *err = (struct nlmsgerr *)NLMSG_DATA(rnh);
if (err->error) {
fprintf(stderr, "nlmsgerr %d (%s)\n",
err->error, strerror(-err->error));
return err->error;
}
}
return 0;
}
/* helper: add an nla to existing nh */
static struct nlattr *nla_put(struct nlmsghdr *nh, int type,
const void *data, int len)
{
struct nlattr *na = (struct nlattr *)((char *)nh + NLMSG_ALIGN(nh->nlmsg_len));
na->nla_type = type;
na->nla_len = NLA_HDRLEN + len;
if (data) memcpy((char *)na + NLA_HDRLEN, data, len);
nh->nlmsg_len = NLMSG_ALIGN(nh->nlmsg_len) + NLA_ALIGN(na->nla_len);
return na;
}
/* Create a veth pair via RTM_NEWLINK */
static int create_veth_pair(int sk, const char *a, const char *b)
{
char buf[4096] = {0};
struct nlmsghdr *nh = (struct nlmsghdr *)buf;
nh->nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
nh->nlmsg_type = RTM_NEWLINK;
nh->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | NLM_F_CREATE | NLM_F_EXCL;
nh->nlmsg_seq = 1;
struct ifinfomsg *ifi = NLMSG_DATA(nh);
ifi->ifi_family = AF_UNSPEC;
nla_put(nh, IFLA_IFNAME, a, strlen(a) + 1);
/* nested: linkinfo = { kind="veth", data = { peer = { ifname=b } } } */
struct nlattr *linfo = (struct nlattr *)((char *)nh + NLMSG_ALIGN(nh->nlmsg_len));
linfo->nla_type = IFLA_LINKINFO;
int linfo_start = nh->nlmsg_len;
linfo->nla_len = NLA_HDRLEN;
nh->nlmsg_len = NLMSG_ALIGN(nh->nlmsg_len) + NLA_HDRLEN;
nla_put(nh, IFLA_INFO_KIND, "veth", strlen("veth") + 1);
struct nlattr *info_data = (struct nlattr *)((char *)nh + NLMSG_ALIGN(nh->nlmsg_len));
info_data->nla_type = IFLA_INFO_DATA;
int info_data_start = nh->nlmsg_len;
info_data->nla_len = NLA_HDRLEN;
nh->nlmsg_len = NLMSG_ALIGN(nh->nlmsg_len) + NLA_HDRLEN;
struct nlattr *peer = (struct nlattr *)((char *)nh + NLMSG_ALIGN(nh->nlmsg_len));
peer->nla_type = VETH_INFO_PEER;
int peer_start = nh->nlmsg_len;
peer->nla_len = NLA_HDRLEN + sizeof(struct ifinfomsg);
/* zero ifinfomsg for peer */
memset((char *)peer + NLA_HDRLEN, 0, sizeof(struct ifinfomsg));
nh->nlmsg_len = NLMSG_ALIGN(nh->nlmsg_len) + peer->nla_len;
nla_put(nh, IFLA_IFNAME, b, strlen(b) + 1);
peer->nla_len = nh->nlmsg_len - peer_start;
info_data->nla_len = nh->nlmsg_len - info_data_start;
linfo->nla_len = nh->nlmsg_len - linfo_start;
return nl_send_recv(sk, nh);
}
/* Move netdev (by ifname) to target netns_fd */
static int set_link_netns(int sk, const char *ifname, int netns_fd)
{
char buf[4096] = {0};
struct nlmsghdr *nh = (struct nlmsghdr *)buf;
nh->nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
nh->nlmsg_type = RTM_NEWLINK;
nh->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
nh->nlmsg_seq = 2;
struct ifinfomsg *ifi = NLMSG_DATA(nh);
ifi->ifi_family = AF_UNSPEC;
nla_put(nh, IFLA_IFNAME, ifname, strlen(ifname) + 1);
__u32 fd = netns_fd;
nla_put(nh, IFLA_NET_NS_FD, &fd, sizeof(fd));
return nl_send_recv(sk, nh);
}
/* Bring up an interface by name */
static int link_up(int sk, const char *ifname)
{
char buf[4096] = {0};
struct nlmsghdr *nh = (struct nlmsghdr *)buf;
nh->nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
nh->nlmsg_type = RTM_NEWLINK;
nh->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
nh->nlmsg_seq = 3;
struct ifinfomsg *ifi = NLMSG_DATA(nh);
ifi->ifi_family = AF_UNSPEC;
ifi->ifi_flags = IFF_UP;
ifi->ifi_change = IFF_UP;
nla_put(nh, IFLA_IFNAME, ifname, strlen(ifname) + 1);
return nl_send_recv(sk, nh);
}
static int if_nametoindex_via_proc(const char *name)
{
return if_nametoindex(name);
}
/* Create an HSR device with the two named slaves. */
static int create_hsr_in_ns(int sk, const char *hsr_name,
const char *slave1, const char *slave2)
{
char buf[4096] = {0};
struct nlmsghdr *nh = (struct nlmsghdr *)buf;
nh->nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
nh->nlmsg_type = RTM_NEWLINK;
nh->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | NLM_F_CREATE | NLM_F_EXCL;
nh->nlmsg_seq = 4;
struct ifinfomsg *ifi = NLMSG_DATA(nh);
ifi->ifi_family = AF_UNSPEC;
nla_put(nh, IFLA_IFNAME, hsr_name, strlen(hsr_name) + 1);
struct nlattr *linfo = (struct nlattr *)((char *)nh + NLMSG_ALIGN(nh->nlmsg_len));
linfo->nla_type = IFLA_LINKINFO;
int linfo_start = nh->nlmsg_len;
linfo->nla_len = NLA_HDRLEN;
nh->nlmsg_len = NLMSG_ALIGN(nh->nlmsg_len) + NLA_HDRLEN;
nla_put(nh, IFLA_INFO_KIND, "hsr", 4);
struct nlattr *idata = (struct nlattr *)((char *)nh + NLMSG_ALIGN(nh->nlmsg_len));
idata->nla_type = IFLA_INFO_DATA;
int idata_start = nh->nlmsg_len;
idata->nla_len = NLA_HDRLEN;
nh->nlmsg_len = NLMSG_ALIGN(nh->nlmsg_len) + NLA_HDRLEN;
__u32 sidx1 = if_nametoindex_via_proc(slave1);
__u32 sidx2 = if_nametoindex_via_proc(slave2);
if (!sidx1 || !sidx2) {
fprintf(stderr, "if_nametoindex %s=%u %s=%u\n",
slave1, sidx1, slave2, sidx2);
return -1;
}
nla_put(nh, IFLA_HSR_SLAVE1, &sidx1, sizeof(sidx1));
nla_put(nh, IFLA_HSR_SLAVE2, &sidx2, sizeof(sidx2));
idata->nla_len = nh->nlmsg_len - idata_start;
linfo->nla_len = nh->nlmsg_len - linfo_start;
return nl_send_recv(sk, nh);
}
/* Resolve HSR genl family id and mcast group id */
static int resolve_hsr_genl(int sk, __u16 *family_id, __u32 *mc_id)
{
char buf[8192] = {0};
struct nlmsghdr *nh = (struct nlmsghdr *)buf;
nh->nlmsg_len = NLMSG_LENGTH(GENL_HDRLEN);
nh->nlmsg_type = GENL_ID_CTRL;
nh->nlmsg_flags = NLM_F_REQUEST;
nh->nlmsg_seq = 10;
struct genlmsghdr *gh = NLMSG_DATA(nh);
gh->cmd = CTRL_CMD_GETFAMILY;
gh->version = 1;
nla_put(nh, CTRL_ATTR_FAMILY_NAME, HSR_GENL_NAME, strlen(HSR_GENL_NAME) + 1);
if (send(sk, buf, nh->nlmsg_len, 0) < 0) {
perror("send GETFAMILY"); return -1;
}
int n = recv(sk, buf, sizeof(buf), 0);
if (n < 0) { perror("recv GETFAMILY"); return -1; }
struct nlmsghdr *rnh = (struct nlmsghdr *)buf;
if (rnh->nlmsg_type == NLMSG_ERROR) {
struct nlmsgerr *err = (struct nlmsgerr *)NLMSG_DATA(rnh);
fprintf(stderr, "GETFAMILY error %d\n", err->error);
return -1;
}
char *p = (char *)NLMSG_DATA(rnh) + GENL_HDRLEN;
int rem = rnh->nlmsg_len - NLMSG_LENGTH(GENL_HDRLEN);
*family_id = 0;
*mc_id = 0;
while (rem > 0) {
struct nlattr *a = (struct nlattr *)p;
if (a->nla_type == CTRL_ATTR_FAMILY_ID) {
*family_id = *(__u16 *)((char *)a + NLA_HDRLEN);
} else if (a->nla_type == CTRL_ATTR_MCAST_GROUPS) {
char *q = (char *)a + NLA_HDRLEN;
int qrem = a->nla_len - NLA_HDRLEN;
while (qrem > 0) {
struct nlattr *g = (struct nlattr *)q;
char *gq = (char *)g + NLA_HDRLEN;
int grem = g->nla_len - NLA_HDRLEN;
__u32 gid = 0;
const char *gname = NULL;
while (grem > 0) {
struct nlattr *ga = (struct nlattr *)gq;
if (ga->nla_type == CTRL_ATTR_MCAST_GRP_ID)
gid = *(__u32 *)((char *)ga + NLA_HDRLEN);
else if (ga->nla_type == CTRL_ATTR_MCAST_GRP_NAME)
gname = (char *)ga + NLA_HDRLEN;
int gal = NLA_ALIGN(ga->nla_len);
gq += gal; grem -= gal;
}
if (gname && !strcmp(gname, HSR_GENL_MCGRP_NAME))
*mc_id = gid;
int gl = NLA_ALIGN(g->nla_len);
q += gl; qrem -= gl;
}
}
int al = NLA_ALIGN(a->nla_len);
p += al; rem -= al;
}
return (*family_id && *mc_id) ? 0 : -1;
}
/* Open a NETLINK_GENERIC socket and subscribe to HSR mcast group */
static int hsr_listener(__u32 mc_id)
{
int sk = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
if (sk < 0) { perror("socket(GEN)"); return -1; }
struct sockaddr_nl sa = { .nl_family = AF_NETLINK };
if (bind(sk, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
perror("bind GEN"); return -1;
}
if (setsockopt(sk, SOL_NETLINK, NETLINK_ADD_MEMBERSHIP,
&mc_id, sizeof(mc_id)) < 0) {
perror("setsockopt ADD_MEMBERSHIP");
return -1;
}
return sk;
}
/* Drain & count notifications received within timeout_ms */
static int drain_notifications(int sk, int timeout_ms, const char *label)
{
int count = 0;
struct pollfd pfd = { .fd = sk, .events = POLLIN };
int remain = timeout_ms;
struct timespec t0, t1;
clock_gettime(CLOCK_MONOTONIC, &t0);
while (remain > 0) {
int n = poll(&pfd, 1, remain);
if (n < 0) break;
if (n == 0) break;
char buf[4096];
int r = recv(sk, buf, sizeof(buf), MSG_DONTWAIT);
if (r < 0) break;
struct nlmsghdr *nh = (struct nlmsghdr *)buf;
while (NLMSG_OK(nh, r)) {
struct genlmsghdr *gh = NLMSG_DATA(nh);
printf("[%s] HSR notification cmd=%u len=%u\n",
label, gh->cmd, nh->nlmsg_len);
count++;
nh = NLMSG_NEXT(nh, r);
}
clock_gettime(CLOCK_MONOTONIC, &t1);
long elapsed_ms = (t1.tv_sec - t0.tv_sec) * 1000 +
(t1.tv_nsec - t0.tv_nsec) / 1000000;
remain = timeout_ms - (int)elapsed_ms;
}
return count;
}
int main(int argc, char **argv)
{
int timeout_s = 30;
if (argc >= 2) timeout_s = atoi(argv[1]);
/* Spawn helper child in fresh netns. Parent stays in init_net. */
int sync_p[2], ready_p[2];
if (pipe(sync_p) < 0 || pipe(ready_p) < 0) { perror("pipe"); return 1; }
pid_t pid = fork();
if (pid < 0) { perror("fork"); return 1; }
if (pid == 0) {
setbuf(stdout, NULL); setbuf(stderr, NULL);
/* CHILD */
close(sync_p[0]); close(ready_p[1]);
if (unshare(CLONE_NEWNET) < 0) {
perror("unshare(NEWNET)"); _exit(1);
}
/* Signal "I am ready to receive vethA1/vethB1" */
write(sync_p[1], "x", 1);
/* Wait for parent to finish device moves + HSR creation */
char c; read(ready_p[0], &c, 1);
close(sync_p[1]); close(ready_p[0]);
/* In child netns: bring up slaves + create HSR. */
int sk = nl_open(NULL);
if (sk < 0) _exit(1);
fprintf(stderr, "[child] bringing up vethA1, vethB1, hsr0\n");
link_up(sk, "vethA1");
link_up(sk, "vethB1");
__u16 fid; __u32 mcid;
int gsk = nl_open_generic();
if (gsk < 0) _exit(1);
if (resolve_hsr_genl(gsk, &fid, &mcid) < 0) {
fprintf(stderr, "[child] resolve HSR genl failed\n");
_exit(1);
}
fprintf(stderr, "[child] HSR family_id=%u mcgrp_id=%u\n", fid, mcid);
if (create_hsr_in_ns(sk, "hsr0", "vethA1", "vethB1") < 0) {
fprintf(stderr, "[child] create_hsr_in_ns failed\n");
_exit(1);
}
link_up(sk, "hsr0");
fprintf(stderr, "[child] hsr0 up\n");
/* Open genl mcast listener for HSR_MCGRP */
int lsk = hsr_listener(mcid);
if (lsk < 0) _exit(1);
fprintf(stderr, "[child] listening on HSR mcast in child namespace for %ds...\n",
timeout_s);
int n = drain_notifications(lsk, timeout_s * 1000, "child");
fprintf(stderr, "[child] received %d HSR notifications in child namespace\n", n);
_exit(n > 0 ? 0 : 42);
}
/* PARENT (init_net) */
close(sync_p[1]); close(ready_p[0]);
char c; read(sync_p[0], &c, 1); /* wait until child unshare'd */
close(sync_p[0]);
int sk = nl_open(NULL);
if (sk < 0) { kill(pid, SIGKILL); return 1; }
printf("[parent] creating veth pairs vethA0<->vethA1, vethB0<->vethB1\n");
if (create_veth_pair(sk, "vethA0", "vethA1") < 0 ||
create_veth_pair(sk, "vethB0", "vethB1") < 0) {
kill(pid, SIGKILL); return 1;
}
char path[64];
snprintf(path, sizeof(path), "/proc/%d/ns/net", pid);
int netns_fd = open(path, O_RDONLY);
if (netns_fd < 0) { perror("open child namespace"); kill(pid, SIGKILL); return 1; }
printf("[parent] moving vethA1, vethB1 into the child namespace\n");
if (set_link_netns(sk, "vethA1", netns_fd) < 0 ||
set_link_netns(sk, "vethB1", netns_fd) < 0) {
kill(pid, SIGKILL); return 1;
}
link_up(sk, "vethA0");
link_up(sk, "vethB0");
close(netns_fd);
/* hsr1 is the peer that hsr0 in N will time out once we cut frames. */
printf("[parent] creating hsr1 on vethA0+vethB0 in init_net\n");
if (create_hsr_in_ns(sk, "hsr1", "vethA0", "vethB0") < 0) {
fprintf(stderr, "[parent] create hsr1 failed\n");
kill(pid, SIGKILL); return 1;
}
link_up(sk, "hsr1");
/* Parent subscribes to HSR mcast in init_net */
__u16 fid; __u32 mcid;
int gsk = nl_open_generic();
if (gsk < 0) { kill(pid, SIGKILL); return 1; }
if (resolve_hsr_genl(gsk, &fid, &mcid) < 0) {
fprintf(stderr, "[parent] resolve HSR genl failed\n");
kill(pid, SIGKILL); return 1;
}
printf("[parent] HSR family_id=%u mcgrp_id=%u (init_net)\n", fid, mcid);
int lsk = hsr_listener(mcid);
if (lsk < 0) { kill(pid, SIGKILL); return 1; }
/* Signal child to proceed with HSR creation */
write(ready_p[1], "x", 1);
close(ready_p[1]);
/* let both HSRs talk for 4 s so node tables populate */
printf("[parent] soak 4s\n");
drain_notifications(lsk, 4000, "parent(init_net)");
/* cut frames by bringing the init_net slaves down */
printf("[parent] bringing vethA0, vethB0 down\n");
char buf2[4096] = {0};
struct nlmsghdr *nh2 = (struct nlmsghdr *)buf2;
nh2->nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
nh2->nlmsg_type = RTM_NEWLINK;
nh2->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
nh2->nlmsg_seq = 5;
struct ifinfomsg *ifi2 = NLMSG_DATA(nh2);
ifi2->ifi_family = AF_UNSPEC;
ifi2->ifi_change = IFF_UP;
ifi2->ifi_flags = 0;
nla_put(nh2, IFLA_IFNAME, "vethA0", 7);
nl_send_recv(sk, nh2);
nh2->nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
ifi2->ifi_family = AF_UNSPEC;
ifi2->ifi_change = IFF_UP;
ifi2->ifi_flags = 0;
nh2->nlmsg_seq = 6;
nla_put(nh2, IFLA_IFNAME, "vethB0", 7);
nl_send_recv(sk, nh2);
printf("[parent] listening %ds for nodedown\n", timeout_s - 4);
int n = drain_notifications(lsk, (timeout_s - 4) * 1000, "parent(init_net)");
int status;
waitpid(pid, &status, 0);
int child_count = (WIFEXITED(status) && WEXITSTATUS(status) == 0) ? 1 :
(WIFEXITED(status) && WEXITSTATUS(status) == 42) ? 0 :
-1;
printf("\n========== RESULT ==========\n");
printf("init_net received %d HSR notifications\n", n);
printf("child namespace received %s HSR notifications\n",
child_count == 1 ? ">=1" : "0");
if (n > 0 && child_count == 0)
printf("UNPATCHED: init_net got events that belong to the child namespace.\n");
else if (n >= 1 && child_count >= 1)
printf("PATCHED: each namespace receives only its own events.\n");
else
printf("No events in the chosen window. Try a longer timeout.\n");
return 0;
}
^ permalink raw reply [flat|nested] 4+ messages in thread* [PATCH net] hsr: broadcast netlink notifications in the device's net namespace
2026-05-18 6:44 hsr: netlink notifications leak across network namespaces Maoyi Xie
@ 2026-05-27 7:59 ` Maoyi Xie
2026-05-27 11:11 ` Fernando Fernandez Mancera
0 siblings, 1 reply; 4+ messages in thread
From: Maoyi Xie @ 2026-05-27 7:59 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman
Cc: Fernando Fernandez Mancera, Jan Vaclav, Andrew Lunn, Taehee Yoo,
netdev, linux-kernel, Maoyi Xie, stable
The HSR generic netlink family sets .netnsok = true. HSR devices can
live in network namespaces other than init_net.
Two async notifiers broadcast events with genlmsg_multicast(). They
are hsr_nl_ringerror() and hsr_nl_nodedown(). That helper delivers
only on the default genl socket in init_net. So the events always land
in init_net. The network namespace of the device does not matter.
This has two effects. A listener in the device's own namespace never
sees its own ring error and node down events. A privileged listener in
init_net receives events from HSR devices in other namespaces. The
payload carries the peer node MAC (HSR_A_NODE_ADDR) and the slave port
ifindex (HSR_A_IFINDEX). It leaks information across network
namespaces.
Switch both callers to genlmsg_multicast_netns(). Other families with
.netnsok = true already do this. Examples are gtp, ovpn, team,
batman-adv, netdev-genl, ethtool and handshake.
hsr_nl_ringerror() already has the slave port. It uses
dev_net(port->dev). hsr_nl_nodedown() takes the namespace from the
master port via hsr_port_get_hsr().
Fixes: 09e91dbea0aa ("hsr: set .netnsok flag")
Cc: stable@vger.kernel.org
Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com>
---
This is the fix for the problem I reported on netdev on 2026-05-18 [1].
That thread had no reply, so I am sending the patch and adding the HSR
maintainers to Cc. The proof of concept and the test numbers are in
that message.
[1] https://lore.kernel.org/netdev/CAHPEe=GO=2qqWZPwBB4rrXc3mkD0dznp2K78nCsKwF=c-QwxEw@mail.gmail.com/
net/hsr/hsr_netlink.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/net/hsr/hsr_netlink.c b/net/hsr/hsr_netlink.c
index db0b0af7a692..067ceaf7304b 100644
--- a/net/hsr/hsr_netlink.c
+++ b/net/hsr/hsr_netlink.c
@@ -247,7 +247,8 @@ void hsr_nl_ringerror(struct hsr_priv *hsr, unsigned char addr[ETH_ALEN],
goto nla_put_failure;
genlmsg_end(skb, msg_head);
- genlmsg_multicast(&hsr_genl_family, skb, 0, 0, GFP_ATOMIC);
+ genlmsg_multicast_netns(&hsr_genl_family, dev_net(port->dev),
+ skb, 0, 0, GFP_ATOMIC);
return;
@@ -283,8 +284,17 @@ void hsr_nl_nodedown(struct hsr_priv *hsr, unsigned char addr[ETH_ALEN])
if (res < 0)
goto nla_put_failure;
+ rcu_read_lock();
+ master = hsr_port_get_hsr(hsr, HSR_PT_MASTER);
+ if (!master) {
+ rcu_read_unlock();
+ goto nla_put_failure;
+ }
+
genlmsg_end(skb, msg_head);
- genlmsg_multicast(&hsr_genl_family, skb, 0, 0, GFP_ATOMIC);
+ genlmsg_multicast_netns(&hsr_genl_family, dev_net(master->dev),
+ skb, 0, 0, GFP_ATOMIC);
+ rcu_read_unlock();
return;
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH net] hsr: broadcast netlink notifications in the device's net namespace
2026-05-27 7:59 ` [PATCH net] hsr: broadcast netlink notifications in the device's net namespace Maoyi Xie
@ 2026-05-27 11:11 ` Fernando Fernandez Mancera
2026-05-27 23:18 ` Jakub Kicinski
0 siblings, 1 reply; 4+ messages in thread
From: Fernando Fernandez Mancera @ 2026-05-27 11:11 UTC (permalink / raw)
To: Maoyi Xie, David S . Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman
Cc: Jan Vaclav, Andrew Lunn, Taehee Yoo, netdev, linux-kernel, stable
On 5/27/26 9:59 AM, Maoyi Xie wrote:
> The HSR generic netlink family sets .netnsok = true. HSR devices can
> live in network namespaces other than init_net.
>
> Two async notifiers broadcast events with genlmsg_multicast(). They
> are hsr_nl_ringerror() and hsr_nl_nodedown(). That helper delivers
> only on the default genl socket in init_net. So the events always land
> in init_net. The network namespace of the device does not matter.
>
> This has two effects. A listener in the device's own namespace never
> sees its own ring error and node down events. A privileged listener in
> init_net receives events from HSR devices in other namespaces. The
> payload carries the peer node MAC (HSR_A_NODE_ADDR) and the slave port
> ifindex (HSR_A_IFINDEX). It leaks information across network
> namespaces.
>
> Switch both callers to genlmsg_multicast_netns(). Other families with
> .netnsok = true already do this. Examples are gtp, ovpn, team,
> batman-adv, netdev-genl, ethtool and handshake.
>
> hsr_nl_ringerror() already has the slave port. It uses
> dev_net(port->dev). hsr_nl_nodedown() takes the namespace from the
> master port via hsr_port_get_hsr().
>
> Fixes: 09e91dbea0aa ("hsr: set .netnsok flag")
> Cc: stable@vger.kernel.org
> Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com>
> ---
> This is the fix for the problem I reported on netdev on 2026-05-18 [1].
> That thread had no reply, so I am sending the patch and adding the HSR
> maintainers to Cc. The proof of concept and the test numbers are in
> that message.
>
> [1] https://lore.kernel.org/netdev/CAHPEe=GO=2qqWZPwBB4rrXc3mkD0dznp2K78nCsKwF=c-QwxEw@mail.gmail.com/
>
> net/hsr/hsr_netlink.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/net/hsr/hsr_netlink.c b/net/hsr/hsr_netlink.c
> index db0b0af7a692..067ceaf7304b 100644
> --- a/net/hsr/hsr_netlink.c
> +++ b/net/hsr/hsr_netlink.c
> @@ -247,7 +247,8 @@ void hsr_nl_ringerror(struct hsr_priv *hsr, unsigned char addr[ETH_ALEN],
> goto nla_put_failure;
>
> genlmsg_end(skb, msg_head);
> - genlmsg_multicast(&hsr_genl_family, skb, 0, 0, GFP_ATOMIC);
> + genlmsg_multicast_netns(&hsr_genl_family, dev_net(port->dev),
> + skb, 0, 0, GFP_ATOMIC);
>
> return;
>
> @@ -283,8 +284,17 @@ void hsr_nl_nodedown(struct hsr_priv *hsr, unsigned char addr[ETH_ALEN])
> if (res < 0)
> goto nla_put_failure;
>
> + rcu_read_lock();
> + master = hsr_port_get_hsr(hsr, HSR_PT_MASTER);
> + if (!master) {
> + rcu_read_unlock();
> + goto nla_put_failure;
> + }
> +
> genlmsg_end(skb, msg_head);
> - genlmsg_multicast(&hsr_genl_family, skb, 0, 0, GFP_ATOMIC);
> + genlmsg_multicast_netns(&hsr_genl_family, dev_net(master->dev),
> + skb, 0, 0, GFP_ATOMIC);
> + rcu_read_unlock();
>
The patch makes sense to me. Anyway, I think we can reduce the RCU
critical section here.
What about moving rcu_read_unlock() after the if and extract the net
before unlocking? The benefit is minimal but I think it is worth it.
Thanks,
Fernando.
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH net] hsr: broadcast netlink notifications in the device's net namespace
2026-05-27 11:11 ` Fernando Fernandez Mancera
@ 2026-05-27 23:18 ` Jakub Kicinski
0 siblings, 0 replies; 4+ messages in thread
From: Jakub Kicinski @ 2026-05-27 23:18 UTC (permalink / raw)
To: Fernando Fernandez Mancera
Cc: Maoyi Xie, David S . Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Jan Vaclav, Andrew Lunn, Taehee Yoo, netdev,
linux-kernel, stable
On Wed, 27 May 2026 13:11:57 +0200 Fernando Fernandez Mancera wrote:
> > genlmsg_end(skb, msg_head);
> > - genlmsg_multicast(&hsr_genl_family, skb, 0, 0, GFP_ATOMIC);
> > + genlmsg_multicast_netns(&hsr_genl_family, dev_net(master->dev),
> > + skb, 0, 0, GFP_ATOMIC);
> > + rcu_read_unlock();
> >
>
> The patch makes sense to me. Anyway, I think we can reduce the RCU
> critical section here.
>
> What about moving rcu_read_unlock() after the if and extract the net
> before unlocking? The benefit is minimal but I think it is worth it.
Not sure TBH, we'd need to take a ref on the netns and allocate
a tracker (on DEBUG kernels). One could go either way.
I'm replying because I wanted to question whether this is Fixes+stable@
worthy. Sending the notifications to the namespace where the device is
makes sense. But it's as much a behavior changes as it is a fix.
The commit in question was merged to 5.6, real users clearly don't care.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-27 23:18 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-18 6:44 hsr: netlink notifications leak across network namespaces Maoyi Xie
2026-05-27 7:59 ` [PATCH net] hsr: broadcast netlink notifications in the device's net namespace Maoyi Xie
2026-05-27 11:11 ` Fernando Fernandez Mancera
2026-05-27 23:18 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox