Netdev List
 help / color / mirror / Atom feed
* hsr: netlink notifications leak across network namespaces
@ 2026-05-18  6:44 Maoyi Xie
  2026-05-27  7:59 ` [PATCH net] hsr: broadcast netlink notifications in the device's net namespace Maoyi Xie
  0 siblings, 1 reply; 4+ messages in thread
From: Maoyi Xie @ 2026-05-18  6:44 UTC (permalink / raw)
  To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: Simon Horman, netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2889 bytes --]

Hi all,

(Resending from a personal address. My previous attempt from
my NTU corporate account carried an auto-appended confidentiality
disclaimer that you've declined to accept. The content below is
unchanged.)

I noticed what looks like a cross-namespace leak in
net/hsr/hsr_netlink.c on linux-7.1-rc1. Could you confirm if this
is a real defect, and worth fixing?

# Where it is

net/hsr/hsr_netlink.c line 250 in hsr_nl_ringerror()
net/hsr/hsr_netlink.c line 287 in hsr_nl_nodedown()

Both call genlmsg_multicast(&hsr_genl_family, skb, 0, 0, GFP_ATOMIC).

# What looks wrong

hsr_genl_family is registered with .netnsok = true (line 557),
so HSR devices can live in non-init network namespaces. But
genlmsg_multicast() delivers on the kernel default generic
netlink socket. That socket lives in init_net. Events from any
HSR device end up there, regardless of which network namespace
the device sits in.

Other .netnsok = true networking subsystems use
genlmsg_multicast_netns() instead. Examples:

  drivers/net/gtp.c
  drivers/net/ovpn/netlink.c
  drivers/net/team/team_core.c
  net/batman-adv/netlink.c
  net/core/netdev-genl.c
  net/ethtool/netlink.c
  net/handshake/netlink.c

# Two effects

1. A privileged listener in init_net subscribed to the
  "hsr-network" group receives ring-error and node-down events
   for HSR devices in any network namespace. The payload carries
   the peer MAC (HSR_A_NODE_ADDR) and the slave port ifindex
   (HSR_A_IFINDEX). Looks like a cross-namespace information
   leak.

2. A listener inside the HSR device's own network namespace
   never sees its own events. Functional defect for any
   namespaced HSR consumer.

# How I tested

Attached proof of concept sets up vethA0/vethB0 with hsr1 in
init_net and vethA1/vethB1 with hsr0 in a child network
namespace. Veth pairs bridge the namespace boundary. Both HSRs
talk for 4 seconds so node tables fill. Then vethA0/vethB0 go
down. Each side's prune timer fires hsr_nl_nodedown() for the
peer MAC. Two listeners report what they got.

Vanilla linux-7.1-rc1:
    init_net          received 2 HSR notifications
    child namespace   received 0 HSR notifications

With genlmsg_multicast_netns() routed via
dev_net(master->dev) and dev_net(port->dev):
    init_net          received 1 HSR notification    (hsr1's event)
    child namespace   received 1 HSR notification    (hsr0's event)

The proof of concept source poc_hsr_pernet.c and the run log
poc_hsr_pernet.log are attached. The proof of concept build
overrides HSR_NODE_FORGET_TIME from 60000 ms to 3000 ms to
keep wall clock short; the bug shape is the same at the
stock 60 s.

If you confirm this is a real bug, I have a small patch ready
that switches the two callers to genlmsg_multicast_netns().
Please let me know if you would like me to send it as a
follow-up.

Thanks,
Maoyi
--
Nanyang Technological University
https://maoyixie.com/

[-- Attachment #2: poc_hsr_pernet.log --]
[-- Type: application/octet-stream, Size: 1814 bytes --]

# Unpatched (linux-7.1-rc1 vanilla)

$ rmmod hsr; insmod /root/hsr.ko; /root/poc_hsr_pernet 15
[child] bringing up vethA1, vethB1, hsr0
[child] HSR family_id=37 mcgrp_id=15
[child] hsr0 up
[child] listening on HSR mcast in child namespace for 15s...
[child] received 0 HSR notifications in child namespace
[parent] creating veth pairs vethA0<->vethA1, vethB0<->vethB1
[parent] moving vethA1, vethB1 into the child namespace
[parent] creating hsr1 on vethA0+vethB0 in init_net
[parent] HSR family_id=37 mcgrp_id=15 (init_net)
[parent] soak 4s
[parent] bringing vethA0, vethB0 down
[parent] listening 11s for nodedown
[parent(init_net)] HSR notification cmd=2 len=32
[parent(init_net)] HSR notification cmd=2 len=32

========== RESULT ==========
init_net   received 2 HSR notifications
child namespace  received 0 HSR notifications
UNPATCHED: init_net got events that belong to the child namespace.


# Patched (linux-7.1-rc1 + the two-line genlmsg_multicast_netns conversion)

$ rmmod hsr; insmod /root/hsr.ko; /root/poc_hsr_pernet 15
[child] bringing up vethA1, vethB1, hsr0
[child] HSR family_id=38 mcgrp_id=15
[child] hsr0 up
[child] listening on HSR mcast in child namespace for 15s...
[child] HSR notification cmd=2 len=32
[child] received 1 HSR notifications in child namespace
[parent] creating veth pairs vethA0<->vethA1, vethB0<->vethB1
[parent] moving vethA1, vethB1 into the child namespace
[parent] creating hsr1 on vethA0+vethB0 in init_net
[parent] HSR family_id=38 mcgrp_id=15 (init_net)
[parent] soak 4s
[parent] bringing vethA0, vethB0 down
[parent] listening 11s for nodedown
[parent(init_net)] HSR notification cmd=2 len=32

========== RESULT ==========
init_net   received 1 HSR notifications
child namespace  received >=1 HSR notifications
PATCHED: each namespace receives only its own events.

[-- Attachment #3: poc_hsr_pernet.c --]
[-- Type: application/octet-stream, Size: 17541 bytes --]

/*
 * PoC for HSR genlmsg_multicast cross-namespace leak.
 *
 * Setup:
 *   parent in init_net.
 *   parent creates two veth pairs: vethA0/vethA1 and vethB0/vethB1.
 *   parent moves vethA1, vethB1 into a child network namespace.
 *   parent builds hsr1 on vethA0 + vethB0 in init_net.
 *   child  builds hsr0 on vethA1 + vethB1 in the child namespace.
 *   parent listens on the "hsr-network" mcast group in init_net.
 *   child  listens on the "hsr-network" mcast group in the child
 *          namespace.
 *
 * Trigger:
 *   let both HSRs talk for 4 s.
 *   parent brings vethA0 and vethB0 down.
 *   each HSR's prune timer eventually fires hsr_nl_nodedown for the
 *   peer MAC.
 *
 * Expected:
 *   unpatched: parent counts 2 events (hsr1 own + hsr0 leak),
 *              child  counts 0.
 *   patched:   parent counts 1 (hsr1 own), child counts 1 (hsr0 own).
 *
 * Build: cc -O0 -g poc_hsr_pernet.c -o poc_hsr_pernet
 * Run as root in a VM with hsr.ko loaded.
 *
 * Note on timing:
 *   The trigger relies on hsr_nl_nodedown firing once each side's
 *   node-table entry for the peer expires. The expiry time is
 *   HSR_NODE_FORGET_TIME in net/hsr/hsr_main.h, default 60000 ms.
 *   To finish in 15 s, rebuild the hsr module with
 *   HSR_NODE_FORGET_TIME = 3000. To run against a stock kernel,
 *   pass a timeout argument >= 70:
 *
 *     ./poc_hsr_pernet 75
 *
 *   The bug shape is identical in both cases.
 */

#define _GNU_SOURCE
#include <arpa/inet.h>
#include <errno.h>
#include <fcntl.h>
#include <linux/genetlink.h>
#include <linux/if.h>
#include <linux/if_link.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
#include <linux/veth.h>
#include <poll.h>
#include <sched.h>
#include <signal.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/socket.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>

/* if_nametoindex is normally in <net/if.h>, which clashes with
 * <linux/if.h>. Forward-declare to avoid the conflict.
 */
extern unsigned int if_nametoindex(const char *);

/* HSR netlink UAPI (mirror of include/uapi/linux/hsr_netlink.h) */
#define HSR_GENL_NAME		"HSR"
#define HSR_GENL_VERSION	1
#define HSR_GENL_MCGRP_NAME	"hsr-network"
#define HSR_C_RING_ERROR	1
#define HSR_C_NODE_DOWN		2

#define HSR_ATTR_NODE_ADDR	2
#define HSR_ATTR_IFINDEX	3

/* IFLA_INFO_* for nested netlink */
#define IFLA_INFO_KIND		1
#define IFLA_INFO_DATA		2

/* HSR rtnetlink attrs (include/uapi/linux/if_link.h IFLA_HSR_*) */
#define IFLA_HSR_SLAVE1		1
#define IFLA_HSR_SLAVE2		2

static int nl_open(int *pid_out)
{
	int sk = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
	if (sk < 0) { perror("socket(NETLINK_ROUTE)"); return -1; }
	struct sockaddr_nl sa = { .nl_family = AF_NETLINK };
	if (bind(sk, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
		perror("bind NETLINK_ROUTE"); return -1;
	}
	socklen_t slen = sizeof(sa);
	getsockname(sk, (struct sockaddr *)&sa, &slen);
	if (pid_out) *pid_out = sa.nl_pid;
	return sk;
}

static int nl_open_generic(void)
{
	int sk = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
	if (sk < 0) { perror("socket(NETLINK_GENERIC)"); return -1; }
	struct sockaddr_nl sa = { .nl_family = AF_NETLINK };
	if (bind(sk, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
		perror("bind NETLINK_GENERIC"); return -1;
	}
	return sk;
}

static int nl_send_recv(int sk, struct nlmsghdr *nh)
{
	if (send(sk, nh, nh->nlmsg_len, 0) < 0) {
		perror("send"); return -1;
	}
	char buf[8192];
	int n = recv(sk, buf, sizeof(buf), 0);
	if (n < 0) { perror("recv"); return -1; }
	struct nlmsghdr *rnh = (struct nlmsghdr *)buf;
	if (rnh->nlmsg_type == NLMSG_ERROR) {
		struct nlmsgerr *err = (struct nlmsgerr *)NLMSG_DATA(rnh);
		if (err->error) {
			fprintf(stderr, "nlmsgerr %d (%s)\n",
				err->error, strerror(-err->error));
			return err->error;
		}
	}
	return 0;
}

/* helper: add an nla to existing nh */
static struct nlattr *nla_put(struct nlmsghdr *nh, int type,
			      const void *data, int len)
{
	struct nlattr *na = (struct nlattr *)((char *)nh + NLMSG_ALIGN(nh->nlmsg_len));
	na->nla_type = type;
	na->nla_len = NLA_HDRLEN + len;
	if (data) memcpy((char *)na + NLA_HDRLEN, data, len);
	nh->nlmsg_len = NLMSG_ALIGN(nh->nlmsg_len) + NLA_ALIGN(na->nla_len);
	return na;
}

/* Create a veth pair via RTM_NEWLINK */
static int create_veth_pair(int sk, const char *a, const char *b)
{
	char buf[4096] = {0};
	struct nlmsghdr *nh = (struct nlmsghdr *)buf;
	nh->nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
	nh->nlmsg_type = RTM_NEWLINK;
	nh->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | NLM_F_CREATE | NLM_F_EXCL;
	nh->nlmsg_seq = 1;
	struct ifinfomsg *ifi = NLMSG_DATA(nh);
	ifi->ifi_family = AF_UNSPEC;

	nla_put(nh, IFLA_IFNAME, a, strlen(a) + 1);

	/* nested: linkinfo = { kind="veth", data = { peer = { ifname=b } } } */
	struct nlattr *linfo = (struct nlattr *)((char *)nh + NLMSG_ALIGN(nh->nlmsg_len));
	linfo->nla_type = IFLA_LINKINFO;
	int linfo_start = nh->nlmsg_len;
	linfo->nla_len = NLA_HDRLEN;
	nh->nlmsg_len = NLMSG_ALIGN(nh->nlmsg_len) + NLA_HDRLEN;
	nla_put(nh, IFLA_INFO_KIND, "veth", strlen("veth") + 1);

	struct nlattr *info_data = (struct nlattr *)((char *)nh + NLMSG_ALIGN(nh->nlmsg_len));
	info_data->nla_type = IFLA_INFO_DATA;
	int info_data_start = nh->nlmsg_len;
	info_data->nla_len = NLA_HDRLEN;
	nh->nlmsg_len = NLMSG_ALIGN(nh->nlmsg_len) + NLA_HDRLEN;

	struct nlattr *peer = (struct nlattr *)((char *)nh + NLMSG_ALIGN(nh->nlmsg_len));
	peer->nla_type = VETH_INFO_PEER;
	int peer_start = nh->nlmsg_len;
	peer->nla_len = NLA_HDRLEN + sizeof(struct ifinfomsg);
	/* zero ifinfomsg for peer */
	memset((char *)peer + NLA_HDRLEN, 0, sizeof(struct ifinfomsg));
	nh->nlmsg_len = NLMSG_ALIGN(nh->nlmsg_len) + peer->nla_len;
	nla_put(nh, IFLA_IFNAME, b, strlen(b) + 1);
	peer->nla_len = nh->nlmsg_len - peer_start;
	info_data->nla_len = nh->nlmsg_len - info_data_start;
	linfo->nla_len = nh->nlmsg_len - linfo_start;

	return nl_send_recv(sk, nh);
}

/* Move netdev (by ifname) to target netns_fd */
static int set_link_netns(int sk, const char *ifname, int netns_fd)
{
	char buf[4096] = {0};
	struct nlmsghdr *nh = (struct nlmsghdr *)buf;
	nh->nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
	nh->nlmsg_type = RTM_NEWLINK;
	nh->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
	nh->nlmsg_seq = 2;
	struct ifinfomsg *ifi = NLMSG_DATA(nh);
	ifi->ifi_family = AF_UNSPEC;

	nla_put(nh, IFLA_IFNAME, ifname, strlen(ifname) + 1);
	__u32 fd = netns_fd;
	nla_put(nh, IFLA_NET_NS_FD, &fd, sizeof(fd));

	return nl_send_recv(sk, nh);
}

/* Bring up an interface by name */
static int link_up(int sk, const char *ifname)
{
	char buf[4096] = {0};
	struct nlmsghdr *nh = (struct nlmsghdr *)buf;
	nh->nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
	nh->nlmsg_type = RTM_NEWLINK;
	nh->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
	nh->nlmsg_seq = 3;
	struct ifinfomsg *ifi = NLMSG_DATA(nh);
	ifi->ifi_family = AF_UNSPEC;
	ifi->ifi_flags = IFF_UP;
	ifi->ifi_change = IFF_UP;
	nla_put(nh, IFLA_IFNAME, ifname, strlen(ifname) + 1);
	return nl_send_recv(sk, nh);
}

static int if_nametoindex_via_proc(const char *name)
{
	return if_nametoindex(name);
}

/* Create an HSR device with the two named slaves. */
static int create_hsr_in_ns(int sk, const char *hsr_name,
			    const char *slave1, const char *slave2)
{
	char buf[4096] = {0};
	struct nlmsghdr *nh = (struct nlmsghdr *)buf;
	nh->nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
	nh->nlmsg_type = RTM_NEWLINK;
	nh->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | NLM_F_CREATE | NLM_F_EXCL;
	nh->nlmsg_seq = 4;
	struct ifinfomsg *ifi = NLMSG_DATA(nh);
	ifi->ifi_family = AF_UNSPEC;
	nla_put(nh, IFLA_IFNAME, hsr_name, strlen(hsr_name) + 1);

	struct nlattr *linfo = (struct nlattr *)((char *)nh + NLMSG_ALIGN(nh->nlmsg_len));
	linfo->nla_type = IFLA_LINKINFO;
	int linfo_start = nh->nlmsg_len;
	linfo->nla_len = NLA_HDRLEN;
	nh->nlmsg_len = NLMSG_ALIGN(nh->nlmsg_len) + NLA_HDRLEN;
	nla_put(nh, IFLA_INFO_KIND, "hsr", 4);

	struct nlattr *idata = (struct nlattr *)((char *)nh + NLMSG_ALIGN(nh->nlmsg_len));
	idata->nla_type = IFLA_INFO_DATA;
	int idata_start = nh->nlmsg_len;
	idata->nla_len = NLA_HDRLEN;
	nh->nlmsg_len = NLMSG_ALIGN(nh->nlmsg_len) + NLA_HDRLEN;

	__u32 sidx1 = if_nametoindex_via_proc(slave1);
	__u32 sidx2 = if_nametoindex_via_proc(slave2);
	if (!sidx1 || !sidx2) {
		fprintf(stderr, "if_nametoindex %s=%u %s=%u\n",
			slave1, sidx1, slave2, sidx2);
		return -1;
	}
	nla_put(nh, IFLA_HSR_SLAVE1, &sidx1, sizeof(sidx1));
	nla_put(nh, IFLA_HSR_SLAVE2, &sidx2, sizeof(sidx2));

	idata->nla_len = nh->nlmsg_len - idata_start;
	linfo->nla_len = nh->nlmsg_len - linfo_start;

	return nl_send_recv(sk, nh);
}

/* Resolve HSR genl family id and mcast group id */
static int resolve_hsr_genl(int sk, __u16 *family_id, __u32 *mc_id)
{
	char buf[8192] = {0};
	struct nlmsghdr *nh = (struct nlmsghdr *)buf;
	nh->nlmsg_len = NLMSG_LENGTH(GENL_HDRLEN);
	nh->nlmsg_type = GENL_ID_CTRL;
	nh->nlmsg_flags = NLM_F_REQUEST;
	nh->nlmsg_seq = 10;
	struct genlmsghdr *gh = NLMSG_DATA(nh);
	gh->cmd = CTRL_CMD_GETFAMILY;
	gh->version = 1;
	nla_put(nh, CTRL_ATTR_FAMILY_NAME, HSR_GENL_NAME, strlen(HSR_GENL_NAME) + 1);

	if (send(sk, buf, nh->nlmsg_len, 0) < 0) {
		perror("send GETFAMILY"); return -1;
	}
	int n = recv(sk, buf, sizeof(buf), 0);
	if (n < 0) { perror("recv GETFAMILY"); return -1; }
	struct nlmsghdr *rnh = (struct nlmsghdr *)buf;
	if (rnh->nlmsg_type == NLMSG_ERROR) {
		struct nlmsgerr *err = (struct nlmsgerr *)NLMSG_DATA(rnh);
		fprintf(stderr, "GETFAMILY error %d\n", err->error);
		return -1;
	}
	char *p = (char *)NLMSG_DATA(rnh) + GENL_HDRLEN;
	int rem = rnh->nlmsg_len - NLMSG_LENGTH(GENL_HDRLEN);
	*family_id = 0;
	*mc_id = 0;
	while (rem > 0) {
		struct nlattr *a = (struct nlattr *)p;
		if (a->nla_type == CTRL_ATTR_FAMILY_ID) {
			*family_id = *(__u16 *)((char *)a + NLA_HDRLEN);
		} else if (a->nla_type == CTRL_ATTR_MCAST_GROUPS) {
			char *q = (char *)a + NLA_HDRLEN;
			int qrem = a->nla_len - NLA_HDRLEN;
			while (qrem > 0) {
				struct nlattr *g = (struct nlattr *)q;
				char *gq = (char *)g + NLA_HDRLEN;
				int grem = g->nla_len - NLA_HDRLEN;
				__u32 gid = 0;
				const char *gname = NULL;
				while (grem > 0) {
					struct nlattr *ga = (struct nlattr *)gq;
					if (ga->nla_type == CTRL_ATTR_MCAST_GRP_ID)
						gid = *(__u32 *)((char *)ga + NLA_HDRLEN);
					else if (ga->nla_type == CTRL_ATTR_MCAST_GRP_NAME)
						gname = (char *)ga + NLA_HDRLEN;
					int gal = NLA_ALIGN(ga->nla_len);
					gq += gal; grem -= gal;
				}
				if (gname && !strcmp(gname, HSR_GENL_MCGRP_NAME))
					*mc_id = gid;
				int gl = NLA_ALIGN(g->nla_len);
				q += gl; qrem -= gl;
			}
		}
		int al = NLA_ALIGN(a->nla_len);
		p += al; rem -= al;
	}
	return (*family_id && *mc_id) ? 0 : -1;
}

/* Open a NETLINK_GENERIC socket and subscribe to HSR mcast group */
static int hsr_listener(__u32 mc_id)
{
	int sk = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
	if (sk < 0) { perror("socket(GEN)"); return -1; }
	struct sockaddr_nl sa = { .nl_family = AF_NETLINK };
	if (bind(sk, (struct sockaddr *)&sa, sizeof(sa)) < 0) {
		perror("bind GEN"); return -1;
	}
	if (setsockopt(sk, SOL_NETLINK, NETLINK_ADD_MEMBERSHIP,
		       &mc_id, sizeof(mc_id)) < 0) {
		perror("setsockopt ADD_MEMBERSHIP");
		return -1;
	}
	return sk;
}

/* Drain & count notifications received within timeout_ms */
static int drain_notifications(int sk, int timeout_ms, const char *label)
{
	int count = 0;
	struct pollfd pfd = { .fd = sk, .events = POLLIN };
	int remain = timeout_ms;
	struct timespec t0, t1;
	clock_gettime(CLOCK_MONOTONIC, &t0);
	while (remain > 0) {
		int n = poll(&pfd, 1, remain);
		if (n < 0) break;
		if (n == 0) break;
		char buf[4096];
		int r = recv(sk, buf, sizeof(buf), MSG_DONTWAIT);
		if (r < 0) break;
		struct nlmsghdr *nh = (struct nlmsghdr *)buf;
		while (NLMSG_OK(nh, r)) {
			struct genlmsghdr *gh = NLMSG_DATA(nh);
			printf("[%s] HSR notification cmd=%u len=%u\n",
			       label, gh->cmd, nh->nlmsg_len);
			count++;
			nh = NLMSG_NEXT(nh, r);
		}
		clock_gettime(CLOCK_MONOTONIC, &t1);
		long elapsed_ms = (t1.tv_sec - t0.tv_sec) * 1000 +
				  (t1.tv_nsec - t0.tv_nsec) / 1000000;
		remain = timeout_ms - (int)elapsed_ms;
	}
	return count;
}

int main(int argc, char **argv)
{
	int timeout_s = 30;
	if (argc >= 2) timeout_s = atoi(argv[1]);

	/* Spawn helper child in fresh netns. Parent stays in init_net. */
	int sync_p[2], ready_p[2];
	if (pipe(sync_p) < 0 || pipe(ready_p) < 0) { perror("pipe"); return 1; }

	pid_t pid = fork();
	if (pid < 0) { perror("fork"); return 1; }

	if (pid == 0) {
		setbuf(stdout, NULL); setbuf(stderr, NULL);
		/* CHILD */
		close(sync_p[0]); close(ready_p[1]);
		if (unshare(CLONE_NEWNET) < 0) {
			perror("unshare(NEWNET)"); _exit(1);
		}
		/* Signal "I am ready to receive vethA1/vethB1" */
		write(sync_p[1], "x", 1);
		/* Wait for parent to finish device moves + HSR creation */
		char c; read(ready_p[0], &c, 1);
		close(sync_p[1]); close(ready_p[0]);

		/* In child netns: bring up slaves + create HSR. */
		int sk = nl_open(NULL);
		if (sk < 0) _exit(1);
		fprintf(stderr, "[child] bringing up vethA1, vethB1, hsr0\n");
		link_up(sk, "vethA1");
		link_up(sk, "vethB1");

		__u16 fid; __u32 mcid;
		int gsk = nl_open_generic();
		if (gsk < 0) _exit(1);
		if (resolve_hsr_genl(gsk, &fid, &mcid) < 0) {
			fprintf(stderr, "[child] resolve HSR genl failed\n");
			_exit(1);
		}
		fprintf(stderr, "[child] HSR family_id=%u mcgrp_id=%u\n", fid, mcid);

		if (create_hsr_in_ns(sk, "hsr0", "vethA1", "vethB1") < 0) {
			fprintf(stderr, "[child] create_hsr_in_ns failed\n");
			_exit(1);
		}
		link_up(sk, "hsr0");
		fprintf(stderr, "[child] hsr0 up\n");

		/* Open genl mcast listener for HSR_MCGRP */
		int lsk = hsr_listener(mcid);
		if (lsk < 0) _exit(1);
		fprintf(stderr, "[child] listening on HSR mcast in child namespace for %ds...\n",
		       timeout_s);
		int n = drain_notifications(lsk, timeout_s * 1000, "child");
		fprintf(stderr, "[child] received %d HSR notifications in child namespace\n", n);
		_exit(n > 0 ? 0 : 42);
	}

	/* PARENT (init_net) */
	close(sync_p[1]); close(ready_p[0]);
	char c; read(sync_p[0], &c, 1);  /* wait until child unshare'd */
	close(sync_p[0]);

	int sk = nl_open(NULL);
	if (sk < 0) { kill(pid, SIGKILL); return 1; }

	printf("[parent] creating veth pairs vethA0<->vethA1, vethB0<->vethB1\n");
	if (create_veth_pair(sk, "vethA0", "vethA1") < 0 ||
	    create_veth_pair(sk, "vethB0", "vethB1") < 0) {
		kill(pid, SIGKILL); return 1;
	}

	char path[64];
	snprintf(path, sizeof(path), "/proc/%d/ns/net", pid);
	int netns_fd = open(path, O_RDONLY);
	if (netns_fd < 0) { perror("open child namespace"); kill(pid, SIGKILL); return 1; }

	printf("[parent] moving vethA1, vethB1 into the child namespace\n");
	if (set_link_netns(sk, "vethA1", netns_fd) < 0 ||
	    set_link_netns(sk, "vethB1", netns_fd) < 0) {
		kill(pid, SIGKILL); return 1;
	}

	link_up(sk, "vethA0");
	link_up(sk, "vethB0");
	close(netns_fd);

	/* hsr1 is the peer that hsr0 in N will time out once we cut frames. */
	printf("[parent] creating hsr1 on vethA0+vethB0 in init_net\n");
	if (create_hsr_in_ns(sk, "hsr1", "vethA0", "vethB0") < 0) {
		fprintf(stderr, "[parent] create hsr1 failed\n");
		kill(pid, SIGKILL); return 1;
	}
	link_up(sk, "hsr1");

	/* Parent subscribes to HSR mcast in init_net */
	__u16 fid; __u32 mcid;
	int gsk = nl_open_generic();
	if (gsk < 0) { kill(pid, SIGKILL); return 1; }
	if (resolve_hsr_genl(gsk, &fid, &mcid) < 0) {
		fprintf(stderr, "[parent] resolve HSR genl failed\n");
		kill(pid, SIGKILL); return 1;
	}
	printf("[parent] HSR family_id=%u mcgrp_id=%u (init_net)\n", fid, mcid);

	int lsk = hsr_listener(mcid);
	if (lsk < 0) { kill(pid, SIGKILL); return 1; }

	/* Signal child to proceed with HSR creation */
	write(ready_p[1], "x", 1);
	close(ready_p[1]);

	/* let both HSRs talk for 4 s so node tables populate */
	printf("[parent] soak 4s\n");
	drain_notifications(lsk, 4000, "parent(init_net)");

	/* cut frames by bringing the init_net slaves down */
	printf("[parent] bringing vethA0, vethB0 down\n");
	char buf2[4096] = {0};
	struct nlmsghdr *nh2 = (struct nlmsghdr *)buf2;
	nh2->nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
	nh2->nlmsg_type = RTM_NEWLINK;
	nh2->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
	nh2->nlmsg_seq = 5;
	struct ifinfomsg *ifi2 = NLMSG_DATA(nh2);
	ifi2->ifi_family = AF_UNSPEC;
	ifi2->ifi_change = IFF_UP;
	ifi2->ifi_flags = 0;
	nla_put(nh2, IFLA_IFNAME, "vethA0", 7);
	nl_send_recv(sk, nh2);
	nh2->nlmsg_len = NLMSG_LENGTH(sizeof(struct ifinfomsg));
	ifi2->ifi_family = AF_UNSPEC;
	ifi2->ifi_change = IFF_UP;
	ifi2->ifi_flags = 0;
	nh2->nlmsg_seq = 6;
	nla_put(nh2, IFLA_IFNAME, "vethB0", 7);
	nl_send_recv(sk, nh2);

	printf("[parent] listening %ds for nodedown\n", timeout_s - 4);
	int n = drain_notifications(lsk, (timeout_s - 4) * 1000, "parent(init_net)");

	int status;
	waitpid(pid, &status, 0);
	int child_count = (WIFEXITED(status) && WEXITSTATUS(status) == 0) ? 1 :
			  (WIFEXITED(status) && WEXITSTATUS(status) == 42) ? 0 :
			  -1;

	printf("\n========== RESULT ==========\n");
	printf("init_net   received %d HSR notifications\n", n);
	printf("child namespace  received %s HSR notifications\n",
	       child_count == 1 ? ">=1" : "0");
	if (n > 0 && child_count == 0)
		printf("UNPATCHED: init_net got events that belong to the child namespace.\n");
	else if (n >= 1 && child_count >= 1)
		printf("PATCHED: each namespace receives only its own events.\n");
	else
		printf("No events in the chosen window. Try a longer timeout.\n");
	return 0;
}

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net] hsr: broadcast netlink notifications in the device's net namespace
  2026-05-18  6:44 hsr: netlink notifications leak across network namespaces Maoyi Xie
@ 2026-05-27  7:59 ` Maoyi Xie
  2026-05-27 11:11   ` Fernando Fernandez Mancera
  0 siblings, 1 reply; 4+ messages in thread
From: Maoyi Xie @ 2026-05-27  7:59 UTC (permalink / raw)
  To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: Fernando Fernandez Mancera, Jan Vaclav, Andrew Lunn, Taehee Yoo,
	netdev, linux-kernel, Maoyi Xie, stable

The HSR generic netlink family sets .netnsok = true. HSR devices can
live in network namespaces other than init_net.

Two async notifiers broadcast events with genlmsg_multicast(). They
are hsr_nl_ringerror() and hsr_nl_nodedown(). That helper delivers
only on the default genl socket in init_net. So the events always land
in init_net. The network namespace of the device does not matter.

This has two effects. A listener in the device's own namespace never
sees its own ring error and node down events. A privileged listener in
init_net receives events from HSR devices in other namespaces. The
payload carries the peer node MAC (HSR_A_NODE_ADDR) and the slave port
ifindex (HSR_A_IFINDEX). It leaks information across network
namespaces.

Switch both callers to genlmsg_multicast_netns(). Other families with
.netnsok = true already do this. Examples are gtp, ovpn, team,
batman-adv, netdev-genl, ethtool and handshake.

hsr_nl_ringerror() already has the slave port. It uses
dev_net(port->dev). hsr_nl_nodedown() takes the namespace from the
master port via hsr_port_get_hsr().

Fixes: 09e91dbea0aa ("hsr: set .netnsok flag")
Cc: stable@vger.kernel.org
Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com>
---
This is the fix for the problem I reported on netdev on 2026-05-18 [1].
That thread had no reply, so I am sending the patch and adding the HSR
maintainers to Cc. The proof of concept and the test numbers are in
that message.

[1] https://lore.kernel.org/netdev/CAHPEe=GO=2qqWZPwBB4rrXc3mkD0dznp2K78nCsKwF=c-QwxEw@mail.gmail.com/

 net/hsr/hsr_netlink.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/net/hsr/hsr_netlink.c b/net/hsr/hsr_netlink.c
index db0b0af7a692..067ceaf7304b 100644
--- a/net/hsr/hsr_netlink.c
+++ b/net/hsr/hsr_netlink.c
@@ -247,7 +247,8 @@ void hsr_nl_ringerror(struct hsr_priv *hsr, unsigned char addr[ETH_ALEN],
 		goto nla_put_failure;
 
 	genlmsg_end(skb, msg_head);
-	genlmsg_multicast(&hsr_genl_family, skb, 0, 0, GFP_ATOMIC);
+	genlmsg_multicast_netns(&hsr_genl_family, dev_net(port->dev),
+				skb, 0, 0, GFP_ATOMIC);
 
 	return;
 
@@ -283,8 +284,17 @@ void hsr_nl_nodedown(struct hsr_priv *hsr, unsigned char addr[ETH_ALEN])
 	if (res < 0)
 		goto nla_put_failure;
 
+	rcu_read_lock();
+	master = hsr_port_get_hsr(hsr, HSR_PT_MASTER);
+	if (!master) {
+		rcu_read_unlock();
+		goto nla_put_failure;
+	}
+
 	genlmsg_end(skb, msg_head);
-	genlmsg_multicast(&hsr_genl_family, skb, 0, 0, GFP_ATOMIC);
+	genlmsg_multicast_netns(&hsr_genl_family, dev_net(master->dev),
+				skb, 0, 0, GFP_ATOMIC);
+	rcu_read_unlock();
 
 	return;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net] hsr: broadcast netlink notifications in the device's net namespace
  2026-05-27  7:59 ` [PATCH net] hsr: broadcast netlink notifications in the device's net namespace Maoyi Xie
@ 2026-05-27 11:11   ` Fernando Fernandez Mancera
  2026-05-27 23:18     ` Jakub Kicinski
  0 siblings, 1 reply; 4+ messages in thread
From: Fernando Fernandez Mancera @ 2026-05-27 11:11 UTC (permalink / raw)
  To: Maoyi Xie, David S . Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman
  Cc: Jan Vaclav, Andrew Lunn, Taehee Yoo, netdev, linux-kernel, stable

On 5/27/26 9:59 AM, Maoyi Xie wrote:
> The HSR generic netlink family sets .netnsok = true. HSR devices can
> live in network namespaces other than init_net.
> 
> Two async notifiers broadcast events with genlmsg_multicast(). They
> are hsr_nl_ringerror() and hsr_nl_nodedown(). That helper delivers
> only on the default genl socket in init_net. So the events always land
> in init_net. The network namespace of the device does not matter.
> 
> This has two effects. A listener in the device's own namespace never
> sees its own ring error and node down events. A privileged listener in
> init_net receives events from HSR devices in other namespaces. The
> payload carries the peer node MAC (HSR_A_NODE_ADDR) and the slave port
> ifindex (HSR_A_IFINDEX). It leaks information across network
> namespaces.
> 
> Switch both callers to genlmsg_multicast_netns(). Other families with
> .netnsok = true already do this. Examples are gtp, ovpn, team,
> batman-adv, netdev-genl, ethtool and handshake.
> 
> hsr_nl_ringerror() already has the slave port. It uses
> dev_net(port->dev). hsr_nl_nodedown() takes the namespace from the
> master port via hsr_port_get_hsr().
> 
> Fixes: 09e91dbea0aa ("hsr: set .netnsok flag")
> Cc: stable@vger.kernel.org
> Signed-off-by: Maoyi Xie <maoyixie.tju@gmail.com>
> ---
> This is the fix for the problem I reported on netdev on 2026-05-18 [1].
> That thread had no reply, so I am sending the patch and adding the HSR
> maintainers to Cc. The proof of concept and the test numbers are in
> that message.
> 
> [1] https://lore.kernel.org/netdev/CAHPEe=GO=2qqWZPwBB4rrXc3mkD0dznp2K78nCsKwF=c-QwxEw@mail.gmail.com/
> 
>   net/hsr/hsr_netlink.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/net/hsr/hsr_netlink.c b/net/hsr/hsr_netlink.c
> index db0b0af7a692..067ceaf7304b 100644
> --- a/net/hsr/hsr_netlink.c
> +++ b/net/hsr/hsr_netlink.c
> @@ -247,7 +247,8 @@ void hsr_nl_ringerror(struct hsr_priv *hsr, unsigned char addr[ETH_ALEN],
>   		goto nla_put_failure;
>   
>   	genlmsg_end(skb, msg_head);
> -	genlmsg_multicast(&hsr_genl_family, skb, 0, 0, GFP_ATOMIC);
> +	genlmsg_multicast_netns(&hsr_genl_family, dev_net(port->dev),
> +				skb, 0, 0, GFP_ATOMIC);
>   
>   	return;
>   
> @@ -283,8 +284,17 @@ void hsr_nl_nodedown(struct hsr_priv *hsr, unsigned char addr[ETH_ALEN])
>   	if (res < 0)
>   		goto nla_put_failure;
>   
> +	rcu_read_lock();
> +	master = hsr_port_get_hsr(hsr, HSR_PT_MASTER);
> +	if (!master) {
> +		rcu_read_unlock();
> +		goto nla_put_failure;
> +	}
> +
>   	genlmsg_end(skb, msg_head);
> -	genlmsg_multicast(&hsr_genl_family, skb, 0, 0, GFP_ATOMIC);
> +	genlmsg_multicast_netns(&hsr_genl_family, dev_net(master->dev),
> +				skb, 0, 0, GFP_ATOMIC);
> +	rcu_read_unlock();
>   

The patch makes sense to me. Anyway, I think we can reduce the RCU 
critical section here.

What about moving rcu_read_unlock() after the if and extract the net 
before unlocking? The benefit is minimal but I think it is worth it.

Thanks,
Fernando.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH net] hsr: broadcast netlink notifications in the device's net namespace
  2026-05-27 11:11   ` Fernando Fernandez Mancera
@ 2026-05-27 23:18     ` Jakub Kicinski
  0 siblings, 0 replies; 4+ messages in thread
From: Jakub Kicinski @ 2026-05-27 23:18 UTC (permalink / raw)
  To: Fernando Fernandez Mancera
  Cc: Maoyi Xie, David S . Miller, Eric Dumazet, Paolo Abeni,
	Simon Horman, Jan Vaclav, Andrew Lunn, Taehee Yoo, netdev,
	linux-kernel, stable

On Wed, 27 May 2026 13:11:57 +0200 Fernando Fernandez Mancera wrote:
> >   	genlmsg_end(skb, msg_head);
> > -	genlmsg_multicast(&hsr_genl_family, skb, 0, 0, GFP_ATOMIC);
> > +	genlmsg_multicast_netns(&hsr_genl_family, dev_net(master->dev),
> > +				skb, 0, 0, GFP_ATOMIC);
> > +	rcu_read_unlock();
> >     
> 
> The patch makes sense to me. Anyway, I think we can reduce the RCU 
> critical section here.
> 
> What about moving rcu_read_unlock() after the if and extract the net 
> before unlocking? The benefit is minimal but I think it is worth it.

Not sure TBH, we'd need to take a ref on the netns and allocate 
a tracker (on DEBUG kernels). One could go either way.

I'm replying because I wanted to question whether this is Fixes+stable@ 
worthy. Sending the notifications to the namespace where the device is
makes sense. But it's as much a behavior changes as it is a fix.
The commit in question was merged to 5.6, real users clearly don't care.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-27 23:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-18  6:44 hsr: netlink notifications leak across network namespaces Maoyi Xie
2026-05-27  7:59 ` [PATCH net] hsr: broadcast netlink notifications in the device's net namespace Maoyi Xie
2026-05-27 11:11   ` Fernando Fernandez Mancera
2026-05-27 23:18     ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox