* [PATCH net 0/3] bridge: Fix sleep in atomic context
@ 2026-05-26 6:48 Ido Schimmel
2026-05-26 6:48 ` [PATCH net 1/3] bridge: Fix sleep in atomic context in netlink path Ido Schimmel
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Ido Schimmel @ 2026-05-26 6:48 UTC (permalink / raw)
To: netdev, bridge
Cc: davem, kuba, pabeni, edumazet, razor, horms, sdf, Ido Schimmel
Under certain circumstances the bridge driver can call
dev_set_promiscuity() while holding the bridge spin lock. This is a
problem as dev_set_promiscuity() might sleep.
Patches #1-#2 fix the problem in the netlink and sysfs configuration
paths by only taking the lock where it is actually needed, thereby
avoiding calling dev_set_promiscuity() from an atomic context.
Patch #3 adds test cases for both configuration paths in rtnetlink.sh
which already includes test cases for similar issues.
Note that dev_set_promiscuity() can sleep either when it takes the net
device mutex or when calling netif_rx_mode_sync(). I encountered the
problem with the latter, but blamed the former since it came earlier.
Ido Schimmel (3):
bridge: Fix sleep in atomic context in netlink path
bridge: Fix sleep in atomic context in sysfs path
selftests: rtnetlink: Add bridge promiscuity tests
net/bridge/br_netlink.c | 17 +++----
net/bridge/br_switchdev.c | 1 -
net/bridge/br_sysfs_if.c | 30 ++++++++---
tools/testing/selftests/net/rtnetlink.sh | 63 ++++++++++++++++++++++++
4 files changed, 92 insertions(+), 19 deletions(-)
--
2.54.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net 1/3] bridge: Fix sleep in atomic context in netlink path
2026-05-26 6:48 [PATCH net 0/3] bridge: Fix sleep in atomic context Ido Schimmel
@ 2026-05-26 6:48 ` Ido Schimmel
2026-05-26 6:48 ` [PATCH net 2/3] bridge: Fix sleep in atomic context in sysfs path Ido Schimmel
2026-05-26 6:48 ` [PATCH net 3/3] selftests: rtnetlink: Add bridge promiscuity tests Ido Schimmel
2 siblings, 0 replies; 4+ messages in thread
From: Ido Schimmel @ 2026-05-26 6:48 UTC (permalink / raw)
To: netdev, bridge
Cc: davem, kuba, pabeni, edumazet, razor, horms, sdf, Ido Schimmel
Since the introduction of the netlink configuration path for bridge
ports in commit 25c71c75ac87 ("bridge: bridge port parameters over
netlink"), br_setport() was always called with the bridge lock held
around it. Back then this decision made sense: The bridge lock protects
the STP state of the bridge and its ports and at that time the function
only processed three STP related netlink attributes (cost, priority and
state).
Nowadays, br_setport() processes a lot more attributes and most of them
do not need the bridge lock:
* Bridge flags: Only require RTNL. Read locklessly by the data path.
Annotations can be added in net-next.
* FDB port flushing: Only requires the FDB lock.
* Multicast attributes: Only require the multicast lock.
* Group forward mask: Only requires RTNL. Read locklessly by the data
path. Annotations can be added in net-next.
* Backup port and NHID: Only require RTNL. Read locklessly by the data
path.
This is a problem as the bridge calls dev_set_promiscuity() when certain
bridge port flags change and this function can sleep since the commit
cited below, resulting in a splat such as [1].
Fix this by reducing the scope of the bridge lock and only take it when
processing the three STP related attributes that require it. This is
consistent with the multicast attributes where each attribute acquires
the multicast lock instead of having one critical section for all
relevant attributes.
[1]
BUG: sleeping function called from invalid context at net/core/dev_addr_lists.c:1262
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 356, name: bridge
preempt_count: 201, expected: 0
RCU nest depth: 0, expected: 0
2 locks held by bridge/356:
#0: ffffffff919473a0 (rtnl_mutex){+.+.}-{4:4}, at: rtnetlink_rcv_msg (net/core/rtnetlink.c:80 net/core/rtnetlink.c:7002)
#1: ffff888115072d58 (&br->lock){+...}-{3:3}, at: br_setlink (./include/linux/spinlock.h:348 net/bridge/br_netlink.c:1117)
Preemption disabled at:
0x0
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
__might_resched.cold (kernel/sched/core.c:9163)
netif_rx_mode_run (net/core/dev_addr_lists.c:1262)
netif_rx_mode_sync (net/core/dev_addr_lists.c:1428)
dev_set_promiscuity (net/core/dev_api.c:289)
br_manage_promisc (net/bridge/br_if.c:135 net/bridge/br_if.c:172)
br_port_flags_change (net/bridge/br_if.c:242 net/bridge/br_if.c:747)
br_setport (net/bridge/br_netlink.c:1000)
br_setlink (net/bridge/br_netlink.c:1118)
rtnl_bridge_setlink (net/core/rtnetlink.c:5572)
rtnetlink_rcv_msg (net/core/rtnetlink.c:7005)
netlink_rcv_skb (net/netlink/af_netlink.c:2550)
netlink_unicast (net/netlink/af_netlink.c:1318 net/netlink/af_netlink.c:1344)
netlink_sendmsg (net/netlink/af_netlink.c:1894)
__sock_sendmsg (net/socket.c:787 (discriminator 4) net/socket.c:802 (discriminator 4))
____sys_sendmsg (net/socket.c:2698)
___sys_sendmsg (net/socket.c:2752)
__sys_sendmsg (net/socket.c:2784)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
Fixes: 78cd408356fe ("net: add missing instance lock to dev_set_promiscuity")
Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
net/bridge/br_netlink.c | 17 +++++++----------
1 file changed, 7 insertions(+), 10 deletions(-)
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index c04a4d0889ae..b9591dd755f9 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1000,19 +1000,25 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[],
br_port_flags_change(p, changed_mask);
if (tb[IFLA_BRPORT_COST]) {
+ spin_lock_bh(&p->br->lock);
err = br_stp_set_path_cost(p, nla_get_u32(tb[IFLA_BRPORT_COST]));
+ spin_unlock_bh(&p->br->lock);
if (err)
return err;
}
if (tb[IFLA_BRPORT_PRIORITY]) {
+ spin_lock_bh(&p->br->lock);
err = br_stp_set_port_priority(p, nla_get_u16(tb[IFLA_BRPORT_PRIORITY]));
+ spin_unlock_bh(&p->br->lock);
if (err)
return err;
}
if (tb[IFLA_BRPORT_STATE]) {
+ spin_lock_bh(&p->br->lock);
err = br_set_port_state(p, nla_get_u8(tb[IFLA_BRPORT_STATE]));
+ spin_unlock_bh(&p->br->lock);
if (err)
return err;
}
@@ -1114,9 +1120,7 @@ int br_setlink(struct net_device *dev, struct nlmsghdr *nlh, u16 flags,
if (err)
return err;
- spin_lock_bh(&p->br->lock);
err = br_setport(p, tb, extack);
- spin_unlock_bh(&p->br->lock);
} else {
/* Binary compatibility with old RSTP */
if (nla_len(protinfo) < sizeof(u8))
@@ -1203,17 +1207,10 @@ static int br_port_slave_changelink(struct net_device *brdev,
struct nlattr *data[],
struct netlink_ext_ack *extack)
{
- struct net_bridge *br = netdev_priv(brdev);
- int ret;
-
if (!data)
return 0;
- spin_lock_bh(&br->lock);
- ret = br_setport(br_port_get_rtnl(dev), data, extack);
- spin_unlock_bh(&br->lock);
-
- return ret;
+ return br_setport(br_port_get_rtnl(dev), data, extack);
}
static int br_port_fill_slave_info(struct sk_buff *skb,
--
2.54.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH net 2/3] bridge: Fix sleep in atomic context in sysfs path
2026-05-26 6:48 [PATCH net 0/3] bridge: Fix sleep in atomic context Ido Schimmel
2026-05-26 6:48 ` [PATCH net 1/3] bridge: Fix sleep in atomic context in netlink path Ido Schimmel
@ 2026-05-26 6:48 ` Ido Schimmel
2026-05-26 6:48 ` [PATCH net 3/3] selftests: rtnetlink: Add bridge promiscuity tests Ido Schimmel
2 siblings, 0 replies; 4+ messages in thread
From: Ido Schimmel @ 2026-05-26 6:48 UTC (permalink / raw)
To: netdev, bridge
Cc: davem, kuba, pabeni, edumazet, razor, horms, sdf, Ido Schimmel
Since the start of the git history, brport_store() always acquired the
bridge lock. Back then this decision made sense: The bridge lock
protects the STP state of the bridge and its ports and at that time the
function was only used by two STP related attributes (cost and
priority).
Nowadays, brport_store() processes a lot more attributes and most of
them do not need the bridge lock:
* Bridge flags: Only require RTNL. Read locklessly by the data path.
Annotations can be added in net-next.
* FDB port flushing: Only requires the FDB lock.
* Multicast attributes: Only require the multicast lock.
* Group forward mask: Only requires RTNL. Read locklessly by the data
path. Annotations can be added in net-next.
* Backup port: Only requires RTNL. Read locklessly by the data path.
This is a problem as the bridge calls dev_set_promiscuity() when certain
bridge port flags change and this function can sleep since the commit
cited below, resulting in a splat such as [1].
Fix this by reducing the scope of the bridge lock and only take it when
processing the two STP related attributes that require it. Remove the
now stale comment from br_switchdev_set_port_flag(). The
SWITCHDEV_F_DEFER flag can be removed in net-next.
[1]
BUG: sleeping function called from invalid context at net/core/dev_addr_lists.c:1262
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 372, name: bash
preempt_count: 201, expected: 0
RCU nest depth: 0, expected: 0
5 locks held by bash/372:
#0: ffff88810c51c3f0 (sb_writers#7){.+.+}-{0:0}, at: ksys_write (fs/read_write.c:740)
#1: ffff888115ce9480 (&of->mutex){+.+.}-{4:4}, at: kernfs_fop_write_iter (fs/kernfs/file.c:343)
#2: ffff88810b9fd330 (kn->active#37){.+.+}-{0:0}, at: kernfs_fop_write_iter (fs/kernfs/file.c:80 fs/kernfs/file.c:344)
#3: ffffffffa59473a0 (rtnl_mutex){+.+.}-{4:4}, at: brport_store (net/bridge/br_sysfs_if.c:326)
#4: ffff8881099d2d58 (&br->lock){+...}-{3:3}, at: brport_store (./include/linux/spinlock.h:348 net/bridge/br_sysfs_if.c:345)
Preemption disabled at:
0x0
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
__might_resched.cold (kernel/sched/core.c:9163)
netif_rx_mode_run (net/core/dev_addr_lists.c:1262)
netif_rx_mode_sync (net/core/dev_addr_lists.c:1428)
dev_set_promiscuity (net/core/dev_api.c:289)
br_manage_promisc (net/bridge/br_if.c:135 net/bridge/br_if.c:172)
br_port_flags_change (net/bridge/br_if.c:242 net/bridge/br_if.c:747)
store_learning (net/bridge/br_sysfs_if.c:79 net/bridge/br_sysfs_if.c:235)
brport_store (net/bridge/br_sysfs_if.c:346)
kernfs_fop_write_iter (fs/kernfs/file.c:352)
new_sync_write (fs/read_write.c:595)
vfs_write (fs/read_write.c:688)
ksys_write (fs/read_write.c:740)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
Fixes: 78cd408356fe ("net: add missing instance lock to dev_set_promiscuity")
Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
net/bridge/br_switchdev.c | 1 -
net/bridge/br_sysfs_if.c | 30 ++++++++++++++++++++++--------
2 files changed, 22 insertions(+), 9 deletions(-)
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index 18b558a931ad..ee3ad9dfbab9 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -99,7 +99,6 @@ int br_switchdev_set_port_flag(struct net_bridge_port *p,
attr.u.brport_flags.val = flags;
attr.u.brport_flags.mask = mask;
- /* We run from atomic context here */
err = call_switchdev_notifiers(SWITCHDEV_PORT_ATTR_SET, p->dev,
&info.info, extack);
err = notifier_to_errno(err);
diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
index 1f57c36a7fc0..d6df81fa0d13 100644
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -86,16 +86,34 @@ static ssize_t show_path_cost(struct net_bridge_port *p, char *buf)
return sysfs_emit(buf, "%d\n", p->path_cost);
}
-static BRPORT_ATTR(path_cost, 0644,
- show_path_cost, br_stp_set_path_cost);
+static int store_path_cost(struct net_bridge_port *p, unsigned long v)
+{
+ int ret;
+
+ spin_lock_bh(&p->br->lock);
+ ret = br_stp_set_path_cost(p, v);
+ spin_unlock_bh(&p->br->lock);
+ return ret;
+}
+
+static BRPORT_ATTR(path_cost, 0644, show_path_cost, store_path_cost);
static ssize_t show_priority(struct net_bridge_port *p, char *buf)
{
return sysfs_emit(buf, "%d\n", p->priority);
}
-static BRPORT_ATTR(priority, 0644,
- show_priority, br_stp_set_port_priority);
+static int store_priority(struct net_bridge_port *p, unsigned long v)
+{
+ int ret;
+
+ spin_lock_bh(&p->br->lock);
+ ret = br_stp_set_port_priority(p, v);
+ spin_unlock_bh(&p->br->lock);
+ return ret;
+}
+
+static BRPORT_ATTR(priority, 0644, show_priority, store_priority);
static ssize_t show_designated_root(struct net_bridge_port *p, char *buf)
{
@@ -334,17 +352,13 @@ static ssize_t brport_store(struct kobject *kobj,
ret = -ENOMEM;
goto out_unlock;
}
- spin_lock_bh(&p->br->lock);
ret = brport_attr->store_raw(p, buf_copy);
- spin_unlock_bh(&p->br->lock);
kfree(buf_copy);
} else if (brport_attr->store) {
val = simple_strtoul(buf, &endp, 0);
if (endp == buf)
goto out_unlock;
- spin_lock_bh(&p->br->lock);
ret = brport_attr->store(p, val);
- spin_unlock_bh(&p->br->lock);
}
if (!ret) {
--
2.54.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH net 3/3] selftests: rtnetlink: Add bridge promiscuity tests
2026-05-26 6:48 [PATCH net 0/3] bridge: Fix sleep in atomic context Ido Schimmel
2026-05-26 6:48 ` [PATCH net 1/3] bridge: Fix sleep in atomic context in netlink path Ido Schimmel
2026-05-26 6:48 ` [PATCH net 2/3] bridge: Fix sleep in atomic context in sysfs path Ido Schimmel
@ 2026-05-26 6:48 ` Ido Schimmel
2 siblings, 0 replies; 4+ messages in thread
From: Ido Schimmel @ 2026-05-26 6:48 UTC (permalink / raw)
To: netdev, bridge
Cc: davem, kuba, pabeni, edumazet, razor, horms, sdf, Ido Schimmel
Add two test cases that always pass, but trigger sleeping in atomic
context BUGs without "bridge: Fix sleep in atomic context in netlink
path" and "bridge: Fix sleep in atomic context in sysfs path".
Assisted-by: Claude:claude-opus-4-7
Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
---
tools/testing/selftests/net/rtnetlink.sh | 63 ++++++++++++++++++++++++
1 file changed, 63 insertions(+)
diff --git a/tools/testing/selftests/net/rtnetlink.sh b/tools/testing/selftests/net/rtnetlink.sh
index c499953d4885..ace3a99023ed 100755
--- a/tools/testing/selftests/net/rtnetlink.sh
+++ b/tools/testing/selftests/net/rtnetlink.sh
@@ -24,6 +24,8 @@ ALL_TESTS="
kci_test_macsec
kci_test_macsec_vlan
kci_test_team_bridge_macvlan
+ kci_test_bridge_promisc_netlink
+ kci_test_bridge_promisc_sysfs
kci_test_ipsec
kci_test_ipsec_offload
kci_test_fdb_get
@@ -61,6 +63,14 @@ check_fail()
fi
}
+sysfs_write()
+{
+ local val="$1"
+ local path="$2"
+
+ echo "$val" > "$path"
+}
+
run_cmd_common()
{
local cmd="$*"
@@ -680,6 +690,59 @@ kci_test_team_bridge_macvlan()
end_test "PASS: team_bridge_macvlan"
}
+# Test that changing bridge port flags via the netlink path does not sleep with
+# the bridge spin lock held.
+kci_test_bridge_promisc_netlink()
+{
+ local dummy="test_dummy1"
+ local bridge="test_br1"
+ local team="test_team1"
+ local ret=0
+
+ run_cmd ip link add $team up type team
+ run_cmd ip link add $bridge up type bridge vlan_filtering 1
+ run_cmd ip link add $dummy up type dummy
+ run_cmd ip link set $dummy master $bridge
+ run_cmd ip link set $team master $bridge
+
+ # This causes the bridge driver to sync all the static FDB entries to
+ # the team device (which supports unicast filtering) and remove it from
+ # promiscuous mode. The call to dev_set_promiscuity() can sleep due to
+ # Rx mode inlining, which is a problem if the bridge spin lock is held.
+ run_cmd bridge link set dev $dummy flood off learning off
+
+ run_cmd ip link del $dummy
+ run_cmd ip link del $bridge
+ run_cmd ip link del $team
+
+ end_test "PASS: bridge_promisc_netlink"
+}
+
+# Same as kci_test_bridge_promisc_netlink(), but the flags are changed via the
+# sysfs path.
+kci_test_bridge_promisc_sysfs()
+{
+ local dummy="test_dummy1"
+ local bridge="test_br1"
+ local team="test_team1"
+ local ret=0
+
+ run_cmd ip link add $team up type team
+ run_cmd ip link add $bridge up type bridge vlan_filtering 1
+ run_cmd ip link add $dummy up type dummy
+ run_cmd ip link set $dummy master $bridge
+ run_cmd ip link set $team master $bridge
+
+ run_cmd sysfs_write 0 /sys/class/net/$dummy/brport/unicast_flood
+ run_cmd sysfs_write 0 /sys/class/net/$dummy/brport/learning
+
+ run_cmd ip link del $dummy
+ run_cmd ip link del $bridge
+ run_cmd ip link del $team
+
+ end_test "PASS: bridge_promisc_sysfs"
+}
+
#-------------------------------------------------------------------
# Example commands
# ip x s add proto esp src 14.0.0.52 dst 14.0.0.70 \
--
2.54.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-26 6:49 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-26 6:48 [PATCH net 0/3] bridge: Fix sleep in atomic context Ido Schimmel
2026-05-26 6:48 ` [PATCH net 1/3] bridge: Fix sleep in atomic context in netlink path Ido Schimmel
2026-05-26 6:48 ` [PATCH net 2/3] bridge: Fix sleep in atomic context in sysfs path Ido Schimmel
2026-05-26 6:48 ` [PATCH net 3/3] selftests: rtnetlink: Add bridge promiscuity tests Ido Schimmel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox