* [PATCH net] net: switchdev: Convert blocking notification chain to a raw one
@ 2025-03-05 12:15 Amit Cohen
2025-03-10 6:17 ` Simon Horman
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Amit Cohen @ 2025-03-05 12:15 UTC (permalink / raw)
To: netdev
Cc: idosch, petrm, jiri, ivecera, davem, edumazet, kuba, pabeni,
horms, olteanv, tobias, Amit Cohen
A blocking notification chain uses a read-write semaphore to protect the
integrity of the chain. The semaphore is acquired for writing when
adding / removing notifiers to / from the chain and acquired for reading
when traversing the chain and informing notifiers about an event.
In case of the blocking switchdev notification chain, recursive
notifications are possible which leads to the semaphore being acquired
twice for reading and to lockdep warnings being generated [1].
Specifically, this can happen when the bridge driver processes a
SWITCHDEV_BRPORT_UNOFFLOADED event which causes it to emit notifications
about deferred events when calling switchdev_deferred_process().
Fix this by converting the notification chain to a raw notification
chain in a similar fashion to the netdev notification chain. Protect
the chain using the RTNL mutex by acquiring it when modifying the chain.
Events are always informed under the RTNL mutex, but add an assertion in
call_switchdev_blocking_notifiers() to make sure this is not violated in
the future.
Maintain the "blocking" prefix as events are always emitted from process
context and listeners are allowed to block.
[1]:
WARNING: possible recursive locking detected
6.14.0-rc4-custom-g079270089484 #1 Not tainted
--------------------------------------------
ip/52731 is trying to acquire lock:
ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
but task is already holding lock:
ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock((switchdev_blocking_notif_chain).rwsem);
lock((switchdev_blocking_notif_chain).rwsem);
*** DEADLOCK ***
May be due to missing lock nesting notation
3 locks held by ip/52731:
#0: ffffffff84f795b0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x727/0x1dc0
#1: ffffffff8731f628 (&net->rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x790/0x1dc0
#2: ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
stack backtrace:
...
? __pfx_down_read+0x10/0x10
? __pfx_mark_lock+0x10/0x10
? __pfx_switchdev_port_attr_set_deferred+0x10/0x10
blocking_notifier_call_chain+0x58/0xa0
switchdev_port_attr_notify.constprop.0+0xb3/0x1b0
? __pfx_switchdev_port_attr_notify.constprop.0+0x10/0x10
? mark_held_locks+0x94/0xe0
? switchdev_deferred_process+0x11a/0x340
switchdev_port_attr_set_deferred+0x27/0xd0
switchdev_deferred_process+0x164/0x340
br_switchdev_port_unoffload+0xc8/0x100 [bridge]
br_switchdev_blocking_event+0x29f/0x580 [bridge]
notifier_call_chain+0xa2/0x440
blocking_notifier_call_chain+0x6e/0xa0
switchdev_bridge_port_unoffload+0xde/0x1a0
...
Fixes: f7a70d650b0b6 ("net: bridge: switchdev: Ensure deferred event delivery on unoffload")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
---
net/switchdev/switchdev.c | 25 ++++++++++++++++++-------
1 file changed, 18 insertions(+), 7 deletions(-)
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index 6488ead9e464..4d5fbacef496 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -472,7 +472,7 @@ bool switchdev_port_obj_act_is_deferred(struct net_device *dev,
EXPORT_SYMBOL_GPL(switchdev_port_obj_act_is_deferred);
static ATOMIC_NOTIFIER_HEAD(switchdev_notif_chain);
-static BLOCKING_NOTIFIER_HEAD(switchdev_blocking_notif_chain);
+static RAW_NOTIFIER_HEAD(switchdev_blocking_notif_chain);
/**
* register_switchdev_notifier - Register notifier
@@ -518,17 +518,27 @@ EXPORT_SYMBOL_GPL(call_switchdev_notifiers);
int register_switchdev_blocking_notifier(struct notifier_block *nb)
{
- struct blocking_notifier_head *chain = &switchdev_blocking_notif_chain;
+ struct raw_notifier_head *chain = &switchdev_blocking_notif_chain;
+ int err;
+
+ rtnl_lock();
+ err = raw_notifier_chain_register(chain, nb);
+ rtnl_unlock();
- return blocking_notifier_chain_register(chain, nb);
+ return err;
}
EXPORT_SYMBOL_GPL(register_switchdev_blocking_notifier);
int unregister_switchdev_blocking_notifier(struct notifier_block *nb)
{
- struct blocking_notifier_head *chain = &switchdev_blocking_notif_chain;
+ struct raw_notifier_head *chain = &switchdev_blocking_notif_chain;
+ int err;
- return blocking_notifier_chain_unregister(chain, nb);
+ rtnl_lock();
+ err = raw_notifier_chain_unregister(chain, nb);
+ rtnl_unlock();
+
+ return err;
}
EXPORT_SYMBOL_GPL(unregister_switchdev_blocking_notifier);
@@ -536,10 +546,11 @@ int call_switchdev_blocking_notifiers(unsigned long val, struct net_device *dev,
struct switchdev_notifier_info *info,
struct netlink_ext_ack *extack)
{
+ ASSERT_RTNL();
info->dev = dev;
info->extack = extack;
- return blocking_notifier_call_chain(&switchdev_blocking_notif_chain,
- val, info);
+ return raw_notifier_call_chain(&switchdev_blocking_notif_chain,
+ val, info);
}
EXPORT_SYMBOL_GPL(call_switchdev_blocking_notifiers);
--
2.47.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH net] net: switchdev: Convert blocking notification chain to a raw one
2025-03-05 12:15 [PATCH net] net: switchdev: Convert blocking notification chain to a raw one Amit Cohen
@ 2025-03-10 6:17 ` Simon Horman
2025-03-11 8:12 ` Kuniyuki Iwashima
2025-03-11 0:11 ` Vladimir Oltean
2025-03-11 11:10 ` patchwork-bot+netdevbpf
2 siblings, 1 reply; 6+ messages in thread
From: Simon Horman @ 2025-03-10 6:17 UTC (permalink / raw)
To: Amit Cohen
Cc: netdev, idosch, petrm, jiri, ivecera, davem, edumazet, kuba,
pabeni, olteanv, tobias
On Wed, Mar 05, 2025 at 02:15:09PM +0200, Amit Cohen wrote:
> A blocking notification chain uses a read-write semaphore to protect the
> integrity of the chain. The semaphore is acquired for writing when
> adding / removing notifiers to / from the chain and acquired for reading
> when traversing the chain and informing notifiers about an event.
>
> In case of the blocking switchdev notification chain, recursive
> notifications are possible which leads to the semaphore being acquired
> twice for reading and to lockdep warnings being generated [1].
>
> Specifically, this can happen when the bridge driver processes a
> SWITCHDEV_BRPORT_UNOFFLOADED event which causes it to emit notifications
> about deferred events when calling switchdev_deferred_process().
>
> Fix this by converting the notification chain to a raw notification
> chain in a similar fashion to the netdev notification chain. Protect
> the chain using the RTNL mutex by acquiring it when modifying the chain.
> Events are always informed under the RTNL mutex, but add an assertion in
> call_switchdev_blocking_notifiers() to make sure this is not violated in
> the future.
Hi Amit,
As you may be aware there is quite some activity to reduce the reliance on
RTNL. However, as the events in question are already protected by RTNL
I think the approach you have taken here is entirely reasonable.
>
> Maintain the "blocking" prefix as events are always emitted from process
> context and listeners are allowed to block.
>
> [1]:
> WARNING: possible recursive locking detected
> 6.14.0-rc4-custom-g079270089484 #1 Not tainted
> --------------------------------------------
> ip/52731 is trying to acquire lock:
> ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
>
> but task is already holding lock:
> ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
> CPU0
> ----
> lock((switchdev_blocking_notif_chain).rwsem);
> lock((switchdev_blocking_notif_chain).rwsem);
>
> *** DEADLOCK ***
> May be due to missing lock nesting notation
> 3 locks held by ip/52731:
> #0: ffffffff84f795b0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x727/0x1dc0
> #1: ffffffff8731f628 (&net->rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x790/0x1dc0
> #2: ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
>
> stack backtrace:
> ...
> ? __pfx_down_read+0x10/0x10
> ? __pfx_mark_lock+0x10/0x10
> ? __pfx_switchdev_port_attr_set_deferred+0x10/0x10
> blocking_notifier_call_chain+0x58/0xa0
> switchdev_port_attr_notify.constprop.0+0xb3/0x1b0
> ? __pfx_switchdev_port_attr_notify.constprop.0+0x10/0x10
> ? mark_held_locks+0x94/0xe0
> ? switchdev_deferred_process+0x11a/0x340
> switchdev_port_attr_set_deferred+0x27/0xd0
> switchdev_deferred_process+0x164/0x340
> br_switchdev_port_unoffload+0xc8/0x100 [bridge]
> br_switchdev_blocking_event+0x29f/0x580 [bridge]
> notifier_call_chain+0xa2/0x440
> blocking_notifier_call_chain+0x6e/0xa0
> switchdev_bridge_port_unoffload+0xde/0x1a0
> ...
>
> Fixes: f7a70d650b0b6 ("net: bridge: switchdev: Ensure deferred event delivery on unoffload")
> Signed-off-by: Amit Cohen <amcohen@nvidia.com>
> Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
...
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH net] net: switchdev: Convert blocking notification chain to a raw one
2025-03-10 6:17 ` Simon Horman
@ 2025-03-11 8:12 ` Kuniyuki Iwashima
2025-03-12 17:50 ` Ido Schimmel
0 siblings, 1 reply; 6+ messages in thread
From: Kuniyuki Iwashima @ 2025-03-11 8:12 UTC (permalink / raw)
To: amcohen
Cc: horms, davem, edumazet, idosch, ivecera, jiri, kuba, netdev,
olteanv, pabeni, petrm, tobias
From: Simon Horman <horms@kernel.org>
Date: Mon, 10 Mar 2025 06:17:37 +0000
> On Wed, Mar 05, 2025 at 02:15:09PM +0200, Amit Cohen wrote:
> > A blocking notification chain uses a read-write semaphore to protect the
> > integrity of the chain. The semaphore is acquired for writing when
> > adding / removing notifiers to / from the chain and acquired for reading
> > when traversing the chain and informing notifiers about an event.
> >
> > In case of the blocking switchdev notification chain, recursive
> > notifications are possible which leads to the semaphore being acquired
> > twice for reading and to lockdep warnings being generated [1].
> >
> > Specifically, this can happen when the bridge driver processes a
> > SWITCHDEV_BRPORT_UNOFFLOADED event which causes it to emit notifications
> > about deferred events when calling switchdev_deferred_process().
> >
> > Fix this by converting the notification chain to a raw notification
> > chain in a similar fashion to the netdev notification chain. Protect
> > the chain using the RTNL mutex by acquiring it when modifying the chain.
> > Events are always informed under the RTNL mutex, but add an assertion in
> > call_switchdev_blocking_notifiers() to make sure this is not violated in
> > the future.
>
> Hi Amit,
>
> As you may be aware there is quite some activity to reduce the reliance on
> RTNL. However, as the events in question are already protected by RTNL
> I think the approach you have taken here is entirely reasonable.
It would be appreicated if Amit you can post a follow-up patch against
net-next.git to convert the rtnl_lock() to another lock or rtnl_net_lock().
Thanks!
p.s. thanks for ccing me, Vladimir!
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] net: switchdev: Convert blocking notification chain to a raw one
2025-03-11 8:12 ` Kuniyuki Iwashima
@ 2025-03-12 17:50 ` Ido Schimmel
0 siblings, 0 replies; 6+ messages in thread
From: Ido Schimmel @ 2025-03-12 17:50 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: amcohen, horms, davem, edumazet, ivecera, jiri, kuba, netdev,
olteanv, pabeni, petrm, tobias
On Tue, Mar 11, 2025 at 01:12:50AM -0700, Kuniyuki Iwashima wrote:
> From: Simon Horman <horms@kernel.org>
> > As you may be aware there is quite some activity to reduce the reliance on
> > RTNL. However, as the events in question are already protected by RTNL
> > I think the approach you have taken here is entirely reasonable.
>
> It would be appreicated if Amit you can post a follow-up patch against
> net-next.git to convert the rtnl_lock() to another lock or rtnl_net_lock().
We're obviously aware of the RTNL related work and we thought about
making these notification chains (atomic and blocking) per-netns, but
it's not something that can be submitted as a fix.
I will look into it, but there are some listeners that I'm not sure how
to convert. I can register them with "&init_net" for RFC. Hopefully the
relevant maintainers will be able to help with that.
Note that we will need to keep ASSERT_RTNL() in
call_switchdev_blocking_notifiers() until all the callers are converted
to per-netns RTNL.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net] net: switchdev: Convert blocking notification chain to a raw one
2025-03-05 12:15 [PATCH net] net: switchdev: Convert blocking notification chain to a raw one Amit Cohen
2025-03-10 6:17 ` Simon Horman
@ 2025-03-11 0:11 ` Vladimir Oltean
2025-03-11 11:10 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 6+ messages in thread
From: Vladimir Oltean @ 2025-03-11 0:11 UTC (permalink / raw)
To: Amit Cohen, Kuniyuki Iwashima
Cc: netdev, idosch, petrm, jiri, ivecera, davem, edumazet, kuba,
pabeni, horms, tobias
Hi Amit,
On Wed, Mar 05, 2025 at 02:15:09PM +0200, Amit Cohen wrote:
> A blocking notification chain uses a read-write semaphore to protect the
> integrity of the chain. The semaphore is acquired for writing when
> adding / removing notifiers to / from the chain and acquired for reading
> when traversing the chain and informing notifiers about an event.
>
> In case of the blocking switchdev notification chain, recursive
> notifications are possible which leads to the semaphore being acquired
> twice for reading and to lockdep warnings being generated [1].
>
> Specifically, this can happen when the bridge driver processes a
> SWITCHDEV_BRPORT_UNOFFLOADED event which causes it to emit notifications
> about deferred events when calling switchdev_deferred_process().
>
> Fix this by converting the notification chain to a raw notification
> chain in a similar fashion to the netdev notification chain. Protect
> the chain using the RTNL mutex by acquiring it when modifying the chain.
> Events are always informed under the RTNL mutex, but add an assertion in
> call_switchdev_blocking_notifiers() to make sure this is not violated in
> the future.
>
> Maintain the "blocking" prefix as events are always emitted from process
> context and listeners are allowed to block.
>
> [1]:
> WARNING: possible recursive locking detected
> 6.14.0-rc4-custom-g079270089484 #1 Not tainted
> --------------------------------------------
> ip/52731 is trying to acquire lock:
> ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
>
> but task is already holding lock:
> ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
> CPU0
> ----
> lock((switchdev_blocking_notif_chain).rwsem);
> lock((switchdev_blocking_notif_chain).rwsem);
>
> *** DEADLOCK ***
> May be due to missing lock nesting notation
> 3 locks held by ip/52731:
> #0: ffffffff84f795b0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x727/0x1dc0
> #1: ffffffff8731f628 (&net->rtnl_mutex){+.+.}-{4:4}, at: rtnl_newlink+0x790/0x1dc0
> #2: ffffffff850918d8 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x58/0xa0
>
> stack backtrace:
> ...
> ? __pfx_down_read+0x10/0x10
> ? __pfx_mark_lock+0x10/0x10
> ? __pfx_switchdev_port_attr_set_deferred+0x10/0x10
> blocking_notifier_call_chain+0x58/0xa0
> switchdev_port_attr_notify.constprop.0+0xb3/0x1b0
> ? __pfx_switchdev_port_attr_notify.constprop.0+0x10/0x10
> ? mark_held_locks+0x94/0xe0
> ? switchdev_deferred_process+0x11a/0x340
> switchdev_port_attr_set_deferred+0x27/0xd0
> switchdev_deferred_process+0x164/0x340
> br_switchdev_port_unoffload+0xc8/0x100 [bridge]
> br_switchdev_blocking_event+0x29f/0x580 [bridge]
> notifier_call_chain+0xa2/0x440
> blocking_notifier_call_chain+0x6e/0xa0
> switchdev_bridge_port_unoffload+0xde/0x1a0
> ...
>
> Fixes: f7a70d650b0b6 ("net: bridge: switchdev: Ensure deferred event delivery on unoffload")
> Signed-off-by: Amit Cohen <amcohen@nvidia.com>
> Reviewed-by: Ido Schimmel <idosch@nvidia.com>
> ---
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: [PATCH net] net: switchdev: Convert blocking notification chain to a raw one
2025-03-05 12:15 [PATCH net] net: switchdev: Convert blocking notification chain to a raw one Amit Cohen
2025-03-10 6:17 ` Simon Horman
2025-03-11 0:11 ` Vladimir Oltean
@ 2025-03-11 11:10 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 6+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-03-11 11:10 UTC (permalink / raw)
To: Amit Cohen
Cc: netdev, idosch, petrm, jiri, ivecera, davem, edumazet, kuba,
pabeni, horms, olteanv, tobias
Hello:
This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Wed, 5 Mar 2025 14:15:09 +0200 you wrote:
> A blocking notification chain uses a read-write semaphore to protect the
> integrity of the chain. The semaphore is acquired for writing when
> adding / removing notifiers to / from the chain and acquired for reading
> when traversing the chain and informing notifiers about an event.
>
> In case of the blocking switchdev notification chain, recursive
> notifications are possible which leads to the semaphore being acquired
> twice for reading and to lockdep warnings being generated [1].
>
> [...]
Here is the summary with links:
- [net] net: switchdev: Convert blocking notification chain to a raw one
https://git.kernel.org/netdev/net/c/62531a1effa8
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-03-12 17:50 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-05 12:15 [PATCH net] net: switchdev: Convert blocking notification chain to a raw one Amit Cohen
2025-03-10 6:17 ` Simon Horman
2025-03-11 8:12 ` Kuniyuki Iwashima
2025-03-12 17:50 ` Ido Schimmel
2025-03-11 0:11 ` Vladimir Oltean
2025-03-11 11:10 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).