* [PATCH net] bridge: cfm: Fix race condition in peer_mep deletion
@ 2026-03-10 18:18 Hyunwoo Kim
2026-03-11 3:02 ` Hyunwoo Kim
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Hyunwoo Kim @ 2026-03-10 18:18 UTC (permalink / raw)
To: razor, idosch, davem, edumazet, kuba, pabeni, horms
Cc: bridge, netdev, imv4bel
When a peer MEP is being deleted, cancel_delayed_work_sync() is called
on ccm_rx_dwork before freeing. However, br_cfm_frame_rx() runs in
softirq context under rcu_read_lock (without RTNL) and can re-schedule
ccm_rx_dwork via ccm_rx_timer_start() between cancel_delayed_work_sync()
returning and kfree_rcu() being called.
The following is a simple race scenario:
cpu0 cpu1
mep_delete_implementation()
cancel_delayed_work_sync(ccm_rx_dwork);
br_cfm_frame_rx()
// peer_mep still in hlist
if (peer_mep->ccm_defect)
ccm_rx_timer_start()
queue_delayed_work(ccm_rx_dwork)
hlist_del_rcu(&peer_mep->head);
kfree_rcu(peer_mep, rcu);
ccm_rx_work_expired()
// on freed peer_mep
To prevent this, cancel_delayed_work_sync() is replaced with
disable_delayed_work_sync() in both peer MEP deletion paths, so
that subsequent queue_delayed_work() calls from br_cfm_frame_rx()
are silently rejected.
The cc_peer_disable() helper retains cancel_delayed_work_sync()
because it is also used for the CC enable/disable toggle path where
the work must remain re-schedulable.
Fixes: dc32cbb3dbd7 ("bridge: cfm: Kernel space implementation of CFM. CCM frame RX added.")
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
---
net/bridge/br_cfm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/bridge/br_cfm.c b/net/bridge/br_cfm.c
index 2c70fe47de38..118c7ea48c35 100644
--- a/net/bridge/br_cfm.c
+++ b/net/bridge/br_cfm.c
@@ -576,7 +576,7 @@ static void mep_delete_implementation(struct net_bridge *br,
/* Empty and free peer MEP list */
hlist_for_each_entry_safe(peer_mep, n_store, &mep->peer_mep_list, head) {
- cancel_delayed_work_sync(&peer_mep->ccm_rx_dwork);
+ disable_delayed_work_sync(&peer_mep->ccm_rx_dwork);
hlist_del_rcu(&peer_mep->head);
kfree_rcu(peer_mep, rcu);
}
@@ -732,7 +732,7 @@ int br_cfm_cc_peer_mep_remove(struct net_bridge *br, const u32 instance,
return -ENOENT;
}
- cc_peer_disable(peer_mep);
+ disable_delayed_work_sync(&peer_mep->ccm_rx_dwork);
hlist_del_rcu(&peer_mep->head);
kfree_rcu(peer_mep, rcu);
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net] bridge: cfm: Fix race condition in peer_mep deletion
2026-03-10 18:18 [PATCH net] bridge: cfm: Fix race condition in peer_mep deletion Hyunwoo Kim
@ 2026-03-11 3:02 ` Hyunwoo Kim
2026-03-12 15:35 ` Ido Schimmel
2026-03-13 1:40 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Hyunwoo Kim @ 2026-03-11 3:02 UTC (permalink / raw)
To: razor, idosch, davem, edumazet, kuba, pabeni, horms
Cc: bridge, netdev, v4bel, henrik.bjoernlund, horatiu.vultur, nikolay,
sd
On Wed, Mar 11, 2026 at 03:18:09AM +0900, Hyunwoo Kim wrote:
> When a peer MEP is being deleted, cancel_delayed_work_sync() is called
> on ccm_rx_dwork before freeing. However, br_cfm_frame_rx() runs in
> softirq context under rcu_read_lock (without RTNL) and can re-schedule
> ccm_rx_dwork via ccm_rx_timer_start() between cancel_delayed_work_sync()
> returning and kfree_rcu() being called.
>
> The following is a simple race scenario:
>
> cpu0 cpu1
>
> mep_delete_implementation()
> cancel_delayed_work_sync(ccm_rx_dwork);
> br_cfm_frame_rx()
> // peer_mep still in hlist
> if (peer_mep->ccm_defect)
> ccm_rx_timer_start()
> queue_delayed_work(ccm_rx_dwork)
> hlist_del_rcu(&peer_mep->head);
> kfree_rcu(peer_mep, rcu);
> ccm_rx_work_expired()
> // on freed peer_mep
>
> To prevent this, cancel_delayed_work_sync() is replaced with
> disable_delayed_work_sync() in both peer MEP deletion paths, so
> that subsequent queue_delayed_work() calls from br_cfm_frame_rx()
> are silently rejected.
>
> The cc_peer_disable() helper retains cancel_delayed_work_sync()
> because it is also used for the CC enable/disable toggle path where
> the work must remain re-schedulable.
>
> Fixes: dc32cbb3dbd7 ("bridge: cfm: Kernel space implementation of CFM. CCM frame RX added.")
> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> ---
> net/bridge/br_cfm.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/bridge/br_cfm.c b/net/bridge/br_cfm.c
> index 2c70fe47de38..118c7ea48c35 100644
> --- a/net/bridge/br_cfm.c
> +++ b/net/bridge/br_cfm.c
> @@ -576,7 +576,7 @@ static void mep_delete_implementation(struct net_bridge *br,
>
> /* Empty and free peer MEP list */
> hlist_for_each_entry_safe(peer_mep, n_store, &mep->peer_mep_list, head) {
> - cancel_delayed_work_sync(&peer_mep->ccm_rx_dwork);
> + disable_delayed_work_sync(&peer_mep->ccm_rx_dwork);
> hlist_del_rcu(&peer_mep->head);
> kfree_rcu(peer_mep, rcu);
> }
> @@ -732,7 +732,7 @@ int br_cfm_cc_peer_mep_remove(struct net_bridge *br, const u32 instance,
> return -ENOENT;
> }
>
> - cc_peer_disable(peer_mep);
> + disable_delayed_work_sync(&peer_mep->ccm_rx_dwork);
>
> hlist_del_rcu(&peer_mep->head);
> kfree_rcu(peer_mep, rcu);
> --
> 2.43.0
>
CC'ing the Fixes patch authors and Sabrina, who is familiar with this bug pattern.
Best regards,
Hyunwoo Kim
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net] bridge: cfm: Fix race condition in peer_mep deletion
2026-03-10 18:18 [PATCH net] bridge: cfm: Fix race condition in peer_mep deletion Hyunwoo Kim
2026-03-11 3:02 ` Hyunwoo Kim
@ 2026-03-12 15:35 ` Ido Schimmel
2026-03-13 1:40 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: Ido Schimmel @ 2026-03-12 15:35 UTC (permalink / raw)
To: Hyunwoo Kim; +Cc: razor, davem, edumazet, kuba, pabeni, horms, bridge, netdev
On Wed, Mar 11, 2026 at 03:18:09AM +0900, Hyunwoo Kim wrote:
> When a peer MEP is being deleted, cancel_delayed_work_sync() is called
> on ccm_rx_dwork before freeing. However, br_cfm_frame_rx() runs in
> softirq context under rcu_read_lock (without RTNL) and can re-schedule
> ccm_rx_dwork via ccm_rx_timer_start() between cancel_delayed_work_sync()
> returning and kfree_rcu() being called.
>
> The following is a simple race scenario:
>
> cpu0 cpu1
>
> mep_delete_implementation()
> cancel_delayed_work_sync(ccm_rx_dwork);
> br_cfm_frame_rx()
> // peer_mep still in hlist
> if (peer_mep->ccm_defect)
> ccm_rx_timer_start()
> queue_delayed_work(ccm_rx_dwork)
> hlist_del_rcu(&peer_mep->head);
> kfree_rcu(peer_mep, rcu);
> ccm_rx_work_expired()
> // on freed peer_mep
>
> To prevent this, cancel_delayed_work_sync() is replaced with
> disable_delayed_work_sync() in both peer MEP deletion paths, so
> that subsequent queue_delayed_work() calls from br_cfm_frame_rx()
> are silently rejected.
>
> The cc_peer_disable() helper retains cancel_delayed_work_sync()
> because it is also used for the CC enable/disable toggle path where
> the work must remain re-schedulable.
>
> Fixes: dc32cbb3dbd7 ("bridge: cfm: Kernel space implementation of CFM. CCM frame RX added.")
> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Not familiar with CFM, but your explanation makes sense.
AFAICT it's not needed for ccm_tx_dwork since the delayed work re-queues
itself.
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net] bridge: cfm: Fix race condition in peer_mep deletion
2026-03-10 18:18 [PATCH net] bridge: cfm: Fix race condition in peer_mep deletion Hyunwoo Kim
2026-03-11 3:02 ` Hyunwoo Kim
2026-03-12 15:35 ` Ido Schimmel
@ 2026-03-13 1:40 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-03-13 1:40 UTC (permalink / raw)
To: Hyunwoo Kim
Cc: razor, idosch, davem, edumazet, kuba, pabeni, horms, bridge,
netdev
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Wed, 11 Mar 2026 03:18:09 +0900 you wrote:
> When a peer MEP is being deleted, cancel_delayed_work_sync() is called
> on ccm_rx_dwork before freeing. However, br_cfm_frame_rx() runs in
> softirq context under rcu_read_lock (without RTNL) and can re-schedule
> ccm_rx_dwork via ccm_rx_timer_start() between cancel_delayed_work_sync()
> returning and kfree_rcu() being called.
>
> The following is a simple race scenario:
>
> [...]
Here is the summary with links:
- [net] bridge: cfm: Fix race condition in peer_mep deletion
https://git.kernel.org/netdev/net/c/3715a0085531
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-03-13 1:40 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-10 18:18 [PATCH net] bridge: cfm: Fix race condition in peer_mep deletion Hyunwoo Kim
2026-03-11 3:02 ` Hyunwoo Kim
2026-03-12 15:35 ` Ido Schimmel
2026-03-13 1:40 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox