* [PATCH net] net: Lock lower level devices when updating features @ 2025-05-06 14:21 Cosmin Ratiu 2025-05-06 17:12 ` Stanislav Fomichev 0 siblings, 1 reply; 10+ messages in thread From: Cosmin Ratiu @ 2025-05-06 14:21 UTC (permalink / raw) To: netdev, cratiu Cc: Stanislav Fomichev, David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, jiri @ resnulli . us, Saeed Mahameed, Dragos Tatulea, linux-kselftest __netdev_update_features() expects the netdevice to be ops-locked, but it gets called recursively on the lower level netdevices to sync their features, and nothing locks those. This commit fixes that, with the assumption that it shouldn't be possible for both higher-level and lover-level netdevices to require the instance lock, because that would lead to lock dependency warnings. Without this, playing with higher level (e.g. vxlan) netdevices on top of netdevices with instance locking enabled can run into issues: WARNING: CPU: 59 PID: 206496 at ./include/net/netdev_lock.h:17 netif_napi_add_weight_locked+0x753/0xa60 [...] Call Trace: <TASK> mlx5e_open_channel+0xc09/0x3740 [mlx5_core] mlx5e_open_channels+0x1f0/0x770 [mlx5_core] mlx5e_safe_switch_params+0x1b5/0x2e0 [mlx5_core] set_feature_lro+0x1c2/0x330 [mlx5_core] mlx5e_handle_feature+0xc8/0x140 [mlx5_core] mlx5e_set_features+0x233/0x2e0 [mlx5_core] __netdev_update_features+0x5be/0x1670 __netdev_update_features+0x71f/0x1670 dev_ethtool+0x21c5/0x4aa0 dev_ioctl+0x438/0xae0 sock_ioctl+0x2ba/0x690 __x64_sys_ioctl+0xa78/0x1700 do_syscall_64+0x6d/0x140 entry_SYSCALL_64_after_hwframe+0x4b/0x53 </TASK> Fixes: 7e4d784f5810 ("net: hold netdev instance lock during rtnetlink operations") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> --- net/core/dev.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/net/core/dev.c b/net/core/dev.c index 1be7cb73a602..77472364225c 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -10620,8 +10620,11 @@ int __netdev_update_features(struct net_device *dev) /* some features must be disabled on lower devices when disabled * on an upper device (think: bonding master or bridge) */ - netdev_for_each_lower_dev(dev, lower, iter) + netdev_for_each_lower_dev(dev, lower, iter) { + netdev_lock_ops(lower); netdev_sync_lower_features(dev, lower, features); + netdev_unlock_ops(lower); + } if (!err) { netdev_features_t diff = features ^ dev->features; -- 2.45.0 ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net] net: Lock lower level devices when updating features 2025-05-06 14:21 [PATCH net] net: Lock lower level devices when updating features Cosmin Ratiu @ 2025-05-06 17:12 ` Stanislav Fomichev 2025-05-06 17:40 ` Stanislav Fomichev 2025-05-06 17:47 ` Cosmin Ratiu 0 siblings, 2 replies; 10+ messages in thread From: Stanislav Fomichev @ 2025-05-06 17:12 UTC (permalink / raw) To: Cosmin Ratiu Cc: netdev, Stanislav Fomichev, David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, jiri @ resnulli . us, Saeed Mahameed, Dragos Tatulea, linux-kselftest On 05/06, Cosmin Ratiu wrote: > __netdev_update_features() expects the netdevice to be ops-locked, but > it gets called recursively on the lower level netdevices to sync their > features, and nothing locks those. > > This commit fixes that, with the assumption that it shouldn't be possible > for both higher-level and lover-level netdevices to require the instance > lock, because that would lead to lock dependency warnings. > > Without this, playing with higher level (e.g. vxlan) netdevices on top > of netdevices with instance locking enabled can run into issues: Mentioning vxlan is a bit confusing here; it shouldn't let you flip lro (I think). Which upper are you testing against? Trying to understand if we can cover this case in the selftests. netdevsim also doesn't expose F_LRO feature... (yet?) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] net: Lock lower level devices when updating features 2025-05-06 17:12 ` Stanislav Fomichev @ 2025-05-06 17:40 ` Stanislav Fomichev 2025-05-06 17:47 ` Cosmin Ratiu 1 sibling, 0 replies; 10+ messages in thread From: Stanislav Fomichev @ 2025-05-06 17:40 UTC (permalink / raw) To: Cosmin Ratiu Cc: netdev, Stanislav Fomichev, David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, jiri @ resnulli . us, Saeed Mahameed, Dragos Tatulea, linux-kselftest On 05/06, Stanislav Fomichev wrote: > On 05/06, Cosmin Ratiu wrote: > > __netdev_update_features() expects the netdevice to be ops-locked, but > > it gets called recursively on the lower level netdevices to sync their > > features, and nothing locks those. > > > > This commit fixes that, with the assumption that it shouldn't be possible > > for both higher-level and lover-level netdevices to require the instance > > lock, because that would lead to lock dependency warnings. > > > > Without this, playing with higher level (e.g. vxlan) netdevices on top > > of netdevices with instance locking enabled can run into issues: > > Mentioning vxlan is a bit confusing here; it shouldn't let you flip lro (I > think). Which upper are you testing against? > > Trying to understand if we can cover this case in the selftests. > netdevsim also doesn't expose F_LRO feature... (yet?) Ok, yeah, I can see it with the teaming on top of netdevsim and the following patch applied. Might try to send a testcase to cover this and the promisc part that I missed. Acked-by: Stanislav Fomichev <sdf@fomichev.me> diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c index 2aa999345fe1..af545d42961c 100644 --- a/drivers/net/netdevsim/netdev.c +++ b/drivers/net/netdevsim/netdev.c @@ -881,11 +881,13 @@ static void nsim_setup(struct net_device *dev) NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HW_CSUM | + NETIF_F_LRO | NETIF_F_TSO; dev->hw_features |= NETIF_F_HW_TC | NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HW_CSUM | + NETIF_F_LRO | NETIF_F_TSO; dev->max_mtu = ETH_MAX_MTU; dev->xdp_features = NETDEV_XDP_ACT_HW_OFFLOAD; ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net] net: Lock lower level devices when updating features 2025-05-06 17:12 ` Stanislav Fomichev 2025-05-06 17:40 ` Stanislav Fomichev @ 2025-05-06 17:47 ` Cosmin Ratiu 2025-05-06 18:13 ` Stanislav Fomichev 1 sibling, 1 reply; 10+ messages in thread From: Cosmin Ratiu @ 2025-05-06 17:47 UTC (permalink / raw) To: stfomichev@gmail.com Cc: davem@davemloft.net, kuba@kernel.org, linux-kselftest@vger.kernel.org, Dragos Tatulea, sdf@fomichev.me, pabeni@redhat.com, jiri@resnulli.us, edumazet@google.com, Saeed Mahameed, netdev@vger.kernel.org On Tue, 2025-05-06 at 10:12 -0700, Stanislav Fomichev wrote: > On 05/06, Cosmin Ratiu wrote: > > __netdev_update_features() expects the netdevice to be ops-locked, > > but > > it gets called recursively on the lower level netdevices to sync > > their > > features, and nothing locks those. > > > > This commit fixes that, with the assumption that it shouldn't be > > possible > > for both higher-level and lover-level netdevices to require the > > instance > > lock, because that would lead to lock dependency warnings. > > > > Without this, playing with higher level (e.g. vxlan) netdevices on > > top > > of netdevices with instance locking enabled can run into issues: > > Mentioning vxlan is a bit confusing here; it shouldn't let you flip > lro (I > think). Which upper are you testing against? It is vxlan, but LRO is just a red herring in this case, mlx5e_set_features calls every feature handler in turn, and this is just the example I picked from the sea of stack traces. > > Trying to understand if we can cover this case in the selftests. > netdevsim also doesn't expose F_LRO feature... (yet?) I see you found a way with teaming, but I think in essence a sequence of commands that makes __netdev_update_features call itself recursively once on the lower dev will trigger the netdev_ops_assert_locked on the lower dev. Thanks for the review! Cosmin. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] net: Lock lower level devices when updating features 2025-05-06 17:47 ` Cosmin Ratiu @ 2025-05-06 18:13 ` Stanislav Fomichev 2025-05-07 14:35 ` Cosmin Ratiu 0 siblings, 1 reply; 10+ messages in thread From: Stanislav Fomichev @ 2025-05-06 18:13 UTC (permalink / raw) To: Cosmin Ratiu Cc: davem@davemloft.net, kuba@kernel.org, linux-kselftest@vger.kernel.org, Dragos Tatulea, sdf@fomichev.me, pabeni@redhat.com, jiri@resnulli.us, edumazet@google.com, Saeed Mahameed, netdev@vger.kernel.org On 05/06, Cosmin Ratiu wrote: > On Tue, 2025-05-06 at 10:12 -0700, Stanislav Fomichev wrote: > > On 05/06, Cosmin Ratiu wrote: > > > __netdev_update_features() expects the netdevice to be ops-locked, > > > but > > > it gets called recursively on the lower level netdevices to sync > > > their > > > features, and nothing locks those. > > > > > > This commit fixes that, with the assumption that it shouldn't be > > > possible > > > for both higher-level and lover-level netdevices to require the > > > instance > > > lock, because that would lead to lock dependency warnings. > > > > > > Without this, playing with higher level (e.g. vxlan) netdevices on > > > top > > > of netdevices with instance locking enabled can run into issues: > > > > Mentioning vxlan is a bit confusing here; it shouldn't let you flip > > lro (I > > think). Which upper are you testing against? > > It is vxlan, but LRO is just a red herring in this case, > mlx5e_set_features calls every feature handler in turn, and this is > just the example I picked from the sea of stack traces. > > > > > Trying to understand if we can cover this case in the selftests. > > netdevsim also doesn't expose F_LRO feature... (yet?) > > I see you found a way with teaming, but I think in essence a sequence > of commands that makes __netdev_update_features call itself recursively > once on the lower dev will trigger the netdev_ops_assert_locked on the > lower dev. Right, but netdev_sync_lower_features calls lower's __netdev_update_features only for NETIF_F_UPPER_DISABLES. So it doesn't propagate all features, only LRO AFAICT. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] net: Lock lower level devices when updating features 2025-05-06 18:13 ` Stanislav Fomichev @ 2025-05-07 14:35 ` Cosmin Ratiu 2025-05-07 15:13 ` Cosmin Ratiu 0 siblings, 1 reply; 10+ messages in thread From: Cosmin Ratiu @ 2025-05-07 14:35 UTC (permalink / raw) To: stfomichev@gmail.com Cc: davem@davemloft.net, linux-kselftest@vger.kernel.org, Dragos Tatulea, sdf@fomichev.me, pabeni@redhat.com, jiri@resnulli.us, edumazet@google.com, kuba@kernel.org, Saeed Mahameed, netdev@vger.kernel.org On Tue, 2025-05-06 at 11:13 -0700, Stanislav Fomichev wrote: > On 05/06, Cosmin Ratiu wrote: > > > Right, but netdev_sync_lower_features calls lower's > __netdev_update_features > only for NETIF_F_UPPER_DISABLES. So it doesn't propagate all > features, > only LRO AFAICT. Got it, I didn't look into what netdev_sync_lower_features actually does besides noticing it can call __netdev_update_feature... In any case, please hold off with picking this patch up, it seems there's a possibility of a real deadlock. Here's the scenario: ============================================ WARNING: possible recursive locking detected -------------------------------------------- ethtool/44150 is trying to acquire lock: ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: __netdev_update_features+0x31e/0xe20 but task is already holding lock: ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: ethnl_set_features+0xbc/0x4b0 and the lock comparison function returns 0: other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&dev_instance_lock_key#7); lock(&dev_instance_lock_key#7); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by ethtool/44150: #0: ffffffff830e5a50 (cb_lock){++++}-{4:4}, at: genl_rcv+0x15/0x40 #1: ffffffff830cf708 (rtnl_mutex){+.+.}-{4:4}, at: ethnl_set_features+0x88/0x4b0 #2: ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: ethnl_set_features+0xbc/0x4b0 stack backtrace: Call Trace: <TASK> dump_stack_lvl+0x69/0xa0 print_deadlock_bug.cold+0xbd/0xca __lock_acquire+0x163c/0x2f00 lock_acquire+0xd3/0x2e0 __mutex_lock+0x98/0xf10 __netdev_update_features+0x31e/0xe20 netdev_update_features+0x1f/0x60 vlan_device_event+0x57d/0x930 [8021q] notifier_call_chain+0x3d/0x100 netdev_features_change+0x32/0x50 ethnl_set_features+0x17e/0x4b0 genl_family_rcv_msg_doit+0xe0/0x130 genl_rcv_msg+0x188/0x290 [...] I'm not sure how to solve this yet... Cosmin. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] net: Lock lower level devices when updating features 2025-05-07 14:35 ` Cosmin Ratiu @ 2025-05-07 15:13 ` Cosmin Ratiu 2025-05-07 20:29 ` Cosmin Ratiu 0 siblings, 1 reply; 10+ messages in thread From: Cosmin Ratiu @ 2025-05-07 15:13 UTC (permalink / raw) To: stfomichev@gmail.com Cc: davem@davemloft.net, linux-kselftest@vger.kernel.org, Dragos Tatulea, sdf@fomichev.me, kuba@kernel.org, jiri@resnulli.us, edumazet@google.com, pabeni@redhat.com, Saeed Mahameed, netdev@vger.kernel.org On Wed, 2025-05-07 at 14:35 +0000, Cosmin Ratiu wrote: > On Tue, 2025-05-06 at 11:13 -0700, Stanislav Fomichev wrote: > > On 05/06, Cosmin Ratiu wrote: > > > > > > Right, but netdev_sync_lower_features calls lower's > > __netdev_update_features > > only for NETIF_F_UPPER_DISABLES. So it doesn't propagate all > > features, > > only LRO AFAICT. > > Got it, I didn't look into what netdev_sync_lower_features actually > does besides noticing it can call __netdev_update_feature... > > In any case, please hold off with picking this patch up, it seems > there's a possibility of a real deadlock. Here's the scenario: > > ============================================ > WARNING: possible recursive locking detected > -------------------------------------------- > ethtool/44150 is trying to acquire lock: > ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: > __netdev_update_features+0x31e/0xe20 > > but task is already holding lock: > ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: > ethnl_set_features+0xbc/0x4b0 > and the lock comparison function returns 0: > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > ---- > lock(&dev_instance_lock_key#7); > lock(&dev_instance_lock_key#7); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 3 locks held by ethtool/44150: > #0: ffffffff830e5a50 (cb_lock){++++}-{4:4}, at: genl_rcv+0x15/0x40 > #1: ffffffff830cf708 (rtnl_mutex){+.+.}-{4:4}, at: > ethnl_set_features+0x88/0x4b0 > #2: ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: > ethnl_set_features+0xbc/0x4b0 > > stack backtrace: > Call Trace: > <TASK> > dump_stack_lvl+0x69/0xa0 > print_deadlock_bug.cold+0xbd/0xca > __lock_acquire+0x163c/0x2f00 > lock_acquire+0xd3/0x2e0 > __mutex_lock+0x98/0xf10 > __netdev_update_features+0x31e/0xe20 > netdev_update_features+0x1f/0x60 > vlan_device_event+0x57d/0x930 [8021q] > notifier_call_chain+0x3d/0x100 > netdev_features_change+0x32/0x50 > ethnl_set_features+0x17e/0x4b0 > genl_family_rcv_msg_doit+0xe0/0x130 > genl_rcv_msg+0x188/0x290 > [...] > > I'm not sure how to solve this yet... > Cosmin. If it's not clear, the problem is that: 1. the lower device is already ops locked 2. netdev_feature_change gets called. 3. __netdev_update_features gets called for the vlan (upper) dev. 4. It tries to acquire the same lock instance as 1 (this patch). 5. Deadlock. One solution I can think of would be to run device notifiers for changing features outside the lock, it doesn't seem like the netdev lock has anything to do with what other devices might do with this information. This can be triggered from many scenarios, I have another similar stack involving bonding. What do you think? Cosmin. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH net] net: Lock lower level devices when updating features 2025-05-07 15:13 ` Cosmin Ratiu @ 2025-05-07 20:29 ` Cosmin Ratiu 2025-05-07 21:20 ` Stanislav Fomichev 0 siblings, 1 reply; 10+ messages in thread From: Cosmin Ratiu @ 2025-05-07 20:29 UTC (permalink / raw) To: stfomichev@gmail.com Cc: davem@davemloft.net, linux-kselftest@vger.kernel.org, Dragos Tatulea, sdf@fomichev.me, pabeni@redhat.com, jiri@resnulli.us, edumazet@google.com, kuba@kernel.org, Saeed Mahameed, netdev@vger.kernel.org On Wed, 2025-05-07 at 15:13 +0000, Cosmin Ratiu wrote: > > In any case, please hold off with picking this patch up, it seems > > there's a possibility of a real deadlock. Here's the scenario: > > > > ============================================ > > WARNING: possible recursive locking detected > > -------------------------------------------- > > ethtool/44150 is trying to acquire lock: > > ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: > > __netdev_update_features+0x31e/0xe20 > > > > but task is already holding lock: > > ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: > > ethnl_set_features+0xbc/0x4b0 > > and the lock comparison function returns 0: > > > > other info that might help us debug this: > > Possible unsafe locking scenario: > > > > CPU0 > > ---- > > lock(&dev_instance_lock_key#7); > > lock(&dev_instance_lock_key#7); > > > > *** DEADLOCK *** > > > > May be due to missing lock nesting notation > > > > 3 locks held by ethtool/44150: > > #0: ffffffff830e5a50 (cb_lock){++++}-{4:4}, at: genl_rcv+0x15/0x40 > > #1: ffffffff830cf708 (rtnl_mutex){+.+.}-{4:4}, at: > > ethnl_set_features+0x88/0x4b0 > > #2: ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: > > ethnl_set_features+0xbc/0x4b0 > > > > stack backtrace: > > Call Trace: > > <TASK> > > dump_stack_lvl+0x69/0xa0 > > print_deadlock_bug.cold+0xbd/0xca > > __lock_acquire+0x163c/0x2f00 > > lock_acquire+0xd3/0x2e0 > > __mutex_lock+0x98/0xf10 > > __netdev_update_features+0x31e/0xe20 > > netdev_update_features+0x1f/0x60 > > vlan_device_event+0x57d/0x930 [8021q] > > notifier_call_chain+0x3d/0x100 > > netdev_features_change+0x32/0x50 > > ethnl_set_features+0x17e/0x4b0 > > genl_family_rcv_msg_doit+0xe0/0x130 > > genl_rcv_msg+0x188/0x290 > > [...] > > > > I'm not sure how to solve this yet... > > Cosmin. > > If it's not clear, the problem is that: > 1. the lower device is already ops locked > 2. netdev_feature_change gets called. > 3. __netdev_update_features gets called for the vlan (upper) dev. > 4. It tries to acquire the same lock instance as 1 (this patch). > 5. Deadlock. > > One solution I can think of would be to run device notifiers for > changing features outside the lock, it doesn't seem like the netdev > lock has anything to do with what other devices might do with this > information. > > This can be triggered from many scenarios, I have another similar > stack > involving bonding. > > What do you think? All I could think of was to drop the lock during the netdev_features_changed notifier calls, like in the following hunk. I'm running this through regressions, let's see if it's a good idea or not. diff --git a/net/core/dev.c b/net/core/dev.c index 1be7cb73a602..817fd5bc21b1 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1514,7 +1514,12 @@ int dev_get_alias(const struct net_device *dev, char *name, size_t len) */ void netdev_features_change(struct net_device *dev) { + /* Drop the lock to avoid potential deadlocks from e.g. upper dev + * notifiers altering features of 'dev' and acquiring dev lock again. + */ + netdev_unlock_ops(dev); call_netdevice_notifiers(NETDEV_FEAT_CHANGE, dev); + netdev_lock_ops(dev); } EXPORT_SYMBOL(netdev_features_change); ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net] net: Lock lower level devices when updating features 2025-05-07 20:29 ` Cosmin Ratiu @ 2025-05-07 21:20 ` Stanislav Fomichev 2025-05-08 10:33 ` Cosmin Ratiu 0 siblings, 1 reply; 10+ messages in thread From: Stanislav Fomichev @ 2025-05-07 21:20 UTC (permalink / raw) To: Cosmin Ratiu Cc: davem@davemloft.net, linux-kselftest@vger.kernel.org, Dragos Tatulea, sdf@fomichev.me, pabeni@redhat.com, jiri@resnulli.us, edumazet@google.com, kuba@kernel.org, Saeed Mahameed, netdev@vger.kernel.org On 05/07, Cosmin Ratiu wrote: > On Wed, 2025-05-07 at 15:13 +0000, Cosmin Ratiu wrote: > > > In any case, please hold off with picking this patch up, it seems > > > there's a possibility of a real deadlock. Here's the scenario: > > > > > > ============================================ > > > WARNING: possible recursive locking detected > > > -------------------------------------------- > > > ethtool/44150 is trying to acquire lock: > > > ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: > > > __netdev_update_features+0x31e/0xe20 > > > > > > but task is already holding lock: > > > ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: > > > ethnl_set_features+0xbc/0x4b0 > > > and the lock comparison function returns 0: > > > > > > other info that might help us debug this: > > > Possible unsafe locking scenario: > > > > > > CPU0 > > > ---- > > > lock(&dev_instance_lock_key#7); > > > lock(&dev_instance_lock_key#7); > > > > > > *** DEADLOCK *** > > > > > > May be due to missing lock nesting notation > > > > > > 3 locks held by ethtool/44150: > > > #0: ffffffff830e5a50 (cb_lock){++++}-{4:4}, at: genl_rcv+0x15/0x40 > > > #1: ffffffff830cf708 (rtnl_mutex){+.+.}-{4:4}, at: > > > ethnl_set_features+0x88/0x4b0 > > > #2: ffff8881364e8c80 (&dev_instance_lock_key#7){+.+.}-{4:4}, at: > > > ethnl_set_features+0xbc/0x4b0 > > > > > > stack backtrace: > > > Call Trace: > > > <TASK> > > > dump_stack_lvl+0x69/0xa0 > > > print_deadlock_bug.cold+0xbd/0xca > > > __lock_acquire+0x163c/0x2f00 > > > lock_acquire+0xd3/0x2e0 > > > __mutex_lock+0x98/0xf10 > > > __netdev_update_features+0x31e/0xe20 > > > netdev_update_features+0x1f/0x60 > > > vlan_device_event+0x57d/0x930 [8021q] > > > notifier_call_chain+0x3d/0x100 > > > netdev_features_change+0x32/0x50 > > > ethnl_set_features+0x17e/0x4b0 > > > genl_family_rcv_msg_doit+0xe0/0x130 > > > genl_rcv_msg+0x188/0x290 > > > [...] > > > > > > I'm not sure how to solve this yet... > > > Cosmin. > > > > If it's not clear, the problem is that: > > 1. the lower device is already ops locked > > 2. netdev_feature_change gets called. > > 3. __netdev_update_features gets called for the vlan (upper) dev. > > 4. It tries to acquire the same lock instance as 1 (this patch). > > 5. Deadlock. > > > > One solution I can think of would be to run device notifiers for > > changing features outside the lock, it doesn't seem like the netdev > > lock has anything to do with what other devices might do with this > > information. > > > > This can be triggered from many scenarios, I have another similar > > stack > > involving bonding. > > > > What do you think? > > All I could think of was to drop the lock during the > netdev_features_changed notifier calls, like in the following hunk. > I'm running this through regressions, let's see if it's a good idea or > not. > > diff --git a/net/core/dev.c b/net/core/dev.c > index 1be7cb73a602..817fd5bc21b1 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -1514,7 +1514,12 @@ int dev_get_alias(const struct net_device *dev, > char *name, size_t len) > */ > void netdev_features_change(struct net_device *dev) > { > + /* Drop the lock to avoid potential deadlocks from e.g. upper > dev > + * notifiers altering features of 'dev' and acquiring dev lock > again. > + */ > + netdev_unlock_ops(dev); > call_netdevice_notifiers(NETDEV_FEAT_CHANGE, dev); > + netdev_lock_ops(dev); > } > EXPORT_SYMBOL(netdev_features_change); > Hmm, are you sure you're calling __netdev_update_features on the upper? I don't see how the lower would be locked in that case. From my POW, this is what happens: 1. your dev (lower) has a vlan on it (upper) 2. you call lro=off on the _lower_ 3. this triggers FEAT_CHANGE notifier and vlan_device_event catches it 4. since the lower has a vlan device (dev->vlan_info != NULL), it goes over every other vlan in the group and triggers netdev_update_features for the upper (netdev_update_features vlandev) 5. the upper tries to sync the features into the lower (including the one that triggered FEAT_CHANGE) and that's where the deadlock happens But I think (5) should be largely a no-op for the device triggering the notification, because the features have been already applied in ethnl_set_features. I'd move the lock into netdev_sync_lower_features, and only after checking the features (and making sure that we are going to change them). The feature check might be racy, but I think it should still work? Can you also share the bonding stacktrace as well to confirm it's the same issue? diff --git a/net/core/dev.c b/net/core/dev.c index bcb266ab2912..b5fc8a740e8b 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -10312,6 +10312,7 @@ static void netdev_sync_lower_features(struct net_device *upper, for_each_netdev_feature(upper_disables, feature_bit) { feature = __NETIF_F_BIT(feature_bit); if (!(features & feature) && (lower->features & feature)) { + netdev_lock_ops(lower); netdev_dbg(upper, "Disabling feature %pNF on lower dev %s.\n", &feature, lower->name); lower->wanted_features &= ~feature; @@ -10322,6 +10323,7 @@ static void netdev_sync_lower_features(struct net_device *upper, &feature, lower->name); else netdev_features_change(lower); + netdev_unlock_ops(lower); } } } ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH net] net: Lock lower level devices when updating features 2025-05-07 21:20 ` Stanislav Fomichev @ 2025-05-08 10:33 ` Cosmin Ratiu 0 siblings, 0 replies; 10+ messages in thread From: Cosmin Ratiu @ 2025-05-08 10:33 UTC (permalink / raw) To: stfomichev@gmail.com Cc: davem@davemloft.net, linux-kselftest@vger.kernel.org, Dragos Tatulea, sdf@fomichev.me, kuba@kernel.org, jiri@resnulli.us, edumazet@google.com, pabeni@redhat.com, Saeed Mahameed, netdev@vger.kernel.org On Wed, 2025-05-07 at 14:20 -0700, Stanislav Fomichev wrote: > On 05/07, Cosmin Ratiu wrote: > > On Wed, 2025-05-07 at 15:13 +0000, Cosmin Ratiu wrote: > > > > In any case, please hold off with picking this patch up, it > > > > seems > > > > there's a possibility of a real deadlock. Here's the scenario: > > Hmm, are you sure you're calling __netdev_update_features on the > upper? > I don't see how the lower would be locked in that case. From my POW, > this is what happens: > > 1. your dev (lower) has a vlan on it (upper) > 2. you call lro=off on the _lower_ > 3. this triggers FEAT_CHANGE notifier and vlan_device_event catches > it > 4. since the lower has a vlan device (dev->vlan_info != NULL), it > goes > over every other vlan in the group and triggers > netdev_update_features > for the upper (netdev_update_features vlandev) > 5. the upper tries to sync the features into the lower (including the > one that triggered FEAT_CHANGE) and that's where the deadlock > happens > > But I think (5) should be largely a no-op for the device triggering > the > notification, because the features have been already applied in > ethnl_set_features. > I'd move the lock into netdev_sync_lower_features, and only after > checking > the features (and making sure that we are going to change them). The > feature > check might be racy, but I think it should still work? > You are right, if I restrict the lower dev critical section to only the call to __netdev_update_features for the lower dev there's no deadlock any more, because the device with the lock held already had its features updated. I will send a new version of this patch soon after the full regression suite finishes and I convince myself there are no more issues related to this that we can encounter. > Can you also share the bonding stacktrace as well to confirm it's the > same issue? Sure, here it is, it's the same scenario. bond_netdev_event gets called on a slave dev, it recomputes features and updates all slaves (bond_compute_features), and then the same lock is reacquired. But this is also fixed with your suggestion above. ============================================ WARNING: possible recursive locking detected devlink/14341 is trying to acquire lock: ffff88810ebd8c80 (&dev_instance_lock_key#9){+.+.}-{4:4}, at: __netdev_update_features+0x31e/0xe20 but task is already holding lock: ffff88810ebd8c80 (&dev_instance_lock_key#9){+.+.}-{4:4}, at: mlx5e_attach_netdev+0x31f/0x360 [mlx5_core] and the lock comparison function returns 0: other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&dev_instance_lock_key#9); lock(&dev_instance_lock_key#9); *** DEADLOCK *** May be due to missing lock nesting notation 4 locks held by devlink/14341: #0: ffffffff830e5a50 (cb_lock){++++}-{4:4}, at: genl_rcv+0x15/0x40 #1: ffff888164a5c250 (&devlink->lock_key){+.+.}-{4:4}, at: devlink_get_from_attrs_lock+0xbc/0x180 #2: ffffffff830cf708 (rtnl_mutex){+.+.}-{4:4}, at: mlx5e_attach_netdev+0x30d/0x360 [mlx5_core] #3: ffff88810ebd8c80 (&dev_instance_lock_key#9){+.+.}-{4:4}, at: mlx5e_attach_netdev+0x31f/0x360 [mlx5_core] Call Trace: <TASK> dump_stack_lvl+0x69/0xa0 print_deadlock_bug.cold+0xbd/0xca __lock_acquire+0x163c/0x2f00 lock_acquire+0xd3/0x2e0 __mutex_lock+0x98/0xf10 __netdev_update_features+0x31e/0xe20 netdev_change_features+0x1f/0x60 bond_compute_features+0x24e/0x300 [bonding] bond_netdev_event+0x2e0/0x400 [bonding] notifier_call_chain+0x3d/0x100 netdev_update_features+0x52/0x60 mlx5e_attach_netdev+0x32f/0x360 [mlx5_core] mlx5e_netdev_attach_profile+0x48/0x90 [mlx5_core] mlx5e_netdev_change_profile+0x90/0xf0 [mlx5_core] mlx5e_vport_rep_load+0x414/0x490 [mlx5_core] __esw_offloads_load_rep+0x87/0xd0 [mlx5_core] mlx5_esw_offloads_rep_load+0x45/0xe0 [mlx5_core] esw_offloads_enable+0xb7b/0xca0 [mlx5_core] mlx5_eswitch_enable_locked+0x293/0x430 [mlx5_core] mlx5_devlink_eswitch_mode_set+0x229/0x620 [mlx5_core] devlink_nl_eswitch_set_doit+0x60/0xd0 genl_family_rcv_msg_doit+0xe0/0x130 genl_rcv_msg+0x188/0x290 netlink_rcv_skb+0x4b/0xf0 genl_rcv+0x24/0x40 netlink_unicast+0x1e1/0x2c0 netlink_sendmsg+0x210/0x450 Cosmin. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-05-08 10:33 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-05-06 14:21 [PATCH net] net: Lock lower level devices when updating features Cosmin Ratiu 2025-05-06 17:12 ` Stanislav Fomichev 2025-05-06 17:40 ` Stanislav Fomichev 2025-05-06 17:47 ` Cosmin Ratiu 2025-05-06 18:13 ` Stanislav Fomichev 2025-05-07 14:35 ` Cosmin Ratiu 2025-05-07 15:13 ` Cosmin Ratiu 2025-05-07 20:29 ` Cosmin Ratiu 2025-05-07 21:20 ` Stanislav Fomichev 2025-05-08 10:33 ` Cosmin Ratiu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).