* [PATCH v5 net 0/3] net: Fix race of rtnl_net_lock(dev_net(dev)).
@ 2025-02-17 19:11 Kuniyuki Iwashima
2025-02-17 19:11 ` [PATCH v5 net 1/3] net: Add net_passive_inc() and net_passive_dec() Kuniyuki Iwashima
` (3 more replies)
0 siblings, 4 replies; 13+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-17 19:11 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman
Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev
Yael Chemla reported that commit 7fb1073300a2 ("net: Hold rtnl_net_lock()
in (un)?register_netdevice_notifier_dev_net().") started to trigger KASAN's
use-after-free splat.
The problem is that dev_net(dev) fetched before rtnl_net_lock() might be
different after rtnl_net_lock().
The patch 2 fixes the issue by checking dev_net(dev) after rtnl_net_lock(),
and the patch 3 fixes the same potential issue that would emerge once RTNL
is removed.
Changes:
v5:
* Use do-while loop instead of goto
v4: https://lore.kernel.org/netdev/20250212064206.18159-1-kuniyu@amazon.com/
* Add patch 1
* Fix build failure for !CONFIG_NET_NS in patch 2
v3: https://lore.kernel.org/netdev/20250211051217.12613-1-kuniyu@amazon.com/
* Bump net->passive instead of maybe_get_net()
* Remove msleep(1) loop
* Use rcu_access_pointer() instead of rcu_read_lock().
v2: https://lore.kernel.org/netdev/20250207044251.65421-1-kuniyu@amazon.com/
* Use dev_net_rcu()
* Use msleep(1) instead of cond_resched() after maybe_get_net()
* Remove cond_resched() after net_eq() check
v1: https://lore.kernel.org/netdev/20250130232435.43622-1-kuniyu@amazon.com/
Kuniyuki Iwashima (3):
net: Add net_passive_inc() and net_passive_dec().
net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net().
dev: Use rtnl_net_dev_lock() in unregister_netdev().
include/net/net_namespace.h | 11 ++++++++
net/core/dev.c | 54 +++++++++++++++++++++++++++++++------
net/core/net_namespace.c | 8 +++---
3 files changed, 61 insertions(+), 12 deletions(-)
--
2.39.5 (Apple Git-154)
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH v5 net 1/3] net: Add net_passive_inc() and net_passive_dec(). 2025-02-17 19:11 [PATCH v5 net 0/3] net: Fix race of rtnl_net_lock(dev_net(dev)) Kuniyuki Iwashima @ 2025-02-17 19:11 ` Kuniyuki Iwashima 2025-02-17 19:11 ` [PATCH v5 net 2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net() Kuniyuki Iwashima ` (2 subsequent siblings) 3 siblings, 0 replies; 13+ messages in thread From: Kuniyuki Iwashima @ 2025-02-17 19:11 UTC (permalink / raw) To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev net_drop_ns() is NULL when CONFIG_NET_NS is disabled. The next patch introduces a function that increments and decrements net->passive. As a prep, let's rename and export net_free() to net_passive_dec() and add net_passive_inc(). Suggested-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/netdev/CANn89i+oUCt2VGvrbrweniTendZFEh+nwS=uonc004-aPkWy-Q@mail.gmail.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> --- include/net/net_namespace.h | 11 +++++++++++ net/core/net_namespace.c | 8 ++++---- 2 files changed, 15 insertions(+), 4 deletions(-) diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h index 7ba1402ca779..f467a66abc6b 100644 --- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -297,6 +297,7 @@ static inline int check_net(const struct net *net) } void net_drop_ns(void *); +void net_passive_dec(struct net *net); #else @@ -326,8 +327,18 @@ static inline int check_net(const struct net *net) } #define net_drop_ns NULL + +static inline void net_passive_dec(struct net *net) +{ + refcount_dec(&net->passive); +} #endif +static inline void net_passive_inc(struct net *net) +{ + refcount_inc(&net->passive); +} + /* Returns true if the netns initialization is completed successfully */ static inline bool net_initialized(const struct net *net) { diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index cb39a12b2f82..4303f2a49262 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -464,7 +464,7 @@ static void net_complete_free(void) } -static void net_free(struct net *net) +void net_passive_dec(struct net *net) { if (refcount_dec_and_test(&net->passive)) { kfree(rcu_access_pointer(net->gen)); @@ -482,7 +482,7 @@ void net_drop_ns(void *p) struct net *net = (struct net *)p; if (net) - net_free(net); + net_passive_dec(net); } struct net *copy_net_ns(unsigned long flags, @@ -523,7 +523,7 @@ struct net *copy_net_ns(unsigned long flags, key_remove_domain(net->key_domain); #endif put_user_ns(user_ns); - net_free(net); + net_passive_dec(net); dec_ucounts: dec_net_namespaces(ucounts); return ERR_PTR(rv); @@ -672,7 +672,7 @@ static void cleanup_net(struct work_struct *work) key_remove_domain(net->key_domain); #endif put_user_ns(net->user_ns); - net_free(net); + net_passive_dec(net); } cleanup_net_task = NULL; } -- 2.39.5 (Apple Git-154) ^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v5 net 2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). 2025-02-17 19:11 [PATCH v5 net 0/3] net: Fix race of rtnl_net_lock(dev_net(dev)) Kuniyuki Iwashima 2025-02-17 19:11 ` [PATCH v5 net 1/3] net: Add net_passive_inc() and net_passive_dec() Kuniyuki Iwashima @ 2025-02-17 19:11 ` Kuniyuki Iwashima 2025-03-26 13:46 ` Yael Chemla 2025-02-17 19:11 ` [PATCH v5 net 3/3] dev: Use rtnl_net_dev_lock() in unregister_netdev() Kuniyuki Iwashima 2025-02-19 2:50 ` [PATCH v5 net 0/3] net: Fix race of rtnl_net_lock(dev_net(dev)) patchwork-bot+netdevbpf 3 siblings, 1 reply; 13+ messages in thread From: Kuniyuki Iwashima @ 2025-02-17 19:11 UTC (permalink / raw) To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, Yael Chemla After the cited commit, dev_net(dev) is fetched before holding RTNL and passed to __unregister_netdevice_notifier_net(). However, dev_net(dev) might be different after holding RTNL. In the reported case [0], while removing a VF device, its netns was being dismantled and the VF was moved to init_net. So the following sequence is basically illegal when dev was fetched without lookup: net = dev_net(dev); rtnl_net_lock(net); Let's use a new helper rtnl_net_dev_lock() to fix the race. It fetches dev_net_rcu(dev), bumps its net->passive, and checks if dev_net_rcu(dev) is changed after rtnl_net_lock(). [0]: BUG: KASAN: slab-use-after-free in notifier_call_chain (kernel/notifier.c:75 (discriminator 2)) Read of size 8 at addr ffff88810cefb4c8 by task test-bridge-lag/21127 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:123) print_report (mm/kasan/report.c:379 mm/kasan/report.c:489) kasan_report (mm/kasan/report.c:604) notifier_call_chain (kernel/notifier.c:75 (discriminator 2)) call_netdevice_notifiers_info (net/core/dev.c:2011) unregister_netdevice_many_notify (net/core/dev.c:11551) unregister_netdevice_queue (net/core/dev.c:11487) unregister_netdev (net/core/dev.c:11635) mlx5e_remove (drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6552 drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6579) mlx5_core auxiliary_bus_remove (drivers/base/auxiliary.c:230) device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296) bus_remove_device (./include/linux/kobject.h:193 drivers/base/base.h:73 drivers/base/bus.c:583) device_del (drivers/base/power/power.h:142 drivers/base/core.c:3855) mlx5_rescan_drivers_locked (./include/linux/auxiliary_bus.h:241 drivers/net/ethernet/mellanox/mlx5/core/dev.c:333 drivers/net/ethernet/mellanox/mlx5/core/dev.c:535 drivers/net/ethernet/mellanox/mlx5/core/dev.c:549) mlx5_core mlx5_unregister_device (drivers/net/ethernet/mellanox/mlx5/core/dev.c:468) mlx5_core mlx5_uninit_one (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 drivers/net/ethernet/mellanox/mlx5/core/main.c:1563) mlx5_core remove_one (drivers/net/ethernet/mellanox/mlx5/core/main.c:965 drivers/net/ethernet/mellanox/mlx5/core/main.c:2019) mlx5_core pci_device_remove (./include/linux/pm_runtime.h:129 drivers/pci/pci-driver.c:475) device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296) unbind_store (drivers/base/bus.c:245) kernfs_fop_write_iter (fs/kernfs/file.c:338) vfs_write (fs/read_write.c:587 (discriminator 1) fs/read_write.c:679 (discriminator 1)) ksys_write (fs/read_write.c:732) do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) RIP: 0033:0x7f6a4d5018b7 Fixes: 7fb1073300a2 ("net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_dev_net().") Reported-by: Yael Chemla <ychemla@nvidia.com> Closes: https://lore.kernel.org/netdev/146eabfe-123c-4970-901e-e961b4c09bc3@nvidia.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> --- v5: * Use do-while loop v4: * Fix build failure when !CONFIG_NET_NS * Use net_passive_dec() v3: * Bump net->passive instead of maybe_get_net() * Remove msleep(1) loop * Use rcu_access_pointer() instead of rcu_read_lock(). v2: * Use dev_net_rcu(). * Use msleep(1) instead of cond_resched() after maybe_get_net() * Remove cond_resched() after net_eq() check v1: https://lore.kernel.org/netdev/20250130232435.43622-2-kuniyu@amazon.com/ --- net/core/dev.c | 48 ++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 44 insertions(+), 4 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index b91658e8aedb..19e268568282 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2070,6 +2070,42 @@ static void __move_netdevice_notifier_net(struct net *src_net, __register_netdevice_notifier_net(dst_net, nb, true); } +static void rtnl_net_dev_lock(struct net_device *dev) +{ + bool again; + + do { + struct net *net; + + again = false; + + /* netns might be being dismantled. */ + rcu_read_lock(); + net = dev_net_rcu(dev); + net_passive_inc(net); + rcu_read_unlock(); + + rtnl_net_lock(net); + +#ifdef CONFIG_NET_NS + /* dev might have been moved to another netns. */ + if (!net_eq(net, rcu_access_pointer(dev->nd_net.net))) { + rtnl_net_unlock(net); + net_passive_dec(net); + again = true; + } +#endif + } while (again); +} + +static void rtnl_net_dev_unlock(struct net_device *dev) +{ + struct net *net = dev_net(dev); + + rtnl_net_unlock(net); + net_passive_dec(net); +} + int register_netdevice_notifier_dev_net(struct net_device *dev, struct notifier_block *nb, struct netdev_net_notifier *nn) @@ -2077,6 +2113,11 @@ int register_netdevice_notifier_dev_net(struct net_device *dev, struct net *net = dev_net(dev); int err; + /* rtnl_net_lock() assumes dev is not yet published by + * register_netdevice(). + */ + DEBUG_NET_WARN_ON_ONCE(!list_empty(&dev->dev_list)); + rtnl_net_lock(net); err = __register_netdevice_notifier_net(net, nb, false); if (!err) { @@ -2093,13 +2134,12 @@ int unregister_netdevice_notifier_dev_net(struct net_device *dev, struct notifier_block *nb, struct netdev_net_notifier *nn) { - struct net *net = dev_net(dev); int err; - rtnl_net_lock(net); + rtnl_net_dev_lock(dev); list_del(&nn->list); - err = __unregister_netdevice_notifier_net(net, nb); - rtnl_net_unlock(net); + err = __unregister_netdevice_notifier_net(dev_net(dev), nb); + rtnl_net_dev_unlock(dev); return err; } -- 2.39.5 (Apple Git-154) ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v5 net 2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). 2025-02-17 19:11 ` [PATCH v5 net 2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net() Kuniyuki Iwashima @ 2025-03-26 13:46 ` Yael Chemla 2025-03-26 21:54 ` Kuniyuki Iwashima 0 siblings, 1 reply; 13+ messages in thread From: Yael Chemla @ 2025-03-26 13:46 UTC (permalink / raw) To: Kuniyuki Iwashima, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman Cc: Kuniyuki Iwashima, netdev On 17/02/2025 21:11, Kuniyuki Iwashima wrote: > After the cited commit, dev_net(dev) is fetched before holding RTNL > and passed to __unregister_netdevice_notifier_net(). > > However, dev_net(dev) might be different after holding RTNL. > > In the reported case [0], while removing a VF device, its netns was > being dismantled and the VF was moved to init_net. > > So the following sequence is basically illegal when dev was fetched > without lookup: > > net = dev_net(dev); > rtnl_net_lock(net); > > Let's use a new helper rtnl_net_dev_lock() to fix the race. > > It fetches dev_net_rcu(dev), bumps its net->passive, and checks if > dev_net_rcu(dev) is changed after rtnl_net_lock(). > > [0]: > BUG: KASAN: slab-use-after-free in notifier_call_chain (kernel/notifier.c:75 (discriminator 2)) > Read of size 8 at addr ffff88810cefb4c8 by task test-bridge-lag/21127 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > Call Trace: > <TASK> > dump_stack_lvl (lib/dump_stack.c:123) > print_report (mm/kasan/report.c:379 mm/kasan/report.c:489) > kasan_report (mm/kasan/report.c:604) > notifier_call_chain (kernel/notifier.c:75 (discriminator 2)) > call_netdevice_notifiers_info (net/core/dev.c:2011) > unregister_netdevice_many_notify (net/core/dev.c:11551) > unregister_netdevice_queue (net/core/dev.c:11487) > unregister_netdev (net/core/dev.c:11635) > mlx5e_remove (drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6552 drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6579) mlx5_core > auxiliary_bus_remove (drivers/base/auxiliary.c:230) > device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296) > bus_remove_device (./include/linux/kobject.h:193 drivers/base/base.h:73 drivers/base/bus.c:583) > device_del (drivers/base/power/power.h:142 drivers/base/core.c:3855) > mlx5_rescan_drivers_locked (./include/linux/auxiliary_bus.h:241 drivers/net/ethernet/mellanox/mlx5/core/dev.c:333 drivers/net/ethernet/mellanox/mlx5/core/dev.c:535 drivers/net/ethernet/mellanox/mlx5/core/dev.c:549) mlx5_core > mlx5_unregister_device (drivers/net/ethernet/mellanox/mlx5/core/dev.c:468) mlx5_core > mlx5_uninit_one (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 drivers/net/ethernet/mellanox/mlx5/core/main.c:1563) mlx5_core > remove_one (drivers/net/ethernet/mellanox/mlx5/core/main.c:965 drivers/net/ethernet/mellanox/mlx5/core/main.c:2019) mlx5_core > pci_device_remove (./include/linux/pm_runtime.h:129 drivers/pci/pci-driver.c:475) > device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296) > unbind_store (drivers/base/bus.c:245) > kernfs_fop_write_iter (fs/kernfs/file.c:338) > vfs_write (fs/read_write.c:587 (discriminator 1) fs/read_write.c:679 (discriminator 1)) > ksys_write (fs/read_write.c:732) > do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) > entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) > RIP: 0033:0x7f6a4d5018b7 > > Fixes: 7fb1073300a2 ("net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_dev_net().") > Reported-by: Yael Chemla <ychemla@nvidia.com> > Closes: https://lore.kernel.org/netdev/146eabfe-123c-4970-901e-e961b4c09bc3@nvidia.com/ > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> > Reviewed-by: Eric Dumazet <edumazet@google.com> > --- > v5: > * Use do-while loop > > v4: > * Fix build failure when !CONFIG_NET_NS > * Use net_passive_dec() > > v3: > * Bump net->passive instead of maybe_get_net() > * Remove msleep(1) loop > * Use rcu_access_pointer() instead of rcu_read_lock(). > > v2: > * Use dev_net_rcu(). > * Use msleep(1) instead of cond_resched() after maybe_get_net() > * Remove cond_resched() after net_eq() check > > v1: https://lore.kernel.org/netdev/20250130232435.43622-2-kuniyu@amazon.com/ > --- > net/core/dev.c | 48 ++++++++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 44 insertions(+), 4 deletions(-) > > diff --git a/net/core/dev.c b/net/core/dev.c > index b91658e8aedb..19e268568282 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -2070,6 +2070,42 @@ static void __move_netdevice_notifier_net(struct net *src_net, > __register_netdevice_notifier_net(dst_net, nb, true); > } > > +static void rtnl_net_dev_lock(struct net_device *dev) > +{ > + bool again; > + > + do { > + struct net *net; > + > + again = false; > + > + /* netns might be being dismantled. */ > + rcu_read_lock(); > + net = dev_net_rcu(dev); > + net_passive_inc(net); Hi Kuniyuki, It seems we are still encountering the previously reorted issue, even when running with your latest fix. However, the problem has become less frequent, now requiring multiple test iterations to reproduce. 1) we identified the following warnings (each accompanied by a call trace; only one is detailed below, though others are similar): refcount_t: addition on 0; use-after-free. WARNING: CPU: 6 PID: 1105 at lib/refcount.c:25 refcount_warn_saturate (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) and also refcount_t: underflow; use-after-free. WARNING: CPU: 6 PID: 1105 at lib/refcount.c:28 refcount_warn_saturate (/usr/work/linux/lib/refcount.c:28 (discriminator 1)) 2) test scenario: sets up a network topology of two VFs on different eSwitch, performs ping tests between them, verifies traffic rules offloading, and cleans up the environment afterward. 3) the warning is triggered upon reaching the unregister_netdevice_notifier_dev_net when both net->ns.count and net->passive reference counts are zero. Call Trace: <TASK> ? __warn (/usr/work/linux/kernel/panic.c:748) ? refcount_warn_saturate (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) ? report_bug (/usr/work/linux/lib/bug.c:180 /usr/work/linux/lib/bug.c:219) ? handle_bug (/usr/work/linux/arch/x86/kernel/traps.c:285) ? exc_invalid_op (/usr/work/linux/arch/x86/kernel/traps.c:309 (discriminator 1)) ? asm_exc_invalid_op (/usr/work/linux/./arch/x86/include/asm/idtentry.h:574) ? refcount_warn_saturate (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) ? refcount_warn_saturate (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) rtnl_net_dev_lock (/usr/work/linux/./include/linux/refcount.h:190 /usr/work/linux/./include/linux/refcount.h:241 /usr/work/linux/./include/linux/refcount.h:258 /usr/work/linux/./include/net/net_namespace.h:339 /usr/work/linux/net/core/dev.c:2076) unregister_netdevice_notifier_dev_net (/usr/work/linux/./include/linux/list.h:218 /usr/work/linux/./include/linux/list.h:229 /usr/work/linux/net/core/dev.c:2125) mlx5e_tc_nic_cleanup (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:5352) mlx5_core mlx5e_cleanup_nic_rx (/usr/work/linux/./drivers/net/ethernet/mellanox/mlx5/core/en/fs.h:161 /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:5765) mlx5_core mlx5e_detach_netdev (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6231) mlx5_core _mlx5e_suspend (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6399) mlx5_core mlx5e_remove (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6526 /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6553) mlx5_core auxiliary_bus_remove (/usr/work/linux/drivers/base/auxiliary.c:230) device_release_driver_internal (/usr/work/linux/drivers/base/dd.c:1275 /usr/work/linux/drivers/base/dd.c:1296) bus_remove_device (/usr/work/linux/./include/linux/kobject.h:193 /usr/work/linux/drivers/base/base.h:73 /usr/work/linux/drivers/base/bus.c:586) device_del (/usr/work/linux/drivers/base/power/power.h:142 /usr/work/linux/drivers/base/core.c:3856) ? devl_param_driverinit_value_get (/usr/work/linux/net/devlink/param.c:778) mlx5_rescan_drivers_locked (/usr/work/linux/./include/linux/auxiliary_bus.h:242 /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/dev.c:333 /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/dev.c:535 /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/dev.c:549) mlx5_core mlx5_unregister_device (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/dev.c:468) mlx5_core mlx5_uninit_one (/usr/work/linux/./arch/x86/include/asm/bitops.h:206 /usr/work/linux/./arch/x86/include/asm/bitops.h:238 /usr/work/linux/./include/asm-generic/bitops/instrumented-non-atomic.h:142 /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/main.c:1576) mlx5_core remove_one (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/main.c:969 /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/main.c:2034) mlx5_core pci_device_remove (/usr/work/linux/./arch/x86/include/asm/atomic.h:23 /usr/work/linux/./include/linux/atomic/atomic-arch-fallback.h:457 /usr/work/linux/./include/linux/atomic/atomic-arch-fallback.h:2426 /usr/work/linux/./include/linux/atomic/atomic-arch-fallback.h:2456 /usr/work/linux/./include/linux/atomic/atomic-instrumented.h:1518 /usr/work/linux/./include/linux/pm_runtime.h:129 /usr/work/linux/drivers/pci/pci-driver.c:475) device_release_driver_internal (/usr/work/linux/drivers/base/dd.c:1275 /usr/work/linux/drivers/base/dd.c:1296) pci_stop_bus_device (/usr/work/linux/drivers/pci/remove.c:46 /usr/work/linux/drivers/pci/remove.c:106) pci_stop_and_remove_bus_device (/usr/work/linux/drivers/pci/remove.c:141) pci_iov_remove_virtfn (/usr/work/linux/drivers/pci/iov.c:377) sriov_disable (/usr/work/linux/drivers/pci/iov.c:712 (discriminator 1) /usr/work/linux/drivers/pci/iov.c:723 (discriminator 1)) mlx5_sriov_disable (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/sriov.c:208) mlx5_core mlx5_core_sriov_configure (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/sriov.c:229) mlx5_core sriov_numvfs_store (/usr/work/linux/./include/linux/device.h:1045 /usr/work/linux/drivers/pci/iov.c:471) kernfs_fop_write_iter (/usr/work/linux/fs/kernfs/file.c:338) vfs_write (/usr/work/linux/fs/read_write.c:587 (discriminator 1) /usr/work/linux/fs/read_write.c:679 (discriminator 1)) ksys_write (/usr/work/linux/fs/read_write.c:732) do_syscall_64 (/usr/work/linux/arch/x86/entry/common.c:52 (discriminator 1) /usr/work/linux/arch/x86/entry/common.c:83 (discriminator 1)) entry_SYSCALL_64_after_hwframe (/usr/work/linux/arch/x86/entry/entry_64.S:130) RIP: 0033:0x7fbe3e9018b7 let me know if you need more information Yael > + rcu_read_unlock(); > + > + rtnl_net_lock(net); > + > +#ifdef CONFIG_NET_NS > + /* dev might have been moved to another netns. */ > + if (!net_eq(net, rcu_access_pointer(dev->nd_net.net))) { > + rtnl_net_unlock(net); > + net_passive_dec(net); > + again = true; > + } > +#endif > + } while (again); > +} > + > +static void rtnl_net_dev_unlock(struct net_device *dev) > +{ > + struct net *net = dev_net(dev); > + > + rtnl_net_unlock(net); > + net_passive_dec(net); > +} > + > int register_netdevice_notifier_dev_net(struct net_device *dev, > struct notifier_block *nb, > struct netdev_net_notifier *nn) > @@ -2077,6 +2113,11 @@ int register_netdevice_notifier_dev_net(struct net_device *dev, > struct net *net = dev_net(dev); > int err; > > + /* rtnl_net_lock() assumes dev is not yet published by > + * register_netdevice(). > + */ > + DEBUG_NET_WARN_ON_ONCE(!list_empty(&dev->dev_list)); > + > rtnl_net_lock(net); > err = __register_netdevice_notifier_net(net, nb, false); > if (!err) { > @@ -2093,13 +2134,12 @@ int unregister_netdevice_notifier_dev_net(struct net_device *dev, > struct notifier_block *nb, > struct netdev_net_notifier *nn) > { > - struct net *net = dev_net(dev); > int err; > > - rtnl_net_lock(net); > + rtnl_net_dev_lock(dev); > list_del(&nn->list); > - err = __unregister_netdevice_notifier_net(net, nb); > - rtnl_net_unlock(net); > + err = __unregister_netdevice_notifier_net(dev_net(dev), nb); > + rtnl_net_dev_unlock(dev); > > return err; > } ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v5 net 2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). 2025-03-26 13:46 ` Yael Chemla @ 2025-03-26 21:54 ` Kuniyuki Iwashima 2025-03-26 22:10 ` Eric Dumazet 0 siblings, 1 reply; 13+ messages in thread From: Kuniyuki Iwashima @ 2025-03-26 21:54 UTC (permalink / raw) To: ychemla; +Cc: davem, edumazet, horms, kuba, kuni1840, kuniyu, netdev, pabeni From: Yael Chemla <ychemla@nvidia.com> Date: Wed, 26 Mar 2025 15:46:40 +0200 > On 17/02/2025 21:11, Kuniyuki Iwashima wrote: > > After the cited commit, dev_net(dev) is fetched before holding RTNL > > and passed to __unregister_netdevice_notifier_net(). > > > > However, dev_net(dev) might be different after holding RTNL. > > > > In the reported case [0], while removing a VF device, its netns was > > being dismantled and the VF was moved to init_net. > > > > So the following sequence is basically illegal when dev was fetched > > without lookup: > > > > net = dev_net(dev); > > rtnl_net_lock(net); > > > > Let's use a new helper rtnl_net_dev_lock() to fix the race. > > > > It fetches dev_net_rcu(dev), bumps its net->passive, and checks if > > dev_net_rcu(dev) is changed after rtnl_net_lock(). > > > > [0]: > > BUG: KASAN: slab-use-after-free in notifier_call_chain (kernel/notifier.c:75 (discriminator 2)) > > Read of size 8 at addr ffff88810cefb4c8 by task test-bridge-lag/21127 > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > Call Trace: > > <TASK> > > dump_stack_lvl (lib/dump_stack.c:123) > > print_report (mm/kasan/report.c:379 mm/kasan/report.c:489) > > kasan_report (mm/kasan/report.c:604) > > notifier_call_chain (kernel/notifier.c:75 (discriminator 2)) > > call_netdevice_notifiers_info (net/core/dev.c:2011) > > unregister_netdevice_many_notify (net/core/dev.c:11551) > > unregister_netdevice_queue (net/core/dev.c:11487) > > unregister_netdev (net/core/dev.c:11635) > > mlx5e_remove (drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6552 drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6579) mlx5_core > > auxiliary_bus_remove (drivers/base/auxiliary.c:230) > > device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296) > > bus_remove_device (./include/linux/kobject.h:193 drivers/base/base.h:73 drivers/base/bus.c:583) > > device_del (drivers/base/power/power.h:142 drivers/base/core.c:3855) > > mlx5_rescan_drivers_locked (./include/linux/auxiliary_bus.h:241 drivers/net/ethernet/mellanox/mlx5/core/dev.c:333 drivers/net/ethernet/mellanox/mlx5/core/dev.c:535 drivers/net/ethernet/mellanox/mlx5/core/dev.c:549) mlx5_core > > mlx5_unregister_device (drivers/net/ethernet/mellanox/mlx5/core/dev.c:468) mlx5_core > > mlx5_uninit_one (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 drivers/net/ethernet/mellanox/mlx5/core/main.c:1563) mlx5_core > > remove_one (drivers/net/ethernet/mellanox/mlx5/core/main.c:965 drivers/net/ethernet/mellanox/mlx5/core/main.c:2019) mlx5_core > > pci_device_remove (./include/linux/pm_runtime.h:129 drivers/pci/pci-driver.c:475) > > device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296) > > unbind_store (drivers/base/bus.c:245) > > kernfs_fop_write_iter (fs/kernfs/file.c:338) > > vfs_write (fs/read_write.c:587 (discriminator 1) fs/read_write.c:679 (discriminator 1)) > > ksys_write (fs/read_write.c:732) > > do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) > > entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) > > RIP: 0033:0x7f6a4d5018b7 > > > > Fixes: 7fb1073300a2 ("net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_dev_net().") > > Reported-by: Yael Chemla <ychemla@nvidia.com> > > Closes: https://lore.kernel.org/netdev/146eabfe-123c-4970-901e-e961b4c09bc3@nvidia.com/ > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> > > Reviewed-by: Eric Dumazet <edumazet@google.com> > > --- > > v5: > > * Use do-while loop > > > > v4: > > * Fix build failure when !CONFIG_NET_NS > > * Use net_passive_dec() > > > > v3: > > * Bump net->passive instead of maybe_get_net() > > * Remove msleep(1) loop > > * Use rcu_access_pointer() instead of rcu_read_lock(). > > > > v2: > > * Use dev_net_rcu(). > > * Use msleep(1) instead of cond_resched() after maybe_get_net() > > * Remove cond_resched() after net_eq() check > > > > v1: https://lore.kernel.org/netdev/20250130232435.43622-2-kuniyu@amazon.com/ > > --- > > net/core/dev.c | 48 ++++++++++++++++++++++++++++++++++++++++++++---- > > 1 file changed, 44 insertions(+), 4 deletions(-) > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > index b91658e8aedb..19e268568282 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -2070,6 +2070,42 @@ static void __move_netdevice_notifier_net(struct net *src_net, > > __register_netdevice_notifier_net(dst_net, nb, true); > > } > > > > +static void rtnl_net_dev_lock(struct net_device *dev) > > +{ > > + bool again; > > + > > + do { > > + struct net *net; > > + > > + again = false; > > + > > + /* netns might be being dismantled. */ > > + rcu_read_lock(); > > + net = dev_net_rcu(dev); > > + net_passive_inc(net); > > Hi Kuniyuki, > It seems we are still encountering the previously reorted issue, > even when running with your latest fix. However, the problem has become > less frequent, now requiring multiple test iterations to reproduce. Thanks for reporting! > > 1) we identified the following warnings (each accompanied by a call > trace; only one is detailed below, though others are similar): > > refcount_t: addition on 0; use-after-free. > WARNING: CPU: 6 PID: 1105 at lib/refcount.c:25 refcount_warn_saturate > (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) > > and also > > refcount_t: underflow; use-after-free. > WARNING: CPU: 6 PID: 1105 at lib/refcount.c:28 refcount_warn_saturate > (/usr/work/linux/lib/refcount.c:28 (discriminator 1)) > > > 2) test scenario: > sets up a network topology of two VFs on different eSwitch, performs > ping tests between them, verifies traffic rules offloading, and cleans > up the environment afterward. > > 3) the warning is triggered upon reaching the > unregister_netdevice_notifier_dev_net when both net->ns.count and > net->passive reference counts are zero. It looks unlikely but I missed there is still a race window. If dev_net_rcu() is called between synchronize_net() and dev_net_set() in netif_change_net_namespace(), there might be no synchronize_rcu() called after that and net_passive_dec() could be called in cleanup_net() earlier than net_passive_inc() ... ? Could you test this ? ---8<--- diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h index bd57d8fb54f1..c275e95c83ab 100644 --- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -337,9 +337,9 @@ static inline void net_passive_dec(struct net *net) } #endif -static inline void net_passive_inc(struct net *net) +static inline bool net_passive_inc(struct net *net) { - refcount_inc(&net->passive); + return refcount_inc_not_zero(&net->passive); } /* Returns true if the netns initialization is completed successfully */ diff --git a/net/core/dev.c b/net/core/dev.c index b597cc27a115..baff9568a34f 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2087,7 +2087,11 @@ static void rtnl_net_dev_lock(struct net_device *dev) /* netns might be being dismantled. */ rcu_read_lock(); net = dev_net_rcu(dev); - net_passive_inc(net); + if (!net_passive_inc(net)) { + rcu_read_unlock(); + msleep(1); + continue; + } rcu_read_unlock(); rtnl_net_lock(net); ---8<--- ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v5 net 2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). 2025-03-26 21:54 ` Kuniyuki Iwashima @ 2025-03-26 22:10 ` Eric Dumazet 2025-03-26 22:32 ` Kuniyuki Iwashima 0 siblings, 1 reply; 13+ messages in thread From: Eric Dumazet @ 2025-03-26 22:10 UTC (permalink / raw) To: Kuniyuki Iwashima; +Cc: ychemla, davem, horms, kuba, kuni1840, netdev, pabeni On Wed, Mar 26, 2025 at 10:55 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote: > > From: Yael Chemla <ychemla@nvidia.com> > Date: Wed, 26 Mar 2025 15:46:40 +0200 > > On 17/02/2025 21:11, Kuniyuki Iwashima wrote: > > > After the cited commit, dev_net(dev) is fetched before holding RTNL > > > and passed to __unregister_netdevice_notifier_net(). > > > > > > However, dev_net(dev) might be different after holding RTNL. > > > > > > In the reported case [0], while removing a VF device, its netns was > > > being dismantled and the VF was moved to init_net. > > > > > > So the following sequence is basically illegal when dev was fetched > > > without lookup: > > > > > > net = dev_net(dev); > > > rtnl_net_lock(net); > > > > > > Let's use a new helper rtnl_net_dev_lock() to fix the race. > > > > > > It fetches dev_net_rcu(dev), bumps its net->passive, and checks if > > > dev_net_rcu(dev) is changed after rtnl_net_lock(). > > > > > > [0]: > > > BUG: KASAN: slab-use-after-free in notifier_call_chain (kernel/notifier.c:75 (discriminator 2)) > > > Read of size 8 at addr ffff88810cefb4c8 by task test-bridge-lag/21127 > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > > Call Trace: > > > <TASK> > > > dump_stack_lvl (lib/dump_stack.c:123) > > > print_report (mm/kasan/report.c:379 mm/kasan/report.c:489) > > > kasan_report (mm/kasan/report.c:604) > > > notifier_call_chain (kernel/notifier.c:75 (discriminator 2)) > > > call_netdevice_notifiers_info (net/core/dev.c:2011) > > > unregister_netdevice_many_notify (net/core/dev.c:11551) > > > unregister_netdevice_queue (net/core/dev.c:11487) > > > unregister_netdev (net/core/dev.c:11635) > > > mlx5e_remove (drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6552 drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6579) mlx5_core > > > auxiliary_bus_remove (drivers/base/auxiliary.c:230) > > > device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296) > > > bus_remove_device (./include/linux/kobject.h:193 drivers/base/base.h:73 drivers/base/bus.c:583) > > > device_del (drivers/base/power/power.h:142 drivers/base/core.c:3855) > > > mlx5_rescan_drivers_locked (./include/linux/auxiliary_bus.h:241 drivers/net/ethernet/mellanox/mlx5/core/dev.c:333 drivers/net/ethernet/mellanox/mlx5/core/dev.c:535 drivers/net/ethernet/mellanox/mlx5/core/dev.c:549) mlx5_core > > > mlx5_unregister_device (drivers/net/ethernet/mellanox/mlx5/core/dev.c:468) mlx5_core > > > mlx5_uninit_one (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 drivers/net/ethernet/mellanox/mlx5/core/main.c:1563) mlx5_core > > > remove_one (drivers/net/ethernet/mellanox/mlx5/core/main.c:965 drivers/net/ethernet/mellanox/mlx5/core/main.c:2019) mlx5_core > > > pci_device_remove (./include/linux/pm_runtime.h:129 drivers/pci/pci-driver.c:475) > > > device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296) > > > unbind_store (drivers/base/bus.c:245) > > > kernfs_fop_write_iter (fs/kernfs/file.c:338) > > > vfs_write (fs/read_write.c:587 (discriminator 1) fs/read_write.c:679 (discriminator 1)) > > > ksys_write (fs/read_write.c:732) > > > do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) > > > entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) > > > RIP: 0033:0x7f6a4d5018b7 > > > > > > Fixes: 7fb1073300a2 ("net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_dev_net().") > > > Reported-by: Yael Chemla <ychemla@nvidia.com> > > > Closes: https://lore.kernel.org/netdev/146eabfe-123c-4970-901e-e961b4c09bc3@nvidia.com/ > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> > > > Reviewed-by: Eric Dumazet <edumazet@google.com> > > > --- > > > v5: > > > * Use do-while loop > > > > > > v4: > > > * Fix build failure when !CONFIG_NET_NS > > > * Use net_passive_dec() > > > > > > v3: > > > * Bump net->passive instead of maybe_get_net() > > > * Remove msleep(1) loop > > > * Use rcu_access_pointer() instead of rcu_read_lock(). > > > > > > v2: > > > * Use dev_net_rcu(). > > > * Use msleep(1) instead of cond_resched() after maybe_get_net() > > > * Remove cond_resched() after net_eq() check > > > > > > v1: https://lore.kernel.org/netdev/20250130232435.43622-2-kuniyu@amazon.com/ > > > --- > > > net/core/dev.c | 48 ++++++++++++++++++++++++++++++++++++++++++++---- > > > 1 file changed, 44 insertions(+), 4 deletions(-) > > > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > > index b91658e8aedb..19e268568282 100644 > > > --- a/net/core/dev.c > > > +++ b/net/core/dev.c > > > @@ -2070,6 +2070,42 @@ static void __move_netdevice_notifier_net(struct net *src_net, > > > __register_netdevice_notifier_net(dst_net, nb, true); > > > } > > > > > > +static void rtnl_net_dev_lock(struct net_device *dev) > > > +{ > > > + bool again; > > > + > > > + do { > > > + struct net *net; > > > + > > > + again = false; > > > + > > > + /* netns might be being dismantled. */ > > > + rcu_read_lock(); > > > + net = dev_net_rcu(dev); > > > + net_passive_inc(net); > > > > Hi Kuniyuki, > > It seems we are still encountering the previously reorted issue, > > even when running with your latest fix. However, the problem has become > > less frequent, now requiring multiple test iterations to reproduce. > > Thanks for reporting! > > > > > > 1) we identified the following warnings (each accompanied by a call > > trace; only one is detailed below, though others are similar): > > > > refcount_t: addition on 0; use-after-free. > > WARNING: CPU: 6 PID: 1105 at lib/refcount.c:25 refcount_warn_saturate > > (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) > > > > and also > > > > refcount_t: underflow; use-after-free. > > WARNING: CPU: 6 PID: 1105 at lib/refcount.c:28 refcount_warn_saturate > > (/usr/work/linux/lib/refcount.c:28 (discriminator 1)) > > > > > > 2) test scenario: > > sets up a network topology of two VFs on different eSwitch, performs > > ping tests between them, verifies traffic rules offloading, and cleans > > up the environment afterward. > > > > 3) the warning is triggered upon reaching the > > unregister_netdevice_notifier_dev_net when both net->ns.count and > > net->passive reference counts are zero. > > It looks unlikely but I missed there is still a race window. > > If dev_net_rcu() is called between synchronize_net() and dev_net_set() > in netif_change_net_namespace(), there might be no synchronize_rcu() > called after that and net_passive_dec() could be called in cleanup_net() > earlier than net_passive_inc() ... ? > > Could you test this ? > > ---8<--- > diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h > index bd57d8fb54f1..c275e95c83ab 100644 > --- a/include/net/net_namespace.h > +++ b/include/net/net_namespace.h > @@ -337,9 +337,9 @@ static inline void net_passive_dec(struct net *net) > } > #endif > > -static inline void net_passive_inc(struct net *net) > +static inline bool net_passive_inc(struct net *net) > { > - refcount_inc(&net->passive); > + return refcount_inc_not_zero(&net->passive); Hmm We want the two oher net_passive_inc() callers to have refcount safety. Please add a new net_passive_inc_not_zero() ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v5 net 2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). 2025-03-26 22:10 ` Eric Dumazet @ 2025-03-26 22:32 ` Kuniyuki Iwashima 2025-04-01 20:49 ` Yael Chemla 0 siblings, 1 reply; 13+ messages in thread From: Kuniyuki Iwashima @ 2025-03-26 22:32 UTC (permalink / raw) To: edumazet; +Cc: davem, horms, kuba, kuni1840, kuniyu, netdev, pabeni, ychemla From: Eric Dumazet <edumazet@google.com> Date: Wed, 26 Mar 2025 23:10:04 +0100 > On Wed, Mar 26, 2025 at 10:55 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote: > > > > From: Yael Chemla <ychemla@nvidia.com> > > Date: Wed, 26 Mar 2025 15:46:40 +0200 > > > On 17/02/2025 21:11, Kuniyuki Iwashima wrote: > > > > After the cited commit, dev_net(dev) is fetched before holding RTNL > > > > and passed to __unregister_netdevice_notifier_net(). > > > > > > > > However, dev_net(dev) might be different after holding RTNL. > > > > > > > > In the reported case [0], while removing a VF device, its netns was > > > > being dismantled and the VF was moved to init_net. > > > > > > > > So the following sequence is basically illegal when dev was fetched > > > > without lookup: > > > > > > > > net = dev_net(dev); > > > > rtnl_net_lock(net); > > > > > > > > Let's use a new helper rtnl_net_dev_lock() to fix the race. > > > > > > > > It fetches dev_net_rcu(dev), bumps its net->passive, and checks if > > > > dev_net_rcu(dev) is changed after rtnl_net_lock(). > > > > > > > > [0]: > > > > BUG: KASAN: slab-use-after-free in notifier_call_chain (kernel/notifier.c:75 (discriminator 2)) > > > > Read of size 8 at addr ffff88810cefb4c8 by task test-bridge-lag/21127 > > > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 > > > > Call Trace: > > > > <TASK> > > > > dump_stack_lvl (lib/dump_stack.c:123) > > > > print_report (mm/kasan/report.c:379 mm/kasan/report.c:489) > > > > kasan_report (mm/kasan/report.c:604) > > > > notifier_call_chain (kernel/notifier.c:75 (discriminator 2)) > > > > call_netdevice_notifiers_info (net/core/dev.c:2011) > > > > unregister_netdevice_many_notify (net/core/dev.c:11551) > > > > unregister_netdevice_queue (net/core/dev.c:11487) > > > > unregister_netdev (net/core/dev.c:11635) > > > > mlx5e_remove (drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6552 drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6579) mlx5_core > > > > auxiliary_bus_remove (drivers/base/auxiliary.c:230) > > > > device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296) > > > > bus_remove_device (./include/linux/kobject.h:193 drivers/base/base.h:73 drivers/base/bus.c:583) > > > > device_del (drivers/base/power/power.h:142 drivers/base/core.c:3855) > > > > mlx5_rescan_drivers_locked (./include/linux/auxiliary_bus.h:241 drivers/net/ethernet/mellanox/mlx5/core/dev.c:333 drivers/net/ethernet/mellanox/mlx5/core/dev.c:535 drivers/net/ethernet/mellanox/mlx5/core/dev.c:549) mlx5_core > > > > mlx5_unregister_device (drivers/net/ethernet/mellanox/mlx5/core/dev.c:468) mlx5_core > > > > mlx5_uninit_one (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 drivers/net/ethernet/mellanox/mlx5/core/main.c:1563) mlx5_core > > > > remove_one (drivers/net/ethernet/mellanox/mlx5/core/main.c:965 drivers/net/ethernet/mellanox/mlx5/core/main.c:2019) mlx5_core > > > > pci_device_remove (./include/linux/pm_runtime.h:129 drivers/pci/pci-driver.c:475) > > > > device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296) > > > > unbind_store (drivers/base/bus.c:245) > > > > kernfs_fop_write_iter (fs/kernfs/file.c:338) > > > > vfs_write (fs/read_write.c:587 (discriminator 1) fs/read_write.c:679 (discriminator 1)) > > > > ksys_write (fs/read_write.c:732) > > > > do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) > > > > entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) > > > > RIP: 0033:0x7f6a4d5018b7 > > > > > > > > Fixes: 7fb1073300a2 ("net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_dev_net().") > > > > Reported-by: Yael Chemla <ychemla@nvidia.com> > > > > Closes: https://lore.kernel.org/netdev/146eabfe-123c-4970-901e-e961b4c09bc3@nvidia.com/ > > > > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> > > > > Reviewed-by: Eric Dumazet <edumazet@google.com> > > > > --- > > > > v5: > > > > * Use do-while loop > > > > > > > > v4: > > > > * Fix build failure when !CONFIG_NET_NS > > > > * Use net_passive_dec() > > > > > > > > v3: > > > > * Bump net->passive instead of maybe_get_net() > > > > * Remove msleep(1) loop > > > > * Use rcu_access_pointer() instead of rcu_read_lock(). > > > > > > > > v2: > > > > * Use dev_net_rcu(). > > > > * Use msleep(1) instead of cond_resched() after maybe_get_net() > > > > * Remove cond_resched() after net_eq() check > > > > > > > > v1: https://lore.kernel.org/netdev/20250130232435.43622-2-kuniyu@amazon.com/ > > > > --- > > > > net/core/dev.c | 48 ++++++++++++++++++++++++++++++++++++++++++++---- > > > > 1 file changed, 44 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > > > index b91658e8aedb..19e268568282 100644 > > > > --- a/net/core/dev.c > > > > +++ b/net/core/dev.c > > > > @@ -2070,6 +2070,42 @@ static void __move_netdevice_notifier_net(struct net *src_net, > > > > __register_netdevice_notifier_net(dst_net, nb, true); > > > > } > > > > > > > > +static void rtnl_net_dev_lock(struct net_device *dev) > > > > +{ > > > > + bool again; > > > > + > > > > + do { > > > > + struct net *net; > > > > + > > > > + again = false; > > > > + > > > > + /* netns might be being dismantled. */ > > > > + rcu_read_lock(); > > > > + net = dev_net_rcu(dev); > > > > + net_passive_inc(net); > > > > > > Hi Kuniyuki, > > > It seems we are still encountering the previously reorted issue, > > > even when running with your latest fix. However, the problem has become > > > less frequent, now requiring multiple test iterations to reproduce. > > > > Thanks for reporting! > > > > > > > > > > 1) we identified the following warnings (each accompanied by a call > > > trace; only one is detailed below, though others are similar): > > > > > > refcount_t: addition on 0; use-after-free. > > > WARNING: CPU: 6 PID: 1105 at lib/refcount.c:25 refcount_warn_saturate > > > (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) > > > > > > and also > > > > > > refcount_t: underflow; use-after-free. > > > WARNING: CPU: 6 PID: 1105 at lib/refcount.c:28 refcount_warn_saturate > > > (/usr/work/linux/lib/refcount.c:28 (discriminator 1)) > > > > > > > > > 2) test scenario: > > > sets up a network topology of two VFs on different eSwitch, performs > > > ping tests between them, verifies traffic rules offloading, and cleans > > > up the environment afterward. > > > > > > 3) the warning is triggered upon reaching the > > > unregister_netdevice_notifier_dev_net when both net->ns.count and > > > net->passive reference counts are zero. > > > > It looks unlikely but I missed there is still a race window. > > > > If dev_net_rcu() is called between synchronize_net() and dev_net_set() > > in netif_change_net_namespace(), there might be no synchronize_rcu() > > called after that and net_passive_dec() could be called in cleanup_net() > > earlier than net_passive_inc() ... ? > > > > Could you test this ? > > > > ---8<--- > > diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h > > index bd57d8fb54f1..c275e95c83ab 100644 > > --- a/include/net/net_namespace.h > > +++ b/include/net/net_namespace.h > > @@ -337,9 +337,9 @@ static inline void net_passive_dec(struct net *net) > > } > > #endif > > > > -static inline void net_passive_inc(struct net *net) > > +static inline bool net_passive_inc(struct net *net) > > { > > - refcount_inc(&net->passive); > > + return refcount_inc_not_zero(&net->passive); > > Hmm > > We want the two oher net_passive_inc() callers to have refcount safety. > > Please add a new net_passive_inc_not_zero() Sure, will do. Also, I'll remove the weird do-while {} and do everything under ifdef CONFIG_NET_NS guard. If it's disabled, just calling rtnl_net_lock() for &init_net is fine. ---8<--- diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h index bd57d8fb54f1..c91b13e6fb12 100644 --- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -342,6 +342,11 @@ static inline void net_passive_inc(struct net *net) refcount_inc(&net->passive); } +static inline bool net_passive_inc_not_zero(struct net *net) +{ + return refcount_inc_not_zero(&net->passive); +} + /* Returns true if the netns initialization is completed successfully */ static inline bool net_initialized(const struct net *net) { diff --git a/net/core/dev.c b/net/core/dev.c index b597cc27a115..408659d238b8 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2077,38 +2077,43 @@ static void __move_netdevice_notifier_net(struct net *src_net, static void rtnl_net_dev_lock(struct net_device *dev) { - bool again; - - do { - struct net *net; - - again = false; +#ifdef CONFIG_NET_NS + struct net *net; - /* netns might be being dismantled. */ - rcu_read_lock(); - net = dev_net_rcu(dev); - net_passive_inc(net); +again: + /* netns might be being dismantled. */ + rcu_read_lock(); + net = dev_net_rcu(dev); + if (!net_passive_inc_not_zero(net)) { rcu_read_unlock(); + msleep(1); + goto again; + } + rcu_read_unlock(); - rtnl_net_lock(net); + rtnl_net_lock(net); -#ifdef CONFIG_NET_NS - /* dev might have been moved to another netns. */ - if (!net_eq(net, rcu_access_pointer(dev->nd_net.net))) { - rtnl_net_unlock(net); - net_passive_dec(net); - again = true; - } + /* dev might have been moved to another netns. */ + if (!net_eq(net, rcu_access_pointer(dev->nd_net.net))) { + rtnl_net_unlock(net); + net_passive_dec(net); + goto again; + } +#else + rtnl_net_lock(&init_net); #endif - } while (again); } static void rtnl_net_dev_unlock(struct net_device *dev) { +#ifdef CONFIG_NET_NS struct net *net = dev_net(dev); rtnl_net_unlock(net); net_passive_dec(net); +#else + rtnl_net_unlock(&init_net); +#endif } int register_netdevice_notifier_dev_net(struct net_device *dev, ---8<--- ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v5 net 2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). 2025-03-26 22:32 ` Kuniyuki Iwashima @ 2025-04-01 20:49 ` Yael Chemla 2025-04-01 21:58 ` Kuniyuki Iwashima 0 siblings, 1 reply; 13+ messages in thread From: Yael Chemla @ 2025-04-01 20:49 UTC (permalink / raw) To: Kuniyuki Iwashima, edumazet; +Cc: davem, horms, kuba, kuni1840, netdev, pabeni On 27/03/2025 0:32, Kuniyuki Iwashima wrote: > From: Eric Dumazet <edumazet@google.com> > Date: Wed, 26 Mar 2025 23:10:04 +0100 >> On Wed, Mar 26, 2025 at 10:55 PM Kuniyuki Iwashima <kuniyu@amazon.com> wrote: >>> >>> From: Yael Chemla <ychemla@nvidia.com> >>> Date: Wed, 26 Mar 2025 15:46:40 +0200 >>>> On 17/02/2025 21:11, Kuniyuki Iwashima wrote: >>>>> After the cited commit, dev_net(dev) is fetched before holding RTNL >>>>> and passed to __unregister_netdevice_notifier_net(). >>>>> >>>>> However, dev_net(dev) might be different after holding RTNL. >>>>> >>>>> In the reported case [0], while removing a VF device, its netns was >>>>> being dismantled and the VF was moved to init_net. >>>>> >>>>> So the following sequence is basically illegal when dev was fetched >>>>> without lookup: >>>>> >>>>> net = dev_net(dev); >>>>> rtnl_net_lock(net); >>>>> >>>>> Let's use a new helper rtnl_net_dev_lock() to fix the race. >>>>> >>>>> It fetches dev_net_rcu(dev), bumps its net->passive, and checks if >>>>> dev_net_rcu(dev) is changed after rtnl_net_lock(). >>>>> >>>>> [0]: >>>>> BUG: KASAN: slab-use-after-free in notifier_call_chain (kernel/notifier.c:75 (discriminator 2)) >>>>> Read of size 8 at addr ffff88810cefb4c8 by task test-bridge-lag/21127 >>>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 >>>>> Call Trace: >>>>> <TASK> >>>>> dump_stack_lvl (lib/dump_stack.c:123) >>>>> print_report (mm/kasan/report.c:379 mm/kasan/report.c:489) >>>>> kasan_report (mm/kasan/report.c:604) >>>>> notifier_call_chain (kernel/notifier.c:75 (discriminator 2)) >>>>> call_netdevice_notifiers_info (net/core/dev.c:2011) >>>>> unregister_netdevice_many_notify (net/core/dev.c:11551) >>>>> unregister_netdevice_queue (net/core/dev.c:11487) >>>>> unregister_netdev (net/core/dev.c:11635) >>>>> mlx5e_remove (drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6552 drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6579) mlx5_core >>>>> auxiliary_bus_remove (drivers/base/auxiliary.c:230) >>>>> device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296) >>>>> bus_remove_device (./include/linux/kobject.h:193 drivers/base/base.h:73 drivers/base/bus.c:583) >>>>> device_del (drivers/base/power/power.h:142 drivers/base/core.c:3855) >>>>> mlx5_rescan_drivers_locked (./include/linux/auxiliary_bus.h:241 drivers/net/ethernet/mellanox/mlx5/core/dev.c:333 drivers/net/ethernet/mellanox/mlx5/core/dev.c:535 drivers/net/ethernet/mellanox/mlx5/core/dev.c:549) mlx5_core >>>>> mlx5_unregister_device (drivers/net/ethernet/mellanox/mlx5/core/dev.c:468) mlx5_core >>>>> mlx5_uninit_one (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 drivers/net/ethernet/mellanox/mlx5/core/main.c:1563) mlx5_core >>>>> remove_one (drivers/net/ethernet/mellanox/mlx5/core/main.c:965 drivers/net/ethernet/mellanox/mlx5/core/main.c:2019) mlx5_core >>>>> pci_device_remove (./include/linux/pm_runtime.h:129 drivers/pci/pci-driver.c:475) >>>>> device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296) >>>>> unbind_store (drivers/base/bus.c:245) >>>>> kernfs_fop_write_iter (fs/kernfs/file.c:338) >>>>> vfs_write (fs/read_write.c:587 (discriminator 1) fs/read_write.c:679 (discriminator 1)) >>>>> ksys_write (fs/read_write.c:732) >>>>> do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1)) >>>>> entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) >>>>> RIP: 0033:0x7f6a4d5018b7 >>>>> >>>>> Fixes: 7fb1073300a2 ("net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_dev_net().") >>>>> Reported-by: Yael Chemla <ychemla@nvidia.com> >>>>> Closes: https://lore.kernel.org/netdev/146eabfe-123c-4970-901e-e961b4c09bc3@nvidia.com/ >>>>> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> >>>>> Reviewed-by: Eric Dumazet <edumazet@google.com> >>>>> --- >>>>> v5: >>>>> * Use do-while loop >>>>> >>>>> v4: >>>>> * Fix build failure when !CONFIG_NET_NS >>>>> * Use net_passive_dec() >>>>> >>>>> v3: >>>>> * Bump net->passive instead of maybe_get_net() >>>>> * Remove msleep(1) loop >>>>> * Use rcu_access_pointer() instead of rcu_read_lock(). >>>>> >>>>> v2: >>>>> * Use dev_net_rcu(). >>>>> * Use msleep(1) instead of cond_resched() after maybe_get_net() >>>>> * Remove cond_resched() after net_eq() check >>>>> >>>>> v1: https://lore.kernel.org/netdev/20250130232435.43622-2-kuniyu@amazon.com/ >>>>> --- >>>>> net/core/dev.c | 48 ++++++++++++++++++++++++++++++++++++++++++++---- >>>>> 1 file changed, 44 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/net/core/dev.c b/net/core/dev.c >>>>> index b91658e8aedb..19e268568282 100644 >>>>> --- a/net/core/dev.c >>>>> +++ b/net/core/dev.c >>>>> @@ -2070,6 +2070,42 @@ static void __move_netdevice_notifier_net(struct net *src_net, >>>>> __register_netdevice_notifier_net(dst_net, nb, true); >>>>> } >>>>> >>>>> +static void rtnl_net_dev_lock(struct net_device *dev) >>>>> +{ >>>>> + bool again; >>>>> + >>>>> + do { >>>>> + struct net *net; >>>>> + >>>>> + again = false; >>>>> + >>>>> + /* netns might be being dismantled. */ >>>>> + rcu_read_lock(); >>>>> + net = dev_net_rcu(dev); >>>>> + net_passive_inc(net); >>>> >>>> Hi Kuniyuki, >>>> It seems we are still encountering the previously reorted issue, >>>> even when running with your latest fix. However, the problem has become >>>> less frequent, now requiring multiple test iterations to reproduce. >>> >>> Thanks for reporting! >>> >>> >>>> >>>> 1) we identified the following warnings (each accompanied by a call >>>> trace; only one is detailed below, though others are similar): >>>> >>>> refcount_t: addition on 0; use-after-free. >>>> WARNING: CPU: 6 PID: 1105 at lib/refcount.c:25 refcount_warn_saturate >>>> (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) >>>> >>>> and also >>>> >>>> refcount_t: underflow; use-after-free. >>>> WARNING: CPU: 6 PID: 1105 at lib/refcount.c:28 refcount_warn_saturate >>>> (/usr/work/linux/lib/refcount.c:28 (discriminator 1)) >>>> >>>> >>>> 2) test scenario: >>>> sets up a network topology of two VFs on different eSwitch, performs >>>> ping tests between them, verifies traffic rules offloading, and cleans >>>> up the environment afterward. >>>> >>>> 3) the warning is triggered upon reaching the >>>> unregister_netdevice_notifier_dev_net when both net->ns.count and >>>> net->passive reference counts are zero. >>> >>> It looks unlikely but I missed there is still a race window. >>> >>> If dev_net_rcu() is called between synchronize_net() and dev_net_set() >>> in netif_change_net_namespace(), there might be no synchronize_rcu() >>> called after that and net_passive_dec() could be called in cleanup_net() >>> earlier than net_passive_inc() ... ? >>> >>> Could you test this ? >>> >>> ---8<--- >>> diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h >>> index bd57d8fb54f1..c275e95c83ab 100644 >>> --- a/include/net/net_namespace.h >>> +++ b/include/net/net_namespace.h >>> @@ -337,9 +337,9 @@ static inline void net_passive_dec(struct net *net) >>> } >>> #endif >>> >>> -static inline void net_passive_inc(struct net *net) >>> +static inline bool net_passive_inc(struct net *net) >>> { >>> - refcount_inc(&net->passive); >>> + return refcount_inc_not_zero(&net->passive); >> >> Hmm >> >> We want the two oher net_passive_inc() callers to have refcount safety. >> >> Please add a new net_passive_inc_not_zero() > > Sure, will do. > > Also, I'll remove the weird do-while {} and do everything under > ifdef CONFIG_NET_NS guard. If it's disabled, just calling > rtnl_net_lock() for &init_net is fine. > > ---8<--- > diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h > index bd57d8fb54f1..c91b13e6fb12 100644 > --- a/include/net/net_namespace.h > +++ b/include/net/net_namespace.h > @@ -342,6 +342,11 @@ static inline void net_passive_inc(struct net *net) > refcount_inc(&net->passive); > } > > +static inline bool net_passive_inc_not_zero(struct net *net) > +{ > + return refcount_inc_not_zero(&net->passive); > +} > + > /* Returns true if the netns initialization is completed successfully */ > static inline bool net_initialized(const struct net *net) > { > diff --git a/net/core/dev.c b/net/core/dev.c > index b597cc27a115..408659d238b8 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -2077,38 +2077,43 @@ static void __move_netdevice_notifier_net(struct net *src_net, > > static void rtnl_net_dev_lock(struct net_device *dev) > { > - bool again; > - > - do { > - struct net *net; > - > - again = false; > +#ifdef CONFIG_NET_NS > + struct net *net; > > - /* netns might be being dismantled. */ > - rcu_read_lock(); > - net = dev_net_rcu(dev); > - net_passive_inc(net); > +again: > + /* netns might be being dismantled. */ > + rcu_read_lock(); > + net = dev_net_rcu(dev); > + if (!net_passive_inc_not_zero(net)) { > rcu_read_unlock(); > + msleep(1); > + goto again; > + } > + rcu_read_unlock(); > > - rtnl_net_lock(net); > + rtnl_net_lock(net); > > -#ifdef CONFIG_NET_NS > - /* dev might have been moved to another netns. */ > - if (!net_eq(net, rcu_access_pointer(dev->nd_net.net))) { > - rtnl_net_unlock(net); > - net_passive_dec(net); > - again = true; > - } > + /* dev might have been moved to another netns. */ > + if (!net_eq(net, rcu_access_pointer(dev->nd_net.net))) { > + rtnl_net_unlock(net); > + net_passive_dec(net); > + goto again; > + } > +#else > + rtnl_net_lock(&init_net); > #endif > - } while (again); > } > > static void rtnl_net_dev_unlock(struct net_device *dev) > { > +#ifdef CONFIG_NET_NS > struct net *net = dev_net(dev); > > rtnl_net_unlock(net); > net_passive_dec(net); > +#else > + rtnl_net_unlock(&init_net); > +#endif > } > > int register_netdevice_notifier_dev_net(struct net_device *dev, > ---8<--- Hi Kuniyuki, Sorry for the delay (I was OOO). I tested your patch, and while the race occurs much less frequently, it still happens—see the warnings and call traces below. Additionally, in some cases, the test which reproduce the race hang. Debugging shows that we're stuck in an endless loop inside rtnl_net_dev_lock because the passive refcount is already zero, causing net_passive_inc_not_zero to return false, thus it go to "again" and this repeats without ending. I suspect, as you mentioned before, that in such cases, the passive refcount was decreased from cleanup_net. warnings and call traces: refcount_t: addition on 0; use-after-free. WARNING: CPU: 4 PID: 27219 at lib/refcount.c:25 refcount_warn_saturate (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) Call Trace: <TASK> ? __warn (/usr/work/linux/kernel/panic.c:746) ? refcount_warn_saturate (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) ? report_bug (/usr/work/linux/lib/bug.c:207) ? handle_bug (/usr/work/linux/arch/x86/kernel/traps.c:285) ? exc_invalid_op (/usr/work/linux/arch/x86/kernel/traps.c:309 (discriminator 1)) ? asm_exc_invalid_op (/usr/work/linux/./arch/x86/include/asm/idtentry.h:574) ? refcount_warn_saturate (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) ? refcount_warn_saturate (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) rtnl_net_dev_lock (/usr/work/linux/net/core/dev.c:2091) unregister_netdevice_notifier_dev_net (/usr/work/linux/./include/linux/list.h:218 /usr/work/linux/./include/linux/list.h:229 /usr/work/linux/net/core/dev.c:2130) mlx5e_tc_nic_cleanup (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:5355) mlx5_core mlx5e_cleanup_nic_rx (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:5765 (discriminator 1)) mlx5_core mlx5e_detach_netdev (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6391) mlx5_core _mlx5e_suspend (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6403) mlx5_core mlx5e_remove (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6431 (discriminator 1) /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6507 (discriminator 1)) mlx5_core auxiliary_bus_remove (/usr/work/linux/drivers/base/auxiliary.c:230) device_release_driver_internal (/usr/work/linux/drivers/base/dd.c:1273 /usr/work/linux/drivers/base/dd.c:1296) bus_remove_device (/usr/work/linux/./include/linux/kobject.h:193 /usr/work/linux/drivers/base/base.h:73 /usr/work/linux/drivers/base/bus.c:586) device_del (/usr/work/linux/drivers/base/power/power.h:142 /usr/work/linux/drivers/base/core.c:3856) ? devl_param_driverinit_value_get (/usr/work/linux/net/devlink/param.c:778) mlx5_rescan_drivers_locked (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/dev.c:550) mlx5_core mlx5_unregister_device (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/dev.c:448) mlx5_core mlx5_uninit_one (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/main.c:1587) mlx5_core remove_one (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/main.c:281 /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/main.c:973 /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/main.c:2034) mlx5_core pci_device_remove (/usr/work/linux/./arch/x86/include/asm/atomic.h:23 /usr/work/linux/./include/linux/atomic/atomic-arch-fallback.h:457 /usr/work/linux/./include/linux/atomic/atomic-arch-fallback.h:2426 /usr/work/linux/./include/linux/atomic/atomic-arch-fallback.h:2456 /usr/work/linux/./include/linux/atomic/atomic-instrumented.h:1518 /usr/work/linux/./include/linux/pm_runtime.h:129 /usr/work/linux/drivers/pci/pci-driver.c:475) device_release_driver_internal (/usr/work/linux/drivers/base/dd.c:1273 /usr/work/linux/drivers/base/dd.c:1296) pci_stop_bus_device (/usr/work/linux/drivers/pci/remove.c:46 /usr/work/linux/drivers/pci/remove.c:106) pci_stop_and_remove_bus_device (/usr/work/linux/drivers/pci/remove.c:141) pci_iov_remove_virtfn (/usr/work/linux/drivers/pci/iov.c:377) sriov_disable (/usr/work/linux/drivers/pci/iov.c:712 (discriminator 1) /usr/work/linux/drivers/pci/iov.c:723 (discriminator 1)) mlx5_sriov_disable (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/sriov.c:209) mlx5_core mlx5_core_sriov_configure (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/sriov.c:195 (discriminator 1) /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/sriov.c:222 (discriminator 1)) mlx5_core sriov_numvfs_store (/usr/work/linux/drivers/pci/iov.c:450) kernfs_fop_write_iter (/usr/work/linux/fs/kernfs/file.c:334) vfs_write (/usr/work/linux/./include/linux/rcu_sync.h:36 (discriminator 2) /usr/work/linux/./include/linux/percpu-rwsem.h:62 (discriminator 2) /usr/work/linux/./include/linux/fs.h:1785 (discriminator 2) /usr/work/linux/./include/linux/fs.h:1921 (discriminator 2) /usr/work/linux/./include/linux/fs.h:3035 (discriminator 2) /usr/work/linux/fs/read_write.c:675 (discriminator 2)) ksys_write (/usr/work/linux/fs/read_write.c:732) do_syscall_64 (/usr/work/linux/arch/x86/entry/common.c:52 (discriminator 1) /usr/work/linux/arch/x86/entry/common.c:83 (discriminator 1)) entry_SYSCALL_64_after_hwframe (/usr/work/linux/arch/x86/entry/entry_64.S:130) RIP: 0033:0x7fb7b6f018b7 and another warning: refcount_t: underflow; use-after-free. [ 3860.462923] WARNING: CPU: 4 PID: 27219 at lib/refcount.c:28 refcount_warn_saturate (/usr/work/linux/lib/refcount.c:28 (discriminator 1)). [ 3860.479887] Call Trace: [ 3860.480199] <TASK> [ 3860.480478] ? __warn (/usr/work/linux/kernel/panic.c:746) [ 3860.480849] ? refcount_warn_saturate (/usr/work/linux/lib/refcount.c:28 (discriminator 1)) [ 3860.481327] ? report_bug (/usr/work/linux/lib/bug.c:207) [ 3860.481797] ? handle_bug (/usr/work/linux/arch/x86/kernel/traps.c:285) [ 3860.482190] ? exc_invalid_op (/usr/work/linux/arch/x86/kernel/traps.c:309 (discriminator 1)) [ 3860.482610] ? asm_exc_invalid_op (/usr/work/linux/./arch/x86/include/asm/idtentry.h:574) [ 3860.483057] ? refcount_warn_saturate (/usr/work/linux/lib/refcount.c:28 (discriminator 1)) [ 3860.483535] ? refcount_warn_saturate (/usr/work/linux/lib/refcount.c:28 (discriminator 1)) [ 3860.484007] unregister_netdevice_notifier_dev_net (/usr/work/linux/net/core/dev.c:2135) [ 3860.484569] mlx5e_tc_nic_cleanup (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:5355) mlx5_core [ 3860.485148] mlx5e_cleanup_nic_rx (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:5765 (discriminator 1)) mlx5_core [ 3860.485720] mlx5e_detach_netdev (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6391) mlx5_core [ 3860.486337] _mlx5e_suspend (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6403) mlx5_core [ 3860.486853] mlx5e_remove (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6431 (discriminator 1) /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6507 (discriminator 1)) mlx5_core [ 3860.487377] auxiliary_bus_remove (/usr/work/linux/drivers/base/auxiliary.c:230) [ 3860.487824] device_release_driver_internal (/usr/work/linux/drivers/base/dd.c:1273 /usr/work/linux/drivers/base/dd.c:1296) [ 3860.488347] bus_remove_device (/usr/work/linux/./include/linux/kobject.h:193 /usr/work/linux/drivers/base/base.h:73 /usr/work/linux/drivers/base/bus.c:586) [ 3860.488776] device_del (/usr/work/linux/drivers/base/power/power.h:142 /usr/work/linux/drivers/base/core.c:3856) [ 3860.489167] ? devl_param_driverinit_value_get (/usr/work/linux/net/devlink/param.c:778) [ 3860.489717] mlx5_rescan_drivers_locked (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/dev.c:550) mlx5_core [ 3860.490376] mlx5_unregister_device (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/dev.c:448) mlx5_core [ 3860.490937] mlx5_uninit_one (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/main.c:1587) mlx5_core [ 3860.491471] remove_one (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/main.c:281 /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/main.c:973 /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/main.c:2034) mlx5_core [ 3860.491948] pci_device_remove (/usr/work/linux/./arch/x86/include/asm/atomic.h:23 /usr/work/linux/./include/linux/atomic/atomic-arch-fallback.h:457 /usr/work/linux/./include/linux/atomic/atomic-arch-fallback.h:2426 /usr/work/linux/./include/linux/atomic/atomic-arch-fallback. h:2456 /usr/work/linux/./include/linux/atomic/atomic-instrumented.h:1518 /usr/work/linux/./include/linux/pm_runtime.h:129 /usr/work/linux/drivers/pci/pci-driver.c:475) [ 3860.492378] device_release_driver_internal (/usr/work/linux/drivers/base/dd.c:1273 /usr/work/linux/drivers/base/dd.c:1296) [ 3860.492908] pci_stop_bus_device (/usr/work/linux/drivers/pci/remove.c:46 /usr/work/linux/drivers/pci/remove.c:106) [ 3860.493344] pci_stop_and_remove_bus_device (/usr/work/linux/drivers/pci/remove.c:141) [ 3860.493918] pci_iov_remove_virtfn (/usr/work/linux/drivers/pci/iov.c:377) [ 3860.494386] sriov_disable (/usr/work/linux/drivers/pci/iov.c:712 (discriminator 1) /usr/work/linux/drivers/pci/iov.c:723 (discriminator 1)) [ 3860.494786] mlx5_sriov_disable (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/sriov.c:209) mlx5_core [ 3860.495327] mlx5_core_sriov_configure (/usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/sriov.c:195 (discriminator 1) /usr/work/linux/drivers/net/ethernet/mellanox/mlx5/core/sriov.c:222 (discriminator 1)) mlx5_core [ 3860.495919] sriov_numvfs_store (/usr/work/linux/drivers/pci/iov.c:450) [ 3860.496358] kernfs_fop_write_iter (/usr/work/linux/fs/kernfs/file.c:334) [ 3860.496819] vfs_write (/usr/work/linux/./include/linux/rcu_sync.h:36 (discriminator 2) /usr/work/linux/./include/linux/percpu-rwsem.h:62 (discriminator 2) /usr/work/linux/./include/linux/fs.h:1785 (discriminator 2) /usr/work/linux/./include/linux/fs.h:1921 (discriminator 2) /usr/work/linux/./include/linux/fs.h:3035 (discriminator 2) /usr/work/linux/fs/read_write.c:675 (discriminator 2)) [ 3860.497201] ksys_write (/usr/work/linux/fs/read_write.c:732) [ 3860.497598] do_syscall_64 (/usr/work/linux/arch/x86/entry/common.c:52 (discriminator 1) /usr/work/linux/arch/x86/entry/common.c:83 (discriminator 1)) [ 3860.498070] entry_SYSCALL_64_after_hwframe (/usr/work/linux/arch/x86/entry/entry_64.S:130) [ 3860.498580] RIP: 0033:0x7fb7b6f018b7 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v5 net 2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). 2025-04-01 20:49 ` Yael Chemla @ 2025-04-01 21:58 ` Kuniyuki Iwashima 2025-04-06 15:37 ` Yael Chemla 0 siblings, 1 reply; 13+ messages in thread From: Kuniyuki Iwashima @ 2025-04-01 21:58 UTC (permalink / raw) To: ychemla; +Cc: davem, edumazet, horms, kuba, kuni1840, kuniyu, netdev, pabeni Hi Yael, Thanks for testing! From: Yael Chemla <ychemla@nvidia.com> Date: Tue, 1 Apr 2025 23:49:42 +0300 > Hi Kuniyuki, > Sorry for the delay (I was OOO). I tested your patch, and while the race > occurs much less frequently, it still happens—see the warnings and call > traces below. > Additionally, in some cases, the test which reproduce the race hang. > Debugging shows that we're stuck in an endless loop inside > rtnl_net_dev_lock because the passive refcount is already zero, causing > net_passive_inc_not_zero to return false, thus it go to "again" and this > repeats without ending. > I suspect, as you mentioned before, that in such cases, the passive > refcount was decreased from cleanup_net. This sounds weird. We assumed vif will be moved to init_net, then the infinite loop should never happen. So the assumption was wrong and vif belonged to the dead netns and was not moved to init_net ... ?? Even if dev_change_net_namespace() fails, it leads to BUG(). > > > warnings and call traces: > > refcount_t: addition on 0; use-after-free. I guess this is from the old log or the test patch was not applied because _inc_not_zero() will trigger REFCOUNT_ADD_NOT_ZERO_OVF and then the message will be refcount_t: saturated; leaking memory , see __refcount_add_not_zero() and refcount_warn_saturate(). > WARNING: CPU: 4 PID: 27219 at lib/refcount.c:25 refcount_warn_saturate > (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v5 net 2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). 2025-04-01 21:58 ` Kuniyuki Iwashima @ 2025-04-06 15:37 ` Yael Chemla 2025-04-23 11:47 ` Yael Chemla 0 siblings, 1 reply; 13+ messages in thread From: Yael Chemla @ 2025-04-06 15:37 UTC (permalink / raw) To: Kuniyuki Iwashima; +Cc: davem, edumazet, horms, kuba, kuni1840, netdev, pabeni On 02/04/2025 0:58, Kuniyuki Iwashima wrote: > Hi Yael, > > Thanks for testing! > > From: Yael Chemla <ychemla@nvidia.com> > Date: Tue, 1 Apr 2025 23:49:42 +0300 >> Hi Kuniyuki, >> Sorry for the delay (I was OOO). I tested your patch, and while the race >> occurs much less frequently, it still happens—see the warnings and call >> traces below. >> Additionally, in some cases, the test which reproduce the race hang. >> Debugging shows that we're stuck in an endless loop inside >> rtnl_net_dev_lock because the passive refcount is already zero, causing >> net_passive_inc_not_zero to return false, thus it go to "again" and this >> repeats without ending. >> I suspect, as you mentioned before, that in such cases, the passive >> refcount was decreased from cleanup_net. > > This sounds weird. > > We assumed vif will be moved to init_net, then the infinite loop > should never happen. > > So the assumption was wrong and vif belonged to the dead netns and > was not moved to init_net ... ?? > > Even if dev_change_net_namespace() fails, it leads to BUG(). > Hi Kuniyuki, In failure scenarios, we observe that cleanup_net is invoked, followed by net_passive_dec, which reduces the passive refcount to zero. These are called before the call to unregister_netdevice_notifier_dev_net. During the test, dev_change_net_namespace is called once, but it operates on different net_device poiner than the one passed to final call of unregister_netdevice_notifier_dev_net, a call which enter infinite loop (with net->passive=0 and net->ns.count=0, inside rtnl_net_dev_lock, as explained in previous mail). Do you need additional debug information, perhaps specific details regarding reassigning the netns to init_net? Please let me know how I can help further. >> >> >> warnings and call traces: >> >> refcount_t: addition on 0; use-after-free. > > I guess this is from the old log or the test patch was not applied > because _inc_not_zero() will trigger REFCOUNT_ADD_NOT_ZERO_OVF and > then the message will be > > refcount_t: saturated; leaking memory > > , see __refcount_add_not_zero() and refcount_warn_saturate(). > you are right it's a mistake, i was unable to reproduce another failure with call trace info. Test either succeeds or hang (infinite loop). > >> WARNING: CPU: 4 PID: 27219 at lib/refcount.c:25 refcount_warn_saturate >> (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v5 net 2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). 2025-04-06 15:37 ` Yael Chemla @ 2025-04-23 11:47 ` Yael Chemla 0 siblings, 0 replies; 13+ messages in thread From: Yael Chemla @ 2025-04-23 11:47 UTC (permalink / raw) To: Kuniyuki Iwashima; +Cc: davem, edumazet, horms, kuba, kuni1840, netdev, pabeni On 06/04/2025 18:37, Yael Chemla wrote: > On 02/04/2025 0:58, Kuniyuki Iwashima wrote: >> Hi Yael, >> >> Thanks for testing! >> >> From: Yael Chemla <ychemla@nvidia.com> >> Date: Tue, 1 Apr 2025 23:49:42 +0300 >>> Hi Kuniyuki, >>> Sorry for the delay (I was OOO). I tested your patch, and while the race >>> occurs much less frequently, it still happens—see the warnings and call >>> traces below. >>> Additionally, in some cases, the test which reproduce the race hang. >>> Debugging shows that we're stuck in an endless loop inside >>> rtnl_net_dev_lock because the passive refcount is already zero, causing >>> net_passive_inc_not_zero to return false, thus it go to "again" and this >>> repeats without ending. >>> I suspect, as you mentioned before, that in such cases, the passive >>> refcount was decreased from cleanup_net. >> >> This sounds weird. >> >> We assumed vif will be moved to init_net, then the infinite loop >> should never happen. >> >> So the assumption was wrong and vif belonged to the dead netns and >> was not moved to init_net ... ?? >> >> Even if dev_change_net_namespace() fails, it leads to BUG(). >> > > Hi Kuniyuki, > In failure scenarios, we observe that cleanup_net is invoked, followed > by net_passive_dec, which reduces the passive refcount to zero. These > are called before the call to unregister_netdevice_notifier_dev_net. > > During the test, dev_change_net_namespace is called once, but it > operates on different net_device poiner than the one passed to final > call of unregister_netdevice_notifier_dev_net, a call which enter > infinite loop (with net->passive=0 and net->ns.count=0, inside > rtnl_net_dev_lock, as explained in previous mail). > > Do you need additional debug information, perhaps specific details > regarding reassigning the netns to init_net? Please let me know how I > can help further. > Hi Kuniyuki, any updates on this? thanks, Yael >>> >>> >>> warnings and call traces: >>> >>> refcount_t: addition on 0; use-after-free. >> >> I guess this is from the old log or the test patch was not applied >> because _inc_not_zero() will trigger REFCOUNT_ADD_NOT_ZERO_OVF and >> then the message will be >> >> refcount_t: saturated; leaking memory >> >> , see __refcount_add_not_zero() and refcount_warn_saturate(). >> > > you are right it's a mistake, i was unable to reproduce another failure > with call trace info. Test either succeeds or hang (infinite loop). > >> >>> WARNING: CPU: 4 PID: 27219 at lib/refcount.c:25 refcount_warn_saturate >>> (/usr/work/linux/lib/refcount.c:25 (discriminator 1)) > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v5 net 3/3] dev: Use rtnl_net_dev_lock() in unregister_netdev(). 2025-02-17 19:11 [PATCH v5 net 0/3] net: Fix race of rtnl_net_lock(dev_net(dev)) Kuniyuki Iwashima 2025-02-17 19:11 ` [PATCH v5 net 1/3] net: Add net_passive_inc() and net_passive_dec() Kuniyuki Iwashima 2025-02-17 19:11 ` [PATCH v5 net 2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net() Kuniyuki Iwashima @ 2025-02-17 19:11 ` Kuniyuki Iwashima 2025-02-19 2:50 ` [PATCH v5 net 0/3] net: Fix race of rtnl_net_lock(dev_net(dev)) patchwork-bot+netdevbpf 3 siblings, 0 replies; 13+ messages in thread From: Kuniyuki Iwashima @ 2025-02-17 19:11 UTC (permalink / raw) To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev The following sequence is basically illegal when dev was fetched without lookup because dev_net(dev) might be different after holding rtnl_net_lock(): net = dev_net(dev); rtnl_net_lock(net); Let's use rtnl_net_dev_lock() in unregister_netdev(). Note that there is no real bug in unregister_netdev() for now because RTNL protects the scope even if dev_net(dev) is changed before/after RTNL. Fixes: 00fb9823939e ("dev: Hold per-netns RTNL in (un)?register_netdev().") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> --- net/core/dev.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/net/core/dev.c b/net/core/dev.c index 19e268568282..fafd2f4b5d5d 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -11920,11 +11920,9 @@ EXPORT_SYMBOL(unregister_netdevice_many); */ void unregister_netdev(struct net_device *dev) { - struct net *net = dev_net(dev); - - rtnl_net_lock(net); + rtnl_net_dev_lock(dev); unregister_netdevice(dev); - rtnl_net_unlock(net); + rtnl_net_dev_unlock(dev); } EXPORT_SYMBOL(unregister_netdev); -- 2.39.5 (Apple Git-154) ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v5 net 0/3] net: Fix race of rtnl_net_lock(dev_net(dev)). 2025-02-17 19:11 [PATCH v5 net 0/3] net: Fix race of rtnl_net_lock(dev_net(dev)) Kuniyuki Iwashima ` (2 preceding siblings ...) 2025-02-17 19:11 ` [PATCH v5 net 3/3] dev: Use rtnl_net_dev_lock() in unregister_netdev() Kuniyuki Iwashima @ 2025-02-19 2:50 ` patchwork-bot+netdevbpf 3 siblings, 0 replies; 13+ messages in thread From: patchwork-bot+netdevbpf @ 2025-02-19 2:50 UTC (permalink / raw) To: Kuniyuki Iwashima; +Cc: davem, edumazet, kuba, pabeni, horms, kuni1840, netdev Hello: This series was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Mon, 17 Feb 2025 11:11:26 -0800 you wrote: > Yael Chemla reported that commit 7fb1073300a2 ("net: Hold rtnl_net_lock() > in (un)?register_netdevice_notifier_dev_net().") started to trigger KASAN's > use-after-free splat. > > The problem is that dev_net(dev) fetched before rtnl_net_lock() might be > different after rtnl_net_lock(). > > [...] Here is the summary with links: - [v5,net,1/3] net: Add net_passive_inc() and net_passive_dec(). https://git.kernel.org/netdev/net/c/e57a6320215c - [v5,net,2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net(). https://git.kernel.org/netdev/net/c/65161fb544aa - [v5,net,3/3] dev: Use rtnl_net_dev_lock() in unregister_netdev(). https://git.kernel.org/netdev/net/c/d4c6bfc83936 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-04-23 11:48 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-02-17 19:11 [PATCH v5 net 0/3] net: Fix race of rtnl_net_lock(dev_net(dev)) Kuniyuki Iwashima 2025-02-17 19:11 ` [PATCH v5 net 1/3] net: Add net_passive_inc() and net_passive_dec() Kuniyuki Iwashima 2025-02-17 19:11 ` [PATCH v5 net 2/3] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net() Kuniyuki Iwashima 2025-03-26 13:46 ` Yael Chemla 2025-03-26 21:54 ` Kuniyuki Iwashima 2025-03-26 22:10 ` Eric Dumazet 2025-03-26 22:32 ` Kuniyuki Iwashima 2025-04-01 20:49 ` Yael Chemla 2025-04-01 21:58 ` Kuniyuki Iwashima 2025-04-06 15:37 ` Yael Chemla 2025-04-23 11:47 ` Yael Chemla 2025-02-17 19:11 ` [PATCH v5 net 3/3] dev: Use rtnl_net_dev_lock() in unregister_netdev() Kuniyuki Iwashima 2025-02-19 2:50 ` [PATCH v5 net 0/3] net: Fix race of rtnl_net_lock(dev_net(dev)) patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).