public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 net 0/2] net: Fix race of rtnl_net_lock(dev_net(dev)).
@ 2025-02-07  4:42 Kuniyuki Iwashima
  2025-02-07  4:42 ` [PATCH v2 net 1/2] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net() Kuniyuki Iwashima
  2025-02-07  4:42 ` [PATCH v2 net 2/2] dev: Use rtnl_net_dev_lock() in unregister_netdev() Kuniyuki Iwashima
  0 siblings, 2 replies; 7+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-07  4:42 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

Yael Chemla reported that commit 7fb1073300a2 ("net: Hold rtnl_net_lock()
in (un)?register_netdevice_notifier_dev_net().") started to trigger KASAN's
use-after-free splat.

The problem is that dev_net(dev) fetched before rtnl_net_lock() might be
different after rtnl_net_lock().

The patch 1 fixes the issue by checking dev_net(dev) after rtnl_net_lock(),
and the patch 2 fixes the same potential issue that would emerge once RTNL
is removed.


Changes:
  v2:
    * Use dev_net_rcu()
    * Use msleep(1) instead of cond_resched() after maybe_get_net()
    * Remove cond_resched() after net_eq() check

  v1: https://lore.kernel.org/netdev/20250130232435.43622-1-kuniyu@amazon.com/


Kuniyuki Iwashima (2):
  net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net().
  dev: Use rtnl_net_dev_lock() in unregister_netdev().

 net/core/dev.c | 69 +++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 52 insertions(+), 17 deletions(-)

-- 
2.39.5 (Apple Git-154)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 net 1/2] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net().
  2025-02-07  4:42 [PATCH v2 net 0/2] net: Fix race of rtnl_net_lock(dev_net(dev)) Kuniyuki Iwashima
@ 2025-02-07  4:42 ` Kuniyuki Iwashima
  2025-02-07  6:42   ` Eric Dumazet
  2025-02-07  4:42 ` [PATCH v2 net 2/2] dev: Use rtnl_net_dev_lock() in unregister_netdev() Kuniyuki Iwashima
  1 sibling, 1 reply; 7+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-07  4:42 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev, Yael Chemla

After the cited commit, dev_net(dev) is fetched before holding RTNL
and passed to __unregister_netdevice_notifier_net().

However, dev_net(dev) might be different after holding RTNL.

In the reported case [0], while removing a VF device, its netns was
being dismantled and the VF was moved to init_net.

So the following sequence is basically illegal when dev was fetched
without lookup:

  net = dev_net(dev);
  rtnl_net_lock(net);

Let's use a new helper rtnl_net_dev_lock() to fix the race.

It calls maybe_get_net() for dev_net_rcu(dev) and checks dev_net_rcu(dev)
before/after rtnl_net_lock().

The dev_net_rcu(dev) pointer itself is valid, thanks to RCU API, but the
netns might be being dismantled.  maybe_get_net() is to avoid the race.
This can be done by holding pernet_ops_rwsem, but it will be overkill.

[0]:
BUG: KASAN: slab-use-after-free in notifier_call_chain (kernel/notifier.c:75 (discriminator 2))
Read of size 8 at addr ffff88810cefb4c8 by task test-bridge-lag/21127
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl (lib/dump_stack.c:123)
 print_report (mm/kasan/report.c:379 mm/kasan/report.c:489)
 kasan_report (mm/kasan/report.c:604)
 notifier_call_chain (kernel/notifier.c:75 (discriminator 2))
 call_netdevice_notifiers_info (net/core/dev.c:2011)
 unregister_netdevice_many_notify (net/core/dev.c:11551)
 unregister_netdevice_queue (net/core/dev.c:11487)
 unregister_netdev (net/core/dev.c:11635)
 mlx5e_remove (drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6552 drivers/net/ethernet/mellanox/mlx5/core/en_main.c:6579) mlx5_core
 auxiliary_bus_remove (drivers/base/auxiliary.c:230)
 device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296)
 bus_remove_device (./include/linux/kobject.h:193 drivers/base/base.h:73 drivers/base/bus.c:583)
 device_del (drivers/base/power/power.h:142 drivers/base/core.c:3855)
 mlx5_rescan_drivers_locked (./include/linux/auxiliary_bus.h:241 drivers/net/ethernet/mellanox/mlx5/core/dev.c:333 drivers/net/ethernet/mellanox/mlx5/core/dev.c:535 drivers/net/ethernet/mellanox/mlx5/core/dev.c:549) mlx5_core
 mlx5_unregister_device (drivers/net/ethernet/mellanox/mlx5/core/dev.c:468) mlx5_core
 mlx5_uninit_one (./include/linux/instrumented.h:68 ./include/asm-generic/bitops/instrumented-non-atomic.h:141 drivers/net/ethernet/mellanox/mlx5/core/main.c:1563) mlx5_core
 remove_one (drivers/net/ethernet/mellanox/mlx5/core/main.c:965 drivers/net/ethernet/mellanox/mlx5/core/main.c:2019) mlx5_core
 pci_device_remove (./include/linux/pm_runtime.h:129 drivers/pci/pci-driver.c:475)
 device_release_driver_internal (drivers/base/dd.c:1275 drivers/base/dd.c:1296)
 unbind_store (drivers/base/bus.c:245)
 kernfs_fop_write_iter (fs/kernfs/file.c:338)
 vfs_write (fs/read_write.c:587 (discriminator 1) fs/read_write.c:679 (discriminator 1))
 ksys_write (fs/read_write.c:732)
 do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
RIP: 0033:0x7f6a4d5018b7

Fixes: 7fb1073300a2 ("net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_dev_net().")
Reported-by: Yael Chemla <ychemla@nvidia.com>
Closes: https://lore.kernel.org/netdev/146eabfe-123c-4970-901e-e961b4c09bc3@nvidia.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Tested-by: Yael Chemla <ychemla@nvidia.com>
---
v2:
  * Use dev_net_rcu().
  * Use msleep(1) instead of cond_resched() after maybe_get_net()
  * Remove cond_resched() after net_eq() check

v1: https://lore.kernel.org/netdev/20250130232435.43622-2-kuniyu@amazon.com/
---
 net/core/dev.c | 63 +++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 50 insertions(+), 13 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index b91658e8aedb..f7430c9d9bc3 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2070,6 +2070,51 @@ static void __move_netdevice_notifier_net(struct net *src_net,
 	__register_netdevice_notifier_net(dst_net, nb, true);
 }
 
+static bool from_cleanup_net(void)
+{
+#ifdef CONFIG_NET_NS
+	return current == cleanup_net_task;
+#else
+	return false;
+#endif
+}
+
+static void rtnl_net_dev_lock(struct net_device *dev)
+{
+	struct net *net;
+
+	DEBUG_NET_WARN_ON_ONCE(from_cleanup_net());
+again:
+	/* netns might be being dismantled. */
+	rcu_read_lock();
+	net = maybe_get_net(dev_net_rcu(dev));
+	rcu_read_unlock();
+	if (!net) {
+		msleep(1);
+		goto again;
+	}
+
+	rtnl_net_lock(net);
+
+	/* dev might have been moved to another netns. */
+	rcu_read_lock();
+	if (!net_eq(net, dev_net_rcu(dev))) {
+		rcu_read_unlock();
+		rtnl_net_unlock(net);
+		put_net(net);
+		goto again;
+	}
+	rcu_read_unlock();
+}
+
+static void rtnl_net_dev_unlock(struct net_device *dev)
+{
+	struct net *net = dev_net(dev);
+
+	rtnl_net_unlock(net);
+	put_net(net);
+}
+
 int register_netdevice_notifier_dev_net(struct net_device *dev,
 					struct notifier_block *nb,
 					struct netdev_net_notifier *nn)
@@ -2077,6 +2122,8 @@ int register_netdevice_notifier_dev_net(struct net_device *dev,
 	struct net *net = dev_net(dev);
 	int err;
 
+	DEBUG_NET_WARN_ON_ONCE(!list_empty(&dev->dev_list));
+
 	rtnl_net_lock(net);
 	err = __register_netdevice_notifier_net(net, nb, false);
 	if (!err) {
@@ -2093,13 +2140,12 @@ int unregister_netdevice_notifier_dev_net(struct net_device *dev,
 					  struct notifier_block *nb,
 					  struct netdev_net_notifier *nn)
 {
-	struct net *net = dev_net(dev);
 	int err;
 
-	rtnl_net_lock(net);
+	rtnl_net_dev_lock(dev);
 	list_del(&nn->list);
-	err = __unregister_netdevice_notifier_net(net, nb);
-	rtnl_net_unlock(net);
+	err = __unregister_netdevice_notifier_net(dev_net(dev), nb);
+	rtnl_net_dev_unlock(dev);
 
 	return err;
 }
@@ -10255,15 +10301,6 @@ static void dev_index_release(struct net *net, int ifindex)
 	WARN_ON(xa_erase(&net->dev_by_index, ifindex));
 }
 
-static bool from_cleanup_net(void)
-{
-#ifdef CONFIG_NET_NS
-	return current == cleanup_net_task;
-#else
-	return false;
-#endif
-}
-
 /* Delayed registration/unregisteration */
 LIST_HEAD(net_todo_list);
 DECLARE_WAIT_QUEUE_HEAD(netdev_unregistering_wq);
-- 
2.39.5 (Apple Git-154)


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 net 2/2] dev: Use rtnl_net_dev_lock() in unregister_netdev().
  2025-02-07  4:42 [PATCH v2 net 0/2] net: Fix race of rtnl_net_lock(dev_net(dev)) Kuniyuki Iwashima
  2025-02-07  4:42 ` [PATCH v2 net 1/2] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net() Kuniyuki Iwashima
@ 2025-02-07  4:42 ` Kuniyuki Iwashima
  1 sibling, 0 replies; 7+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-07  4:42 UTC (permalink / raw)
  To: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman
  Cc: Kuniyuki Iwashima, Kuniyuki Iwashima, netdev

The following sequence is basically illegal when dev was fetched
without lookup because dev_net(dev) might be different after holding
rtnl_net_lock():

  net = dev_net(dev);
  rtnl_net_lock(net);

Let's use rtnl_net_dev_lock() in unregister_netdev().

Note that there is no real bug in unregister_netdev() for now
because RTNL protects the scope even if dev_net(dev) is changed
before/after RTNL.

Fixes: 00fb9823939e ("dev: Hold per-netns RTNL in (un)?register_netdev().")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 net/core/dev.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index f7430c9d9bc3..385f307291d0 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11917,11 +11917,9 @@ EXPORT_SYMBOL(unregister_netdevice_many);
  */
 void unregister_netdev(struct net_device *dev)
 {
-	struct net *net = dev_net(dev);
-
-	rtnl_net_lock(net);
+	rtnl_net_dev_lock(dev);
 	unregister_netdevice(dev);
-	rtnl_net_unlock(net);
+	rtnl_net_dev_unlock(dev);
 }
 EXPORT_SYMBOL(unregister_netdev);
 
-- 
2.39.5 (Apple Git-154)


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 net 1/2] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net().
  2025-02-07  4:42 ` [PATCH v2 net 1/2] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net() Kuniyuki Iwashima
@ 2025-02-07  6:42   ` Eric Dumazet
  2025-02-07  6:58     ` Kuniyuki Iwashima
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2025-02-07  6:42 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: David S. Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Kuniyuki Iwashima, netdev, Yael Chemla

On Fri, Feb 7, 2025 at 5:43 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> After the cited commit, dev_net(dev) is fetched before holding RTNL
> and passed to __unregister_netdevice_notifier_net().
>
> However, dev_net(dev) might be different after holding RTNL.
>
> In the reported case [0], while removing a VF device, its netns was
> being dismantled and the VF was moved to init_net.
>
> So the following sequence is basically illegal when dev was fetched
> without lookup:
>
>   net = dev_net(dev);
>   rtnl_net_lock(net);
>
> Let's use a new helper rtnl_net_dev_lock() to fix the race.
>
> It calls maybe_get_net() for dev_net_rcu(dev) and checks dev_net_rcu(dev)
> before/after rtnl_net_lock().
>
> The dev_net_rcu(dev) pointer itself is valid, thanks to RCU API, but the
> netns might be being dismantled.  maybe_get_net() is to avoid the race.
> This can be done by holding pernet_ops_rwsem, but it will be overkill.
>
>
> Fixes: 7fb1073300a2 ("net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_dev_net().")
> Reported-by: Yael Chemla <ychemla@nvidia.com>
> Closes: https://lore.kernel.org/netdev/146eabfe-123c-4970-901e-e961b4c09bc3@nvidia.com/
> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> Tested-by: Yael Chemla <ychemla@nvidia.com>
> ---
> v2:
>   * Use dev_net_rcu().
>   * Use msleep(1) instead of cond_resched() after maybe_get_net()
>   * Remove cond_resched() after net_eq() check
>
> v1: https://lore.kernel.org/netdev/20250130232435.43622-2-kuniyu@amazon.com/
> ---
>  net/core/dev.c | 63 +++++++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 50 insertions(+), 13 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index b91658e8aedb..f7430c9d9bc3 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2070,6 +2070,51 @@ static void __move_netdevice_notifier_net(struct net *src_net,
>         __register_netdevice_notifier_net(dst_net, nb, true);
>  }
>
> +static bool from_cleanup_net(void)
> +{
> +#ifdef CONFIG_NET_NS
> +       return current == cleanup_net_task;
> +#else
> +       return false;
> +#endif
> +}
> +
> +static void rtnl_net_dev_lock(struct net_device *dev)
> +{
> +       struct net *net;
> +
> +       DEBUG_NET_WARN_ON_ONCE(from_cleanup_net());

I would rather make sure rtnl_net_dev_lock() _can_ be called from cleanup_net()


> +again:
> +       /* netns might be being dismantled. */
> +       rcu_read_lock();
> +       net = maybe_get_net(dev_net_rcu(dev));

I do not think maybe_get_net() is what we want here.

If the netns is already in dismantle phase, the count will be zero.

Instead:

net = dev_net_rcu(dev);
refcount_inc(&net->passive);


> +       rcu_read_unlock();

> +       if (!net) {
> +               msleep(1);
> +               goto again;
> +       }

> +
> +       rtnl_net_lock(net);
> +
> +       /* dev might have been moved to another netns. */
> +       rcu_read_lock();

As we do not dereference the net pointer, I would not acquire
rcu_read_lock() and instead use

if (!net_eq(net, rcu_access_pointer(dev->nd_net.net)) {



> +       if (!net_eq(net, dev_net_rcu(dev))) {
> +               rcu_read_unlock();
> +               rtnl_net_unlock(net);

> +               put_net(net);
instead :
         net_drop_ns(net);

> +               goto again;
> +       }
> +       rcu_read_unlock();
> +}
> +
> +static void rtnl_net_dev_unlock(struct net_device *dev)
> +{
> +       struct net *net = dev_net(dev);
> +
> +       rtnl_net_unlock(net);

And replace the put_net() here and above with:

net_drop_ns(net);

> +       put_net(net);
> +}
> +
>  int register_netdevice_notifier_dev_net(struct net_device *dev,
>                                         struct notifier_block *nb,
>                                         struct netdev_net_notifier *nn)
> @@ -2077,6 +2122,8 @@ int register_netdevice_notifier_dev_net(struct net_device *dev,
>         struct net *net = dev_net(dev);
>         int err;
>

> +       DEBUG_NET_WARN_ON_ONCE(!list_empty(&dev->dev_list));
/* Why is this needed ? */

> +
>         rtnl_net_lock(net);
>         err = __register_netdevice_notifier_net(net, nb, false);
>         if (!err) {
> @@ -2093,13 +2140,12 @@ int unregister_netdevice_notifier_dev_net(struct net_device *dev,
>                                           struct notifier_block *nb,
>                                           struct netdev_net_notifier *nn)
>  {
> -       struct net *net = dev_net(dev);
>         int err;
>
> -       rtnl_net_lock(net);
> +       rtnl_net_dev_lock(dev);
>         list_del(&nn->list);
> -       err = __unregister_netdevice_notifier_net(net, nb);
> -       rtnl_net_unlock(net);
> +       err = __unregister_netdevice_notifier_net(dev_net(dev), nb);
> +       rtnl_net_dev_unlock(dev);
>
>         return err;
>  }
> @@ -10255,15 +10301,6 @@ static void dev_index_release(struct net *net, int ifindex)
>         WARN_ON(xa_erase(&net->dev_by_index, ifindex));
>  }
>
> -static bool from_cleanup_net(void)
> -{
> -#ifdef CONFIG_NET_NS
> -       return current == cleanup_net_task;
> -#else
> -       return false;
> -#endif
> -}
> -
>  /* Delayed registration/unregisteration */
>  LIST_HEAD(net_todo_list);
>  DECLARE_WAIT_QUEUE_HEAD(netdev_unregistering_wq);
> --
> 2.39.5 (Apple Git-154)
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 net 1/2] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net().
  2025-02-07  6:42   ` Eric Dumazet
@ 2025-02-07  6:58     ` Kuniyuki Iwashima
  2025-02-07  7:01       ` Eric Dumazet
  0 siblings, 1 reply; 7+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-07  6:58 UTC (permalink / raw)
  To: edumazet; +Cc: davem, horms, kuba, kuni1840, kuniyu, netdev, pabeni, ychemla

From: Eric Dumazet <edumazet@google.com>
Date: Fri, 7 Feb 2025 07:42:13 +0100
> On Fri, Feb 7, 2025 at 5:43 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > After the cited commit, dev_net(dev) is fetched before holding RTNL
> > and passed to __unregister_netdevice_notifier_net().
> >
> > However, dev_net(dev) might be different after holding RTNL.
> >
> > In the reported case [0], while removing a VF device, its netns was
> > being dismantled and the VF was moved to init_net.
> >
> > So the following sequence is basically illegal when dev was fetched
> > without lookup:
> >
> >   net = dev_net(dev);
> >   rtnl_net_lock(net);
> >
> > Let's use a new helper rtnl_net_dev_lock() to fix the race.
> >
> > It calls maybe_get_net() for dev_net_rcu(dev) and checks dev_net_rcu(dev)
> > before/after rtnl_net_lock().
> >
> > The dev_net_rcu(dev) pointer itself is valid, thanks to RCU API, but the
> > netns might be being dismantled.  maybe_get_net() is to avoid the race.
> > This can be done by holding pernet_ops_rwsem, but it will be overkill.
> >
> >
> > Fixes: 7fb1073300a2 ("net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_dev_net().")
> > Reported-by: Yael Chemla <ychemla@nvidia.com>
> > Closes: https://lore.kernel.org/netdev/146eabfe-123c-4970-901e-e961b4c09bc3@nvidia.com/
> > Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
> > Tested-by: Yael Chemla <ychemla@nvidia.com>
> > ---
> > v2:
> >   * Use dev_net_rcu().
> >   * Use msleep(1) instead of cond_resched() after maybe_get_net()
> >   * Remove cond_resched() after net_eq() check
> >
> > v1: https://lore.kernel.org/netdev/20250130232435.43622-2-kuniyu@amazon.com/
> > ---
> >  net/core/dev.c | 63 +++++++++++++++++++++++++++++++++++++++-----------
> >  1 file changed, 50 insertions(+), 13 deletions(-)
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index b91658e8aedb..f7430c9d9bc3 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -2070,6 +2070,51 @@ static void __move_netdevice_notifier_net(struct net *src_net,
> >         __register_netdevice_notifier_net(dst_net, nb, true);
> >  }
> >
> > +static bool from_cleanup_net(void)
> > +{
> > +#ifdef CONFIG_NET_NS
> > +       return current == cleanup_net_task;
> > +#else
> > +       return false;
> > +#endif
> > +}
> > +
> > +static void rtnl_net_dev_lock(struct net_device *dev)
> > +{
> > +       struct net *net;
> > +
> > +       DEBUG_NET_WARN_ON_ONCE(from_cleanup_net());
> 
> I would rather make sure rtnl_net_dev_lock() _can_ be called from cleanup_net()
> 
> 
> > +again:
> > +       /* netns might be being dismantled. */
> > +       rcu_read_lock();
> > +       net = maybe_get_net(dev_net_rcu(dev));
> 
> I do not think maybe_get_net() is what we want here.
> 
> If the netns is already in dismantle phase, the count will be zero.

Yes, so I placed the warning above.

Will use net->passive instead, thanks for suggestion!


> 
> Instead:
> 
> net = dev_net_rcu(dev);
> refcount_inc(&net->passive);
> 
> 
> > +       rcu_read_unlock();
> 
> > +       if (!net) {
> > +               msleep(1);
> > +               goto again;
> > +       }
> 
> > +
> > +       rtnl_net_lock(net);
> > +
> > +       /* dev might have been moved to another netns. */
> > +       rcu_read_lock();
> 
> As we do not dereference the net pointer, I would not acquire
> rcu_read_lock() and instead use
> 
> if (!net_eq(net, rcu_access_pointer(dev->nd_net.net)) {

Exactly, will use rcu_access_pointer().


> 
> 
> 
> > +       if (!net_eq(net, dev_net_rcu(dev))) {
> > +               rcu_read_unlock();
> > +               rtnl_net_unlock(net);
> 
> > +               put_net(net);
> instead :
>          net_drop_ns(net);
> 
> > +               goto again;
> > +       }
> > +       rcu_read_unlock();
> > +}
> > +
> > +static void rtnl_net_dev_unlock(struct net_device *dev)
> > +{
> > +       struct net *net = dev_net(dev);
> > +
> > +       rtnl_net_unlock(net);
> 
> And replace the put_net() here and above with:
> 
> net_drop_ns(net);
> 
> > +       put_net(net);
> > +}
> > +
> >  int register_netdevice_notifier_dev_net(struct net_device *dev,
> >                                         struct notifier_block *nb,
> >                                         struct netdev_net_notifier *nn)
> > @@ -2077,6 +2122,8 @@ int register_netdevice_notifier_dev_net(struct net_device *dev,
> >         struct net *net = dev_net(dev);
> >         int err;
> >
> 
> > +       DEBUG_NET_WARN_ON_ONCE(!list_empty(&dev->dev_list));
> /* Why is this needed ? */

The following rtnl_net_lock() assumes the dev is not yet published
by register_netdevice(), and I think there's no such users calling
register_netdevice_notifier_dev_net() after that, so just a paranoid..


> 
> > +
> >         rtnl_net_lock(net);
> >         err = __register_netdevice_notifier_net(net, nb, false);
> >         if (!err) {

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 net 1/2] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net().
  2025-02-07  6:58     ` Kuniyuki Iwashima
@ 2025-02-07  7:01       ` Eric Dumazet
  2025-02-07  7:07         ` Kuniyuki Iwashima
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Dumazet @ 2025-02-07  7:01 UTC (permalink / raw)
  To: Kuniyuki Iwashima; +Cc: davem, horms, kuba, kuni1840, netdev, pabeni, ychemla

On Fri, Feb 7, 2025 at 7:59 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From: Eric Dumazet <edumazet@google.com>

> > /* Why is this needed ? */
>
> The following rtnl_net_lock() assumes the dev is not yet published
> by register_netdevice(), and I think there's no such users calling
> register_netdevice_notifier_dev_net() after that, so just a paranoid..

Please add a comment then ;)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 net 1/2] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net().
  2025-02-07  7:01       ` Eric Dumazet
@ 2025-02-07  7:07         ` Kuniyuki Iwashima
  0 siblings, 0 replies; 7+ messages in thread
From: Kuniyuki Iwashima @ 2025-02-07  7:07 UTC (permalink / raw)
  To: edumazet; +Cc: davem, horms, kuba, kuni1840, kuniyu, netdev, pabeni, ychemla

From: Eric Dumazet <edumazet@google.com>
Date: Fri, 7 Feb 2025 08:01:36 +0100
> On Fri, Feb 7, 2025 at 7:59 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
> >
> > From: Eric Dumazet <edumazet@google.com>
> 
> > > /* Why is this needed ? */
> >
> > The following rtnl_net_lock() assumes the dev is not yet published
> > by register_netdevice(), and I think there's no such users calling
> > register_netdevice_notifier_dev_net() after that, so just a paranoid..
> 
> Please add a comment then ;)

Sure, will add a comment there!

Thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-02-07  7:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-07  4:42 [PATCH v2 net 0/2] net: Fix race of rtnl_net_lock(dev_net(dev)) Kuniyuki Iwashima
2025-02-07  4:42 ` [PATCH v2 net 1/2] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net() Kuniyuki Iwashima
2025-02-07  6:42   ` Eric Dumazet
2025-02-07  6:58     ` Kuniyuki Iwashima
2025-02-07  7:01       ` Eric Dumazet
2025-02-07  7:07         ` Kuniyuki Iwashima
2025-02-07  4:42 ` [PATCH v2 net 2/2] dev: Use rtnl_net_dev_lock() in unregister_netdev() Kuniyuki Iwashima

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox