* [PATCH net-next] net: let lockdep compare instance locks
@ 2025-05-16 1:24 Jakub Kicinski
2025-05-16 1:49 ` Kuniyuki Iwashima
0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2025-05-16 1:24 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms, sdf, kuniyu,
Jakub Kicinski
AFAIU always returning -1 from lockdep's compare function
basically disables checking of dependencies between given
locks. Try to be a little more precise about what guarantees
that instance locks won't deadlock.
Right now we only nest them under protection of rtnl_lock.
Mostly in unregister_netdevice_many() and dev_close_many().
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
include/net/netdev_lock.h | 29 +++++++++++++++++++++--------
1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/include/net/netdev_lock.h b/include/net/netdev_lock.h
index 2a753813f849..75a2da23100d 100644
--- a/include/net/netdev_lock.h
+++ b/include/net/netdev_lock.h
@@ -99,16 +99,29 @@ static inline void netdev_unlock_ops_compat(struct net_device *dev)
static inline int netdev_lock_cmp_fn(const struct lockdep_map *a,
const struct lockdep_map *b)
{
- /* Only lower devices currently grab the instance lock, so no
- * real ordering issues can occur. In the near future, only
- * hardware devices will grab instance lock which also does not
- * involve any ordering. Suppress lockdep ordering warnings
- * until (if) we start grabbing instance lock on pure SW
- * devices (bond/team/veth/etc).
- */
+#ifdef CONFIG_DEBUG_NET_SMALL_RTNL
+ const struct net_device *dev_a, *dev_b;
+
+ dev_a = container_of(a, struct net_device, lock.dep_map);
+ dev_b = container_of(b, struct net_device, lock.dep_map);
+#endif
+
if (a == b)
return 0;
- return -1;
+
+ /* Locking multiple devices usually happens under rtnl_lock */
+ if (lockdep_rtnl_is_held())
+ return -1;
+
+#ifdef CONFIG_DEBUG_NET_SMALL_RTNL
+ /* It's okay to use per-netns rtnl_lock if devices share netns */
+ if (net_eq(dev_net(dev_a), dev_net(dev_b)) &&
+ lockdep_rtnl_net_is_held(dev_net(dev_a)))
+ return -1;
+#endif
+
+ /* Otherwise taking two instance locks is not allowed */
+ return 1;
}
#define netdev_lockdep_set_classes(dev) \
--
2.49.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: let lockdep compare instance locks
2025-05-16 1:24 [PATCH net-next] net: let lockdep compare instance locks Jakub Kicinski
@ 2025-05-16 1:49 ` Kuniyuki Iwashima
2025-05-16 2:36 ` Jakub Kicinski
0 siblings, 1 reply; 8+ messages in thread
From: Kuniyuki Iwashima @ 2025-05-16 1:49 UTC (permalink / raw)
To: kuba; +Cc: andrew+netdev, davem, edumazet, horms, kuniyu, netdev, pabeni,
sdf
From: Jakub Kicinski <kuba@kernel.org>
Date: Thu, 15 May 2025 18:24:59 -0700
> AFAIU always returning -1 from lockdep's compare function
> basically disables checking of dependencies between given
> locks. Try to be a little more precise about what guarantees
> that instance locks won't deadlock.
>
> Right now we only nest them under protection of rtnl_lock.
> Mostly in unregister_netdevice_many() and dev_close_many().
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> include/net/netdev_lock.h | 29 +++++++++++++++++++++--------
> 1 file changed, 21 insertions(+), 8 deletions(-)
>
> diff --git a/include/net/netdev_lock.h b/include/net/netdev_lock.h
> index 2a753813f849..75a2da23100d 100644
> --- a/include/net/netdev_lock.h
> +++ b/include/net/netdev_lock.h
> @@ -99,16 +99,29 @@ static inline void netdev_unlock_ops_compat(struct net_device *dev)
> static inline int netdev_lock_cmp_fn(const struct lockdep_map *a,
> const struct lockdep_map *b)
> {
> - /* Only lower devices currently grab the instance lock, so no
> - * real ordering issues can occur. In the near future, only
> - * hardware devices will grab instance lock which also does not
> - * involve any ordering. Suppress lockdep ordering warnings
> - * until (if) we start grabbing instance lock on pure SW
> - * devices (bond/team/veth/etc).
> - */
> +#ifdef CONFIG_DEBUG_NET_SMALL_RTNL
> + const struct net_device *dev_a, *dev_b;
> +
> + dev_a = container_of(a, struct net_device, lock.dep_map);
> + dev_b = container_of(b, struct net_device, lock.dep_map);
> +#endif
> +
> if (a == b)
> return 0;
> - return -1;
> +
> + /* Locking multiple devices usually happens under rtnl_lock */
> + if (lockdep_rtnl_is_held())
> + return -1;
> +
> +#ifdef CONFIG_DEBUG_NET_SMALL_RTNL
> + /* It's okay to use per-netns rtnl_lock if devices share netns */
> + if (net_eq(dev_net(dev_a), dev_net(dev_b)) &&
> + lockdep_rtnl_net_is_held(dev_net(dev_a)))
Do we need
!from_cleanup_net()
before lockdep_rtnl_net_is_held() ?
__rtnl_net_lock() is not held in ops_exit_rtnl_list() and
default_device_exit_batch() when calling unregister_netdevice_many().
> + return -1;
> +#endif
> +
> + /* Otherwise taking two instance locks is not allowed */
> + return 1;
> }
>
> #define netdev_lockdep_set_classes(dev) \
> --
> 2.49.0
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: let lockdep compare instance locks
2025-05-16 1:49 ` Kuniyuki Iwashima
@ 2025-05-16 2:36 ` Jakub Kicinski
2025-05-16 2:59 ` Kuniyuki Iwashima
0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2025-05-16 2:36 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: andrew+netdev, davem, edumazet, horms, netdev, pabeni, sdf
On Thu, 15 May 2025 18:49:07 -0700 Kuniyuki Iwashima wrote:
> > +#ifdef CONFIG_DEBUG_NET_SMALL_RTNL
> > + /* It's okay to use per-netns rtnl_lock if devices share netns */
> > + if (net_eq(dev_net(dev_a), dev_net(dev_b)) &&
> > + lockdep_rtnl_net_is_held(dev_net(dev_a)))
>
> Do we need
>
> !from_cleanup_net()
>
> before lockdep_rtnl_net_is_held() ?
>
> __rtnl_net_lock() is not held in ops_exit_rtnl_list() and
> default_device_exit_batch() when calling unregister_netdevice_many().
Or do we need
if (from_cleanup_net())
return -1;
?
Is the thinking that once the big rtnl lock disappears in cleanup_net
the devices are safe to destroy without any locking because there can't
be any live users trying to access them?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: let lockdep compare instance locks
2025-05-16 2:36 ` Jakub Kicinski
@ 2025-05-16 2:59 ` Kuniyuki Iwashima
2025-05-16 15:22 ` Jakub Kicinski
0 siblings, 1 reply; 8+ messages in thread
From: Kuniyuki Iwashima @ 2025-05-16 2:59 UTC (permalink / raw)
To: kuba; +Cc: andrew+netdev, davem, edumazet, horms, kuniyu, netdev, pabeni,
sdf
From: Jakub Kicinski <kuba@kernel.org>
Date: Thu, 15 May 2025 19:36:09 -0700
> On Thu, 15 May 2025 18:49:07 -0700 Kuniyuki Iwashima wrote:
> > > +#ifdef CONFIG_DEBUG_NET_SMALL_RTNL
> > > + /* It's okay to use per-netns rtnl_lock if devices share netns */
> > > + if (net_eq(dev_net(dev_a), dev_net(dev_b)) &&
> > > + lockdep_rtnl_net_is_held(dev_net(dev_a)))
> >
> > Do we need
> >
> > !from_cleanup_net()
> >
> > before lockdep_rtnl_net_is_held() ?
> >
> > __rtnl_net_lock() is not held in ops_exit_rtnl_list() and
> > default_device_exit_batch() when calling unregister_netdevice_many().
>
> Or do we need
>
> if (from_cleanup_net())
> return -1;
>
> ?
Ah right, otherwise we'll return 1 for cleanup_net() :)
> Is the thinking that once the big rtnl lock disappears in cleanup_net
> the devices are safe to destroy without any locking because there can't
> be any live users trying to access them?
I hope yes, but removing VF via sysfs and removing netns might
race and need some locking ?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: let lockdep compare instance locks
2025-05-16 2:59 ` Kuniyuki Iwashima
@ 2025-05-16 15:22 ` Jakub Kicinski
2025-05-16 17:14 ` Jakub Kicinski
0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2025-05-16 15:22 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: andrew+netdev, davem, edumazet, horms, netdev, pabeni, sdf
On Thu, 15 May 2025 19:59:41 -0700 Kuniyuki Iwashima wrote:
> > Is the thinking that once the big rtnl lock disappears in cleanup_net
> > the devices are safe to destroy without any locking because there can't
> > be any live users trying to access them?
>
> I hope yes, but removing VF via sysfs and removing netns might
> race and need some locking ?
I think we should take the small lock around default_device_exit_net()
and then we'd be safe? Either a given VF gets moved to init_net first
or the sysfs gets to it and unregisters it safely in the old netns.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: let lockdep compare instance locks
2025-05-16 15:22 ` Jakub Kicinski
@ 2025-05-16 17:14 ` Jakub Kicinski
2025-05-16 17:50 ` Kuniyuki Iwashima
2025-05-16 17:50 ` Stanislav Fomichev
0 siblings, 2 replies; 8+ messages in thread
From: Jakub Kicinski @ 2025-05-16 17:14 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: andrew+netdev, davem, edumazet, horms, netdev, pabeni, sdf
On Fri, 16 May 2025 08:22:43 -0700 Jakub Kicinski wrote:
> On Thu, 15 May 2025 19:59:41 -0700 Kuniyuki Iwashima wrote:
> > > Is the thinking that once the big rtnl lock disappears in cleanup_net
> > > the devices are safe to destroy without any locking because there can't
> > > be any live users trying to access them?
> >
> > I hope yes, but removing VF via sysfs and removing netns might
> > race and need some locking ?
>
> I think we should take the small lock around default_device_exit_net()
> and then we'd be safe? Either a given VF gets moved to init_net first
> or the sysfs gets to it and unregisters it safely in the old netns.
Thinking about it some more, we'll have to revisit this problem before
removing the big lock, anyway. I'm leaning towards doing this for now:
diff --git a/include/net/netdev_lock.h b/include/net/netdev_lock.h
index 2a753813f849..c345afecd4c5 100644
--- a/include/net/netdev_lock.h
+++ b/include/net/netdev_lock.h
@@ -99,16 +99,15 @@ static inline void netdev_unlock_ops_compat(struct net_device *dev)
static inline int netdev_lock_cmp_fn(const struct lockdep_map *a,
const struct lockdep_map *b)
{
- /* Only lower devices currently grab the instance lock, so no
- * real ordering issues can occur. In the near future, only
- * hardware devices will grab instance lock which also does not
- * involve any ordering. Suppress lockdep ordering warnings
- * until (if) we start grabbing instance lock on pure SW
- * devices (bond/team/veth/etc).
- */
if (a == b)
return 0;
- return -1;
+
+ /* Allow locking multiple devices only under rtnl_lock,
+ * the exact order doesn't matter.
+ * Note that upper devices don't lock their ops, so nesting
+ * mostly happens during batched device removal for now.
+ */
+ return lockdep_rtnl_is_held() ? -1 : 1;
}
#define netdev_lockdep_set_classes(dev) \
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: let lockdep compare instance locks
2025-05-16 17:14 ` Jakub Kicinski
@ 2025-05-16 17:50 ` Kuniyuki Iwashima
2025-05-16 17:50 ` Stanislav Fomichev
1 sibling, 0 replies; 8+ messages in thread
From: Kuniyuki Iwashima @ 2025-05-16 17:50 UTC (permalink / raw)
To: kuba; +Cc: andrew+netdev, davem, edumazet, horms, kuniyu, netdev, pabeni,
sdf
From: Jakub Kicinski <kuba@kernel.org>
Date: Fri, 16 May 2025 10:14:41 -0700
> On Fri, 16 May 2025 08:22:43 -0700 Jakub Kicinski wrote:
> > On Thu, 15 May 2025 19:59:41 -0700 Kuniyuki Iwashima wrote:
> > > > Is the thinking that once the big rtnl lock disappears in cleanup_net
> > > > the devices are safe to destroy without any locking because there can't
> > > > be any live users trying to access them?
> > >
> > > I hope yes, but removing VF via sysfs and removing netns might
> > > race and need some locking ?
> >
> > I think we should take the small lock around default_device_exit_net()
> > and then we'd be safe?
Agree. The 'queuing dev for destruction' part will be only racy.
> > Either a given VF gets moved to init_net first
> > or the sysfs gets to it and unregisters it safely in the old netns.
>
> Thinking about it some more, we'll have to revisit this problem before
> removing the big lock, anyway. I'm leaning towards doing this for now:
This looks good to me.
>
> diff --git a/include/net/netdev_lock.h b/include/net/netdev_lock.h
> index 2a753813f849..c345afecd4c5 100644
> --- a/include/net/netdev_lock.h
> +++ b/include/net/netdev_lock.h
> @@ -99,16 +99,15 @@ static inline void netdev_unlock_ops_compat(struct net_device *dev)
> static inline int netdev_lock_cmp_fn(const struct lockdep_map *a,
> const struct lockdep_map *b)
> {
> - /* Only lower devices currently grab the instance lock, so no
> - * real ordering issues can occur. In the near future, only
> - * hardware devices will grab instance lock which also does not
> - * involve any ordering. Suppress lockdep ordering warnings
> - * until (if) we start grabbing instance lock on pure SW
> - * devices (bond/team/veth/etc).
> - */
> if (a == b)
> return 0;
> - return -1;
> +
> + /* Allow locking multiple devices only under rtnl_lock,
> + * the exact order doesn't matter.
> + * Note that upper devices don't lock their ops, so nesting
> + * mostly happens during batched device removal for now.
> + */
> + return lockdep_rtnl_is_held() ? -1 : 1;
> }
>
> #define netdev_lockdep_set_classes(dev) \
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: let lockdep compare instance locks
2025-05-16 17:14 ` Jakub Kicinski
2025-05-16 17:50 ` Kuniyuki Iwashima
@ 2025-05-16 17:50 ` Stanislav Fomichev
1 sibling, 0 replies; 8+ messages in thread
From: Stanislav Fomichev @ 2025-05-16 17:50 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Kuniyuki Iwashima, andrew+netdev, davem, edumazet, horms, netdev,
pabeni, sdf
On 05/16, Jakub Kicinski wrote:
> On Fri, 16 May 2025 08:22:43 -0700 Jakub Kicinski wrote:
> > On Thu, 15 May 2025 19:59:41 -0700 Kuniyuki Iwashima wrote:
> > > > Is the thinking that once the big rtnl lock disappears in cleanup_net
> > > > the devices are safe to destroy without any locking because there can't
> > > > be any live users trying to access them?
> > >
> > > I hope yes, but removing VF via sysfs and removing netns might
> > > race and need some locking ?
> >
> > I think we should take the small lock around default_device_exit_net()
> > and then we'd be safe? Either a given VF gets moved to init_net first
> > or the sysfs gets to it and unregisters it safely in the old netns.
>
> Thinking about it some more, we'll have to revisit this problem before
> removing the big lock, anyway. I'm leaning towards doing this for now:
+1
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-05-16 17:50 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-16 1:24 [PATCH net-next] net: let lockdep compare instance locks Jakub Kicinski
2025-05-16 1:49 ` Kuniyuki Iwashima
2025-05-16 2:36 ` Jakub Kicinski
2025-05-16 2:59 ` Kuniyuki Iwashima
2025-05-16 15:22 ` Jakub Kicinski
2025-05-16 17:14 ` Jakub Kicinski
2025-05-16 17:50 ` Kuniyuki Iwashima
2025-05-16 17:50 ` Stanislav Fomichev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).