From: Stanislav Fomichev <stfomichev@gmail.com>
To: Cosmin Ratiu <cratiu@nvidia.com>
Cc: "pabeni@redhat.com" <pabeni@redhat.com>,
"edumazet@google.com" <edumazet@google.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"davem@davemloft.net" <davem@davemloft.net>,
"sdf@fomichev.me" <sdf@fomichev.me>,
"kuba@kernel.org" <kuba@kernel.org>
Subject: Re: [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER
Date: Wed, 26 Mar 2025 13:37:58 -0700 [thread overview]
Message-ID: <Z-Rlpgp3vb-zsgSM@mini-arch> (raw)
In-Reply-To: <Z-Q-QYvFvQG2usfv@mini-arch>
On 03/26, Stanislav Fomichev wrote:
> On 03/26, Cosmin Ratiu wrote:
> > On Wed, 2025-03-26 at 08:23 -0700, Stanislav Fomichev wrote:
> > > @@ -2028,7 +2028,7 @@ int unregister_netdevice_notifier(struct
> > > notifier_block *nb)
> > >
> > > for_each_net(net) {
> > > __rtnl_net_lock(net);
> > > - call_netdevice_unregister_net_notifiers(nb, net,
> > > true);
> > > + call_netdevice_unregister_net_notifiers(nb, net,
> > > NULL);
> > > __rtnl_net_unlock(net);
> > > }
> >
> > I tested. The deadlock is back now, because dev != NULL and if the lock
> > is held (like in the below stack), the mutex_lock will be attempted
> > again:
>
> I think I'm missing something. In this case I'm not sure why the original
> "fix" worked.
>
> You, presumably, use mlx5? And you just move this single device into
> a new netns? Or there is a couple of other mlx5 devices still hanging in
> the root netns?
>
> I'll try to take a look more at register_netdevice_notifier_net under
> mlx5..
I have a feeling that it's a spurious warning, the lock addresses
are different:
ip/1766 is trying to acquire lock:
ffff888110e18c80 (&dev->lock){+.+.}-{4:4}, at:
call_netdevice_unregister_notifiers+0x7d/0x140
but task is already holding lock:
ffff888130ae0c80 (&dev->lock){+.+.}-{4:4}, at:
do_setlink.isra.0+0x5b/0x1220
Can you try to apply the following on top of previous patch? At least
to confirm whether it matches my understanding.. We might also stick
with that unless we find a better option.
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 3506024c2453..e3d8d6c9bf03 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -40,6 +40,7 @@
#include <linux/if_bridge.h>
#include <linux/filter.h>
#include <net/netdev_queues.h>
+#include <net/netdev_lock.h>
#include <net/page_pool/types.h>
#include <net/pkt_sched.h>
#include <net/xdp_sock_drv.h>
@@ -5454,6 +5455,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev)
netdev->netdev_ops = &mlx5e_netdev_ops;
netdev->xdp_metadata_ops = &mlx5e_xdp_metadata_ops;
netdev->xsk_tx_metadata_ops = &mlx5e_xsk_tx_metadata_ops;
+ netdev_lockdep_set_classes(netdev);
mlx5e_dcbnl_build_netdev(netdev);
next prev parent reply other threads:[~2025-03-26 20:38 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-25 21:30 [PATCH net-next 0/9] net: hold instance lock during NETDEV_UP/REGISTER/UNREGISTER Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 1/9] net: switch to netif_disable_lro in inetdev_init Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 2/9] net: hold instance lock during NETDEV_REGISTER/UP/UNREGISTER Stanislav Fomichev
2025-03-26 15:03 ` Cosmin Ratiu
2025-03-26 15:23 ` Stanislav Fomichev
2025-03-26 15:37 ` Cosmin Ratiu
2025-03-26 17:49 ` Stanislav Fomichev
2025-03-26 20:37 ` Stanislav Fomichev [this message]
2025-03-26 20:57 ` Cosmin Ratiu
2025-03-26 21:18 ` Cosmin Ratiu
2025-03-26 22:02 ` Stanislav Fomichev
2025-03-26 15:24 ` Cosmin Ratiu
2025-03-26 17:43 ` Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 3/9] net: use netif_disable_lro in ipv6_add_dev Stanislav Fomichev
2025-03-26 7:33 ` kernel test robot
2025-03-25 21:30 ` [PATCH net-next 4/9] net: dummy: request ops lock Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 5/9] net: release instance lock during NETDEV_UNREGISTER for bond/team Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 6/9] docs: net: document netdev notifier expectations Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 7/9] net: designate XSK pool pointers in queues as "ops protected" Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 8/9] netdev: add "ops compat locking" helpers Stanislav Fomichev
2025-03-25 21:30 ` [PATCH net-next 9/9] netdev: don't hold rtnl_lock over nl queue info get when possible Stanislav Fomichev
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z-Rlpgp3vb-zsgSM@mini-arch \
--to=stfomichev@gmail.com \
--cc=cratiu@nvidia.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).