All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Fomichev <stfomichev@gmail.com>
To: Cosmin Ratiu <cratiu@nvidia.com>
Cc: "pabeni@redhat.com" <pabeni@redhat.com>,
	"kuba@kernel.org" <kuba@kernel.org>,
	"edumazet@google.com" <edumazet@google.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"saeed@kernel.org" <saeed@kernel.org>,
	"sdf@fomichev.me" <sdf@fomichev.me>,
	"davem@davemloft.net" <davem@davemloft.net>
Subject: Re: [PATCH net-next v10 08/14] net: hold netdev instance lock during sysfs operations
Date: Mon, 24 Mar 2025 11:18:54 -0700	[thread overview]
Message-ID: <Z-GiDo7wWJ4zFEmt@mini-arch> (raw)
In-Reply-To: <8e5bf1dffe7c5ae2191e9082dcd0f72469b4fc0b.camel@nvidia.com>

On 03/24, Cosmin Ratiu wrote:
> On Mon, 2025-03-24 at 09:06 -0700, Stanislav Fomichev wrote:
> > On 03/24, Cosmin Ratiu wrote:
> > > Call Trace:
> > > dump_stack_lvl+0x62/0x90
> > > print_deadlock_bug+0x274/0x3b0
> > > __lock_acquire+0x1229/0x2470
> > > lock_acquire+0xb7/0x2b0
> > > __mutex_lock+0xa6/0xd20
> > > dev_disable_lro+0x20/0x80
> > > inetdev_init+0x12f/0x1f0
> > > inetdev_event+0x48b/0x870
> > > notifier_call_chain+0x38/0xf0
> > > netif_change_net_namespace+0x72e/0x9f0
> > > do_setlink.isra.0+0xd5/0x1220
> > > rtnl_newlink+0x7ea/0xb50
> > > rtnetlink_rcv_msg+0x459/0x5e0
> > > netlink_rcv_skb+0x54/0x100
> > > netlink_unicast+0x193/0x270
> > > netlink_sendmsg+0x204/0x450
> > 
> > I think something like the patch below should fix it? inetdev_init is
> > called for blackhole (sw device, we don't care about ops lock) and
> > from
> > REGISTER/UNREGISTER notifiers. We hold the lock during REGISTER,
> > and will soon hold the lock during UNREGISTER:
> > https://lore.kernel.org/netdev/20250312223507.805719-9-kuba@kernel.org/
> > 
> > (might also need to EXPORT_SYM netif_disable_lro)
> > 
> > diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
> > index 754f60fb6e25..77e5705ac799 100644
> > --- a/net/ipv4/devinet.c
> > +++ b/net/ipv4/devinet.c
> > @@ -281,7 +281,7 @@ static struct in_device *inetdev_init(struct
> > net_device *dev)
> >  	if (!in_dev->arp_parms)
> >  		goto out_kfree;
> >  	if (IPV4_DEVCONF(in_dev->cnf, FORWARDING))
> > -		dev_disable_lro(dev);
> > +		netif_disable_lro(dev);
> >  	/* Reference in_dev->dev */
> >  	netdev_hold(dev, &in_dev->dev_tracker, GFP_KERNEL);
> >  	/* Account for reference dev->ip_ptr (below) */
> 
> Unfortunately, this seems to result, on another code path, in:
> WARNING: CPU: 10 PID: 1479 at ./include/net/netdev_lock.h:54
> __netdev_update_features+0x65f/0xca0
> __warn+0x81/0x180
> __netdev_update_features+0x65f/0xca0
> report_bug+0x156/0x180
> handle_bug+0x4f/0x90
> exc_invalid_op+0x13/0x60
> asm_exc_invalid_op+0x16/0x20
> __netdev_update_features+0x65f/0xca0
> netif_disable_lro+0x30/0x1d0
> inetdev_init+0x12f/0x1f0
> inetdev_event+0x48b/0x870
> notifier_call_chain+0x38/0xf0
> register_netdevice+0x741/0x8b0
> register_netdev+0x1f/0x40
> mlx5e_probe+0x4e3/0x8e0 [mlx5_core]
> auxiliary_bus_probe+0x3f/0x90
> really_probe+0xc3/0x3a0
> __driver_probe_device+0x80/0x150
> driver_probe_device+0x1f/0x90
> __device_attach_driver+0x7d/0x100
> bus_for_each_drv+0x80/0xd0
> __device_attach+0xb4/0x1c0
> bus_probe_device+0x91/0xa0
> device_add+0x657/0x870
> 
> I see register_netdevice briefly acquires the netdev lock in two
> separate blocks and has a __netdev_update_features call in one of the
> blocks, but the lock is not held for
> call_netdevice_notifiers(NETDEV_REGISTER, dev).

Ok, so we might need to also try to run NETDEV_REGISTER hooks
consistently under the instance lock. This might bring more surprises,
but I don't see any other easy option. Will test it out locally...

diff --git a/net/core/dev.c b/net/core/dev.c
index f29c1368c304..d672d521b92a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1815,7 +1815,9 @@ static int call_netdevice_register_notifiers(struct notifier_block *nb,
 {
 	int err;
 
+	netdev_lock_ops(dev);
 	err = call_netdevice_notifier(nb, NETDEV_REGISTER, dev);
+	netdev_unlock_ops(dev);
 	err = notifier_to_errno(err);
 	if (err)
 		return err;
@@ -11014,7 +11016,9 @@ int register_netdevice(struct net_device *dev)
 		memcpy(dev->perm_addr, dev->dev_addr, dev->addr_len);
 
 	/* Notify protocols, that a new device appeared. */
+	netdev_lock_ops(dev);
 	ret = call_netdevice_notifiers(NETDEV_REGISTER, dev);
+	netdev_unlock_ops(dev);
 	ret = notifier_to_errno(ret);
 	if (ret) {
 		/* Expect explicit free_netdev() on failure */
@@ -12036,6 +12040,7 @@ int netif_change_net_namespace(struct net_device *dev, struct net *net,
 	int err, new_nsid;
 
 	ASSERT_RTNL();
+	netdev_ops_assert_locked(dev);
 
 	/* Don't allow namespace local devices to be moved. */
 	err = -EINVAL;

  reply	other threads:[~2025-03-24 18:18 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-05 16:37 [PATCH net-next v10 00/14] net: Hold netdev instance lock during ndo operations Stanislav Fomichev
2025-03-05 16:37 ` [PATCH net-next v10 01/14] net: hold netdev instance lock during ndo_open/ndo_stop Stanislav Fomichev
2025-03-05 16:37 ` [PATCH net-next v10 02/14] net: hold netdev instance lock during nft ndo_setup_tc Stanislav Fomichev
2025-03-07 19:39   ` Eric Dumazet
2025-03-07 19:57     ` Stanislav Fomichev
2025-03-05 16:37 ` [PATCH net-next v10 03/14] net: sched: wrap doit/dumpit methods Stanislav Fomichev
2025-03-05 16:37 ` [PATCH net-next v10 04/14] net: hold netdev instance lock during qdisc ndo_setup_tc Stanislav Fomichev
2025-03-13  8:51   ` Eric Dumazet
2025-03-13  9:11     ` Stanislav Fomichev
2025-05-05 13:41   ` Cosmin Ratiu
2025-05-05 15:07     ` Stanislav Fomichev
2025-05-05 15:12       ` Cosmin Ratiu
2025-05-05 18:35         ` Jakub Kicinski
2025-05-05 18:54           ` Stanislav Fomichev
2025-05-05 19:03             ` Jakub Kicinski
2025-03-05 16:37 ` [PATCH net-next v10 05/14] net: hold netdev instance lock during queue operations Stanislav Fomichev
2025-03-05 16:37 ` [PATCH net-next v10 06/14] net: hold netdev instance lock during rtnetlink operations Stanislav Fomichev
2025-03-05 16:37 ` [PATCH net-next v10 07/14] net: hold netdev instance lock during ioctl operations Stanislav Fomichev
2025-03-05 16:37 ` [PATCH net-next v10 08/14] net: hold netdev instance lock during sysfs operations Stanislav Fomichev
2025-03-24 15:34   ` Cosmin Ratiu
2025-03-24 15:56     ` Cosmin Ratiu
2025-03-24 16:06     ` Stanislav Fomichev
2025-03-24 17:06       ` Cosmin Ratiu
2025-03-24 18:18         ` Stanislav Fomichev [this message]
2025-03-05 16:37 ` [PATCH net-next v10 09/14] net: hold netdev instance lock during ndo_bpf Stanislav Fomichev
2025-03-05 16:37 ` [PATCH net-next v10 10/14] net: ethtool: try to protect all callback with netdev instance lock Stanislav Fomichev
2025-03-05 16:37 ` [PATCH net-next v10 11/14] net: replace dev_addr_sem " Stanislav Fomichev
2025-03-05 16:37 ` [PATCH net-next v10 12/14] net: add option to request " Stanislav Fomichev
2025-03-05 16:37 ` [PATCH net-next v10 13/14] docs: net: document new locking reality Stanislav Fomichev
2025-03-05 16:37 ` [PATCH net-next v10 14/14] eth: bnxt: remove most dependencies on RTNL Stanislav Fomichev
2025-03-06 21:50 ` [PATCH net-next v10 00/14] net: Hold netdev instance lock during ndo operations patchwork-bot+netdevbpf
  -- strict thread matches above, loose matches on Subject: below --
2025-03-02  0:08 Stanislav Fomichev
2025-03-02  0:08 ` [PATCH net-next v10 08/14] net: hold netdev instance lock during sysfs operations Stanislav Fomichev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z-GiDo7wWJ4zFEmt@mini-arch \
    --to=stfomichev@gmail.com \
    --cc=cratiu@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=saeed@kernel.org \
    --cc=sdf@fomichev.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.