All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Fomichev <stfomichev@gmail.com>
To: Florian Westphal <fw@strlen.de>
Cc: netdev@vger.kernel.org, Paolo Abeni <pabeni@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	linux-kernel@vger.kernel.org, sdf@fomichev.me
Subject: Re: [PATCH net 2/2] net: core: split unregister_netdevice list into smaller chunks
Date: Fri, 10 Oct 2025 15:38:30 -0700	[thread overview]
Message-ID: <aOmK5i5e_Oi93JiO@mini-arch> (raw)
In-Reply-To: <20251010135412.22602-3-fw@strlen.de>

On 10/10, Florian Westphal wrote:
> Since blamed commit, unregister_netdevice_many_notify() takes the netdev
> mutex if the device needs it.
> 
> This isn't a problem in itself, the problem is that the list can be
> very long, so it may lock a LOT of mutexes, but lockdep engine can only
> deal with MAX_LOCK_DEPTH held locks:
> 
> unshare -n bash -c 'for i in $(seq 1 100);do  ip link add foo$i type dummy;done'
> BUG: MAX_LOCK_DEPTH too low!
> turning off the locking correctness validator.
> depth: 48  max: 48!
> 48 locks held by kworker/u16:1/69:
>  #0: ffff8880010b7148 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work+0x7ed/0x1350
>  #1: ffffc900004a7d40 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work+0xcf3/0x1350
>  #2: ffffffff8bc6fbd0 (pernet_ops_rwsem){++++}-{4:4}, at: cleanup_net+0xab/0x7f0
>  #3: ffffffff8bc8daa8 (rtnl_mutex){+.+.}-{4:4}, at: default_device_exit_batch+0x7e/0x2e0
>  #4: ffff88800b5e9cb0 (&dev_instance_lock_key#3){+.+.}-{4:4}, at: unregister_netdevice_many_notify+0x1056/0x1b00
> [..]
> 
> Work around this limitation by chopping the list into smaller chunks
> and process them individually for LOCKDEP enabled kernels.
> 
> Fixes: 7e4d784f5810 ("net: hold netdev instance lock during rtnetlink operations")
> Signed-off-by: Florian Westphal <fw@strlen.de>
> ---
>  net/core/dev.c | 34 +++++++++++++++++++++++++++++++++-
>  1 file changed, 33 insertions(+), 1 deletion(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 9a09b48c9371..7e35aa4ebc74 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -12208,6 +12208,38 @@ static void unregister_netdevice_close_many(struct list_head *head)
>  	}
>  }
>  
> +static void unregister_netdevice_close_many_lockdep(struct list_head *head)
> +{
> +#ifdef CONFIG_LOCKDEP
> +	unsigned int lock_depth = lockdep_depth(current);
> +	unsigned int lock_count = lock_depth;
> +	struct net_device *dev, *tmp;
> +	LIST_HEAD(done_head);
> +
> +	list_for_each_entry_safe(dev, tmp, head, unreg_list) {
> +		if (netdev_need_ops_lock(dev))
> +			lock_count++;
> +
> +		/* we'll run out of lockdep keys, reduce size. */
> +		if (lock_count >= MAX_LOCK_DEPTH - 1) {
> +			LIST_HEAD(tmp_head);
> +
> +			list_cut_before(&tmp_head, head, &dev->unreg_list);
> +			unregister_netdevice_close_many(&tmp_head);
> +			lock_count = lock_depth;
> +			list_splice_tail(&tmp_head, &done_head);
> +		}
> +	}
> +
> +	unregister_netdevice_close_many(head);
> +
> +	list_for_each_entry_safe_reverse(dev, tmp, &done_head, unreg_list)
> +		list_move(&dev->unreg_list, head);
> +#else
> +	unregister_netdevice_close_many(head);
> +#endif


Any reason not to morph the original code to add this 'no more than 8 at a
time' constraint? Having a separate lockdep path with list juggling
seems a bit fragile.

1. add all ops locked devs to the list
2. for each MAX_LOCK_DEPTH (or 'infinity' in the case of non-lockdep)
  2.1 lock N devs
  2.2 netif_close_many
  2.3 unlock N devs
3. ... do the non-ops-locked ones

This way the code won't diverge too much I hope.

  reply	other threads:[~2025-10-10 22:38 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-10 13:54 [PATCH net 0/2] net: avoid LOCKDEP MAX_LOCK_DEPTH splat Florian Westphal
2025-10-10 13:54 ` [PATCH net 1/2] net: core: move unregister_many inner loops to a helper Florian Westphal
2025-10-10 13:54 ` [PATCH net 2/2] net: core: split unregister_netdevice list into smaller chunks Florian Westphal
2025-10-10 22:38   ` Stanislav Fomichev [this message]
2025-10-11 14:30     ` Florian Westphal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aOmK5i5e_Oi93JiO@mini-arch \
    --to=stfomichev@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.