Re: [BUG] -next lockdep invalid wait context

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Marco Elver <elver@google.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: paulmck@kernel.org, linux-next@vger.kernel.org,
	linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com,
	linux-mm@kvack.org, sfr@canb.auug.org.au, bigeasy@linutronix.de,
	longman@redhat.com, boqun.feng@gmail.com, cl@linux.com,
	penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com,
	akpm@linux-foundation.org
Subject: Re: [BUG] -next lockdep invalid wait context
Date: Wed, 30 Oct 2024 23:34:08 +0100	[thread overview]
Message-ID: <ZyK0YPgtWExT4deh@elver.google.com> (raw)
In-Reply-To: <e06d69c9-f067-45c6-b604-fd340c3bd612@suse.cz>

On Wed, Oct 30, 2024 at 10:48PM +0100, Vlastimil Babka wrote:
> On 10/30/24 22:05, Paul E. McKenney wrote:
> > Hello!
> 
> Hi!
> 
> > The next-20241030 release gets the splat shown below when running
> > scftorture in a preemptible kernel.  This bisects to this commit:
> > 
> > 560af5dc839e ("lockdep: Enable PROVE_RAW_LOCK_NESTING with PROVE_LOCKING")
> > 
> > Except that all this is doing is enabling lockdep to find the problem.
> > 
> > The obvious way to fix this is to make the kmem_cache structure's
> > cpu_slab field's ->lock be a raw spinlock, but this might not be what
> > we want for real-time response.
> 
> But it's a local_lock, not spinlock and it's doing local_lock_irqsave(). I'm
> confused what's happening here, the code has been like this for years now.
> 
> > This can be reproduced deterministically as follows:
> > 
> > tools/testing/selftests/rcutorture/bin/kvm.sh --torture scf --allcpus --duration 2 --configs PREEMPT --kconfig CONFIG_NR_CPUS=64 --memory 7G --trust-make --kasan --bootargs "scftorture.nthreads=64 torture.disable_onoff_at_boot csdlock_debug=1"
> > 
> > I doubt that the number of CPUs or amount of memory makes any difference,
> > but that is what I used.
> > 
> > Thoughts?
> > 
> > 							Thanx, Paul
> > 
> > ------------------------------------------------------------------------
> > 
> > [   35.659746] =============================
> > [   35.659746] [ BUG: Invalid wait context ]
> > [   35.659746] 6.12.0-rc5-next-20241029 #57233 Not tainted
> > [   35.659746] -----------------------------
> > [   35.659746] swapper/37/0 is trying to lock:
> > [   35.659746] ffff8881ff4bf2f0 (&c->lock){....}-{3:3}, at: put_cpu_partial+0x49/0x1b0
> > [   35.659746] other info that might help us debug this:
> > [   35.659746] context-{2:2}
> > [   35.659746] no locks held by swapper/37/0.
> > [   35.659746] stack backtrace:
> > [   35.659746] CPU: 37 UID: 0 PID: 0 Comm: swapper/37 Not tainted 6.12.0-rc5-next-20241029 #57233
> > [   35.659746] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
> > [   35.659746] Call Trace:
> > [   35.659746]  <IRQ>
> > [   35.659746]  dump_stack_lvl+0x68/0xa0
> > [   35.659746]  __lock_acquire+0x8fd/0x3b90
> > [   35.659746]  ? start_secondary+0x113/0x210
> > [   35.659746]  ? __pfx___lock_acquire+0x10/0x10
> > [   35.659746]  ? __pfx___lock_acquire+0x10/0x10
> > [   35.659746]  ? __pfx___lock_acquire+0x10/0x10
> > [   35.659746]  ? __pfx___lock_acquire+0x10/0x10
> > [   35.659746]  lock_acquire+0x19b/0x520
> > [   35.659746]  ? put_cpu_partial+0x49/0x1b0
> > [   35.659746]  ? __pfx_lock_acquire+0x10/0x10
> > [   35.659746]  ? __pfx_lock_release+0x10/0x10
> > [   35.659746]  ? lock_release+0x20f/0x6f0
> > [   35.659746]  ? __pfx_lock_release+0x10/0x10
> > [   35.659746]  ? lock_release+0x20f/0x6f0
> > [   35.659746]  ? kasan_save_track+0x14/0x30
> > [   35.659746]  put_cpu_partial+0x52/0x1b0
> > [   35.659746]  ? put_cpu_partial+0x49/0x1b0
> > [   35.659746]  ? __pfx_scf_handler_1+0x10/0x10
> > [   35.659746]  __flush_smp_call_function_queue+0x2d2/0x600
> 
> How did we even get to put_cpu_partial directly from flushing smp calls?
> SLUB doesn't use them, it uses queue_work_on)_ for flushing and that
> flushing doesn't involve put_cpu_partial() AFAIK.
> 
> I think only slab allocation or free can lead to put_cpu_partial() that
> would mean the backtrace is missing something. And that somebody does a slab
> alloc/free from a smp callback, which I'd then assume isn't allowed?

Tail-call optimization is hiding the caller. Compiling with
-fno-optimize-sibling-calls exposes the caller. This gives the full
picture:

[   40.321505] =============================
[   40.322711] [ BUG: Invalid wait context ]
[   40.323927] 6.12.0-rc5-next-20241030-dirty #4 Not tainted
[   40.325502] -----------------------------
[   40.326653] cpuhp/47/253 is trying to lock:
[   40.327869] ffff8881ff9bf2f0 (&c->lock){....}-{3:3}, at: put_cpu_partial+0x48/0x1a0
[   40.330081] other info that might help us debug this:
[   40.331540] context-{2:2}
[   40.332305] 3 locks held by cpuhp/47/253:
[   40.333468]  #0: ffffffffae6e6910 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0xe0/0x590
[   40.336048]  #1: ffffffffae6e9060 (cpuhp_state-down){+.+.}-{0:0}, at: cpuhp_thread_fun+0xe0/0x590
[   40.338607]  #2: ffff8881002a6948 (&root->kernfs_rwsem){++++}-{4:4}, at: kernfs_remove_by_name_ns+0x78/0x100
[   40.341454] stack backtrace:
[   40.342291] CPU: 47 UID: 0 PID: 253 Comm: cpuhp/47 Not tainted 6.12.0-rc5-next-20241030-dirty #4
[   40.344807] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   40.347482] Call Trace:
[   40.348199]  <IRQ>
[   40.348827]  dump_stack_lvl+0x6b/0xa0
[   40.349899]  dump_stack+0x10/0x20
[   40.350850]  __lock_acquire+0x900/0x4010
[   40.360290]  lock_acquire+0x191/0x4f0
[   40.364850]  put_cpu_partial+0x51/0x1a0
[   40.368341]  scf_handler+0x1bd/0x290
[   40.370590]  scf_handler_1+0x4e/0xb0
[   40.371630]  __flush_smp_call_function_queue+0x2dd/0x600
[   40.373142]  generic_smp_call_function_single_interrupt+0xe/0x20
[   40.374801]  __sysvec_call_function_single+0x50/0x280
[   40.376214]  sysvec_call_function_single+0x6c/0x80
[   40.377543]  </IRQ>
[   40.378142]  <TASK>

And scf_handler does indeed tail-call kfree:

	static void scf_handler(void *scfc_in)
	{
	[...]
		} else {
			kfree(scfcp);
		}
	}

next prev parent reply	other threads:[~2024-10-30 22:34 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-30 21:05 [BUG] -next lockdep invalid wait context Paul E. McKenney
2024-10-30 21:48 ` Vlastimil Babka
2024-10-30 22:34   ` Marco Elver [this message]
2024-10-30 23:04     ` Boqun Feng
2024-10-30 23:10     ` Paul E. McKenney
2024-10-31  7:21       ` Sebastian Andrzej Siewior
2024-10-31  7:35         ` Vlastimil Babka
2024-10-31  7:55           ` Sebastian Andrzej Siewior
2024-10-31  8:18             ` Vlastimil Babka
2024-11-01 17:14               ` Paul E. McKenney
2024-10-31 17:50             ` Paul E. McKenney
2024-11-01 19:50               ` Boqun Feng
2024-11-01 19:54                 ` [PATCH] scftorture: Use workqueue to free scf_check Boqun Feng
2024-11-01 23:35                   ` Paul E. McKenney
2024-11-03  3:35                     ` Boqun Feng
2024-11-03 15:03                       ` Paul E. McKenney
2024-11-04 10:50                         ` [PATCH 1/2] scftorture: Move memory allocation outside of preempt_disable region Sebastian Andrzej Siewior
2024-11-04 10:50                           ` [PATCH 2/2] scftorture: Use a lock-less list to free memory Sebastian Andrzej Siewior
2024-11-05  1:00                             ` Boqun Feng
2024-11-07 11:21                               ` Sebastian Andrzej Siewior
2024-11-07 14:08                                 ` Paul E. McKenney
2024-11-07 14:43                                   ` Sebastian Andrzej Siewior
2024-11-07 14:59                                     ` Paul E. McKenney
2024-11-02  0:12         ` [BUG] -next lockdep invalid wait context Hillf Danton
2024-11-02  0:45           ` Boqun Feng
2024-11-04 18:08             ` Tejun Heo
2024-11-05  9:37               ` Vlastimil Babka
2024-11-08 10:05               ` Sebastian Andrzej Siewior
2024-11-08 17:02                 ` Tejun Heo
2024-11-08 17:12                   ` Sebastian Andrzej Siewior
2024-11-08 22:24                   ` [PATCH] kernfs: Use RCU for kernfs_node::name lookup Sebastian Andrzej Siewior
2024-11-08 22:31                     ` Tejun Heo
2024-11-11 17:04                       ` Sebastian Andrzej Siewior
2024-11-12 19:02                         ` Tejun Heo
2024-11-13  7:58                           ` Sebastian Andrzej Siewior
2024-11-08 23:16                     ` Hillf Danton
2024-11-08 23:48                       ` [syzbot] [kernfs?] WARNING: locking bug in kernfs_path_from_node syzbot
2024-11-11  4:49                     ` [PATCH] kernfs: Use RCU for kernfs_node::name lookup kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZyK0YPgtWExT4deh@elver.google.com \
    --to=elver@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=boqun.feng@gmail.com \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-next@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=sfr@canb.auug.org.au \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.