public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: David Woodhouse <dwmw2@infradead.org>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Jason Wang <jasowang@redhat.com>
Cc: "x86@kernel.org" <x86@kernel.org>, hpa <hpa@zytor.com>,
	dyoung <dyoung@redhat.com>, kexec <kexec@lists.infradead.org>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Stefano Garzarella <sgarzare@redhat.com>,
	eperezma <eperezma@redhat.com>,
	Paolo Bonzini <bonzini@redhat.com>,
	ming.lei@redhat.com, Petr Mladek <pmladek@suse.com>,
	John Ogness <jogness@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: Lockdep warnings on kexec (virtio_blk, hrtimers)
Date: Thu, 12 Dec 2024 14:34:48 +0100	[thread overview]
Message-ID: <87ldwl9g93.ffs@tglx> (raw)
In-Reply-To: <7717fe2ac0ce5f0a2c43fdab8b11f4483d54a2a4.camel@infradead.org>

CC+ printk folks

On Thu, Dec 12 2024 at 11:07, David Woodhouse wrote:
> On Wed, 2024-12-11 at 07:42 -0500, Stefan Hajnoczi wrote:
>> On Tue, Dec 10, 2024 at 09:56:43AM +0800, Jason Wang wrote:
>> > Adding more virtio-blk people here.
>> 
>> Please try Ming Lei's recent fix in Jens' tree:
>> 
>>   virtio-blk: don't keep queue frozen during system suspend
>>   commit: 7678abee0867e6b7fb89aa40f6e9f575f755fb37
>> 
>> https://git.kernel.dk/cgit/linux/commit/?h=block-6.13&id=7678abee0867e6b7fb89aa40f6e9f575f755fb37
>
> Thanks. That does make those warnings go away. I do still get this one
> occasionally though. It seems to go away without 'no_console_suspend'
> on the command line, but I'm not sure that makes it OK.

Not really.

> [   23.665790] Interrupts enabled after irqrouter_resume+0x0/0x50

The resume callback irqrouter_resume() returns with interrupts enabled,
but it's absolutely unclear where this happens. The lockdep tracking is
not really helpful:

> [   23.697043] hardirqs last  enabled at (15573): [<ffffffffa8281b8e>] __up_console_sem+0x7e/0x90
> [   23.697855] hardirqs last disabled at (15580): [<ffffffffa8281b73>] __up_console_sem+0x63/0x90

__up_console_sem()
{
	printk_safe_enter_irqsave(flags);       // Assuming this is __up_console_sem+0x63/0x90
                                                // Saves state in @flags and disables interrupts
        up(&console_sem);
        printk_safe_exit_irqrestore(flags);     // Assuming this is __up_console_sem+0x7e/0x90
                                                // Restores the interrupt state from @flags
}

Though the events are in reverse order:

    last enabled  at 15573
    last disabled at 15580

At event #15573 printk_safe_exit_irqrestore(flags) enabled interrupts,
which means the preceeding printk_safe_enter_irqsave(flags) was invoked
with interrupts enabled. But that enable event wiped the real culprit,
which enabled interrupts before __up_console_sem() was invoked.

At event #15580 printk_safe_enter_irqsave(flags); disables interrupts
again, which is probably at the point where printk() dumps the bug, but
I might be misreading this.

Now David's observation that the problem "goes away" when he adds
"no_console_suspend" on the command line is definitely interesting, but
does not really help in figuring out the root cause.

> [   23.698673] softirqs last  enabled at (14798): [<ffffffffa81c6c12>] __irq_exit_rcu+0xe2/0x100
> [   23.699481] softirqs last disabled at (14777): [<ffffffffa81c6c12>] __irq_exit_rcu+0xe2/0x100
> [   23.700284] ---[ end trace 0000000000000000 ]---
> [   23.702460] ------------[ cut here ]------------
> [   23.702963] WARNING: CPU: 0 PID: 560 at kernel/time/hrtimer.c:995 hrtimers_resume_local+0x29/0x40

This one is just a consequence of the above.

David, can you retest with the debug patch below? That should pin-point
the real culprit.

Thanks,

        tglx
---
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -621,6 +621,9 @@ do {									\
 
 extern void lockdep_assert_in_softirq_func(void);
 
+extern void lockdep_suspend_syscore_enter(void);
+extern void lockdep_suspend_syscore_exit(void);
+
 #else
 # define might_lock(lock) do { } while (0)
 # define might_lock_read(lock) do { } while (0)
@@ -635,6 +638,8 @@ extern void lockdep_assert_in_softirq_fu
 # define lockdep_assert_preemption_disabled() do { } while (0)
 # define lockdep_assert_in_softirq() do { } while (0)
 # define lockdep_assert_in_softirq_func() do { } while (0)
+static inline void lockdep_suspend_syscore_enter(void) { }
+static inline void lockdep_suspend_syscore_exit(void) { }
 #endif
 
 #ifdef CONFIG_PROVE_RAW_LOCK_NESTING
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1025,6 +1025,7 @@ int kernel_kexec(void)
 		if (error)
 			goto Enable_cpus;
 		local_irq_disable();
+		lockdep_suspend_syscore_enter();
 		error = syscore_suspend();
 		if (error)
 			goto Enable_irqs;
@@ -1054,6 +1055,7 @@ int kernel_kexec(void)
 	if (kexec_image->preserve_context) {
 		syscore_resume();
  Enable_irqs:
+		lockdep_suspend_syscore_exit();
 		local_irq_enable();
  Enable_cpus:
 		suspend_enable_secondary_cpus();
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4408,6 +4408,18 @@ void lockdep_hardirqs_on_prepare(void)
 }
 EXPORT_SYMBOL_GPL(lockdep_hardirqs_on_prepare);
 
+static bool suspend_syscore_active;
+
+void noinstr lockdep_suspend_syscore_enter(void)
+{
+	suspend_syscore_active = true;
+}
+
+void noinstr lockdep_suspend_syscore_exit(void)
+{
+	suspend_syscore_active = false;
+}
+
 void noinstr lockdep_hardirqs_on(unsigned long ip)
 {
 	struct irqtrace_events *trace = &current->irqtrace;
@@ -4456,6 +4468,8 @@ void noinstr lockdep_hardirqs_on(unsigne
 	if (DEBUG_LOCKS_WARN_ON(!irqs_disabled()))
 		return;
 
+	DEBUG_LOCKS_WARN_ON(suspend_syscore_active);
+
 	/*
 	 * Ensure the lock stack remained unchanged between
 	 * lockdep_hardirqs_on_prepare() and lockdep_hardirqs_on().

  reply	other threads:[~2024-12-12 13:34 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-09 14:28 Lockdep warnings on kexec (virtio_blk, hrtimers) David Woodhouse
2024-12-10  1:56 ` Jason Wang
2024-12-11 12:42   ` Stefan Hajnoczi
2024-12-12 11:07     ` David Woodhouse
2024-12-12 13:34       ` Thomas Gleixner [this message]
2024-12-12 13:46         ` David Woodhouse
2024-12-12 18:04           ` Thomas Gleixner
2024-12-12 19:19             ` David Woodhouse
2024-12-13  0:14               ` Thomas Gleixner
2024-12-13  9:31                 ` David Woodhouse
2024-12-13  9:43                   ` David Woodhouse
2024-12-13 10:42                     ` Thomas Gleixner
2024-12-13 11:09                       ` Ming Lei
2024-12-13 11:31                         ` Thomas Gleixner
2024-12-13 11:48                           ` Ming Lei
2024-12-13 13:23                             ` Thomas Gleixner
2024-12-13 14:07                               ` David Woodhouse
2024-12-13 17:05                                 ` Thomas Gleixner
2024-12-13 17:17                                   ` David Woodhouse
2024-12-13 17:48                                     ` Rafael J. Wysocki
2024-12-13 17:32                                   ` Rafael J. Wysocki
2024-12-13 19:06                                     ` Rafael J. Wysocki
2024-12-13 20:16                                       ` David Woodhouse
2024-12-14  9:57                                         ` David Woodhouse
2024-12-16 12:14                                           ` Rafael J. Wysocki
2024-12-13 17:59                                   ` Rafael J. Wysocki
2024-12-13 13:17                           ` David Woodhouse
2024-12-13 11:12                       ` David Woodhouse
2024-12-13 11:33                         ` Ming Lei
2024-12-13 11:20                 ` Peter Zijlstra
2024-12-13 13:13                   ` Thomas Gleixner
2024-12-16 13:20                     ` [PATCH] sched: Prevent rescheduling when interrupts are disabled Thomas Gleixner
2024-12-16 17:41                       ` David Woodhouse
2024-12-12 11:12     ` Lockdep warnings on kexec (virtio_blk, hrtimers) Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ldwl9g93.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=bonzini@redhat.com \
    --cc=dwmw2@infradead.org \
    --cc=dyoung@redhat.com \
    --cc=eperezma@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jasowang@redhat.com \
    --cc=jogness@linutronix.de \
    --cc=kexec@lists.infradead.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=mst@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=sgarzare@redhat.com \
    --cc=stefanha@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox