From: Petr Mladek <pmladek@suse.com>
To: gregkh@linuxfoundation.org
Cc: songmuchun@bytedance.com, sergey.senozhatsky@gmail.com,
stable@vger.kernel.org
Subject: Re: FAILED: patch "[PATCH] printk: fix deadlock when kernel panic" failed to apply to 4.9-stable tree
Date: Tue, 2 Mar 2021 11:02:39 +0100 [thread overview]
Message-ID: <YD4NP2ffit64F/kB@alley> (raw)
In-Reply-To: <1614604832185174@kroah.com>
On Mon 2021-03-01 14:20:32, gregkh@linuxfoundation.org wrote:
>
> The patch below does not apply to the 4.9-stable tree.
> If someone wants it applied there, or to any other stable or longterm
> tree, then please email the backport, including the original git commit
> id to <stable@vger.kernel.org>.
Please, find below the backport for stable-4.9:
From bfc246e5ed94300431bc8d74b67c74b6f66e75ff Mon Sep 17 00:00:00 2001
From: Muchun Song <songmuchun@bytedance.com>
Date: Wed, 10 Feb 2021 11:48:23 +0800
Subject: [PATCH] printk: fix deadlock when kernel panic
printk_nmi_flush_on_panic() caused the following deadlock on our
server:
CPU0: CPU1:
panic rcu_dump_cpu_stacks
kdump_nmi_shootdown_cpus nmi_trigger_cpumask_backtrace
register_nmi_handler(crash_nmi_callback) printk_nmi_flush
__printk_nmi_flush
raw_spin_lock_irqsave(&read_lock)
// send NMI to other processors
apic_send_IPI_allbutself(NMI_VECTOR)
// NMI interrupt, dead loop
crash_nmi_callback
printk_nmi_flush_on_panic
printk_nmi_flush
__printk_nmi_flush
// deadlock
raw_spin_lock_irqsave(&read_lock)
DEADLOCK: read_lock is taken on CPU1 and will never get released.
It happens when panic() stops a CPU by NMI while it has been in
the middle of printk_nmi_flush().
Handle the lock the same way as logbuf_lock. The printk_nmi buffers
are flushed only when both locks can be safely taken. It can avoid
the deadlock _in this particular case_ at expense of losing contents
of printk_nmi buffers.
Note: It would actually be safe to re-init the locks when all CPUs were
stopped by NMI. But it would require passing this information
from arch-specific code. It is not worth the complexity.
Especially because logbuf_lock and printk_nmi buffers have been
obsoleted by the lockless ring buffer.
This is a backport of the commit 8a8109f303e25a27f92c ("printk: fix
deadlock when kernel panic") for the stable 4.9 kernel.
Fixes: cf9b1106c81c ("printk/nmi: flush NMI messages on the system panic")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Cc: <stable@vger.kernel.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20210210034823.64867-1-songmuchun@bytedance.com
---
kernel/printk/nmi.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/kernel/printk/nmi.c b/kernel/printk/nmi.c
index 2c3e7f024c15..7a50b405ad28 100644
--- a/kernel/printk/nmi.c
+++ b/kernel/printk/nmi.c
@@ -52,6 +52,8 @@ struct nmi_seq_buf {
};
static DEFINE_PER_CPU(struct nmi_seq_buf, nmi_print_seq);
+static DEFINE_RAW_SPINLOCK(nmi_read_lock);
+
/*
* Safe printk() for NMI context. It uses a per-CPU buffer to
* store the message. NMIs are not nested, so there is always only
@@ -134,8 +136,6 @@ static void printk_nmi_flush_seq_line(struct nmi_seq_buf *s,
*/
static void __printk_nmi_flush(struct irq_work *work)
{
- static raw_spinlock_t read_lock =
- __RAW_SPIN_LOCK_INITIALIZER(read_lock);
struct nmi_seq_buf *s = container_of(work, struct nmi_seq_buf, work);
unsigned long flags;
size_t len, size;
@@ -148,7 +148,7 @@ static void __printk_nmi_flush(struct irq_work *work)
* different CPUs. This is especially important when printing
* a backtrace.
*/
- raw_spin_lock_irqsave(&read_lock, flags);
+ raw_spin_lock_irqsave(&nmi_read_lock, flags);
i = 0;
more:
@@ -197,7 +197,7 @@ static void __printk_nmi_flush(struct irq_work *work)
goto more;
out:
- raw_spin_unlock_irqrestore(&read_lock, flags);
+ raw_spin_unlock_irqrestore(&nmi_read_lock, flags);
}
/**
@@ -239,6 +239,14 @@ void printk_nmi_flush_on_panic(void)
raw_spin_lock_init(&logbuf_lock);
}
+ if (in_nmi() && raw_spin_is_locked(&nmi_read_lock)) {
+ if (num_online_cpus() > 1)
+ return;
+
+ debug_locks_off();
+ raw_spin_lock_init(&nmi_read_lock);
+ }
+
printk_nmi_flush();
}
--
2.16.4
next prev parent reply other threads:[~2021-03-03 0:19 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-01 13:20 FAILED: patch "[PATCH] printk: fix deadlock when kernel panic" failed to apply to 4.9-stable tree gregkh
2021-03-02 10:02 ` Petr Mladek [this message]
2021-03-04 13:45 ` Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YD4NP2ffit64F/kB@alley \
--to=pmladek@suse.com \
--cc=gregkh@linuxfoundation.org \
--cc=sergey.senozhatsky@gmail.com \
--cc=songmuchun@bytedance.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.