public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	"H. Peter Anvin" <hpa@zytor.com>,
	Waiman Long <longman@redhat.com>
Subject: [PATCH] x86/nmi: Use trylock in __register_nmi_handler() when in_nmi()
Date: Wed, 27 Nov 2024 18:34:55 -0500	[thread overview]
Message-ID: <20241127233455.76904-1-longman@redhat.com> (raw)

The __register_nmi_handler() function can be called in NMI context from
nmi_shootdown_cpus() leading to a lockdep splat like the following.

[ 1123.133573] ================================
[ 1123.137845] WARNING: inconsistent lock state
[ 1123.142118] 6.12.0-31.el10.x86_64+debug #1 Not tainted
[ 1123.147257] --------------------------------
[ 1123.151529] inconsistent {INITIAL USE} -> {IN-NMI} usage.
  :
[ 1123.261544]  Possible unsafe locking scenario:
[ 1123.261544]
[ 1123.267463]        CPU0
[ 1123.269915]        ----
[ 1123.272368]   lock(&nmi_desc[0].lock);
[ 1123.276122]   <Interrupt>
[ 1123.278746]     lock(&nmi_desc[0].lock);
[ 1123.282671]
[ 1123.282671]  *** DEADLOCK ***
  :
[ 1123.314088] Call Trace:
[ 1123.316542]  <NMI>
[ 1123.318562]  dump_stack_lvl+0x6f/0xb0
[ 1123.322230]  print_usage_bug.part.0+0x3d3/0x610
[ 1123.330618]  lock_acquire.part.0+0x2e6/0x360
[ 1123.357217]  _raw_spin_lock_irqsave+0x46/0x90
[ 1123.366193]  __register_nmi_handler+0x8f/0x3a0
[ 1123.374401]  nmi_shootdown_cpus+0x95/0x120
[ 1123.378509]  kdump_nmi_shootdown_cpus+0x15/0x20
[ 1123.383040]  native_machine_crash_shutdown+0x54/0x160
[ 1123.388095]  __crash_kexec+0x10f/0x1f0

In this particular case, a panic had just happened.

[ 1122.808188] Kernel panic - not syncing: Fatal hardware error!

It can be argued that a lockdep splat after a panic is not a big deal,
but it can still confuse users. Fix that by using trylock in NMI context
to avoid the lockdep splat. If trylock fails, we still need to acquire
the lock on the best effort basis to avoid potential NMI descriptor
list corruption.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 arch/x86/kernel/nmi.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
index ed163c8c8604..0b8ad64be117 100644
--- a/arch/x86/kernel/nmi.c
+++ b/arch/x86/kernel/nmi.c
@@ -171,8 +171,23 @@ int __register_nmi_handler(unsigned int type, struct nmiaction *action)
 	if (WARN_ON_ONCE(!action->handler || !list_empty(&action->list)))
 		return -EINVAL;
 
+	if (in_nmi()) {
+		/*
+		 * register_nmi_handler() can be called in NMI context from
+		 * nmi_shootdown_cpus(). In this case, we use trylock to
+		 * acquire the NMI descriptor lock to avoid potential lockdep
+		 * splat. If that fails, we still acquire the lock on the best
+		 * effort basis, but a real deadlock may happen if the CPU is
+		 * acquiring that lock when NMI happens.
+		 */
+		if (raw_spin_trylock_irqsave(&desc->lock, flags))
+			goto locked;
+		pr_err("%s: trylock of NMI desc lock failed in NMI context, may deadlock!\n",
+			__func__);
+	}
 	raw_spin_lock_irqsave(&desc->lock, flags);
 
+locked:
 	/*
 	 * Indicate if there are multiple registrations on the
 	 * internal NMI handler call chains (SERR and IO_CHECK).
-- 
2.47.0


             reply	other threads:[~2024-11-27 23:35 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-27 23:34 Waiman Long [this message]
2024-11-28  9:28 ` [PATCH] x86/nmi: Use trylock in __register_nmi_handler() when in_nmi() Peter Zijlstra
2024-11-29  1:06   ` Waiman Long
2024-11-29  1:55     ` Waiman Long
2024-11-29 16:57       ` Thomas Gleixner
2024-11-30  3:10         ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241127233455.76904-1-longman@redhat.com \
    --to=longman@redhat.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox