All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Yipeng Zou <zouyipeng@huawei.com>,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	x86@kernel.org, hpa@zytor.com, peterz@infradead.org,
	sohil.mehta@intel.com, rui.zhang@intel.com, arnd@arndb.de,
	yuntao.wang@linux.dev, linux-kernel@vger.kernel.org
Subject: Re: [BUG REPORT] x86/apic: CPU Hang in x86 VM During Kdump
Date: Sun, 27 Jul 2025 14:39:58 +0200	[thread overview]
Message-ID: <87h5yxq02p.ffs@tglx> (raw)
In-Reply-To: <b31a5b91-bc94-46ce-8191-c6576c04f05b@huawei.com>

On Sat, Jul 26 2025 at 17:50, Yipeng Zou wrote:

Please do not top-post and trim your replies.

>      I skipped sending the NMI in native_stop_other_cpus(), and the test 
> passed.

I don't see how that would result in anything meaningful. The reboot vector
IRR bit on that second CPU will still be set.

>      Given this, is there an alternative way to resolve the issue, or 
> can we simply mask the IPI directly at that point?

Good luck for finding a mask register in the local APIC.

Even if there would be a mask register, then the IRR bit still would be
there and on unmask delivered. There is no way to clear IRR bits other
than a full reset (power on or INIT/SIPI sequence) of the local APIC.

In theory the APIC can be reset by clearing the enable bit in the
APIC_BASE MSR, but that's a can of worms in itself.

The Intel SDM is very blury about the behaviour:

  When IA32_APIC_BASE[11] is set to 0, prior initialization to the APIC
  may be lost and the APIC may return to the state described in Section
  11.4.7.1, “Local APIC State After Power-Up or Reset.”

"may" means there is no guarantee.

Aside of that this cannot be done for the original 3-wire APIC bus based
APICs (32-bit museum) pieces. Not that I care much about them, but
that's just going to add more complexity to the existing horrors.

The other problem is that with the bit disabled, the APIC might not
respond to INIT/SIPI anymore, but that's equally unclear from the
documentation; both Intel and AMD manuals are pretty useless when it
comes to the gory details of the APIC and from past experience I know
that there are quite some subtle differences in the APIC behaviour
across CPU generations...

The stale reboot vector IRR problem is pretty straight forward to
mitigate. See patch below.

That needs a full audit of the various vectors, though at a quick
inspection most of them should be fine.

Aside of that there is quite some bogosity in the APIC setup path, which
I need to look deeper into.

Thanks,

	tglx
---
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -136,6 +136,28 @@ static int smp_stop_nmi_callback(unsigne
 DEFINE_IDTENTRY_SYSVEC(sysvec_reboot)
 {
 	apic_eoi();
+
+	/*
+	 * Handle the case where a reboot IPI is stale in the IRR. This
+	 * happens when:
+	 *
+	 *   a CPU crashes with interrupts disabled before handling the
+	 *   reboot IPI and jumps into a crash kernel. The reboot IPI
+	 *   vector is kept set in the APIC IRR across the APIC soft
+	 *   disabled phase and as there is no way to clear a pending IRR
+	 *   bit, it is delivered to the crash kernel immediately when
+	 *   interrupts are enabled.
+	 *
+	 * As the reboot IPI can only be sent after acquiring @stopping_cpu
+	 * by storing the CPU number, this case can be detected when
+	 * @stopping_cpu contains the bootup value -1. Just return and
+	 * ignore it.
+	 */
+	if (atomic_read(&stopping_cpu) == -1) {
+		pr_info("Ignoring stale reboot IPI\n");
+		return;
+	}
+
 	cpu_emergency_disable_virtualization();
 	stop_this_cpu(NULL);
 }

  reply	other threads:[~2025-07-27 12:40 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-04  8:33 [BUG REPORT] x86/apic: CPU Hang in x86 VM During Kdump Yipeng Zou
2025-07-26  9:50 ` Yipeng Zou
2025-07-27 12:39   ` Thomas Gleixner [this message]
2025-07-27 20:01 ` Thomas Gleixner
2025-07-29  8:53   ` Thomas Gleixner
2025-07-29 13:35     ` Yipeng Zou
2025-07-29 19:48       ` Thomas Gleixner
2025-08-11 12:51         ` Yipeng Zou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h5yxq02p.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=arnd@arndb.de \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rui.zhang@intel.com \
    --cc=sohil.mehta@intel.com \
    --cc=x86@kernel.org \
    --cc=yuntao.wang@linux.dev \
    --cc=zouyipeng@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.