From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: x86@kernel.org, Mario Limonciello <mario.limonciello@amd.com>,
Tom Lendacky <thomas.lendacky@amd.com>,
Tony Battersby <tonyb@cybernetics.com>,
Ashok Raj <ashok.raj@linux.intel.com>,
Tony Luck <tony.luck@intel.com>,
Arjan van de Veen <arjan@linux.intel.com>,
Eric Biederman <ebiederm@xmission.com>
Subject: [patch v3 0/7] x86/smp: Cure stop_other_cpus() and kexec() troubles
Date: Thu, 15 Jun 2023 22:33:49 +0200 (CEST) [thread overview]
Message-ID: <20230615190036.898273129@linutronix.de> (raw)
This is the third version of the stop_other_cpus() / kexec()
vs. mwait_play_dead() series. Version 2 can be found here:
https://lore.kernel.org/r/20230613115353.599087484@linutronix.de
The two issues addressed are:
1) stop_other_cpus() continues after observing num_online_cpus() == 1.
This is problematic because the to be stopped CPUs clear their online
bit first and then invoke eventually WBINVD, which can take a long
time. There seems to be an interaction between the WBINVD and the
reboot mechanics as this intermittendly results in hangs.
2) kexec() kernel can overwrite the memory locations which "offline" CPUs
are monitoring. This write brings them out of MWAIT and they resume
execution on overwritten text, page tables, data and stacks resulting
in triple faults.
Cure them by:
#1 Synchronizing stop_other_cpus() with a CPU mask which is updated in
stop_this_cpu() _after_ WBINVD completes.
#2 Bringing offline CPUs out of MWAIT and move them into HLT before
starting the kexec() kernel. Optionaly send them an INIT IPI so they
go back into wait for startup state.
Changes vs. V2:
- Use a CPU mask instead of an atomic counter and send the NMI only to
CPUs which did not report that they reached HLT. That's still not race
free vs. a late handling of the reboot vector, but that's not fixable.
Interestingly enough testing the NMI mechanics unearthed that after soft
disabling the local APIC the CPU is _not_ handling the NMI despite the
SDM claiming:
"The operation and response of a local APIC while in this software-disabled
state is as follows:
* The local APIC will respond normally to INIT, NMI, SMI, and SIPI messages."
I validated that even without handling the NMI, the CPU is kicked out of
HLT reliably.
It's unclear whether that's X2APIC specific and I neither verified that
behaviour on AMD. Nor is it clear what "respond normally" actually means.
The AMD APM is not helpful either:
"SMI, NMI, INIT, Startup, and Remote Read interrupts may be accepted"
Oh well.
The series is also available from git:
git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git x86/kexec
Thanks,
tglx
---
include/asm/cpu.h | 2
include/asm/smp.h | 4 +
kernel/process.c | 25 +++++++--
kernel/smp.c | 111 +++++++++++++++++++++++++++++-----------
kernel/smpboot.c | 149 ++++++++++++++++++++++++++++++++++++++++--------------
5 files changed, 220 insertions(+), 71 deletions(-)
next reply other threads:[~2023-06-15 20:33 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-15 20:33 Thomas Gleixner [this message]
2023-06-15 20:33 ` [patch v3 1/7] x86/smp: Make stop_other_cpus() more robust Thomas Gleixner
2023-06-16 1:58 ` Ashok Raj
2023-06-16 7:53 ` Thomas Gleixner
2023-06-16 14:13 ` Ashok Raj
2023-06-16 18:01 ` Thomas Gleixner
2023-06-16 20:57 ` Ashok Raj
2023-06-19 17:51 ` Ashok Raj
2023-06-20 8:09 ` Borislav Petkov
2023-06-16 16:36 ` Tony Battersby
2023-06-15 20:33 ` [patch v3 2/7] x86/smp: Dont access non-existing CPUID leaf Thomas Gleixner
2023-06-19 17:02 ` Limonciello, Mario
2023-06-19 17:15 ` Thomas Gleixner
2023-06-20 8:20 ` Borislav Petkov
2023-06-15 20:33 ` [patch v3 3/7] x86/smp: Remove pointless wmb()s from native_stop_other_cpus() Thomas Gleixner
2023-06-20 8:47 ` Borislav Petkov
2023-06-20 13:00 ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-06-15 20:33 ` [patch v3 4/7] x86/smp: Use dedicated cache-line for mwait_play_dead() Thomas Gleixner
2023-06-20 9:01 ` Borislav Petkov
2023-06-20 13:00 ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-06-15 20:33 ` [patch v3 5/7] x86/smp: Cure kexec() vs. mwait_play_dead() breakage Thomas Gleixner
2023-06-20 9:23 ` Borislav Petkov
2023-06-20 12:25 ` Thomas Gleixner
2023-06-20 13:00 ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-06-15 20:33 ` [patch v3 6/7] x86/smp: Split sending INIT IPI out into a helper function Thomas Gleixner
2023-06-20 9:29 ` Borislav Petkov
2023-06-20 12:30 ` Thomas Gleixner
2023-06-20 13:00 ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-06-15 20:34 ` [patch v3 7/7] x86/smp: Put CPUs into INIT on shutdown if possible Thomas Gleixner
2023-06-20 10:27 ` Borislav Petkov
2023-06-20 13:00 ` [tip: x86/core] " tip-bot2 for Thomas Gleixner
2023-07-03 3:44 ` [BUG REPORT] Triggering a panic in an x86 virtual machine does not wait Baokun Li
2023-07-05 8:59 ` Thomas Gleixner
2023-07-06 6:44 ` Baokun Li
2023-07-07 10:18 ` Thomas Gleixner
2023-07-07 12:40 ` Baokun Li
2023-07-07 13:49 ` [tip: x86/core] x86/smp: Don't send INIT to boot CPU tip-bot2 for Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230615190036.898273129@linutronix.de \
--to=tglx@linutronix.de \
--cc=arjan@linux.intel.com \
--cc=ashok.raj@linux.intel.com \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mario.limonciello@amd.com \
--cc=thomas.lendacky@amd.com \
--cc=tony.luck@intel.com \
--cc=tonyb@cybernetics.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).