stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Grzegorz Halat <ghalat@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Don Zickus <dzickus@redhat.com>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH AUTOSEL 4.9 11/60] x86/reboot: Always use NMI fallback when shutdown via reboot vector IPI fails
Date: Sun, 22 Sep 2019 14:58:44 -0400	[thread overview]
Message-ID: <20190922185934.4305-11-sashal@kernel.org> (raw)
In-Reply-To: <20190922185934.4305-1-sashal@kernel.org>

From: Grzegorz Halat <ghalat@redhat.com>

[ Upstream commit 747d5a1bf293dcb33af755a6d285d41b8c1ea010 ]

A reboot request sends an IPI via the reboot vector and waits for all other
CPUs to stop. If one or more CPUs are in critical regions with interrupts
disabled then the IPI is not handled on those CPUs and the shutdown hangs
if native_stop_other_cpus() is called with the wait argument set.

Such a situation can happen when one CPU was stopped within a lock held
section and another CPU is trying to acquire that lock with interrupts
disabled. There are other scenarios which can cause such a lockup as well.

In theory the shutdown should be attempted by an NMI IPI after the timeout
period elapsed. Though the wait loop after sending the reboot vector IPI
prevents this. It checks the wait request argument and the timeout. If wait
is set, which is true for sys_reboot() then it won't fall through to the
NMI shutdown method after the timeout period has finished.

This was an oversight when the NMI shutdown mechanism was added to handle
the 'reboot IPI is not working' situation. The mechanism was added to deal
with stuck panic shutdowns, which do not have the wait request set, so the
'wait request' case was probably not considered.

Remove the wait check from the post reboot vector IPI wait loop and enforce
that the wait loop in the NMI fallback path is invoked even if NMI IPIs are
disabled or the registration of the NMI handler fails. That second wait
loop will then hang if not all CPUs shutdown and the wait argument is set.

[ tglx: Avoid the hard to parse line break in the NMI fallback path,
  	add comments and massage the changelog ]

Fixes: 7d007d21e539 ("x86/reboot: Use NMI to assist in shutting down if IRQ fails")
Signed-off-by: Grzegorz Halat <ghalat@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Don Zickus <dzickus@redhat.com>
Link: https://lkml.kernel.org/r/20190628122813.15500-1-ghalat@redhat.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/kernel/smp.c | 46 +++++++++++++++++++++++++------------------
 1 file changed, 27 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
index 2863ad3066921..33ba47c44816b 100644
--- a/arch/x86/kernel/smp.c
+++ b/arch/x86/kernel/smp.c
@@ -181,6 +181,12 @@ asmlinkage __visible void smp_reboot_interrupt(void)
 	irq_exit();
 }
 
+static int register_stop_handler(void)
+{
+	return register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback,
+				    NMI_FLAG_FIRST, "smp_stop");
+}
+
 static void native_stop_other_cpus(int wait)
 {
 	unsigned long flags;
@@ -214,39 +220,41 @@ static void native_stop_other_cpus(int wait)
 		apic->send_IPI_allbutself(REBOOT_VECTOR);
 
 		/*
-		 * Don't wait longer than a second if the caller
-		 * didn't ask us to wait.
+		 * Don't wait longer than a second for IPI completion. The
+		 * wait request is not checked here because that would
+		 * prevent an NMI shutdown attempt in case that not all
+		 * CPUs reach shutdown state.
 		 */
 		timeout = USEC_PER_SEC;
-		while (num_online_cpus() > 1 && (wait || timeout--))
+		while (num_online_cpus() > 1 && timeout--)
 			udelay(1);
 	}
-	
-	/* if the REBOOT_VECTOR didn't work, try with the NMI */
-	if ((num_online_cpus() > 1) && (!smp_no_nmi_ipi))  {
-		if (register_nmi_handler(NMI_LOCAL, smp_stop_nmi_callback,
-					 NMI_FLAG_FIRST, "smp_stop"))
-			/* Note: we ignore failures here */
-			/* Hope the REBOOT_IRQ is good enough */
-			goto finish;
-
-		/* sync above data before sending IRQ */
-		wmb();
 
-		pr_emerg("Shutting down cpus with NMI\n");
+	/* if the REBOOT_VECTOR didn't work, try with the NMI */
+	if (num_online_cpus() > 1) {
+		/*
+		 * If NMI IPI is enabled, try to register the stop handler
+		 * and send the IPI. In any case try to wait for the other
+		 * CPUs to stop.
+		 */
+		if (!smp_no_nmi_ipi && !register_stop_handler()) {
+			/* Sync above data before sending IRQ */
+			wmb();
 
-		apic->send_IPI_allbutself(NMI_VECTOR);
+			pr_emerg("Shutting down cpus with NMI\n");
 
+			apic->send_IPI_allbutself(NMI_VECTOR);
+		}
 		/*
-		 * Don't wait longer than a 10 ms if the caller
-		 * didn't ask us to wait.
+		 * Don't wait longer than 10 ms if the caller didn't
+		 * reqeust it. If wait is true, the machine hangs here if
+		 * one or more CPUs do not reach shutdown state.
 		 */
 		timeout = USEC_PER_MSEC * 10;
 		while (num_online_cpus() > 1 && (wait || timeout--))
 			udelay(1);
 	}
 
-finish:
 	local_irq_save(flags);
 	disable_local_APIC();
 	mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
-- 
2.20.1


  parent reply	other threads:[~2019-09-22 19:08 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-22 18:58 [PATCH AUTOSEL 4.9 01/60] ALSA: hda: Flush interrupts on disabling Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 02/60] regulator: lm363x: Fix off-by-one n_voltages for lm3632 ldo_vpos/ldo_vneg Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 03/60] ASoC: sgtl5000: Fix charge pump source assignment Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 04/60] dmaengine: bcm2835: Print error in case setting DMA mask fails Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 05/60] leds: leds-lp5562 allow firmware files up to the maximum length Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 06/60] media: dib0700: fix link error for dibx000_i2c_set_speed Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 07/60] media: exynos4-is: fix leaked of_node references Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 08/60] media: hdpvr: Add device num check and handling Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 09/60] sched/fair: Fix imbalance due to CPU affinity Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 10/60] sched/core: Fix CPU controller for !RT_GROUP_SCHED Sasha Levin
2019-09-22 18:58 ` Sasha Levin [this message]
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 12/60] x86/apic: Soft disable APIC before initializing it Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 13/60] ALSA: hda - Show the fatal CORB/RIRB error more clearly Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 14/60] ALSA: i2c: ak4xxx-adda: Fix a possible null pointer dereference in build_adc_controls() Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 15/60] media: iguanair: add sanity checks Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 16/60] base: soc: Export soc_device_register/unregister APIs Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 17/60] ALSA: usb-audio: Skip bSynchAddress endpoint check if it is invalid Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 18/60] ia64:unwind: fix double free for mod->arch.init_unw_table Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 19/60] EDAC/altera: Use the proper type for the IRQ status bits Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 20/60] md: don't call spare_active in md_reap_sync_thread if all member devices can't work Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 21/60] md: don't set In_sync if array is frozen Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 22/60] efi: cper: print AER info of PCIe fatal error Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 23/60] media: gspca: zero usb_buf on error Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 24/60] dmaengine: iop-adma: use correct printk format strings Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 25/60] media: omap3isp: Don't set streaming state on random subdevs Sasha Levin
2019-09-22 18:58 ` [PATCH AUTOSEL 4.9 26/60] net: lpc-enet: fix printk format strings Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 27/60] ARM: dts: imx7d: cl-som-imx7: make ethernet work again Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 28/60] media: radio/si470x: kill urb on error Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 29/60] media: hdpvr: add terminating 0 at end of string Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 30/60] media: dvb-core: fix a memory leak bug Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 31/60] PM / devfreq: passive: Use non-devm notifiers Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 32/60] PM / devfreq: exynos-bus: Correct clock enable sequence Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 33/60] media: saa7146: add cleanup in hexium_attach() Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 34/60] media: cpia2_usb: fix memory leaks Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 35/60] media: saa7134: fix terminology around saa7134_i2c_eeprom_md7134_gate() Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 36/60] media: ov9650: add a sanity check Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 37/60] ACPI / CPPC: do not require the _PSD method Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 38/60] arm64: kpti: ensure patched kernel text is fetched from PoU Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 39/60] nvmet: fix data units read and written counters in SMART log Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 40/60] iommu/amd: Silence warnings under memory pressure Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 41/60] libtraceevent: Change users plugin directory Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 42/60] ARM: dts: exynos: Mark LDO10 as always-on on Peach Pit/Pi Chromebooks Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 43/60] ACPI: custom_method: fix memory leaks Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 44/60] ACPI / PCI: fix acpi_pci_irq_enable() memory leak Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 45/60] hwmon: (acpi_power_meter) Change log level for 'unsafe software power cap' Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 46/60] md/raid1: fail run raid1 array when active disk less than one Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 47/60] dmaengine: ti: edma: Do not reset reserved paRAM slots Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 48/60] kprobes: Prohibit probing on BUG() and WARN() address Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 49/60] s390/crypto: xts-aes-s390 fix extra run-time crypto self tests finding Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 50/60] irqchip/gic-v3-its: Fix LPI release for Multi-MSI devices Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 51/60] x86/platform/uv: Fix kmalloc() NULL check routine Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 52/60] ASoC: dmaengine: Make the pcm->name equal to pcm->id if the name is not set Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 53/60] mmc: sdhci: Fix incorrect switch to HS mode Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 54/60] media: technisat-usb2: break out of loop at end of buffer Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 55/60] libertas: Add missing sentinel at end of if_usb.c fw_table Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 56/60] e1000e: add workaround for possible stalled packet Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 57/60] drm/amd/powerplay/smu7: enforce minimal VBITimeout (v2) Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 58/60] media: ttusb-dec: Fix info-leak in ttusb_dec_send_command() Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 59/60] ALSA: hda/realtek - Blacklist PC beep for Lenovo ThinkCentre M73/93 Sasha Levin
2019-09-22 18:59 ` [PATCH AUTOSEL 4.9 60/60] btrfs: extent-tree: Make sure we only allocate extents from block groups with the same type Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190922185934.4305-11-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=dzickus@redhat.com \
    --cc=ghalat@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).