All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Mark Rutland <mark.rutland@arm.com>,
	Sasha Levin <sashal@kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	Will Deacon <will@kernel.org>, Dave Martin <Dave.Martin@arm.com>,
	Cristian Marussi <cristian.marussi@arm.com>
Subject: [PATCH AUTOSEL 5.5 14/28] arm64: smp: fix smp_send_stop() behaviour
Date: Thu, 26 Mar 2020 19:23:43 -0400	[thread overview]
Message-ID: <20200326232357.7516-14-sashal@kernel.org> (raw)
In-Reply-To: <20200326232357.7516-1-sashal@kernel.org>

From: Cristian Marussi <cristian.marussi@arm.com>

[ Upstream commit d0bab0c39e32d39a8c5cddca72e5b4a3059fe050 ]

On a system with only one CPU online, when another one CPU panics while
starting-up, smp_send_stop() will fail to send any STOP message to the
other already online core, resulting in a system still responsive and
alive at the end of the panic procedure.

[  186.700083] CPU3: shutdown
[  187.075462] CPU2: shutdown
[  187.162869] CPU1: shutdown
[  188.689998] ------------[ cut here ]------------
[  188.691645] kernel BUG at arch/arm64/kernel/cpufeature.c:886!
[  188.692079] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[  188.692444] Modules linked in:
[  188.693031] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.6.0-rc4-00001-g338d25c35a98 #104
[  188.693175] Hardware name: Foundation-v8A (DT)
[  188.693492] pstate: 200001c5 (nzCv dAIF -PAN -UAO)
[  188.694183] pc : has_cpuid_feature+0xf0/0x348
[  188.694311] lr : verify_local_elf_hwcaps+0x84/0xe8
[  188.694410] sp : ffff800011b1bf60
[  188.694536] x29: ffff800011b1bf60 x28: 0000000000000000
[  188.694707] x27: 0000000000000000 x26: 0000000000000000
[  188.694801] x25: 0000000000000000 x24: ffff80001189a25c
[  188.694905] x23: 0000000000000000 x22: 0000000000000000
[  188.694996] x21: ffff8000114aa018 x20: ffff800011156a38
[  188.695089] x19: ffff800010c944a0 x18: 0000000000000004
[  188.695187] x17: 0000000000000000 x16: 0000000000000000
[  188.695280] x15: 0000249dbde5431e x14: 0262cbe497efa1fa
[  188.695371] x13: 0000000000000002 x12: 0000000000002592
[  188.695472] x11: 0000000000000080 x10: 00400032b5503510
[  188.695572] x9 : 0000000000000000 x8 : ffff800010c80204
[  188.695659] x7 : 00000000410fd0f0 x6 : 0000000000000001
[  188.695750] x5 : 00000000410fd0f0 x4 : 0000000000000000
[  188.695836] x3 : 0000000000000000 x2 : ffff8000100939d8
[  188.695919] x1 : 0000000000180420 x0 : 0000000000180480
[  188.696253] Call trace:
[  188.696410]  has_cpuid_feature+0xf0/0x348
[  188.696504]  verify_local_elf_hwcaps+0x84/0xe8
[  188.696591]  check_local_cpu_capabilities+0x44/0x128
[  188.696666]  secondary_start_kernel+0xf4/0x188
[  188.697150] Code: 52805001 72a00301 6b01001f 54000ec0 (d4210000)
[  188.698639] ---[ end trace 3f12ca47652f7b72 ]---
[  188.699160] Kernel panic - not syncing: Attempted to kill the idle task!
[  188.699546] Kernel Offset: disabled
[  188.699828] CPU features: 0x00004,20c02008
[  188.700012] Memory Limit: none
[  188.700538] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

[root@arch ~]# echo Helo
Helo
[root@arch ~]# cat /proc/cpuinfo | grep proce
processor	: 0

Make smp_send_stop() account also for the online status of the calling CPU
while evaluating how many CPUs are effectively online: this way, the right
number of STOPs is sent, so enforcing a proper freeze of the system at the
end of panic even under the above conditions.

Fixes: 08e875c16a16c ("arm64: SMP support")
Reported-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/arm64/kernel/smp.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index d4ed9a19d8fe7..e4dc241c5a8e8 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -958,11 +958,22 @@ void tick_broadcast(const struct cpumask *mask)
 }
 #endif
 
+/*
+ * The number of CPUs online, not counting this CPU (which may not be
+ * fully online and so not counted in num_online_cpus()).
+ */
+static inline unsigned int num_other_online_cpus(void)
+{
+	unsigned int this_cpu_online = cpu_online(smp_processor_id());
+
+	return num_online_cpus() - this_cpu_online;
+}
+
 void smp_send_stop(void)
 {
 	unsigned long timeout;
 
-	if (num_online_cpus() > 1) {
+	if (num_other_online_cpus()) {
 		cpumask_t mask;
 
 		cpumask_copy(&mask, cpu_online_mask);
@@ -975,10 +986,10 @@ void smp_send_stop(void)
 
 	/* Wait up to one second for other CPUs to stop */
 	timeout = USEC_PER_SEC;
-	while (num_online_cpus() > 1 && timeout--)
+	while (num_other_online_cpus() && timeout--)
 		udelay(1);
 
-	if (num_online_cpus() > 1)
+	if (num_other_online_cpus())
 		pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
 			cpumask_pr_args(cpu_online_mask));
 
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Cristian Marussi <cristian.marussi@arm.com>,
	Dave Martin <Dave.Martin@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Will Deacon <will@kernel.org>, Sasha Levin <sashal@kernel.org>,
	linux-arm-kernel@lists.infradead.org
Subject: [PATCH AUTOSEL 5.5 14/28] arm64: smp: fix smp_send_stop() behaviour
Date: Thu, 26 Mar 2020 19:23:43 -0400	[thread overview]
Message-ID: <20200326232357.7516-14-sashal@kernel.org> (raw)
In-Reply-To: <20200326232357.7516-1-sashal@kernel.org>

From: Cristian Marussi <cristian.marussi@arm.com>

[ Upstream commit d0bab0c39e32d39a8c5cddca72e5b4a3059fe050 ]

On a system with only one CPU online, when another one CPU panics while
starting-up, smp_send_stop() will fail to send any STOP message to the
other already online core, resulting in a system still responsive and
alive at the end of the panic procedure.

[  186.700083] CPU3: shutdown
[  187.075462] CPU2: shutdown
[  187.162869] CPU1: shutdown
[  188.689998] ------------[ cut here ]------------
[  188.691645] kernel BUG at arch/arm64/kernel/cpufeature.c:886!
[  188.692079] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[  188.692444] Modules linked in:
[  188.693031] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 5.6.0-rc4-00001-g338d25c35a98 #104
[  188.693175] Hardware name: Foundation-v8A (DT)
[  188.693492] pstate: 200001c5 (nzCv dAIF -PAN -UAO)
[  188.694183] pc : has_cpuid_feature+0xf0/0x348
[  188.694311] lr : verify_local_elf_hwcaps+0x84/0xe8
[  188.694410] sp : ffff800011b1bf60
[  188.694536] x29: ffff800011b1bf60 x28: 0000000000000000
[  188.694707] x27: 0000000000000000 x26: 0000000000000000
[  188.694801] x25: 0000000000000000 x24: ffff80001189a25c
[  188.694905] x23: 0000000000000000 x22: 0000000000000000
[  188.694996] x21: ffff8000114aa018 x20: ffff800011156a38
[  188.695089] x19: ffff800010c944a0 x18: 0000000000000004
[  188.695187] x17: 0000000000000000 x16: 0000000000000000
[  188.695280] x15: 0000249dbde5431e x14: 0262cbe497efa1fa
[  188.695371] x13: 0000000000000002 x12: 0000000000002592
[  188.695472] x11: 0000000000000080 x10: 00400032b5503510
[  188.695572] x9 : 0000000000000000 x8 : ffff800010c80204
[  188.695659] x7 : 00000000410fd0f0 x6 : 0000000000000001
[  188.695750] x5 : 00000000410fd0f0 x4 : 0000000000000000
[  188.695836] x3 : 0000000000000000 x2 : ffff8000100939d8
[  188.695919] x1 : 0000000000180420 x0 : 0000000000180480
[  188.696253] Call trace:
[  188.696410]  has_cpuid_feature+0xf0/0x348
[  188.696504]  verify_local_elf_hwcaps+0x84/0xe8
[  188.696591]  check_local_cpu_capabilities+0x44/0x128
[  188.696666]  secondary_start_kernel+0xf4/0x188
[  188.697150] Code: 52805001 72a00301 6b01001f 54000ec0 (d4210000)
[  188.698639] ---[ end trace 3f12ca47652f7b72 ]---
[  188.699160] Kernel panic - not syncing: Attempted to kill the idle task!
[  188.699546] Kernel Offset: disabled
[  188.699828] CPU features: 0x00004,20c02008
[  188.700012] Memory Limit: none
[  188.700538] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

[root@arch ~]# echo Helo
Helo
[root@arch ~]# cat /proc/cpuinfo | grep proce
processor	: 0

Make smp_send_stop() account also for the online status of the calling CPU
while evaluating how many CPUs are effectively online: this way, the right
number of STOPs is sent, so enforcing a proper freeze of the system at the
end of panic even under the above conditions.

Fixes: 08e875c16a16c ("arm64: SMP support")
Reported-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/arm64/kernel/smp.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index d4ed9a19d8fe7..e4dc241c5a8e8 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -958,11 +958,22 @@ void tick_broadcast(const struct cpumask *mask)
 }
 #endif
 
+/*
+ * The number of CPUs online, not counting this CPU (which may not be
+ * fully online and so not counted in num_online_cpus()).
+ */
+static inline unsigned int num_other_online_cpus(void)
+{
+	unsigned int this_cpu_online = cpu_online(smp_processor_id());
+
+	return num_online_cpus() - this_cpu_online;
+}
+
 void smp_send_stop(void)
 {
 	unsigned long timeout;
 
-	if (num_online_cpus() > 1) {
+	if (num_other_online_cpus()) {
 		cpumask_t mask;
 
 		cpumask_copy(&mask, cpu_online_mask);
@@ -975,10 +986,10 @@ void smp_send_stop(void)
 
 	/* Wait up to one second for other CPUs to stop */
 	timeout = USEC_PER_SEC;
-	while (num_online_cpus() > 1 && timeout--)
+	while (num_other_online_cpus() && timeout--)
 		udelay(1);
 
-	if (num_online_cpus() > 1)
+	if (num_other_online_cpus())
 		pr_warn("SMP: failed to stop secondary CPUs %*pbl\n",
 			cpumask_pr_args(cpu_online_mask));
 
-- 
2.20.1


  parent reply	other threads:[~2020-03-26 23:24 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-26 23:23 [PATCH AUTOSEL 5.5 01/28] thunderbolt: Fix error code in tb_port_is_width_supported() Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 02/28] drm/bridge: dw-hdmi: fix AVI frame colorimetry Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 03/28] nvme-rdma: Avoid double freeing of async event data Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 04/28] ALSA: hda/realtek: Fix pop noise on ALC225 Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 05/28] staging: wfx: fix warning about freeing in-use mutex during device unregister Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 06/28] kconfig: introduce m32-flag and m64-flag Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 07/28] int128: fix __uint128_t compiler test in Kconfig Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 08/28] drm/amdgpu: add fbdev suspend/resume on gpu reset Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 09/28] drm/amd/display: Add link_rate quirk for Apple 15" MBP 2017 Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 10/28] drm/bochs: downgrade pci_request_region failure from error to warning Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 11/28] initramfs: restore default compression behavior Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 12/28] staging: greybus: loopback_test: fix potential path truncation Sasha Levin
2020-03-27  6:28   ` Greg KH
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 13/28] staging: greybus: loopback_test: fix potential path truncations Sasha Levin
2020-03-27  6:28   ` Greg Kroah-Hartman
2020-03-26 23:23 ` Sasha Levin [this message]
2020-03-26 23:23   ` [PATCH AUTOSEL 5.5 14/28] arm64: smp: fix smp_send_stop() behaviour Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 15/28] arm64: smp: fix crash_smp_send_stop() behaviour Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 16/28] modpost: Get proper section index by get_secindex() instead of st_shndx Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 17/28] drm/amdgpu: fix typo for vcn1 idle check Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 18/28] tools/power turbostat: Fix gcc build warnings Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 19/28] tools/power turbostat: Fix missing SYS_LPI counter on some Chromebooks Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 20/28] tools/power turbostat: Fix 32-bit capabilities warning Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 21/28] nvmet-tcp: set MSG_MORE only if we actually have more to send Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 22/28] btrfs: fix removal of raid[56|1c34} incompat flags after removing block group Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 23/28] kconfig: Add yes2modconfig and mod2yesconfig targets Sasha Levin
2020-03-26 23:41   ` Tetsuo Handa
2020-03-27  6:26   ` Greg KH
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 24/28] ALSA: pcm: oss: Avoid plugin buffer overflow Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 25/28] ALSA: line6: Fix endless MIDI read loop Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 26/28] ALSA: pcm: oss: Remove WARNING from snd_pcm_plug_alloc() checks Sasha Levin
2020-03-26 23:23   ` Sasha Levin
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 27/28] tty: fix compat TIOCGSERIAL leaking uninitialized memory Sasha Levin
2020-03-27  6:27   ` Greg KH
2020-03-26 23:23 ` [PATCH AUTOSEL 5.5 28/28] drm/lease: fix WARNING in idr_destroy Sasha Levin
2020-03-26 23:23   ` Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200326232357.7516-14-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=Dave.Martin@arm.com \
    --cc=cristian.marussi@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=stable@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.