From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Sasha Levin <sashal@kernel.org>,
pmladek@suse.com, john.ogness@linutronix.de,
Nicholas Piggin <npiggin@gmail.com>,
clg@kaod.org, sudeep.holla@arm.com,
Laurent Dufour <ldufour@linux.ibm.com>,
linuxppc-dev@lists.ozlabs.org
Subject: [PATCH AUTOSEL 5.16 07/52] powerpc/watchdog: Fix missed watchdog reset due to memory ordering race
Date: Mon, 17 Jan 2022 11:58:08 -0500 [thread overview]
Message-ID: <20220117165853.1470420-7-sashal@kernel.org> (raw)
In-Reply-To: <20220117165853.1470420-1-sashal@kernel.org>
From: Nicholas Piggin <npiggin@gmail.com>
[ Upstream commit 5dad4ba68a2483fc80d70b9dc90bbe16e1f27263 ]
It is possible for all CPUs to miss the pending cpumask becoming clear,
and then nobody resetting it, which will cause the lockup detector to
stop working. It will eventually expire, but watchdog_smp_panic will
avoid doing anything if the pending mask is clear and it will never be
reset.
Order the cpumask clear vs the subsequent test to close this race.
Add an extra check for an empty pending mask when the watchdog fires and
finds its bit still clear, to try to catch any other possible races or
bugs here and keep the watchdog working. The extra test in
arch_touch_nmi_watchdog is required to prevent the new warning from
firing off.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Debugged-by: Laurent Dufour <ldufour@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20211110025056.2084347-2-npiggin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/powerpc/kernel/watchdog.c | 41 +++++++++++++++++++++++++++++++++-
1 file changed, 40 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index 3fa6d240bade2..ad94a2c6b7337 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -135,6 +135,10 @@ static void set_cpumask_stuck(const struct cpumask *cpumask, u64 tb)
{
cpumask_or(&wd_smp_cpus_stuck, &wd_smp_cpus_stuck, cpumask);
cpumask_andnot(&wd_smp_cpus_pending, &wd_smp_cpus_pending, cpumask);
+ /*
+ * See wd_smp_clear_cpu_pending()
+ */
+ smp_mb();
if (cpumask_empty(&wd_smp_cpus_pending)) {
wd_smp_last_reset_tb = tb;
cpumask_andnot(&wd_smp_cpus_pending,
@@ -221,13 +225,44 @@ static void wd_smp_clear_cpu_pending(int cpu, u64 tb)
cpumask_clear_cpu(cpu, &wd_smp_cpus_stuck);
wd_smp_unlock(&flags);
+ } else {
+ /*
+ * The last CPU to clear pending should have reset the
+ * watchdog so we generally should not find it empty
+ * here if our CPU was clear. However it could happen
+ * due to a rare race with another CPU taking the
+ * last CPU out of the mask concurrently.
+ *
+ * We can't add a warning for it. But just in case
+ * there is a problem with the watchdog that is causing
+ * the mask to not be reset, try to kick it along here.
+ */
+ if (unlikely(cpumask_empty(&wd_smp_cpus_pending)))
+ goto none_pending;
}
return;
}
+
cpumask_clear_cpu(cpu, &wd_smp_cpus_pending);
+
+ /*
+ * Order the store to clear pending with the load(s) to check all
+ * words in the pending mask to check they are all empty. This orders
+ * with the same barrier on another CPU. This prevents two CPUs
+ * clearing the last 2 pending bits, but neither seeing the other's
+ * store when checking if the mask is empty, and missing an empty
+ * mask, which ends with a false positive.
+ */
+ smp_mb();
if (cpumask_empty(&wd_smp_cpus_pending)) {
unsigned long flags;
+none_pending:
+ /*
+ * Double check under lock because more than one CPU could see
+ * a clear mask with the lockless check after clearing their
+ * pending bits.
+ */
wd_smp_lock(&flags);
if (cpumask_empty(&wd_smp_cpus_pending)) {
wd_smp_last_reset_tb = tb;
@@ -318,8 +353,12 @@ void arch_touch_nmi_watchdog(void)
{
unsigned long ticks = tb_ticks_per_usec * wd_timer_period_ms * 1000;
int cpu = smp_processor_id();
- u64 tb = get_tb();
+ u64 tb;
+ if (!cpumask_test_cpu(cpu, &watchdog_cpumask))
+ return;
+
+ tb = get_tb();
if (tb - per_cpu(wd_timer_tb, cpu) >= ticks) {
per_cpu(wd_timer_tb, cpu) = tb;
wd_smp_clear_cpu_pending(cpu, tb);
--
2.34.1
next prev parent reply other threads:[~2022-01-17 17:02 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20220117165853.1470420-1-sashal@kernel.org>
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 03/52] powerpc/6xx: add missing of_node_put Sasha Levin
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 04/52] powerpc/powernv: " Sasha Levin
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 05/52] powerpc/cell: " Sasha Levin
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 06/52] powerpc/btext: " Sasha Levin
2022-01-17 16:58 ` Sasha Levin [this message]
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 08/52] ASoC: imx-hdmi: add put_device() after of_find_device_by_node() Sasha Levin
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 10/52] powerpc/smp: Move setup_profiling_timer() under CONFIG_PROFILING Sasha Levin
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 13/52] powerpc/powermac: Add missing lockdep_register_key() Sasha Levin
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 14/52] KVM: PPC: Book3S: Suppress warnings when allocating too big memory slots Sasha Levin
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 15/52] KVM: PPC: Book3S: Suppress failed alloc warning in H_COPY_TOFROM_GUEST Sasha Levin
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 27/52] powerpc/40x: Map 32Mbytes of memory at startup Sasha Levin
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 28/52] selftests/powerpc/spectre_v2: Return skip code when miss_percent is high Sasha Levin
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 29/52] powerpc: handle kdump appropriately with crash_kexec_post_notifiers option Sasha Levin
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 30/52] powerpc/fadump: Fix inaccurate CPU state info in vmcore generated with panic Sasha Levin
2022-01-17 16:58 ` [PATCH AUTOSEL 5.16 39/52] selftests/powerpc: Add a test of sigreturning to the kernel Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220117165853.1470420-7-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=clg@kaod.org \
--cc=john.ogness@linutronix.de \
--cc=ldufour@linux.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=npiggin@gmail.com \
--cc=pmladek@suse.com \
--cc=stable@vger.kernel.org \
--cc=sudeep.holla@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).