From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
To: Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@kernel.org>, "H. Peter Anvin" <hpa@zytor.com>
Cc: Andi Kleen <andi.kleen@intel.com>,
Ashok Raj <ashok.raj@intel.com>, Borislav Petkov <bp@suse.de>,
Tony Luck <tony.luck@intel.com>,
"Ravi V. Shankar" <ravi.v.shankar@intel.com>,
x86@kernel.org, sparclinux@vger.kernel.org,
linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
Ricardo Neri <ricardo.neri-calderon@linux.intel.com>,
Jacob Pan <jacob.jun.pan@intel.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
Don Zickus <dzickus@redhat.com>,
Nicholas Piggin <npiggin@gmail.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Frederic Weisbecker <frederic@kernel.org>,
Alexei Starovoitov <ast@kernel.org>,
Babu Moger <babu.moger@oracle.com>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Masami Hiramatsu <mhiramat@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
Philippe Ombredanne <pombredanne@nexb.com>,
Colin Ian King <colin.king@canonical.com>,
Byungchul Park <byungchul.park@lge.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
"Luis R. Rodriguez" <mcgrof@kernel.org>,
Waiman Long <longman@redhat.com>,
Josh Poimboeuf <jpoimboe@redhat.com>,
Randy Dunlap <rdunlap@infradead.org>,
Davidlohr Bueso <dave@stgolabs.net>,
Christoffer Dall <cdall@linaro.org>,
Marc Zyngier <marc.zyngier@arm.com>,
Kai-Heng Feng <kai.heng.feng@canonical.com>,
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
David Rientjes <rientjes@google.com>,
iommu@lists.linux-foundation.org
Subject: [RFC PATCH 20/23] watchdog/hardlockup/hpet: Rotate interrupt among all monitored CPUs
Date: Tue, 12 Jun 2018 17:57:40 -0700 [thread overview]
Message-ID: <1528851463-21140-21-git-send-email-ricardo.neri-calderon@linux.intel.com> (raw)
In-Reply-To: <1528851463-21140-1-git-send-email-ricardo.neri-calderon@linux.intel.com>
In order to detect hardlockups in all the monitored CPUs, move the
interrupt to the next monitored CPU when handling the NMI interrupt; wrap
around when reaching the highest CPU in the mask. This rotation is achieved
by setting the affinity mask to only contain the next CPU to monitor.
In order to prevent our interrupt to be reassigned to another CPU, flag
it as IRQF_NONBALANCING.
The cpumask monitored_mask keeps track of the CPUs that the watchdog
should monitor. This structure is updated when the NMI watchdog is
enabled or disabled in a specific CPU. As this mask can change
concurrently as CPUs are put online or offline and the watchdog is
disabled or enabled, a lock is required to protect the monitored_mask.
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jacob Pan <jacob.jun.pan@intel.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Babu Moger <babu.moger@oracle.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Philippe Ombredanne <pombredanne@nexb.com>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Christoffer Dall <cdall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: "Ravi V. Shankar" <ravi.v.shankar@intel.com>
Cc: x86@kernel.org
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
kernel/watchdog_hld_hpet.c | 28 ++++++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/kernel/watchdog_hld_hpet.c b/kernel/watchdog_hld_hpet.c
index 857e051..c40acfd 100644
--- a/kernel/watchdog_hld_hpet.c
+++ b/kernel/watchdog_hld_hpet.c
@@ -10,6 +10,7 @@
#include <linux/nmi.h>
#include <linux/hpet.h>
#include <asm/hpet.h>
+#include <asm/cpumask.h>
#include <asm/irq_remapping.h>
#undef pr_fmt
@@ -199,8 +200,8 @@ static irqreturn_t hardlockup_detector_irq_handler(int irq, void *data)
* @regs: Register values as seen when the NMI was asserted
*
* When an NMI is issued, look for hardlockups. If the timer is not periodic,
- * kick it. The interrupt is always handled when if delivered via the
- * Front-Side Bus.
+ * kick it. Move the interrupt to the next monitored CPU. The interrupt is
+ * always handled when if delivered via the Front-Side Bus.
*
* Returns:
*
@@ -211,7 +212,7 @@ static int hardlockup_detector_nmi_handler(unsigned int val,
struct pt_regs *regs)
{
struct hpet_hld_data *hdata = hld_data;
- unsigned int use_fsb;
+ unsigned int use_fsb, cpu;
/*
* If FSB delivery mode is used, the timer interrupt is programmed as
@@ -222,8 +223,27 @@ static int hardlockup_detector_nmi_handler(unsigned int val,
if (!use_fsb && !is_hpet_wdt_interrupt(hdata))
return NMI_DONE;
+ /* There are no CPUs to monitor. */
+ if (!cpumask_weight(&hdata->monitored_mask))
+ return NMI_HANDLED;
+
inspect_for_hardlockups(regs);
+ /*
+ * Target a new CPU. Keep trying until we find a monitored CPU. CPUs
+ * are addded and removed to this mask at cpu_up() and cpu_down(),
+ * respectively. Thus, the interrupt should be able to be moved to
+ * the next monitored CPU.
+ */
+ spin_lock(&hld_data->lock);
+ for_each_cpu_wrap(cpu, &hdata->monitored_mask, smp_processor_id() + 1) {
+ if (!irq_set_affinity(hld_data->irq, cpumask_of(cpu)))
+ break;
+ pr_err("Could not assign interrupt to CPU %d. Trying with next present CPU.\n",
+ cpu);
+ }
+ spin_unlock(&hld_data->lock);
+
if (!(hdata->flags & HPET_DEV_PERI_CAP))
kick_timer(hdata);
@@ -336,7 +356,7 @@ static int setup_hpet_irq(struct hpet_hld_data *hdata)
* Request an interrupt to activate the irq in all the needed domains.
*/
ret = request_irq(hwirq, hardlockup_detector_irq_handler,
- IRQF_TIMER | IRQF_DELIVER_AS_NMI,
+ IRQF_TIMER | IRQF_DELIVER_AS_NMI | IRQF_NOBALANCING,
"hpet_hld", hdata);
if (ret)
unregister_nmi_handler(NMI_LOCAL, "hpet_hld");
--
2.7.4
next prev parent reply other threads:[~2018-06-13 1:01 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-13 0:57 [RFC PATCH 00/23] Implement an HPET-based hardlockup detector Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 01/23] x86/apic: Add a parameter for the APIC delivery mode Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 02/23] genirq: Introduce IRQD_DELIVER_AS_NMI Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 03/23] genirq: Introduce IRQF_DELIVER_AS_NMI Ricardo Neri
2018-06-13 8:34 ` Peter Zijlstra
2018-06-13 8:59 ` Julien Thierry
2018-06-13 9:20 ` Thomas Gleixner
2018-06-13 9:36 ` Julien Thierry
2018-06-13 9:49 ` Julien Thierry
2018-06-13 9:57 ` Thomas Gleixner
2018-06-13 10:25 ` Julien Thierry
2018-06-13 10:06 ` Marc Zyngier
2018-06-15 2:12 ` Ricardo Neri
2018-06-15 8:01 ` Julien Thierry
2018-06-16 0:39 ` Ricardo Neri
2018-06-16 13:36 ` Thomas Gleixner
2018-06-13 0:57 ` [RFC PATCH 04/23] iommu/vt-d/irq_remapping: Add support for IRQCHIP_CAN_DELIVER_AS_NMI Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 05/23] x86/msi: " Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 06/23] x86/ioapic: Add support for IRQCHIP_CAN_DELIVER_AS_NMI with interrupt remapping Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 07/23] x86/hpet: Expose more functions to read and write registers Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 08/23] x86/hpet: Calculate ticks-per-second in a separate function Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 09/23] x86/hpet: Reserve timer for the HPET hardlockup detector Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 10/23] x86/hpet: Relocate flag definitions to a header file Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 11/23] x86/hpet: Configure the timer used by the hardlockup detector Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 12/23] kernel/watchdog: Introduce a struct for NMI watchdog operations Ricardo Neri
2018-06-13 7:41 ` Nicholas Piggin
2018-06-13 8:42 ` Peter Zijlstra
2018-06-13 9:26 ` Thomas Gleixner
2018-06-13 11:52 ` Nicholas Piggin
2018-06-14 1:31 ` Ricardo Neri
2018-06-14 2:32 ` Nicholas Piggin
2018-06-14 8:32 ` Thomas Gleixner
2018-06-15 2:21 ` Ricardo Neri
2018-06-14 1:26 ` Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 13/23] watchdog/hardlockup: Define a generic function to detect hardlockups Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 14/23] watchdog/hardlockup: Decouple the hardlockup detector from perf Ricardo Neri
2018-06-13 8:43 ` Peter Zijlstra
2018-06-14 1:19 ` Ricardo Neri
2018-06-14 1:41 ` Nicholas Piggin
2018-06-15 2:23 ` Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 15/23] kernel/watchdog: Add a function to obtain the watchdog_allowed_mask Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 16/23] watchdog/hardlockup: Add an HPET-based hardlockup detector Ricardo Neri
2018-06-13 5:23 ` Randy Dunlap
2018-06-14 1:00 ` Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 17/23] watchdog/hardlockup/hpet: Convert the timer's interrupt to NMI Ricardo Neri
2018-06-13 9:07 ` Peter Zijlstra
2018-06-15 2:07 ` Ricardo Neri
2018-06-13 9:40 ` Thomas Gleixner
2018-06-15 2:03 ` Ricardo Neri
2018-06-15 9:19 ` Thomas Gleixner
2018-06-16 0:51 ` Ricardo Neri
2018-06-16 13:24 ` Thomas Gleixner
2018-06-20 0:15 ` Ricardo Neri
2018-06-20 0:25 ` Randy Dunlap
2018-06-21 0:25 ` Ricardo Neri
2018-06-20 7:47 ` Thomas Gleixner
2018-06-13 0:57 ` [RFC PATCH 18/23] watchdog/hardlockup/hpet: Add the NMI watchdog operations Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 19/23] watchdog/hardlockup: Make arch_touch_nmi_watchdog() to hpet-based implementation Ricardo Neri
2018-06-13 0:57 ` Ricardo Neri [this message]
2018-06-13 9:48 ` [RFC PATCH 20/23] watchdog/hardlockup/hpet: Rotate interrupt among all monitored CPUs Thomas Gleixner
2018-06-15 2:16 ` Ricardo Neri
2018-06-15 10:29 ` Thomas Gleixner
2018-06-16 0:46 ` Ricardo Neri
2018-06-16 13:27 ` Thomas Gleixner
2018-06-13 0:57 ` [RFC PATCH 21/23] watchdog/hardlockup/hpet: Adjust timer expiration on the number of " Ricardo Neri
2018-06-13 0:57 ` [RFC PATCH 22/23] watchdog/hardlockup/hpet: Only enable the HPET watchdog via a boot parameter Ricardo Neri
2018-06-13 5:26 ` Randy Dunlap
2018-06-14 0:58 ` Ricardo Neri
2018-06-14 3:30 ` Randy Dunlap
2018-06-13 0:57 ` [RFC PATCH 23/23] watchdog/hardlockup: Activate the HPET-based lockup detector Ricardo Neri
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1528851463-21140-21-git-send-email-ricardo.neri-calderon@linux.intel.com \
--to=ricardo.neri-calderon@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=andi.kleen@intel.com \
--cc=ashok.raj@intel.com \
--cc=ast@kernel.org \
--cc=babu.moger@oracle.com \
--cc=bp@suse.de \
--cc=byungchul.park@lge.com \
--cc=cdall@linaro.org \
--cc=colin.king@canonical.com \
--cc=dave@stgolabs.net \
--cc=dzickus@redhat.com \
--cc=frederic@kernel.org \
--cc=hpa@zytor.com \
--cc=iommu@lists.linux-foundation.org \
--cc=jacob.jun.pan@intel.com \
--cc=jpoimboe@redhat.com \
--cc=kai.heng.feng@canonical.com \
--cc=konrad.wilk@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=longman@redhat.com \
--cc=marc.zyngier@arm.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mcgrof@kernel.org \
--cc=mhiramat@kernel.org \
--cc=mingo@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=pombredanne@nexb.com \
--cc=rafael.j.wysocki@intel.com \
--cc=ravi.v.shankar@intel.com \
--cc=rdunlap@infradead.org \
--cc=rientjes@google.com \
--cc=sparclinux@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).