From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AEBC0C7EE30 for ; Wed, 1 Mar 2023 23:54:07 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4PRrg22XVTz3fbx for ; Thu, 2 Mar 2023 10:54:06 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=e98mV4j8; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=linux.intel.com (client-ip=192.55.52.43; helo=mga05.intel.com; envelope-from=ricardo.neri-calderon@linux.intel.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=e98mV4j8; dkim-atps=neutral Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4PRrJc2Tp0z3cMh for ; Thu, 2 Mar 2023 10:38:08 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677713888; x=1709249888; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=y8eExCkaoLH5zqN5BzT0UUYALRj8RDWacqQHg/gnrqc=; b=e98mV4j834Yv364KRUjV++945aypqHTLErWVEKt/Zlz2cRhEY3ScgZqy I/xFB+vNF9C+xl6OOKcRG4TlEU344IJ+HWYU2qT+EoZRCFm5Zc/9HwJny iNde+F1zGuHNk/WtziZwnK+bhAy0XHTdK1VT0+6YnrS33+yVPvGrt46SA GHikiazd2ZoJldM4vDL1HQXl7KBfuXqsKZWpF9CbTFXZH4bdtimWiU8gw 3wfy0qWn50BfktoHhr1RXUodwplRzIUCez1miDyto9/3LbHNSXl/tr52F ae3QRYpf3cbsgmm3k6YvSUdsIrx0nVMMVtx2V1oC0C8qBhWdE3OZeSfH1 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10636"; a="420818772" X-IronPort-AV: E=Sophos;i="5.98,225,1673942400"; d="scan'208";a="420818772" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Mar 2023 15:37:59 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10636"; a="738826853" X-IronPort-AV: E=Sophos;i="5.98,225,1673942400"; d="scan'208";a="738826853" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga008.fm.intel.com with ESMTP; 01 Mar 2023 15:37:59 -0800 From: Ricardo Neri To: Tony Luck , Dave Hansen , "Rafael J. Wysocki" , Reinette Chatre , Dan Williams , Len Brown Subject: [PATCH v7 17/24] watchdog/hardlockup: Decouple the hardlockup detector from perf Date: Wed, 1 Mar 2023 15:47:46 -0800 Message-Id: <20230301234753.28582-18-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230301234753.28582-1-ricardo.neri-calderon@linux.intel.com> References: <20230301234753.28582-1-ricardo.neri-calderon@linux.intel.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Ravi V. Shankar" , Andi Kleen , Ricardo Neri , Ricardo Neri , Stephane Eranian , linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org, Nicholas Piggin , Andrew Morton , linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" The current default implementation of the hardlockup detector assumes that it is implemented using perf events. However, the hardlockup detector can be driven by other sources of non-maskable interrupts (e.g., a properly configured timer). Group and wrap in #ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF all the code specific to perf: create and manage perf events, stop and start the perf- based detector. The generic portion of the detector (monitor the timers' thresholds, check timestamps and detect hardlockups as well as the implementation of arch_touch_nmi_watchdog()) is now selected with the new intermediate config symbol CONFIG_HARDLOCKUP_DETECTOR_CORE. The perf-based implementation of the detector selects the new intermediate symbol. Other implementations should do the same. Cc: Andi Kleen Cc: Nicholas Piggin Cc: Andrew Morton Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Reviewed-by: Tony Luck Signed-off-by: Ricardo Neri --- Changes since v6: * None Changes since v5: * None Changes since v4: * None Changes since v3: * Squashed into this patch a previous patch to make arch_touch_nmi_watchdog() part of the core detector code. Changes since v2: * Undid split of the generic hardlockup detector into a separate file. (Thomas Gleixner) * Added a new intermediate symbol CONFIG_HARDLOCKUP_DETECTOR_CORE to select generic parts of the detector (Paul E. McKenney, Thomas Gleixner). Changes since v1: * Make the generic detector code with CONFIG_HARDLOCKUP_DETECTOR. --- include/linux/nmi.h | 5 ++++- kernel/Makefile | 2 +- kernel/watchdog_hld.c | 32 ++++++++++++++++++++------------ lib/Kconfig.debug | 4 ++++ 4 files changed, 29 insertions(+), 14 deletions(-) diff --git a/include/linux/nmi.h b/include/linux/nmi.h index 75038cb2710e..a38c4509f9eb 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -94,8 +94,11 @@ static inline void hardlockup_detector_disable(void) {} # define NMI_WATCHDOG_SYSCTL_PERM 0444 #endif -#if defined(CONFIG_HARDLOCKUP_DETECTOR_PERF) +#if defined(CONFIG_HARDLOCKUP_DETECTOR_CORE) extern void arch_touch_nmi_watchdog(void); +#endif + +#if defined(CONFIG_HARDLOCKUP_DETECTOR_PERF) extern void hardlockup_detector_perf_stop(void); extern void hardlockup_detector_perf_restart(void); extern void hardlockup_detector_perf_disable(void); diff --git a/kernel/Makefile b/kernel/Makefile index 10ef068f598d..f35fad36cf81 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -91,7 +91,7 @@ obj-$(CONFIG_FAIL_FUNCTION) += fail_function.o obj-$(CONFIG_KGDB) += debug/ obj-$(CONFIG_DETECT_HUNG_TASK) += hung_task.o obj-$(CONFIG_LOCKUP_DETECTOR) += watchdog.o -obj-$(CONFIG_HARDLOCKUP_DETECTOR_PERF) += watchdog_hld.o +obj-$(CONFIG_HARDLOCKUP_DETECTOR_CORE) += watchdog_hld.o obj-$(CONFIG_SECCOMP) += seccomp.o obj-$(CONFIG_RELAY) += relay.o obj-$(CONFIG_SYSCTL) += utsname_sysctl.o diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index b352e507b17f..bb6435978c46 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -22,12 +22,8 @@ static DEFINE_PER_CPU(bool, hard_watchdog_warn); static DEFINE_PER_CPU(bool, watchdog_nmi_touch); -static DEFINE_PER_CPU(struct perf_event *, watchdog_ev); -static DEFINE_PER_CPU(struct perf_event *, dead_event); -static struct cpumask dead_events_mask; static unsigned long hardlockup_allcpu_dumped; -static atomic_t watchdog_cpus = ATOMIC_INIT(0); notrace void arch_touch_nmi_watchdog(void) { @@ -98,14 +94,6 @@ static inline bool watchdog_check_timestamp(void) } #endif -static struct perf_event_attr wd_hw_attr = { - .type = PERF_TYPE_HARDWARE, - .config = PERF_COUNT_HW_CPU_CYCLES, - .size = sizeof(struct perf_event_attr), - .pinned = 1, - .disabled = 1, -}; - void inspect_for_hardlockups(struct pt_regs *regs) { if (__this_cpu_read(watchdog_nmi_touch) == true) { @@ -157,6 +145,24 @@ void inspect_for_hardlockups(struct pt_regs *regs) return; } +#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF +#undef pr_fmt +#define pr_fmt(fmt) "NMI perf watchdog: " fmt + +static DEFINE_PER_CPU(struct perf_event *, watchdog_ev); +static DEFINE_PER_CPU(struct perf_event *, dead_event); +static struct cpumask dead_events_mask; + +static atomic_t watchdog_cpus = ATOMIC_INIT(0); + +static struct perf_event_attr wd_hw_attr = { + .type = PERF_TYPE_HARDWARE, + .config = PERF_COUNT_HW_CPU_CYCLES, + .size = sizeof(struct perf_event_attr), + .pinned = 1, + .disabled = 1, +}; + /* Callback function for perf event subsystem */ static void watchdog_overflow_callback(struct perf_event *event, struct perf_sample_data *data, @@ -298,3 +304,5 @@ int __init hardlockup_detector_perf_init(void) } return ret; } + +#endif /* CONFIG_HARDLOCKUP_DETECTOR_PERF */ diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index c8b379e2e9ad..1ff53c5995b1 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1025,9 +1025,13 @@ config BOOTPARAM_SOFTLOCKUP_PANIC Say N if unsure. +config HARDLOCKUP_DETECTOR_CORE + bool + config HARDLOCKUP_DETECTOR_PERF bool select SOFTLOCKUP_DETECTOR + select HARDLOCKUP_DETECTOR_CORE # # Enables a timestamp based low pass filter to compensate for perf based -- 2.25.1