From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B20B8C7EE23 for ; Wed, 1 Mar 2023 23:57:48 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4PRrlH2Zzsz3fTP for ; Thu, 2 Mar 2023 10:57:47 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=ZAbHbTcF; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=linux.intel.com (client-ip=192.55.52.43; helo=mga05.intel.com; envelope-from=ricardo.neri-calderon@linux.intel.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=ZAbHbTcF; dkim-atps=neutral Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4PRrJd1P7qz3cKD for ; Thu, 2 Mar 2023 10:38:09 +1100 (AEDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677713889; x=1709249889; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=TJtWiEKn9cKtWyiPL16X6Q+Hc9VuKWnOfoMEGJZdAM0=; b=ZAbHbTcFOyZ2HMg5U6+7zoE73BhrYTXz6lzCN3hJyU18fkqmj/9lXgDJ CJX4r86woOWQ7zoXP6zd9VTEkPRhLkRRo8bIeJzqRM0FL6GUfIc1jwzIB D8o5CTZ1g+fiJIZJrq4YLzvzU8ys7BKkXW+cWQxneSZMvKShkO9zOayRT HXFD7kWMcKAMcPO4CEGV/YOmyGkW1oe7+SHMDNPb8bKnomF34iFM/M5qg /OLrt4EiDnmO9izIEi5aS803wfdcBb7Z4sbmTm5HNByHFVGeYTlbenEXV XSNTVXyWbHbcZqKy0SGPDeiiaCAH27M6V2pjz9BF+AyW2JM6a8Xpjqtnz w==; X-IronPort-AV: E=McAfee;i="6500,9779,10636"; a="420818801" X-IronPort-AV: E=Sophos;i="5.98,225,1673942400"; d="scan'208";a="420818801" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Mar 2023 15:38:01 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10636"; a="738826874" X-IronPort-AV: E=Sophos;i="5.98,225,1673942400"; d="scan'208";a="738826874" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga008.fm.intel.com with ESMTP; 01 Mar 2023 15:38:00 -0800 From: Ricardo Neri To: Tony Luck , Dave Hansen , "Rafael J. Wysocki" , Reinette Chatre , Dan Williams , Len Brown Subject: [PATCH v7 23/24] watchdog: Introduce hardlockup_detector_mark_unavailable() Date: Wed, 1 Mar 2023 15:47:52 -0800 Message-Id: <20230301234753.28582-24-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230301234753.28582-1-ricardo.neri-calderon@linux.intel.com> References: <20230301234753.28582-1-ricardo.neri-calderon@linux.intel.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Ravi V. Shankar" , Andi Kleen , Ricardo Neri , Ricardo Neri , Stephane Eranian , linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org, linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" The NMI watchdog may become unreliable during runtime. This is the case in x86 if, for instance, the HPET-based hardlockup detector is in use and the TSC counter becomes unstable. Introduce a new interface to mark the hardlockup detector as unavailable in such cases. When doing this, update the state of /proc/sys/kernel/ nmi_watchdog to keep it consistent. Cc: Andi Kleen Cc: Stephane Eranian Cc: "Ravi V. Shankar" Cc: iommu@lists.linux-foundation.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Ricardo Neri --- Changes since v6: * Introduced this patch Changes since v5: * N/A Changes since v4 * N/A Changes since v3 * N/A Changes since v2: * N/A Changes since v1: * N/A --- include/linux/nmi.h | 2 ++ kernel/watchdog.c | 20 ++++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/include/linux/nmi.h b/include/linux/nmi.h index a38c4509f9eb..40a97139ec65 100644 --- a/include/linux/nmi.h +++ b/include/linux/nmi.h @@ -83,9 +83,11 @@ static inline void reset_hung_task_detector(void) { } #if defined(CONFIG_HARDLOCKUP_DETECTOR) extern void hardlockup_detector_disable(void); +extern void hardlockup_detector_mark_unavailable(void); extern unsigned int hardlockup_panic; #else static inline void hardlockup_detector_disable(void) {} +static inline void hardlockup_detector_mark_unavailable(void) {} #endif #if defined(CONFIG_HAVE_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 8e61f21e7e33..0e4fed6d95b9 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -47,6 +47,8 @@ static int __read_mostly nmi_watchdog_available; struct cpumask watchdog_cpumask __read_mostly; unsigned long *watchdog_cpumask_bits = cpumask_bits(&watchdog_cpumask); +static void __lockup_detector_reconfigure(void); + #ifdef CONFIG_HARDLOCKUP_DETECTOR # ifdef CONFIG_SMP @@ -85,6 +87,24 @@ static int __init hardlockup_panic_setup(char *str) } __setup("nmi_watchdog=", hardlockup_panic_setup); +/** + * hardlockup_detector_mark_unavailable - Mark the NMI watchdog as unavailable + * + * Indicate that the hardlockup detector has become unavailable. This may + * happen if the hardware resources that the detector uses have become + * unreliable. + */ +void hardlockup_detector_mark_unavailable(void) +{ + mutex_lock(&watchdog_mutex); + + /* These variables can be updated without stopping the detector. */ + nmi_watchdog_user_enabled = 0; + nmi_watchdog_available = false; + + __lockup_detector_reconfigure(); + mutex_unlock(&watchdog_mutex); +} #endif /* CONFIG_HARDLOCKUP_DETECTOR */ /* -- 2.25.1