From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754792AbbCBTGj (ORCPT ); Mon, 2 Mar 2015 14:06:39 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41060 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754322AbbCBTGi (ORCPT ); Mon, 2 Mar 2015 14:06:38 -0500 Date: Mon, 2 Mar 2015 14:06:33 -0500 From: Don Zickus To: Andrew Morton Cc: LKML , Ulrich Obergfell , Ingo Molnar Subject: Re: [PATCH 6/9] watchdog: implement error handling for failure to set up hardware perf events Message-ID: <20150302190633.GE105625@redhat.com> References: <1423168825-156238-1-git-send-email-dzickus@redhat.com> <1423168825-156238-7-git-send-email-dzickus@redhat.com> <20150223131734.61ee63b5f4064e656f0da762@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150223131734.61ee63b5f4064e656f0da762@linux-foundation.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Andrew, Here is another patch that can be folded into: watchdog-implement-error-handling-for-failure-to-set-up-hardware-perf-events.patch Let me know if you want me to fold this into the original and repost or not. Cheers, Don ----------------8<------------------- From: Don Zickus Date: Mon, 2 Mar 2015 13:56:23 -0500 Subject: [PATCH 2/2] watchdog: Update comments to explain some code This patch was written to update some comments and concerns from Andrew and is expected to be folded into patch watchdog-implement-error-handling-for-failure-to-set-up-hardware-perf-events.patch Resolves: - comments around barriers - comments around blind shutdown of hardware lockup detector - printk letting someone know hardlockup detector is being shut down Suggested-by: Ulrich Obergfell Signed-off-by: Don Zickus --- kernel/watchdog.c | 12 ++++++++++++ 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 55c2a4f..f2be11a 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -495,6 +495,12 @@ static void watchdog(unsigned int cpu) * failure path. Check for failures that can occur asynchronously - * for example, when CPUs are on-lined - and shut down the hardware * perf event on each CPU accordingly. + * + * The only non-obvious place this bit can be cleared is through + * watchdog_nmi_enable(), so a pr_info() is placed there. Placing a + * pr_info here would be too noisy as it would result in a message + * every few seconds if the hardlockup was disabled but the softlockup + * enabled. */ if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED)) watchdog_nmi_disable(cpu); @@ -546,6 +552,9 @@ static int watchdog_nmi_enable(unsigned int cpu) * Disable the hard lockup detector if _any_ CPU fails to set up * set up the hardware perf event. The watchdog() function checks * the NMI_WATCHDOG_ENABLED bit periodically. + * + * The barriers are for syncing up watchdog_enabled across all the + * cpus, as clear_bit() does not use barriers. */ smp_mb__before_atomic(); clear_bit(NMI_WATCHDOG_ENABLED_BIT, &watchdog_enabled); @@ -564,6 +573,9 @@ static int watchdog_nmi_enable(unsigned int cpu) else pr_err("disabled (cpu%i): unable to create perf event: %ld\n", cpu, PTR_ERR(event)); + + pr_info("Shutting down hard lockup detector on all cpus\n"); + return PTR_ERR(event); /* success path */ -- 1.7.1