From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753379AbbCaSa6 (ORCPT ); Tue, 31 Mar 2015 14:30:58 -0400 Received: from mail-am1on0074.outbound.protection.outlook.com ([157.56.112.74]:47374 "EHLO emea01-am1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751083AbbCaSa4 (ORCPT ); Tue, 31 Mar 2015 14:30:56 -0400 Message-ID: <551AE7D4.3020608@ezchip.com> Date: Tue, 31 Mar 2015 14:30:44 -0400 From: Chris Metcalf User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Ingo Molnar CC: Andrew Morton , Don Zickus , Andrew Jones , chai wen , Ulrich Obergfell , Fabian Frederick , Aaron Tomlin , Ben Zhang , Christoph Lameter , Frederic Weisbecker , Gilad Ben-Yossef , Steven Rostedt , open list Subject: Re: [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores References: <1427741465-15747-1-git-send-email-cmetcalf@ezchip.com> <20150331072502.GA16754@gmail.com> In-Reply-To: <20150331072502.GA16754@gmail.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [12.216.194.146] X-ClientProxiedBy: BN1PR12CA0003.namprd12.prod.outlook.com (25.160.77.13) To DB4PR02MB0543.eurprd02.prod.outlook.com (10.141.45.16) Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none; X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DB4PR02MB0543; X-Microsoft-Antispam-PRVS: X-Forefront-Antispam-Report: BMV:1;SFV:NSPM;SFS:(10009020)(6049001)(6009001)(24454002)(377454003)(479174004)(23746002)(86362001)(46102003)(76176999)(87266999)(54356999)(50986999)(42186005)(19580395003)(77096005)(62966003)(33656002)(40100003)(92566002)(122386002)(87976001)(50466002)(59896002)(65816999)(77156002)(66066001)(64126003)(15975445007)(2950100001)(47776003)(36756003)(19580405001)(83506001)(18886065003);DIR:OUT;SFP:1101;SCL:1;SRVR:DB4PR02MB0543;H:[10.7.0.41];FPR:;SPF:None;MLV:sfv;LANG:en; X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(5002010);SRVR:DB4PR02MB0543;BCL:0;PCL:0;RULEID:;SRVR:DB4PR02MB0543; X-Forefront-PRVS: 0532BF6DC2 X-OriginatorOrg: ezchip.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Mar 2015 18:30:52.2885 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB4PR02MB0543 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/31/2015 03:25 AM, Ingo Molnar wrote: > * cmetcalf@ezchip.com wrote: > >> From: Chris Metcalf >> >> Running watchdog can be a helpful debugging feature on regular >> cores, but it's incompatible with nohz_full, since it forces >> regular scheduling events. Accordingly, just exit out immediately >> from any nohz_full core. >> >> An alternate approach would be to add a flags field or function to >> smp_hotplug_thread to control on which cores the percpu threads >> are created, but it wasn't clear that much mechanism was useful. >> >> [...] > So what happens if someone wants to enable the lockup detector, with a > long timeout, even on nohz-full CPUs? This patch makes that > impossible. > > A better solution would be to tweak the defaults: > > - to default the watchdog(s) to disabled when nohz-full is > enabled, even if HARDLOCKUP_DETECTOR=y or DETECT_HUNG_TASK=y, and > allow it to be re-enabled via its sysctl. That's certainly a reasonable thing to do; it looks like just an #ifdef at the top of watchdog.c would suffice. Does this look right? diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 8a46d9d8a66f..c8555c211e65 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -25,7 +25,11 @@ #include #include +#ifdef CONFIG_NO_HZ_FULL +int watchdog_user_enabled = 0; +#else int watchdog_user_enabled = 1; +#endif int __read_mostly watchdog_thresh = 10; #ifdef CONFIG_SMP int __read_mostly sysctl_softlockup_all_cpu_backtrace; It doesn't look like I need to do anything else special to disable HARDLOCKUP_DETECTOR, and khungtaskd can happily run on a non-nohz core, so that should be OK. What I was trying to achieve with my proposed patch was kind of orthogonal: to allow the watchdog to run on standard cores, but not run on nohz cores, so we could benefit from it on the cores where it was safe for it to run. Do you see value in this, or better to just enable/disable all watchdog threads collectively? -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com