From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753354AbbCaSjc (ORCPT ); Tue, 31 Mar 2015 14:39:32 -0400 Received: from mail-db3on0062.outbound.protection.outlook.com ([157.55.234.62]:20224 "EHLO emea01-db3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752292AbbCaSja (ORCPT ); Tue, 31 Mar 2015 14:39:30 -0400 Message-ID: <551AE9D0.5090103@ezchip.com> Date: Tue, 31 Mar 2015 14:39:12 -0400 From: Chris Metcalf User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Christoph Lameter CC: Andrew Morton , Don Zickus , Andrew Jones , chai wen , Ingo Molnar , Ulrich Obergfell , Fabian Frederick , Aaron Tomlin , Ben Zhang , Frederic Weisbecker , Gilad Ben-Yossef , Steven Rostedt , open list Subject: Re: [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores References: <1427741465-15747-1-git-send-email-cmetcalf@ezchip.com> In-Reply-To: Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [12.216.194.146] X-ClientProxiedBy: CO2PR11CA0032.namprd11.prod.outlook.com (10.141.242.170) To DB4PR02MB0542.eurprd02.prod.outlook.com (10.141.45.15) Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none; X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:DB4PR02MB0542; X-Microsoft-Antispam-PRVS: X-Forefront-Antispam-Report: BMV:1;SFV:NSPM;SFS:(10009020)(6049001)(6009001)(24454002)(479174004)(377454003)(50986999)(2950100001)(83506001)(62966003)(87266999)(33656002)(76176999)(77156002)(19580395003)(66066001)(47776003)(15975445007)(59896002)(54356999)(19580405001)(23746002)(122386002)(65816999)(42186005)(86362001)(50466002)(64126003)(92566002)(87976001)(77096005)(46102003)(36756003)(40100003)(18886065003);DIR:OUT;SFP:1101;SCL:1;SRVR:DB4PR02MB0542;H:[10.7.0.41];FPR:;SPF:None;MLV:sfv;LANG:en; X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(5005006)(5002010);SRVR:DB4PR02MB0542;BCL:0;PCL:0;RULEID:;SRVR:DB4PR02MB0542; X-Forefront-PRVS: 0532BF6DC2 X-OriginatorOrg: ezchip.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Mar 2015 18:39:24.9197 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB4PR02MB0542 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/31/2015 06:17 AM, Christoph Lameter wrote: > On Mon, 30 Mar 2015, cmetcalf@ezchip.com wrote: > >> Running watchdog can be a helpful debugging feature on regular >> cores, but it's incompatible with nohz_full, since it forces >> regular scheduling events. Accordingly, just exit out immediately >> from any nohz_full core. > At this point we still have a timer tick every second. So just change the > way the checking occurs that it can be done during the once per second > tick for now? If the tick idle period is expanded later maybe only run the > watchdog activity during those inevitable ticks? Someone recently suggested disabling the forced once-per-second tick :) https://lkml.org/lkml/2014/10/31/364 I am hopeful that we can continue to drive toward that goal, and reluctant to suggest that we pile anything else onto the existing scheduler_tick_max_deferment() assumptions... > It may be best if the watchdog could be configured as to which processors > it should run on? I mentioned this in my reply to Ingo. My naive code was simply to force the cpuset of watchdog-enabled cores to be the complement of the nohz_full cpuset. However, you could also imagine coding up support for a generic cpuset (defaulting in the obvious ways) that could still be overridden. This may come back to a question of just why one believes that nohz_full is a good thing in the first place. For folks that are doing it just to improve performance, power, etc, generally, it may not matter much whether the watchdog ticks occasionally. But for folks who are doing it to establish cores that are run completely tick-free for days on end so they can help process 100 Gb packet streams and never drop a packet, the calculus is a little different. My bias is to say that once you've tagged a core as nohz_full, you never want to run the watchdog on it. But it's worth supporting multiple uses of nohz_full, certainly. -- Chris Metcalf, EZChip Semiconductor http://www.ezchip.com