All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Metcalf <cmetcalf@ezchip.com>
To: Don Zickus <dzickus@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Andrew Jones <drjones@redhat.com>,
	chai wen <chaiw.fnst@cn.fujitsu.com>,
	Ingo Molnar <mingo@kernel.org>,
	Ulrich Obergfell <uobergfe@redhat.com>,
	Fabian Frederick <fabf@skynet.be>,
	Aaron Tomlin <atomlin@redhat.com>, Ben Zhang <benzh@chromium.org>,
	Christoph Lameter <cl@linux.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Gilad Ben-Yossef <gilad@benyossef.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: [PATCH v2] watchdog: nohz: don't run watchdog on nohz_full cores
Date: Mon, 30 Mar 2015 15:32:55 -0400	[thread overview]
Message-ID: <5519A4E7.8040708@ezchip.com> (raw)
In-Reply-To: <20150330191245.GO162412@redhat.com>

On 03/30/2015 03:12 PM, Don Zickus wrote:
> On Mon, Mar 30, 2015 at 02:51:05PM -0400, cmetcalf@ezchip.com wrote:
>> From: Chris Metcalf <cmetcalf@ezchip.com>
>>
>> Running watchdog can be a helpful debugging feature on regular
>> cores, but it's incompatible with nohz_full, since it forces
>> regular scheduling events.  Accordingly, just exit out immediately
>> from any nohz_full core.
>>
>> An alternate approach would be to add a flags field or function to
>> smp_hotplug_thread to control on which cores the percpu threads
>> are created, but it wasn't clear that much mechanism was useful.
> Hi Chris,
>
> It seems like the correct solution would be to hook into the idle_loop
> somehow.  If the cpu is idle, then it seems unlikely that a lockup could
> occur.

With nohz_full, though, the cpu might be running userspace code
with the intention of keeping kernel ticks disabled.  Even returning
to kernel mode to try to figure out if we "should" be running the
watchdog on a given core will induce exactly the kind of interrupts
that nohz_full is designed to prevent.

My assumption is generally that nohz_full cores don't spend a lot of
time in the kernel anyway, as they are optimized for user space.

I guess you could imagine doing something per-cpu on the nohz_full
cores where we effectively call watchdog_disable() whenever a
nohz_full core enters userspace, and watchdog_enable() whenever it
enters the kernel.  We could add some per-cpu state in the watchdog
code to track whether that core was currently enabled or disabled
to avoid double-enabling or double-disabling.  I would think
context_tracking_user_exit()/_enter() would be the place to do this.

This feels like a lot of overhead, potentially.  Thoughts?

> My fear with this apporach is a lockup would occur on the nohz cpu and it
> would go undetected because that cpu is disabled.  Further no printk is
> thrown out to even indicate a cpu is disabled making it more difficult to
> debug.

Assuming we stick with my simple "exit the watchdog thread" model,
we probably likely wouldn't want to warn for every nohz_full core, but a
one-shot message makes sense, e.g. just "watchdog: disabled on
nohz_full cores".

>> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
>> ---
>>   kernel/watchdog.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
>> index 3174bf8e3538..8a46d9d8a66f 100644
>> --- a/kernel/watchdog.c
>> +++ b/kernel/watchdog.c
>> @@ -19,6 +19,7 @@
>>   #include <linux/sysctl.h>
>>   #include <linux/smpboot.h>
>>   #include <linux/sched/rt.h>
>> +#include <linux/tick.h>
>>   
>>   #include <asm/irq_regs.h>
>>   #include <linux/kvm_para.h>
>> @@ -431,6 +432,10 @@ static void watchdog_enable(unsigned int cpu)
>>   	hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
>>   	hrtimer->function = watchdog_timer_fn;
>>   
>> +	/* nohz_full cpus do not do watchdog checking. */
>> +	if (tick_nohz_full_cpu(cpu))
>> +		do_exit(0);
>> +
>>   	/* Enable the perf event */
>>   	watchdog_nmi_enable(cpu);
>>   
>> -- 
>> 2.1.2
>>

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


  reply	other threads:[~2015-03-30 19:33 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-30 18:51 [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores cmetcalf
2015-03-30 19:12 ` Don Zickus
2015-03-30 19:32   ` Chris Metcalf [this message]
2015-03-30 20:02     ` [PATCH v2] " Don Zickus
2015-04-02 15:19       ` Frederic Weisbecker
2015-03-31  2:04   ` [PATCH] " Mike Galbraith
2015-03-31  6:34     ` Mike Galbraith
2015-03-31 18:32     ` Chris Metcalf
2015-03-31  7:25 ` Ingo Molnar
2015-03-31 18:30   ` Chris Metcalf
2015-04-02 13:35     ` Don Zickus
2015-04-02 13:49       ` Chris Metcalf
2015-04-02 14:15         ` Don Zickus
2015-04-02 15:38           ` Frederic Weisbecker
2015-04-02 15:42             ` Chris Metcalf
2015-04-02 16:08               ` Don Zickus
2015-04-02 16:48               ` Frederic Weisbecker
2015-04-02 17:39                 ` [PATCH v3] watchdog: add watchdog_cpumask sysctl to assist nohz cmetcalf
2015-04-02 18:06                   ` Peter Zijlstra
2015-04-02 18:16                     ` Chris Metcalf
2015-04-02 18:33                       ` Peter Zijlstra
2015-04-02 18:49                         ` Chris Metcalf
2015-04-02 18:45                   ` Don Zickus
2015-04-03 16:08                     ` [PATCH v4 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-03 16:08                       ` [PATCH v4 2/2] watchdog: add watchdog_exclude sysctl to assist nohz cmetcalf
2015-04-05 16:46                         ` Ulrich Obergfell
2015-04-06 19:45                           ` [PATCH v5 0/2] nohz/watchdog/smp_hotplug_thread changes cmetcalf
2015-04-06 19:45                             ` [PATCH v5 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-08 13:28                               ` Frederic Weisbecker
2015-04-08 14:06                                 ` Chris Metcalf
2015-04-08 17:29                                   ` Frederic Weisbecker
2015-04-06 19:45                             ` [PATCH v5 2/2] watchdog: add watchdog_exclude sysctl to assist nohz cmetcalf
2015-04-07 15:44                               ` Don Zickus
2015-04-07 15:56                               ` Sasha Levin
2015-04-07 17:49                                 ` Chris Metcalf
2015-04-08 14:01                               ` Frederic Weisbecker
2015-04-08 19:11                                 ` [PATCH v6 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-08 19:11                                   ` [PATCH v6 2/2] watchdog: add watchdog_cpumask sysctl to assist nohz cmetcalf
2015-04-08 20:37                                   ` [PATCH v6 1/2] smpboot: allow excluding cpus from the smpboot threads Thomas Gleixner
2015-04-09 20:29                                     ` [PATCH] " Chris Metcalf
2015-04-10  1:58                                       ` Frederic Weisbecker
2015-04-10 16:33                                         ` Chris Metcalf
2015-04-12 19:14                                           ` Frederic Weisbecker
2015-04-13 16:06                                             ` Chris Metcalf
2015-04-13 21:54                                               ` Frederic Weisbecker
2015-04-14 19:37                                                 ` [PATCH v8 1/3] " Chris Metcalf
2015-04-14 19:37                                                   ` [PATCH v8 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-16 10:46                                                     ` Ulrich Obergfell
2015-04-17 15:41                                                       ` Chris Metcalf
2015-04-22  8:20                                                         ` Ulrich Obergfell
2015-04-28 17:52                                                           ` Chris Metcalf
2015-04-29  8:48                                                             ` Ulrich Obergfell
2015-04-17  1:31                                                     ` Chai Wen
2015-04-17 16:10                                                       ` Chris Metcalf
2015-04-14 19:37                                                   ` [PATCH v8 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-16 15:28                                                   ` [PATCH v8 1/3] smpboot: allow excluding cpus from the smpboot threads Frederic Weisbecker
2015-04-16 15:50                                                     ` Chris Metcalf
2015-04-16 16:48                                                       ` Frederic Weisbecker
2015-04-17 16:17                                                     ` Chris Metcalf
2015-04-17 18:37                                                     ` [PATCH v9 " Chris Metcalf
2015-04-17 18:37                                                       ` [PATCH v9 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-21 12:32                                                         ` Ulrich Obergfell
2015-04-28 18:07                                                           ` Chris Metcalf
2015-04-29  9:49                                                             ` Ulrich Obergfell
2015-04-29 13:10                                                             ` Don Zickus
2015-04-21 14:07                                                         ` Ulrich Obergfell
2015-04-22 15:18                                                           ` Don Zickus
2015-04-25 15:42                                                             ` Ulrich Obergfell
2015-04-22 11:02                                                         ` Ulrich Obergfell
2015-04-22 15:21                                                           ` Don Zickus
2015-04-27 20:27                                                             ` Chris Metcalf
2015-04-28 15:17                                                               ` Don Zickus
2015-04-28 19:42                                                                 ` Andrew Morton
2015-04-30 19:39                                                                 ` [PATCH v10 0/3] add watchdog_cpumask to help nohz_full Chris Metcalf
2015-04-30 19:39                                                                   ` [PATCH v10 1/3] smpboot: allow excluding cpus from the smpboot threads Chris Metcalf
2015-05-01  8:53                                                                     ` Frederic Weisbecker
2015-05-01 19:57                                                                       ` Chris Metcalf
2015-05-01 21:23                                                                         ` Frederic Weisbecker
2015-05-04 22:06                                                                           ` Chris Metcalf
2015-06-03  2:34                                                                             ` Don Zickus
2015-06-04 17:25                                                                           ` Chris Metcalf
2015-05-01 20:00                                                                       ` [PATCH] smpboot: dynamically allocate the cpumask Chris Metcalf
2015-04-30 19:39                                                                   ` [PATCH v10 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-30 20:00                                                                     ` Don Zickus
2015-04-30 20:09                                                                       ` Chris Metcalf
2015-05-01 13:46                                                                         ` Don Zickus
2015-04-30 19:39                                                                   ` [PATCH v10 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-29 22:26                                                         ` [PATCH v9 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Andrew Morton
2015-04-29 22:26                                                         ` Andrew Morton
2015-04-17 18:37                                                       ` [PATCH v9 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-29 22:26                                                         ` Andrew Morton
2015-04-29 22:26                                                       ` [PATCH v9 1/3] smpboot: allow excluding cpus from the smpboot threads Andrew Morton
2015-04-30 16:07                                                         ` Chris Metcalf
2015-04-14 15:23                                               ` [PATCH] " Frederic Weisbecker
2015-04-14 15:39                                                 ` Chris Metcalf
2015-04-14 17:57                                                   ` Thomas Gleixner
2015-04-10 20:48                                         ` [PATCH v7 1/3] " Chris Metcalf
2015-04-10 20:48                                           ` [PATCH v7 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-10 20:48                                           ` [PATCH v7 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-10 21:11                                           ` [PATCH v7 1/3] smpboot: allow excluding cpus from the smpboot threads Andrew Morton
2015-04-13 15:48                                             ` Chris Metcalf
2015-04-08 19:21                                 ` [PATCH v5 2/2] watchdog: add watchdog_exclude sysctl to assist nohz Chris Metcalf
2015-04-08 22:31                                   ` Frederic Weisbecker
2015-03-31 10:17 ` [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores Christoph Lameter
2015-03-31 18:39   ` Chris Metcalf
2015-04-02 14:13     ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5519A4E7.8040708@ezchip.com \
    --to=cmetcalf@ezchip.com \
    --cc=akpm@linux-foundation.org \
    --cc=atomlin@redhat.com \
    --cc=benzh@chromium.org \
    --cc=chaiw.fnst@cn.fujitsu.com \
    --cc=cl@linux.com \
    --cc=drjones@redhat.com \
    --cc=dzickus@redhat.com \
    --cc=fabf@skynet.be \
    --cc=fweisbec@gmail.com \
    --cc=gilad@benyossef.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=uobergfe@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.