From: Chris Metcalf <cmetcalf@ezchip.com>
To: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Andrew Jones <drjones@redhat.com>,
chai wen <chaiw.fnst@cn.fujitsu.com>,
Ulrich Obergfell <uobergfe@redhat.com>,
Fabian Frederick <fabf@skynet.be>,
Aaron Tomlin <atomlin@redhat.com>, Ben Zhang <benzh@chromium.org>,
"Christoph Lameter" <cl@linux.com>,
Frederic Weisbecker <fweisbec@gmail.com>,
"Gilad Ben-Yossef" <gilad@benyossef.com>,
Steven Rostedt <rostedt@goodmis.org>,
"open list" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores
Date: Thu, 2 Apr 2015 09:49:45 -0400 [thread overview]
Message-ID: <551D48F9.6090101@ezchip.com> (raw)
In-Reply-To: <20150402133502.GA175361@redhat.com>
On 4/2/2015 9:35 AM, Don Zickus wrote:
> On Tue, Mar 31, 2015 at 02:30:44PM -0400, Chris Metcalf wrote:
>> On 03/31/2015 03:25 AM, Ingo Molnar wrote:
>>> * cmetcalf@ezchip.com <cmetcalf@ezchip.com> wrote:
>>>
>>>> From: Chris Metcalf <cmetcalf@ezchip.com>
>>>>
>>>> Running watchdog can be a helpful debugging feature on regular
>>>> cores, but it's incompatible with nohz_full, since it forces
>>>> regular scheduling events. Accordingly, just exit out immediately
>>> >from any nohz_full core.
>>>> An alternate approach would be to add a flags field or function to
>>>> smp_hotplug_thread to control on which cores the percpu threads
>>>> are created, but it wasn't clear that much mechanism was useful.
>>>>
>>>> [...]
>>> So what happens if someone wants to enable the lockup detector, with a
>>> long timeout, even on nohz-full CPUs? This patch makes that
>>> impossible.
>>>
>>> A better solution would be to tweak the defaults:
>>>
>>> - to default the watchdog(s) to disabled when nohz-full is
>>> enabled, even if HARDLOCKUP_DETECTOR=y or DETECT_HUNG_TASK=y, and
>>> allow it to be re-enabled via its sysctl.
>> That's certainly a reasonable thing to do; it looks like just an #ifdef
>> at the top of watchdog.c would suffice. Does this look right?
>>
>> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
>> index 8a46d9d8a66f..c8555c211e65 100644
>> --- a/kernel/watchdog.c
>> +++ b/kernel/watchdog.c
>> @@ -25,7 +25,11 @@
>> #include <linux/kvm_para.h>
>> #include <linux/perf_event.h>
>> +#ifdef CONFIG_NO_HZ_FULL
>> +int watchdog_user_enabled = 0;
>> +#else
>> int watchdog_user_enabled = 1;
>> +#endif
>> int __read_mostly watchdog_thresh = 10;
>> #ifdef CONFIG_SMP
>> int __read_mostly sysctl_softlockup_all_cpu_backtrace;
>>
>> It doesn't look like I need to do anything else special to disable
>> HARDLOCKUP_DETECTOR, and khungtaskd can happily run on
>> a non-nohz core, so that should be OK.
>>
>> What I was trying to achieve with my proposed patch was kind
>> of orthogonal: to allow the watchdog to run on standard cores,
>> but not run on nohz cores, so we could benefit from it on the
>> cores where it was safe for it to run. Do you see value in this,
>> or better to just enable/disable all watchdog threads collectively?
>
> Hmm, I am not sure I am a big fan of this approach. I know RHEL keeps the
> watchdogs enabled for customers and it would be a regression if we disabled
> it. And at the same time, I could see RHEL leaning towards enabling
> CONFIG_NO_HZ_FULL, which would just delay this problem a number of years
> until RHEL-8 gets around to ramping up.
>
> So I guess I would prefer to figure out a better co-existing solution now.
>
> Can I ask how the NO_HZ_FULL technology works from userspace? Is there a
> system command that has to be sent? How does the kernel know to turn off
> ticks and trust userspace to do the right thing?
The NO_HZ_FULL option, when configured into the kernel, lets
you boot with "nohz_full=1-15" (or whatever cpumask you like),
typically in conjunction with "isolcpus=1-15". At this point no tasks
will run on those cores until explicitly placed there by affinity, and
once there and running in userspace, the kernel will automatically
get out of their way and not interrupt at all. This lets those tasks
run with 100.000% of the cpu, which is a requirement for many
user-space device drivers running high throughput devices.
(This is typically the use case for the tile architecture customers.)
So, other than a boot flag, there are no system commands or
other APIs to deal with.
Part of the requirement, though, is that there can be only one task
bound and runnable on that cpu, otherwise the kernel has to be
involved to do the context-switching off of the scheduler tick.
This is why having the standard watchdog kernel thread doesn't
work in this context.
I continue to suspect that the right model here is to disable the
watchdog specifically on the cores that the user has tagged with
the nohz_full boot argument. I agree that there might be a case
to be made for leaving the watchdog conditionally (as suggested
by Ingo) but it should be possible to have the watchdogs on
the nohz_full cores be turned off completely if desired.
--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com
next prev parent reply other threads:[~2015-04-02 13:50 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-30 18:51 [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores cmetcalf
2015-03-30 19:12 ` Don Zickus
2015-03-30 19:32 ` [PATCH v2] " Chris Metcalf
2015-03-30 20:02 ` Don Zickus
2015-04-02 15:19 ` Frederic Weisbecker
2015-03-31 2:04 ` [PATCH] " Mike Galbraith
2015-03-31 6:34 ` Mike Galbraith
2015-03-31 18:32 ` Chris Metcalf
2015-03-31 7:25 ` Ingo Molnar
2015-03-31 18:30 ` Chris Metcalf
2015-04-02 13:35 ` Don Zickus
2015-04-02 13:49 ` Chris Metcalf [this message]
2015-04-02 14:15 ` Don Zickus
2015-04-02 15:38 ` Frederic Weisbecker
2015-04-02 15:42 ` Chris Metcalf
2015-04-02 16:08 ` Don Zickus
2015-04-02 16:48 ` Frederic Weisbecker
2015-04-02 17:39 ` [PATCH v3] watchdog: add watchdog_cpumask sysctl to assist nohz cmetcalf
2015-04-02 18:06 ` Peter Zijlstra
2015-04-02 18:16 ` Chris Metcalf
2015-04-02 18:33 ` Peter Zijlstra
2015-04-02 18:49 ` Chris Metcalf
2015-04-02 18:45 ` Don Zickus
2015-04-03 16:08 ` [PATCH v4 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-03 16:08 ` [PATCH v4 2/2] watchdog: add watchdog_exclude sysctl to assist nohz cmetcalf
2015-04-05 16:46 ` Ulrich Obergfell
2015-04-06 19:45 ` [PATCH v5 0/2] nohz/watchdog/smp_hotplug_thread changes cmetcalf
2015-04-06 19:45 ` [PATCH v5 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-08 13:28 ` Frederic Weisbecker
2015-04-08 14:06 ` Chris Metcalf
2015-04-08 17:29 ` Frederic Weisbecker
2015-04-06 19:45 ` [PATCH v5 2/2] watchdog: add watchdog_exclude sysctl to assist nohz cmetcalf
2015-04-07 15:44 ` Don Zickus
2015-04-07 15:56 ` Sasha Levin
2015-04-07 17:49 ` Chris Metcalf
2015-04-08 14:01 ` Frederic Weisbecker
2015-04-08 19:11 ` [PATCH v6 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-08 19:11 ` [PATCH v6 2/2] watchdog: add watchdog_cpumask sysctl to assist nohz cmetcalf
2015-04-08 20:37 ` [PATCH v6 1/2] smpboot: allow excluding cpus from the smpboot threads Thomas Gleixner
2015-04-09 20:29 ` [PATCH] " Chris Metcalf
2015-04-10 1:58 ` Frederic Weisbecker
2015-04-10 16:33 ` Chris Metcalf
2015-04-12 19:14 ` Frederic Weisbecker
2015-04-13 16:06 ` Chris Metcalf
2015-04-13 21:54 ` Frederic Weisbecker
2015-04-14 19:37 ` [PATCH v8 1/3] " Chris Metcalf
2015-04-14 19:37 ` [PATCH v8 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-16 10:46 ` Ulrich Obergfell
2015-04-17 15:41 ` Chris Metcalf
2015-04-22 8:20 ` Ulrich Obergfell
2015-04-28 17:52 ` Chris Metcalf
2015-04-29 8:48 ` Ulrich Obergfell
2015-04-17 1:31 ` Chai Wen
2015-04-17 16:10 ` Chris Metcalf
2015-04-14 19:37 ` [PATCH v8 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-16 15:28 ` [PATCH v8 1/3] smpboot: allow excluding cpus from the smpboot threads Frederic Weisbecker
2015-04-16 15:50 ` Chris Metcalf
2015-04-16 16:48 ` Frederic Weisbecker
2015-04-17 16:17 ` Chris Metcalf
2015-04-17 18:37 ` [PATCH v9 " Chris Metcalf
2015-04-17 18:37 ` [PATCH v9 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-21 12:32 ` Ulrich Obergfell
2015-04-28 18:07 ` Chris Metcalf
2015-04-29 9:49 ` Ulrich Obergfell
2015-04-29 13:10 ` Don Zickus
2015-04-21 14:07 ` Ulrich Obergfell
2015-04-22 15:18 ` Don Zickus
2015-04-25 15:42 ` Ulrich Obergfell
2015-04-22 11:02 ` Ulrich Obergfell
2015-04-22 15:21 ` Don Zickus
2015-04-27 20:27 ` Chris Metcalf
2015-04-28 15:17 ` Don Zickus
2015-04-28 19:42 ` Andrew Morton
2015-04-30 19:39 ` [PATCH v10 0/3] add watchdog_cpumask to help nohz_full Chris Metcalf
2015-04-30 19:39 ` [PATCH v10 1/3] smpboot: allow excluding cpus from the smpboot threads Chris Metcalf
2015-05-01 8:53 ` Frederic Weisbecker
2015-05-01 19:57 ` Chris Metcalf
2015-05-01 21:23 ` Frederic Weisbecker
2015-05-04 22:06 ` Chris Metcalf
2015-06-03 2:34 ` Don Zickus
2015-06-04 17:25 ` Chris Metcalf
2015-05-01 20:00 ` [PATCH] smpboot: dynamically allocate the cpumask Chris Metcalf
2015-04-30 19:39 ` [PATCH v10 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-30 20:00 ` Don Zickus
2015-04-30 20:09 ` Chris Metcalf
2015-05-01 13:46 ` Don Zickus
2015-04-30 19:39 ` [PATCH v10 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-29 22:26 ` [PATCH v9 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Andrew Morton
2015-04-29 22:26 ` Andrew Morton
2015-04-17 18:37 ` [PATCH v9 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-29 22:26 ` Andrew Morton
2015-04-29 22:26 ` [PATCH v9 1/3] smpboot: allow excluding cpus from the smpboot threads Andrew Morton
2015-04-30 16:07 ` Chris Metcalf
2015-04-14 15:23 ` [PATCH] " Frederic Weisbecker
2015-04-14 15:39 ` Chris Metcalf
2015-04-14 17:57 ` Thomas Gleixner
2015-04-10 20:48 ` [PATCH v7 1/3] " Chris Metcalf
2015-04-10 20:48 ` [PATCH v7 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-10 20:48 ` [PATCH v7 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-10 21:11 ` [PATCH v7 1/3] smpboot: allow excluding cpus from the smpboot threads Andrew Morton
2015-04-13 15:48 ` Chris Metcalf
2015-04-08 19:21 ` [PATCH v5 2/2] watchdog: add watchdog_exclude sysctl to assist nohz Chris Metcalf
2015-04-08 22:31 ` Frederic Weisbecker
2015-03-31 10:17 ` [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores Christoph Lameter
2015-03-31 18:39 ` Chris Metcalf
2015-04-02 14:13 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=551D48F9.6090101@ezchip.com \
--to=cmetcalf@ezchip.com \
--cc=akpm@linux-foundation.org \
--cc=atomlin@redhat.com \
--cc=benzh@chromium.org \
--cc=chaiw.fnst@cn.fujitsu.com \
--cc=cl@linux.com \
--cc=drjones@redhat.com \
--cc=dzickus@redhat.com \
--cc=fabf@skynet.be \
--cc=fweisbec@gmail.com \
--cc=gilad@benyossef.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=rostedt@goodmis.org \
--cc=uobergfe@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.