public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Don Zickus <dzickus@redhat.com>, Ingo Molnar <mingo@kernel.org>,
	Andrew Jones <drjones@redhat.com>,
	Ulrich Obergfell <uobergfe@redhat.com>,
	Fabian Frederick <fabf@skynet.be>,
	Aaron Tomlin <atomlin@redhat.com>, Ben Zhang <benzh@chromium.org>,
	Christoph Lameter <cl@linux.com>,
	Gilad Ben-Yossef <gilad@benyossef.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	linux-kernel@vger.kernel.org, Jonathan Corbet <corbet@lwn.net>,
	linux-doc@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH v10 1/3] smpboot: allow excluding cpus from the smpboot threads
Date: Fri, 1 May 2015 23:23:30 +0200	[thread overview]
Message-ID: <20150501212329.GA4179@lerouge> (raw)
In-Reply-To: <5543DABF.4060303@ezchip.com>

On Fri, May 01, 2015 at 03:57:51PM -0400, Chris Metcalf wrote:
> On 05/01/2015 04:53 AM, Frederic Weisbecker wrote:
> >>+	/* Unpark any threads that were voluntarily parked. */
> >>+	for_each_cpu_not(cpu, &ht->cpumask) {
> >>+		if (cpu_online(cpu)) {
> >>+			struct task_struct *tsk = *per_cpu_ptr(ht->store, cpu);
> >>+			if (tsk)
> >>+				kthread_unpark(tsk);
> >I'm still not clear why we are doing that. kthread_stop() should be able
> >to handle parked kthreads, otherwise it needs to be fixed.
> 
> Checking without the unpark, it's actually only a problem with nohz_full.
> In a system without nohz_full, the kthreads are able to stop even when
> they are parked; it's only in the nohz_full case that things wedge.

Ok. So this isn't a proper fix but a workaround for a bug that we don't
understand yet. In this case I much prefer that you remove this workaround
(I'm talking about this unpark loop) because it hides the issue. And hiding
the bug is the last thing we want if we plan to fix it properly.

> 
> For example, booting with only cpu 0 as a housekeeping core (and
> therefore all watchdogs 1-35 on my 36-core tilegx are parked), and
> immediately doing "echo 0 > /proc/sys/kernel/watchdog", I see
> (via SysRq ^O-l) the first parked watchdog, on cpu 1, hung with:
> 
>   frame 0: 0xfffffff7000f2928 lock_hrtimer_base+0xb8/0xc0
>   frame 1: 0xfffffff7000f2a28 hrtimer_try_to_cancel+0x40/0x170
>   frame 2: 0xfffffff7000f2a28 hrtimer_try_to_cancel+0x40/0x170
>   frame 3: 0xfffffff7000f2b98 hrtimer_cancel+0x40/0x68
>   frame 4: 0xfffffff70014cce0 watchdog_disable+0x50/0x70
>   frame 5: 0xfffffff70008c2d0 smpboot_thread_fn+0x350/0x438
>   frame 6: 0xfffffff700084b28 kthread+0x160/0x178

Have you tried to do that before your patchset?

> The other cores are all idle.
> 
> I have no idea why lock_hrtimer_base() is hanging; perhaps the
> hrtimer_cpu_base lock is taken by some other task that is now
> scheduled out.

No, it's a spinlock, tasks can't sleep while holding it. But it looks like
a deadlock.

> 
> The config does not have NO_HZ_FULL_ALL or NO_HZ_FULL_SYSIDLE
> set, and does have RCU_FAST_NO_HZ and RCU_NOCB_CPU_ALL.
> 
> I don't really know how to start debugging this, but I do know that
> unparking the threads first avoids the issue :-)

Do you have CONFIG_PROVE_LOCKING=y ?

I can't check that myself until the middle of next week.

> 
> -- 
> Chris Metcalf, EZChip Semiconductor
> http://www.ezchip.com
> 

  reply	other threads:[~2015-05-01 21:23 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-30 18:51 [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores cmetcalf
2015-03-30 19:12 ` Don Zickus
2015-03-30 19:32   ` [PATCH v2] " Chris Metcalf
2015-03-30 20:02     ` Don Zickus
2015-04-02 15:19       ` Frederic Weisbecker
2015-03-31  2:04   ` [PATCH] " Mike Galbraith
2015-03-31  6:34     ` Mike Galbraith
2015-03-31 18:32     ` Chris Metcalf
2015-03-31  7:25 ` Ingo Molnar
2015-03-31 18:30   ` Chris Metcalf
2015-04-02 13:35     ` Don Zickus
2015-04-02 13:49       ` Chris Metcalf
2015-04-02 14:15         ` Don Zickus
2015-04-02 15:38           ` Frederic Weisbecker
2015-04-02 15:42             ` Chris Metcalf
2015-04-02 16:08               ` Don Zickus
2015-04-02 16:48               ` Frederic Weisbecker
2015-04-02 17:39                 ` [PATCH v3] watchdog: add watchdog_cpumask sysctl to assist nohz cmetcalf
2015-04-02 18:06                   ` Peter Zijlstra
2015-04-02 18:16                     ` Chris Metcalf
2015-04-02 18:33                       ` Peter Zijlstra
2015-04-02 18:49                         ` Chris Metcalf
2015-04-02 18:45                   ` Don Zickus
2015-04-03 16:08                     ` [PATCH v4 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-03 16:08                       ` [PATCH v4 2/2] watchdog: add watchdog_exclude sysctl to assist nohz cmetcalf
2015-04-05 16:46                         ` Ulrich Obergfell
2015-04-06 19:45                           ` [PATCH v5 0/2] nohz/watchdog/smp_hotplug_thread changes cmetcalf
2015-04-06 19:45                             ` [PATCH v5 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-08 13:28                               ` Frederic Weisbecker
2015-04-08 14:06                                 ` Chris Metcalf
2015-04-08 17:29                                   ` Frederic Weisbecker
2015-04-06 19:45                             ` [PATCH v5 2/2] watchdog: add watchdog_exclude sysctl to assist nohz cmetcalf
2015-04-07 15:44                               ` Don Zickus
2015-04-07 15:56                               ` Sasha Levin
2015-04-07 17:49                                 ` Chris Metcalf
2015-04-08 14:01                               ` Frederic Weisbecker
2015-04-08 19:11                                 ` [PATCH v6 1/2] smpboot: allow excluding cpus from the smpboot threads cmetcalf
2015-04-08 19:11                                   ` [PATCH v6 2/2] watchdog: add watchdog_cpumask sysctl to assist nohz cmetcalf
2015-04-08 20:37                                   ` [PATCH v6 1/2] smpboot: allow excluding cpus from the smpboot threads Thomas Gleixner
2015-04-09 20:29                                     ` [PATCH] " Chris Metcalf
2015-04-10  1:58                                       ` Frederic Weisbecker
2015-04-10 16:33                                         ` Chris Metcalf
2015-04-12 19:14                                           ` Frederic Weisbecker
2015-04-13 16:06                                             ` Chris Metcalf
2015-04-13 21:54                                               ` Frederic Weisbecker
2015-04-14 19:37                                                 ` [PATCH v8 1/3] " Chris Metcalf
2015-04-14 19:37                                                   ` [PATCH v8 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-16 10:46                                                     ` Ulrich Obergfell
2015-04-17 15:41                                                       ` Chris Metcalf
2015-04-22  8:20                                                         ` Ulrich Obergfell
2015-04-28 17:52                                                           ` Chris Metcalf
2015-04-29  8:48                                                             ` Ulrich Obergfell
2015-04-17  1:31                                                     ` Chai Wen
2015-04-17 16:10                                                       ` Chris Metcalf
2015-04-14 19:37                                                   ` [PATCH v8 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-16 15:28                                                   ` [PATCH v8 1/3] smpboot: allow excluding cpus from the smpboot threads Frederic Weisbecker
2015-04-16 15:50                                                     ` Chris Metcalf
2015-04-16 16:48                                                       ` Frederic Weisbecker
2015-04-17 16:17                                                     ` Chris Metcalf
2015-04-17 18:37                                                     ` [PATCH v9 " Chris Metcalf
2015-04-17 18:37                                                       ` [PATCH v9 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-21 12:32                                                         ` Ulrich Obergfell
2015-04-28 18:07                                                           ` Chris Metcalf
2015-04-29  9:49                                                             ` Ulrich Obergfell
2015-04-29 13:10                                                             ` Don Zickus
2015-04-21 14:07                                                         ` Ulrich Obergfell
2015-04-22 15:18                                                           ` Don Zickus
2015-04-25 15:42                                                             ` Ulrich Obergfell
2015-04-22 11:02                                                         ` Ulrich Obergfell
2015-04-22 15:21                                                           ` Don Zickus
2015-04-27 20:27                                                             ` Chris Metcalf
2015-04-28 15:17                                                               ` Don Zickus
2015-04-28 19:42                                                                 ` Andrew Morton
2015-04-30 19:39                                                                 ` [PATCH v10 0/3] add watchdog_cpumask to help nohz_full Chris Metcalf
2015-04-30 19:39                                                                   ` [PATCH v10 1/3] smpboot: allow excluding cpus from the smpboot threads Chris Metcalf
2015-05-01  8:53                                                                     ` Frederic Weisbecker
2015-05-01 19:57                                                                       ` Chris Metcalf
2015-05-01 21:23                                                                         ` Frederic Weisbecker [this message]
2015-05-04 22:06                                                                           ` Chris Metcalf
2015-06-03  2:34                                                                             ` Don Zickus
2015-06-04 17:25                                                                           ` Chris Metcalf
2015-05-01 20:00                                                                       ` [PATCH] smpboot: dynamically allocate the cpumask Chris Metcalf
2015-04-30 19:39                                                                   ` [PATCH v10 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-30 20:00                                                                     ` Don Zickus
2015-04-30 20:09                                                                       ` Chris Metcalf
2015-05-01 13:46                                                                         ` Don Zickus
2015-04-30 19:39                                                                   ` [PATCH v10 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-29 22:26                                                         ` [PATCH v9 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Andrew Morton
2015-04-29 22:26                                                         ` Andrew Morton
2015-04-17 18:37                                                       ` [PATCH v9 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-29 22:26                                                         ` Andrew Morton
2015-04-29 22:26                                                       ` [PATCH v9 1/3] smpboot: allow excluding cpus from the smpboot threads Andrew Morton
2015-04-30 16:07                                                         ` Chris Metcalf
2015-04-14 15:23                                               ` [PATCH] " Frederic Weisbecker
2015-04-14 15:39                                                 ` Chris Metcalf
2015-04-14 17:57                                                   ` Thomas Gleixner
2015-04-10 20:48                                         ` [PATCH v7 1/3] " Chris Metcalf
2015-04-10 20:48                                           ` [PATCH v7 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz Chris Metcalf
2015-04-10 20:48                                           ` [PATCH v7 3/3] procfs: treat parked tasks as sleeping for task state Chris Metcalf
2015-04-10 21:11                                           ` [PATCH v7 1/3] smpboot: allow excluding cpus from the smpboot threads Andrew Morton
2015-04-13 15:48                                             ` Chris Metcalf
2015-04-08 19:21                                 ` [PATCH v5 2/2] watchdog: add watchdog_exclude sysctl to assist nohz Chris Metcalf
2015-04-08 22:31                                   ` Frederic Weisbecker
2015-03-31 10:17 ` [PATCH] watchdog: nohz: don't run watchdog on nohz_full cores Christoph Lameter
2015-03-31 18:39   ` Chris Metcalf
2015-04-02 14:13     ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150501212329.GA4179@lerouge \
    --to=fweisbec@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=atomlin@redhat.com \
    --cc=benzh@chromium.org \
    --cc=cl@linux.com \
    --cc=cmetcalf@ezchip.com \
    --cc=corbet@lwn.net \
    --cc=drjones@redhat.com \
    --cc=dzickus@redhat.com \
    --cc=fabf@skynet.be \
    --cc=gilad@benyossef.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=uobergfe@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox