All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Johannes Berg <johannes@sipsolutions.net>
Cc: Ingo Molnar <mingo@elte.hu>,
	Zdenek Kabelac <zdenek.kabelac@gmail.com>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: INFO: possible circular locking dependency at cleanup_workqueue_thread
Date: Tue, 19 May 2009 18:09:36 +0200	[thread overview]
Message-ID: <20090519160936.GA25720@redhat.com> (raw)
In-Reply-To: <1242747203.4797.39.camel@johannes.local>

On 05/19, Johannes Berg wrote:
>
> On Tue, 2009-05-19 at 14:00 +0200, Oleg Nesterov wrote:
>
> > > I'm not familiar enough with the code -- but what are we really trying
> > > to do in CPU_POST_DEAD? It seems to me that at that time things must
> > > already be off the CPU, so ...?
> >
> > Yes, this cpu is dead, we should do cleanup_workqueue_thread() to kill
> > cwq->thread.
> >
> > > On the other hand that calls
> > > flush_cpu_workqueue() so it seems it would actually wait for the work to
> > > be executed on some other CPU, within the CPU_POST_DEAD notification?
> >
> > Yes. Because we can't just kill cwq->thread, we can have the pending
> > work_structs so we have to flush.
> >
> > Why can't we move these works to another CPU? We can, but this doesn't
> > really help. Because in any case we should at least wait for
> > cwq->current_work to complete.
> >
> > Why do we use CPU_POST_DEAD, and not (say) CPU_DEAD to flush/kill ?
> > Because work->func() can sleep in get_online_cpus(), we can't flush
> > until we drop cpu_hotplug.lock.
>
> Right. But exactly this happens in the hibernate case --

not sure I understand your "exactly this" ;)

But your explanation of the deadlock below looks great!

> the hibernate
> code calls kernel/cpu.c:disable_nonboot_cpus() which calls _cpu_down()
> which calls raw_notifier_call_chain(&cpu_chain, CPU_POST_DEAD... Sadly,
> it does so while holding the cpu_add_remove_lock, which is happens to
> have the dependencies outlined in the original email...
>
> The same happens in cpu_down() (without leading _) which you can trigger
> from sysfs by manually removing the CPU, so it's not hibernate specific.

except I don't understand how cpu_add_remove_lock makes the difference...
And thus I can't understand why cpu_down() (called lockless) have the
same problems. Please see below.

> Anyway, you can have a deadlock like this:
>
> CPU 3			CPU 2				CPU 1
> 							suspend/hibernate
> 			something:
> 			rtnl_lock()			device_pm_lock()
> 							-> mutex_lock(&dpm_list_mtx)
>
> 			mutex_lock(&dpm_list_mtx)
>
> linkwatch_work
>  -> rtnl_lock()
> 							disable_nonboot_cpus()

let's suppose disable_nonboot_cpus() does not take cpu_add_remove_lock,

> 							-> flush CPU 3 workqueue

in this case the deadlock is still here?

We can't flush because we hold the lock (dpm_list_mtx) which depends
on another lock taken by work->func(), the "classical" bug with flush.

No?

Oleg.


  reply	other threads:[~2009-05-19 16:14 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-12  7:59 INFO: possible circular locking dependency at cleanup_workqueue_thread Zdenek Kabelac
2009-05-17  7:18 ` Ingo Molnar
2009-05-17 10:42   ` Ming Lei
2009-05-17 11:18   ` Johannes Berg
2009-05-17 13:10     ` Ingo Molnar
2009-05-18 19:47     ` Oleg Nesterov
2009-05-18 20:00       ` Peter Zijlstra
2009-05-18 20:16         ` Oleg Nesterov
2009-05-18 20:40           ` Peter Zijlstra
2009-05-18 22:14             ` Oleg Nesterov
2009-05-19  9:13               ` Peter Zijlstra
2009-05-19 10:49                 ` Peter Zijlstra
2009-05-19 14:53                   ` Oleg Nesterov
2009-05-19  8:51       ` Johannes Berg
2009-05-19 12:00         ` Oleg Nesterov
2009-05-19 15:33           ` Johannes Berg
2009-05-19 16:09             ` Oleg Nesterov [this message]
2009-05-19 16:27               ` Johannes Berg
2009-05-19 18:51                 ` Oleg Nesterov
2009-05-22 10:46                   ` Johannes Berg
2009-05-22 22:23                     ` Rafael J. Wysocki
2009-05-23  8:21                       ` Johannes Berg
2009-05-23 23:20                         ` Rafael J. Wysocki
2009-05-24  3:29                           ` Ming Lei
2009-05-24  3:29                             ` Ming Lei
2009-05-24 11:09                             ` Rafael J. Wysocki
2009-05-24 12:48                               ` Ming Lei
2009-05-24 19:09                                 ` Rafael J. Wysocki
2009-05-24 19:09                                 ` Rafael J. Wysocki
2009-05-24 12:48                               ` Ming Lei
2009-05-24 11:09                             ` Rafael J. Wysocki
2009-05-24 14:30                           ` Alan Stern
2009-05-24 14:30                           ` Alan Stern
2009-05-24 19:06                             ` Rafael J. Wysocki
2009-05-24 19:06                             ` Rafael J. Wysocki
2009-05-23 23:20                         ` Rafael J. Wysocki
2009-05-20  3:36             ` Ming Lei
2009-05-20  6:47               ` Johannes Berg
2009-05-20  7:09                 ` Ming Lei
2009-05-20  7:12                   ` Johannes Berg
2009-05-20  8:21                     ` Ming Lei
2009-05-20  8:45                       ` Johannes Berg
2009-05-22  8:03                 ` Ming Lei
2009-05-22  8:11                   ` Johannes Berg
2009-05-20 12:18   ` Peter Zijlstra
2009-05-20 13:18     ` Oleg Nesterov
2009-05-20 13:44       ` Peter Zijlstra
2009-05-20 13:55         ` Oleg Nesterov
2009-05-20 14:12           ` Peter Zijlstra
2009-05-24 18:58 ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090519160936.GA25720@redhat.com \
    --to=oleg@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=johannes@sipsolutions.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=rjw@sisk.pl \
    --cc=zdenek.kabelac@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.