public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>, Alex Chiang <achiang@hp.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Johannes Berg <johannes@sipsolutions.net>,
	jbarnes@virtuousgeek.org, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, kaneshige.kenji@jp.fujitsu.com,
	Lai Jiangshan <laijs@cn.fujitsu.com>
Subject: Re: [PATCH v5 09/13] PCI: Introduce /sys/bus/pci/devices/.../remove
Date: Tue, 24 Mar 2009 12:17:07 +0100	[thread overview]
Message-ID: <1237893427.24918.174.camel@twins> (raw)
In-Reply-To: <20090324034659.9e1f97dc.akpm@linux-foundation.org>

On Tue, 2009-03-24 at 03:46 -0700, Andrew Morton wrote:
> 
> Thing is, we've always supported kevetnd-calls-flush_work().  That's what
> "morton gets to eat his hat" in run_workqueue() is all about.

Supported as in not complained about it, but its always presented a
deadlock scenario.

> Now, -mm's workqueue-avoid-recursion-in-run_workqueue.patch changes all of
> that.

See the discussions around that patch, Lai Jiangshan discovered that it
had more deadlock potential than we even suspected.

To quote:

---
On 02/06, Lai Jiangshan wrote:
>
> Oleg Nesterov wrote:
> > On 02/05, Lai Jiangshan wrote:
> >> DEADLOCK EXAMPLE for explain my above option:
> >>
> >> (work_func0() and work_func1() are work callback, and they
> >> calls flush_workqueue())
> >>
> >> CPU#0                                      CPU#1
> >> run_workqueue()                         run_workqueue()
> >>   work_func0()                            work_func1()
> >>     flush_workqueue()                       flush_workqueue()
> >>       flush_cpu_workqueue(0)                  .
> >>       flush_cpu_workqueue(cpu#1)              flush_cpu_workqueue(cpu#0)
> >>         waiting work_func1() in cpu#1           waiting work_func0 in cpu#0
> >>
> >> DEADLOCK!
> >
> > I am not sure. Note that when work_func0() calls run_workqueue(),
> > it will clear cwq->current_work, so another flush_ on CPU#1 will
> > not wait for work_func0, no?
>
> cwq->current_work is changed only when
> !list_empty(&cwq->worklist)
> in run_workqueue().
>
> so cwq->current_work may not be changed.

Ah, indeed.

Thanks for correcting me!
---

>   And that patch recently triggered a warning due to some games which
> USB was playing.  I was told this is because USB is being bad.
> 
> But I don't think we've seen a coherent description of what's actually
> _wrong_ with the current code.  flush_cpu_workqueue() has been handling
> this case for many years with no problems reported as far as I know.

Might be sheer luck, but afaik we did have some actual deadlocks due to
workqueue flushing -- a particular one I can remember was cpu-hotplug vs
cpufreq.

> So what has caused this sudden flurry of reports?  Did something change in
> lockdep?  What is this
> 
> [  537.380128]  (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0
> [  537.380128]
> [  537.380128] but task is already holding lock:
> [  537.380128]  (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> 
> supposed to mean?  "events" isn't a lock - it's the name of a kernel
> thread, isn't it?

No workqueue lockdep support has been in there for a while now. /me
pokes at git for a bit..

4e6045f134784f4b158b3c0f7a282b04bd816887 -- Oct 2007, ca. 2.6.24-rc1

What it does it gives the workqueue a lock-object and each worklet. It
then validates that you only get:

 workqueue
   worklet

nestings, eg. calling flush_workqueue() from a worklet will generate

 workqueue    <-.
   worklet      |
     workqueue -'

recursion, IOW the above splat.

Another thing it does is connect the lockchains of workqueue callers
with those of the worklet. eg.

---
    code path 1:
      my_function() -> lock(L1); ...; flush_workqueue(); ...
    
    code path 2:
      run_workqueue() -> my_work() -> ...; lock(L1); ...
    
    you can get a deadlock when my_work() is queued or running
    but my_function() has acquired L1 already.
---

>   If this is supposed to be deadlockable then how?
> 
> Because I don't immediately see what's wrong with e1000_remove() calling
> flush_work().  It's undesirable, and we can perhaps improve it via some
> means, but where is the bug?

I hope the above answers why flushing a workqueue from within that same
workqueue is a very bad thing.


  reply	other threads:[~2009-03-24 11:18 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-20 20:55 [PATCH v5 00/13] PCI core learns hotplug Alex Chiang
2009-03-20 20:55 ` [PATCH v5 01/13] PCI: pci_is_root_bus helper Alex Chiang
2009-03-20 22:00   ` Jesse Barnes
2009-03-20 20:56 ` [PATCH v5 02/13] PCI: don't scan existing devices Alex Chiang
2009-03-20 20:56 ` [PATCH v5 03/13] PCI: pci_scan_slot() returns newly found devices Alex Chiang
2009-03-20 20:56 ` [PATCH v5 04/13] PCI: always scan child buses Alex Chiang
2009-03-20 20:56 ` [PATCH v5 05/13] PCI: do not initialize bridges more than once Alex Chiang
2009-03-20 20:56 ` [PATCH v5 06/13] PCI: do not enable " Alex Chiang
2009-03-20 20:56 ` [PATCH v5 07/13] PCI: Introduce pci_rescan_bus() Alex Chiang
2009-03-20 20:56 ` [PATCH v5 08/13] PCI: Introduce /sys/bus/pci/rescan Alex Chiang
2009-03-20 20:56 ` [PATCH v5 09/13] PCI: Introduce /sys/bus/pci/devices/.../remove Alex Chiang
2009-03-23  9:01   ` Kenji Kaneshige
2009-03-24  3:23     ` Alex Chiang
2009-03-24  9:25       ` Ingo Molnar
2009-03-24 10:46         ` Andrew Morton
2009-03-24 11:17           ` Peter Zijlstra [this message]
2009-03-24 13:21             ` Johannes Berg
2009-03-24 12:32           ` Johannes Berg
2009-03-24 17:23             ` Alex Chiang
2009-03-24 20:22               ` Johannes Berg
2009-03-24 16:12         ` Oleg Nesterov
2009-03-24 17:32           ` Alex Chiang
2009-03-24 19:29     ` Alex Chiang
2009-03-25  5:06       ` Kenji Kaneshige
2009-03-25  5:20         ` Alex Chiang
2009-03-25  5:39           ` Kenji Kaneshige
2009-03-20 20:56 ` [PATCH v5 10/13] PCI: Introduce /sys/bus/pci/devices/.../rescan Alex Chiang
2009-03-20 20:56 ` [PATCH v5 11/13] PCI Hotplug: restore fakephp interface with complete reimplementation Alex Chiang
2009-03-20 20:56 ` [PATCH v5 12/13] PCI Hotplug: rename legacy_fakephp to fakephp Alex Chiang
2009-03-20 20:56 ` [PATCH v5 13/13] PCI Hotplug: schedule fakephp for feature removal Alex Chiang
2012-03-10 21:20   ` Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1237893427.24918.174.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=achiang@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=jbarnes@virtuousgeek.org \
    --cc=johannes@sipsolutions.net \
    --cc=kaneshige.kenji@jp.fujitsu.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox