All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>, Alex Chiang <achiang@hp.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Johannes Berg <johannes@sipsolutions.net>,
	jbarnes@virtuousgeek.org, linux-pci@vger.kernel.org,
	linux-kernel@vger.kernel.org, kaneshige.kenji@jp.fujitsu.com,
	Lai Jiangshan <laijs@cn.fujitsu.com>
Subject: Re: [PATCH v5 09/13] PCI: Introduce /sys/bus/pci/devices/.../remove
Date: Tue, 24 Mar 2009 12:17:07 +0100	[thread overview]
Message-ID: <1237893427.24918.174.camel@twins> (raw)
In-Reply-To: <20090324034659.9e1f97dc.akpm@linux-foundation.org>

On Tue, 2009-03-24 at 03:46 -0700, Andrew Morton wrote:
> 
> Thing is, we've always supported kevetnd-calls-flush_work().  That's what
> "morton gets to eat his hat" in run_workqueue() is all about.

Supported as in not complained about it, but its always presented a
deadlock scenario.

> Now, -mm's workqueue-avoid-recursion-in-run_workqueue.patch changes all of
> that.

See the discussions around that patch, Lai Jiangshan discovered that it
had more deadlock potential than we even suspected.

To quote:

---
On 02/06, Lai Jiangshan wrote:
>
> Oleg Nesterov wrote:
> > On 02/05, Lai Jiangshan wrote:
> >> DEADLOCK EXAMPLE for explain my above option:
> >>
> >> (work_func0() and work_func1() are work callback, and they
> >> calls flush_workqueue())
> >>
> >> CPU#0                                      CPU#1
> >> run_workqueue()                         run_workqueue()
> >>   work_func0()                            work_func1()
> >>     flush_workqueue()                       flush_workqueue()
> >>       flush_cpu_workqueue(0)                  .
> >>       flush_cpu_workqueue(cpu#1)              flush_cpu_workqueue(cpu#0)
> >>         waiting work_func1() in cpu#1           waiting work_func0 in cpu#0
> >>
> >> DEADLOCK!
> >
> > I am not sure. Note that when work_func0() calls run_workqueue(),
> > it will clear cwq->current_work, so another flush_ on CPU#1 will
> > not wait for work_func0, no?
>
> cwq->current_work is changed only when
> !list_empty(&cwq->worklist)
> in run_workqueue().
>
> so cwq->current_work may not be changed.

Ah, indeed.

Thanks for correcting me!
---

>   And that patch recently triggered a warning due to some games which
> USB was playing.  I was told this is because USB is being bad.
> 
> But I don't think we've seen a coherent description of what's actually
> _wrong_ with the current code.  flush_cpu_workqueue() has been handling
> this case for many years with no problems reported as far as I know.

Might be sheer luck, but afaik we did have some actual deadlocks due to
workqueue flushing -- a particular one I can remember was cpu-hotplug vs
cpufreq.

> So what has caused this sudden flurry of reports?  Did something change in
> lockdep?  What is this
> 
> [  537.380128]  (events){--..}, at: [<ffffffff80257fc0>] flush_workqueue+0x0/0xa0
> [  537.380128]
> [  537.380128] but task is already holding lock:
> [  537.380128]  (events){--..}, at: [<ffffffff80257648>] run_workqueue+0x108/0x230
> 
> supposed to mean?  "events" isn't a lock - it's the name of a kernel
> thread, isn't it?

No workqueue lockdep support has been in there for a while now. /me
pokes at git for a bit..

4e6045f134784f4b158b3c0f7a282b04bd816887 -- Oct 2007, ca. 2.6.24-rc1

What it does it gives the workqueue a lock-object and each worklet. It
then validates that you only get:

 workqueue
   worklet

nestings, eg. calling flush_workqueue() from a worklet will generate

 workqueue    <-.
   worklet      |
     workqueue -'

recursion, IOW the above splat.

Another thing it does is connect the lockchains of workqueue callers
with those of the worklet. eg.

---
    code path 1:
      my_function() -> lock(L1); ...; flush_workqueue(); ...
    
    code path 2:
      run_workqueue() -> my_work() -> ...; lock(L1); ...
    
    you can get a deadlock when my_work() is queued or running
    but my_function() has acquired L1 already.
---

>   If this is supposed to be deadlockable then how?
> 
> Because I don't immediately see what's wrong with e1000_remove() calling
> flush_work().  It's undesirable, and we can perhaps improve it via some
> means, but where is the bug?

I hope the above answers why flushing a workqueue from within that same
workqueue is a very bad thing.


  reply	other threads:[~2009-03-24 11:18 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-20 20:55 [PATCH v5 00/13] PCI core learns hotplug Alex Chiang
2009-03-20 20:55 ` [PATCH v5 01/13] PCI: pci_is_root_bus helper Alex Chiang
2009-03-20 22:00   ` Jesse Barnes
2009-03-20 20:56 ` [PATCH v5 02/13] PCI: don't scan existing devices Alex Chiang
2009-03-20 20:56 ` [PATCH v5 03/13] PCI: pci_scan_slot() returns newly found devices Alex Chiang
2009-03-20 20:56 ` [PATCH v5 04/13] PCI: always scan child buses Alex Chiang
2009-03-20 20:56 ` [PATCH v5 05/13] PCI: do not initialize bridges more than once Alex Chiang
2009-03-20 20:56 ` [PATCH v5 06/13] PCI: do not enable " Alex Chiang
2009-03-20 20:56 ` [PATCH v5 07/13] PCI: Introduce pci_rescan_bus() Alex Chiang
2009-03-20 20:56 ` [PATCH v5 08/13] PCI: Introduce /sys/bus/pci/rescan Alex Chiang
2009-03-20 20:56 ` [PATCH v5 09/13] PCI: Introduce /sys/bus/pci/devices/.../remove Alex Chiang
2009-03-23  9:01   ` Kenji Kaneshige
2009-03-24  3:23     ` Alex Chiang
2009-03-24  9:25       ` Ingo Molnar
2009-03-24 10:46         ` Andrew Morton
2009-03-24 11:17           ` Peter Zijlstra [this message]
2009-03-24 13:21             ` Johannes Berg
2009-03-24 12:32           ` Johannes Berg
2009-03-24 17:23             ` Alex Chiang
2009-03-24 20:22               ` Johannes Berg
2009-03-24 16:12         ` Oleg Nesterov
2009-03-24 17:32           ` Alex Chiang
2009-03-24 19:29     ` Alex Chiang
2009-03-25  5:06       ` Kenji Kaneshige
2009-03-25  5:20         ` Alex Chiang
2009-03-25  5:39           ` Kenji Kaneshige
2009-03-20 20:56 ` [PATCH v5 10/13] PCI: Introduce /sys/bus/pci/devices/.../rescan Alex Chiang
2009-03-20 20:56 ` [PATCH v5 11/13] PCI Hotplug: restore fakephp interface with complete reimplementation Alex Chiang
2009-03-20 20:56 ` [PATCH v5 12/13] PCI Hotplug: rename legacy_fakephp to fakephp Alex Chiang
2009-03-20 20:56 ` [PATCH v5 13/13] PCI Hotplug: schedule fakephp for feature removal Alex Chiang
2012-03-10 21:20   ` Yinghai Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1237893427.24918.174.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=achiang@hp.com \
    --cc=akpm@linux-foundation.org \
    --cc=jbarnes@virtuousgeek.org \
    --cc=johannes@sipsolutions.net \
    --cc=kaneshige.kenji@jp.fujitsu.com \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=oleg@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.