From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Tejun Heo <tj@kernel.org>
Cc: Thilo-Alexander Ginkel <thilo@ginkel.com>,
Arnd Bergmann <arnd@arndb.de>,
linux-kernel@vger.kernel.org, dm-devel@redhat.com
Subject: Re: [PATCH] workqueue: fix deadlock in worker_maybe_bind_and_lock()
Date: Fri, 29 Apr 2011 22:40:52 +0200 [thread overview]
Message-ID: <201104292240.52825.rjw@sisk.pl> (raw)
In-Reply-To: <20110429161824.GQ16552@htj.dyndns.org>
On Friday, April 29, 2011, Tejun Heo wrote:
> From 5035b20fa5cd146b66f5f89619c20a4177fb736d Mon Sep 17 00:00:00 2001
> From: Tejun Heo <tj@kernel.org>
> Date: Fri, 29 Apr 2011 18:08:37 +0200
>
> If a rescuer and stop_machine() bringing down a CPU race with each
> other, they may deadlock on non-preemptive kernel. The CPU won't
> accept a new task, so the rescuer can't migrate to the target CPU,
> while stop_machine() can't proceed because the rescuer is holding one
> of the CPU retrying migration. GCWQ_DISASSOCIATED is never cleared
> and worker_maybe_bind_and_lock() retries indefinitely.
>
> This problem can be reproduced semi reliably while the system is
> entering suspend.
>
> http://thread.gmane.org/gmane.linux.kernel/1122051
>
> A lot of kudos to Thilo-Alexander for reporting this tricky issue and
> painstaking testing.
>
> stable: This affects all kernels with cmwq, so all kernels since and
> including v2.6.36 need this fix.
Well, _that_ explains quite a number of mysterious reports where
suspend or poweroff hang randomly.
Thanks a lot of fixing it!
Rafael
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Thilo-Alexander Ginkel <thilo@ginkel.com>
> Tested-by: Thilo-Alexander Ginkel <thilo@ginkel.com>
> Cc: stable@kernel.org
> ---
> Will soon send pull request to Linus. Thank you very much.
>
> kernel/workqueue.c | 8 +++++++-
> 1 files changed, 7 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 04ef830..e3378e8 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1291,8 +1291,14 @@ __acquires(&gcwq->lock)
> return true;
> spin_unlock_irq(&gcwq->lock);
>
> - /* CPU has come up inbetween, retry migration */
> + /*
> + * We've raced with CPU hot[un]plug. Give it a breather
> + * and retry migration. cond_resched() is required here;
> + * otherwise, we might deadlock against cpu_stop trying to
> + * bring down the CPU on non-preemptive kernel.
> + */
> cpu_relax();
> + cond_resched();
> }
> }
>
>
prev parent reply other threads:[~2011-04-29 20:40 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-05 18:56 Soft lockup during suspend since ~2.6.36 [bisected] Thilo-Alexander Ginkel
2011-04-05 23:28 ` Arnd Bergmann
2011-04-06 6:03 ` Thilo-Alexander Ginkel
2011-04-14 12:24 ` Thilo-Alexander Ginkel
2011-04-17 19:35 ` Arnd Bergmann
2011-04-17 21:53 ` Thilo-Alexander Ginkel
2011-04-26 13:11 ` Tejun Heo
2011-04-27 23:51 ` Thilo-Alexander Ginkel
2011-04-27 23:51 ` Thilo-Alexander Ginkel
2011-04-28 10:30 ` Tejun Heo
2011-04-28 23:56 ` Thilo-Alexander Ginkel
2011-04-29 16:00 ` Tejun Heo
2011-04-29 16:18 ` [PATCH] workqueue: fix deadlock in worker_maybe_bind_and_lock() Tejun Heo
2011-04-29 16:18 ` Tejun Heo
2011-04-29 20:40 ` Rafael J. Wysocki [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201104292240.52825.rjw@sisk.pl \
--to=rjw@sisk.pl \
--cc=arnd@arndb.de \
--cc=dm-devel@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=thilo@ginkel.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.