From: Ben Greear <greearb@candelatech.com>
To: Johannes Berg <johannes@sipsolutions.net>
Cc: linux-wireless@vger.kernel.org, Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH] mac80211: Fix deadlock in ieee80211_do_stop.
Date: Fri, 12 Nov 2010 12:57:31 -0800 [thread overview]
Message-ID: <4CDDAA3B.9090007@candelatech.com> (raw)
In-Reply-To: <1289594998.3736.11.camel@jlt3.sipsolutions.net>
On 11/12/2010 12:49 PM, Johannes Berg wrote:
> On Fri, 2010-11-12 at 12:07 -0800, greearb@candelatech.com wrote:
>
>> - flush_work(&sdata->work);
>> + /* Cannot call flush_work here because we are holding
>> + * RTNL and the worker thread(s) that will be called upon to
>> + * do the flushing might already be running a piece of work
>> + * that is blocking on RTNL. That leads to deadlock and/or
>> + * OOM.
>> + */
>> + cancel_work_sync(&sdata->work);
>
> I don't think that comment really belongs into the sources at all ... if
> we did that all the time we'd have no sources left between the
> comments :-)
I didn't want anyone to ever revert this un-knowingly, but
assuming the patch is otherwise agreed to be correct, I'll
remove the comment if you wish. :)
> However, I also don't think it should be necessary to do this.
> sdata->work is always queued on local->workqueue, which is created using
> alloc_ordered_workqueue(), and there is no work on this workqueue that
> uses the RTNL. Therefore, even flushing the entire workqueue must work,
> unless alloc_ordered_workqueue() has no such guarantee any more -- which
> I would consider to be a bug in the new workqueue framework.
The problem appears (to me) to be that the flush_work() attempts
to wait for the worker to complete it's current task. The worker can
be doing a completely separate task (ie, wireless_nlevent_process),
but that task can never complete because do_stop() holds rtnl
and the task-in-progress may block on acquiring rtnl.
So, flush_work() cannot make any progress.
The stack-traces for hung programs I originally posted seem
to agree with this analysis.
So far, I reproduced the bug around 20 times in a row witout the patch,
and since I added this patch, I have two good runs in a row, so it definitely
has an affect.
If my assumptions are correct, it would seem to unsafe to EVER
call flush_work() while holding rtnl (or indeed, any other lock
that any other work could possibly require).
Thanks,
Ben
>
> johannes
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
next prev parent reply other threads:[~2010-11-12 20:57 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-12 20:07 [PATCH] mac80211: Fix deadlock in ieee80211_do_stop greearb
2010-11-12 20:08 ` Luis R. Rodriguez
2010-11-12 20:16 ` Ben Greear
2010-11-12 20:49 ` Johannes Berg
2010-11-12 20:57 ` Ben Greear [this message]
2010-11-12 21:08 ` Johannes Berg
2010-11-12 21:51 ` Ben Greear
2010-11-13 10:34 ` Tejun Heo
2010-11-15 21:16 ` Ben Greear
2010-11-16 14:19 ` Tejun Heo
2010-11-16 16:51 ` Ben Greear
2010-11-17 8:55 ` Tejun Heo
2010-11-17 17:37 ` Ben Greear
2010-11-16 17:40 ` Johannes Berg
2010-11-17 8:47 ` Tejun Heo
2010-11-17 18:53 ` Johannes Berg
2010-11-17 18:59 ` Ben Greear
2010-11-17 19:03 ` Johannes Berg
2010-11-18 6:34 ` Tejun Heo
2010-11-18 7:07 ` Johannes Berg
2010-11-18 7:22 ` Tejun Heo
2010-11-18 16:59 ` Johannes Berg
2010-11-19 14:34 ` Tejun Heo
2010-11-19 17:57 ` Johannes Berg
2010-11-19 20:55 ` Ben Greear
2010-11-19 22:27 ` Luis R. Rodriguez
2010-12-08 17:36 ` Ben Greear
2010-12-08 18:19 ` Ben Greear
2010-12-08 18:28 ` Ben Greear
2010-12-09 14:34 ` Tejun Heo
2010-12-09 14:42 ` Johannes Berg
2010-12-09 14:46 ` Tejun Heo
2010-12-09 16:17 ` Tejun Heo
[not found] ` <4D0156F6.4000306@candelate ch.com>
2010-12-09 17:27 ` Ben Greear
2010-12-09 22:23 ` Ben Greear
2010-12-10 15:11 ` Tejun Heo
2010-12-10 16:35 ` Ben Greear
2010-11-18 17:55 ` Ben Greear
2010-11-18 18:04 ` Tejun Heo
2010-11-18 18:11 ` Ben Greear
2010-11-17 20:13 ` Ben Greear
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4CDDAA3B.9090007@candelatech.com \
--to=greearb@candelatech.com \
--cc=johannes@sipsolutions.net \
--cc=linux-wireless@vger.kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.