linux-wireless.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Johannes Berg <johannes@sipsolutions.net>
Cc: Ben Greear <greearb@candelatech.com>,
	"Luis R. Rodriguez" <mcgrof@gmail.com>,
	linux-wireless@vger.kernel.org
Subject: Re: [PATCH] mac80211: Fix deadlock in ieee80211_do_stop.
Date: Thu, 09 Dec 2010 15:46:48 +0100	[thread overview]
Message-ID: <4D00EBD8.4090805@kernel.org> (raw)
In-Reply-To: <1291905750.3540.14.camel@jlt3.sipsolutions.net>

Hello, Johannes.

On 12/09/2010 03:42 PM, Johannes Berg wrote:
> On Thu, 2010-12-09 at 15:34 +0100, Tejun Heo wrote:
> 
>>    [<78447ce1>] flush_work+0x23/0x27
>>    [<f91a2646>] ieee80211_do_stop+0x25c/0x403 [mac80211]
> 
>>    [<787001fe>] rtnetlink_rcv+0x1b/0x22		<- rtnl lock
> 
> Right, so we're flushing here under RTNL ... I believe this is the one
> that Ben hacked up to not flush or so?

He made it to cancel instead of flush.

>>    [<7878cdab>] _cond_resched+0x2b/0x44
>>    [<7878d84f>] mutex_lock_nested+0x22/0x3b
>>    [<f919fddc>] ieee80211_sta_rx_queued_mgmt+0x2d/0x3a6 [mac80211]
>>    [<f91a2f53>] ieee80211_iface_work+0x1ff/0x282 [mac80211]
> 
>> But, sdata->work is busy running ieee80211_iface_work().  I _suspect_
>> for some reason iee80211_iface_work() isn't finishing.
> 
> It's trying to acquire a mutex here, which must be &ifmgd->mtx or
> &local->mtx, but neither of them ever nest around the RTNL.

Yeah, but the task state is 'R' not 'D' and no one else is holding the
lock.  It seems more like ieee80211_iface_work() is looping
constantly.

>> That, or, the new flush_work() implementation is broken and it's
>> failing to flush when a work is being executed back to back.  I'll
>> prep a debug patch to determine what's going on.
> 
> Thanks.
> 
> I wonder if Ben can attempt to reproduce this using compat-wireless
> against a kernel that doesn't have the workqueue changes, was the last
> one without those 2.6.34? 2.6.35?

As I think we're now pretty close to where the problem is, I'd like to
try a few things before going that path.

>> The rest of the system going down the toilet after this is mostly
>> caused by the held rtnl_lock above.
> 
> Indeed, the rtnl is pretty important :-)

Heh, yeah, it's one of the most widely used mutex.  It's scary.  :-)

-- 
tejun

  reply	other threads:[~2010-12-09 14:46 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-12 20:07 [PATCH] mac80211: Fix deadlock in ieee80211_do_stop greearb
2010-11-12 20:08 ` Luis R. Rodriguez
2010-11-12 20:16   ` Ben Greear
2010-11-12 20:49 ` Johannes Berg
2010-11-12 20:57   ` Ben Greear
2010-11-12 21:08     ` Johannes Berg
2010-11-12 21:51       ` Ben Greear
2010-11-13 10:34       ` Tejun Heo
2010-11-15 21:16         ` Ben Greear
2010-11-16 14:19           ` Tejun Heo
2010-11-16 16:51             ` Ben Greear
2010-11-17  8:55               ` Tejun Heo
2010-11-17 17:37                 ` Ben Greear
2010-11-16 17:40             ` Johannes Berg
2010-11-17  8:47               ` Tejun Heo
2010-11-17 18:53                 ` Johannes Berg
2010-11-17 18:59                   ` Ben Greear
2010-11-17 19:03                     ` Johannes Berg
2010-11-18  6:34                   ` Tejun Heo
2010-11-18  7:07                     ` Johannes Berg
2010-11-18  7:22                       ` Tejun Heo
2010-11-18 16:59                         ` Johannes Berg
2010-11-19 14:34                           ` Tejun Heo
2010-11-19 17:57                             ` Johannes Berg
2010-11-19 20:55                               ` Ben Greear
2010-11-19 22:27                                 ` Luis R. Rodriguez
2010-12-08 17:36                                   ` Ben Greear
2010-12-08 18:19                                     ` Ben Greear
2010-12-08 18:28                                       ` Ben Greear
2010-12-09 14:34                                         ` Tejun Heo
2010-12-09 14:42                                           ` Johannes Berg
2010-12-09 14:46                                             ` Tejun Heo [this message]
2010-12-09 16:17                                               ` Tejun Heo
     [not found]                                                 ` <4D0156F6.4000306@candelate ch.com>
2010-12-09 17:27                                                 ` Ben Greear
2010-12-09 22:23                                                 ` Ben Greear
2010-12-10 15:11                                                   ` Tejun Heo
2010-12-10 16:35                                                     ` Ben Greear
2010-11-18 17:55                         ` Ben Greear
2010-11-18 18:04                           ` Tejun Heo
2010-11-18 18:11                             ` Ben Greear
2010-11-17 20:13             ` Ben Greear

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D00EBD8.4090805@kernel.org \
    --to=tj@kernel.org \
    --cc=greearb@candelatech.com \
    --cc=johannes@sipsolutions.net \
    --cc=linux-wireless@vger.kernel.org \
    --cc=mcgrof@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).