From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Buesch Subject: Re: [PATCH] d80211: make sleeping in hw->config possible #2 Date: Tue, 11 Jul 2006 11:11:27 +0200 Message-ID: <200607111111.28040.mb@bu3sch.de> References: <200607110054.36520.mb@bu3sch.de> <20060710212536.5a223977.akpm@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Cc: linville@tuxdriver.com, jbenc@suse.cz, netdev@vger.kernel.org, bcm43xx-dev@lists.berlios.de Return-path: Received: from static-ip-62-75-166-246.inaddr.intergenia.de ([62.75.166.246]:40143 "EHLO bu3sch.de") by vger.kernel.org with ESMTP id S1750810AbWGKJJn (ORCPT ); Tue, 11 Jul 2006 05:09:43 -0400 To: Andrew Morton In-Reply-To: <20060710212536.5a223977.akpm@osdl.org> Content-Disposition: inline Sender: netdev-owner@vger.kernel.org List-Id: netdev.vger.kernel.org On Tuesday 11 July 2006 06:25, you wrote: > On Tue, 11 Jul 2006 00:54:33 +0200 > Michael Buesch wrote: > > > Please apply this to wireless-dev. > > Note that this is the second try to submit this patch. > > The first try contained a little bug. I'm sorry for that. > > If you already applied the first one, I can provide an incremental patch. > > > > Note2 that this patch depends on the > > [PATCH] cancel_rearming_delayed_work infinite loop fix > > I just sent out to the lists and akpm. > > Am still scratching my head over that. I wouldn't really call it a "fix". > More an enhcancement to cover unanticipated (and arguably strange) usage. > > It's odd to call cancel_rearming_delayed_work() against a rearming > workqueue which isn't actually running. It tends to indicate that the > caller has lost track of what it's up to. No, I don't think so. Let's say we have the following scenario: A wq reschedules itself x times after it was scheduled once from outside. After these x times, it does not reschedule anymore. That's what happens in d80211. And I don't see a solution to sync this, other than modifying the function, because we may call it after the x reschedule times. Or do you think we should really do a statemachine to workaround it? I am not the first one to hit this (I call it) bug. It is _very_ confusing to see this sync function blocking forever. I saw several people complaining about it. Also on #kernelnewbies. > But as a convenience thing I guess it's an OK thing to do. I need to stare > at the implementation for a bit longer - that stuff's tricky. Actually, I think there's still a little race. I will send a more complex fix for this, if you agree to change the function. If you say "no, we don't fix this. Insert a statemachine or something in your code instead", I can use the time for better things. :) But I think the following is also broken in the old code: A wq is not pending anymore, but just executing (before it reschedules itself). I think that would also loop forever. I don't think that's what we want. Because we can't really keep track of _this_. -- Greetings Michael.