From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Morton Subject: Re: [patch 3/6] 2.6.18: sb1250-mac: Phylib IRQ handling fixes Date: Mon, 16 Oct 2006 12:06:53 -0700 Message-ID: <20061016120653.9a70135d.akpm@osdl.org> References: <20061005234310.6c8042b5.akpm@osdl.org> <20061006080323.185f2b58.akpm@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Jeff Garzik , Ralf Baechle , Andy Fleming , netdev@vger.kernel.org, linux-mips@linux-mips.org Return-path: To: "Maciej W. Rozycki" In-Reply-To: Sender: linux-mips-bounce@linux-mips.org Errors-to: linux-mips-bounce@linux-mips.org List-Id: netdev.vger.kernel.org On Mon, 16 Oct 2006 15:50:55 +0100 (BST) "Maciej W. Rozycki" wrote: > Andrew, > > > I don't get it. If some code does > > > > rtnl_lock(); > > flush_scheduled_work(); > > > > and there's some work scheduled which does rtnl_lock() then it'll deadlock. > > > > But it'll deadlock whether or not the caller of flush_scheduled_work() is > > keventd. > > > > Calling flush_scheduled_work() under locks is generally a bad idea. > > Indeed -- this is why I avoid it. > > > I'd have thought that was still deadlockable. Could you describe the > > deadlock more completely please? > > The simplest sequence of calls that prevents races here is as follows: > > unregister_netdev(); > rtnl_lock(); > unregister_netdevice(); > dev_close(); > sbmac_close(); > phy_stop(); > phy_disconnect(); > phy_stop_interrupts(); > phy_disable_interrupts(); > flush_scheduled_work(); > free_irq(); > phy_detach(); > mdiobus_unregister(); > rtnl_unlock(); > > We want to call flush_scheduled_work() from phy_stop_interrupts(), because > there may still be calls to phy_change() waiting for the keventd to > process and mdiobus_unregister() frees structures needed by phy_change(). > This may deadlock because of the call to rtnl_lock() though. > > So the modified sequence I have implemented is as follows: > > unregister_netdev(); > rtnl_lock(); > unregister_netdevice(); > dev_close(); > sbmac_close(); > phy_stop(); > schedule_work(); [sbmac_phy_disconnect()] > rtnl_unlock(); > > and then: > > sbmac_phy_disconnect(); > phy_disconnect(); > phy_stop_interrupts(); > phy_disable_interrupts(); > free_irq(); > phy_detach(); > mdiobus_unregister(); > > This guarantees calls to phy_change() for this PHY unit will not be run > after mdiobus_unregister(), because any such calls have been placed in the > queue before sbmac_phy_disconnect() (phy_stop() prevents the interrupt > handler from issuing new calls to phy_change()). > > We still need flush_scheduled_work() to be called from > phy_stop_interrupts() though, in case some other MAC driver calls > phy_disconnect() (or phy_stop_interrupts(), depending on the abstraction > layer of phylib used) directly rather than using keventd. This is > possible if phy_disconnect() is called from the driver's module_exit() > call, which may make sense for devices that are known not to have their > MII interface available as an external connector. Hence the: > > if (!current_is_keventd()) > flush_scheduled_work(); > > sequence in phy_stop_interrupts(). Of course, we can force all drivers > using phylib (in the interrupt mode) to call phy_disconnect() through > keventd instead. > > Does it sound clearer? > Vaguely. Why doesn't it deadlock if !current_is_keventd()? I mean, whether or not the caller is keventd, the flush_scheduled_work() caller will still be dependent upon rtnl_lock() being acquirable.