From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Tue, 12 Feb 2013 12:50:38 -0800 From: Tejun Heo To: "Rafael J. Wysocki" Cc: Daniel J Blueman , Bjorn Helgaas , Linux Kernel , Linux PCI , Yijing Wang Subject: Re: [3.8-rc7] PCI hotplug wakeup oops Message-ID: <20130212205038.GA9057@htj.dyndns.org> References: <1843565.7Y1sC8j6FG@vostro.rjw.lan> <2437657.3PbvdpUqxu@vostro.rjw.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <2437657.3PbvdpUqxu@vostro.rjw.lan> Sender: linux-kernel-owner@vger.kernel.org List-ID: Hey, Rafael. On Tue, Feb 12, 2013 at 09:53:08PM +0100, Rafael J. Wysocki wrote: > This looks fishy, but I wonder if Tejun has any ideas. > > Tejun, can you please have a look at the call trace below? It looks like > the workqueues code is involved heavily. > > > > kworker/0:0/4 is trying to acquire lock: > > (name){++++.+}, at: [] flush_workqueue+0x0/0x4d0 > > > > but task is already holding lock: > > (name){++++.+}, at: [] process_one_work+0x160/0x4e0 It's basically saying that a work item is trying to flush the workqueue it's currently executing on, at least in lockdep's eyes. > > stack backtrace: > > Pid: 4, comm: kworker/0:0 Not tainted 3.8.0-rc7-ninja+ #21 > > Call Trace: > > [] validate_chain.isra.33+0xda3/0x1240 > > [] __lock_acquire+0x3ac/0xb30 > > [] lock_acquire+0x5a/0x70 > > [] flush_workqueue+0xe8/0x4d0 > > [] drain_workqueue+0x68/0x1f0 > > [] destroy_workqueue+0x13/0x160 And the flush is from workqueue destruction > > [] pciehp_release_ctrl+0x3a/0x90 > > [] pciehp_remove+0x25/0x30 > > [] pcie_port_remove_service+0x52/0x70 > > [] __device_release_driver+0x77/0xe0 > > [] device_release_driver+0x29/0x40 > > [] bus_remove_device+0xf1/0x140 > > [] device_del+0x127/0x1c0 > > [] device_unregister+0x11/0x20 > > [] remove_iter+0x35/0x40 > > [] device_for_each_child+0x36/0x70 > > [] pcie_port_device_remove+0x21/0x40 > > [] pcie_portdrv_remove+0x28/0x50 > > [] pci_device_remove+0x41/0xc0 > > [] __device_release_driver+0x77/0xe0 > > [] device_release_driver+0x29/0x40 > > [] bus_remove_device+0xf1/0x140 > > [] device_del+0x127/0x1c0 > > [] device_unregister+0x11/0x20 > > [] pci_stop_bus_device+0xb4/0xc0 > > [] pci_stop_bus_device+0x35/0xc0 > > [] pci_stop_and_remove_bus_device+0x11/0x20 > > [] pciehp_unconfigure_device+0x91/0x190 > > [] pciehp_disable_slot+0x71/0x220 > > [] pciehp_power_thread+0xe6/0x110 > > [] process_one_work+0x1ca/0x4e0 running from a workqueue which probably is at least transitively related to the workqueue being destroyed. Does this lead to an actual deadlock? Thanks. -- tejun