From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756341AbZETMTO (ORCPT ); Wed, 20 May 2009 08:19:14 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756553AbZETMS7 (ORCPT ); Wed, 20 May 2009 08:18:59 -0400 Received: from viefep19-int.chello.at ([62.179.121.39]:2343 "EHLO viefep19-int.chello.at" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756070AbZETMS6 (ORCPT ); Wed, 20 May 2009 08:18:58 -0400 X-SourceIP: 213.93.53.227 Subject: Re: INFO: possible circular locking dependency at cleanup_workqueue_thread From: Peter Zijlstra To: Ingo Molnar Cc: Zdenek Kabelac , "Rafael J. Wysocki" , Oleg Nesterov , Linux Kernel Mailing List In-Reply-To: <20090517071834.GA8507@elte.hu> References: <20090517071834.GA8507@elte.hu> Content-Type: text/plain Date: Wed, 20 May 2009 14:18:58 +0200 Message-Id: <1242821938.26820.586.camel@twins> Mime-Version: 1.0 X-Mailer: Evolution 2.26.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > > ======================================================= > > [ INFO: possible circular locking dependency detected ] > > 2.6.30-rc5-00097-gd665355 #59 > > ------------------------------------------------------- > > pm-suspend/12129 is trying to acquire lock: > > (events){+.+.+.}, at: [] cleanup_workqueue_thread+0x26/0xd0 > > > > but task is already holding lock: > > (cpu_add_remove_lock){+.+.+.}, at: [] > > cpu_maps_update_begin+0x17/0x20 > > > > which lock already depends on the new lock. > > > > > > the existing dependency chain (in reverse order) is: > > > > -> #5 (cpu_add_remove_lock){+.+.+.}: > > [] __lock_acquire+0xc64/0x10a0 > > [] lock_acquire+0x98/0x140 > > [] __mutex_lock_common+0x4c/0x3b0 > > [] mutex_lock_nested+0x46/0x60 > > [] cpu_maps_update_begin+0x17/0x20 > > [] __create_workqueue_key+0xc3/0x250 > > [] stop_machine_create+0x40/0xb0 > > [] sys_delete_module+0x84/0x270 > > [] system_call_fastpath+0x16/0x1b > > [] 0xffffffffffffffff Oleg, why does __create_workqueue_key() require cpu_maps_update_begin()? Wouldn't get_online_cpus() be enough to freeze the online cpus? Breaking the setup_lock -> cpu_add_remove_lock dependency seems sufficient. > > -> #4 (setup_lock){+.+.+.}: > > [] __lock_acquire+0xc64/0x10a0 > > [] lock_acquire+0x98/0x140 > > [] __mutex_lock_common+0x4c/0x3b0 > > [] mutex_lock_nested+0x46/0x60 > > [] stop_machine_create+0x17/0xb0 > > [] disable_nonboot_cpus+0x26/0x130 > > [] suspend_devices_and_enter+0xbd/0x1b0 > > [] enter_state+0x107/0x170 > > [] state_store+0x99/0x100 > > [] kobj_attr_store+0x17/0x20 > > [] sysfs_write_file+0xd9/0x160 > > [] vfs_write+0xb8/0x180 > > [] sys_write+0x51/0x90 > > [] system_call_fastpath+0x16/0x1b > > [] 0xffffffffffffffff > > > > -> #3 (dpm_list_mtx){+.+.+.}: > > [] __lock_acquire+0xc64/0x10a0 > > [] lock_acquire+0x98/0x140 > > [] __mutex_lock_common+0x4c/0x3b0 > > [] mutex_lock_nested+0x46/0x60 > > [] device_pm_add+0x1f/0xe0 > > [] device_add+0x45f/0x570 > > [] wiphy_register+0x158/0x280 [cfg80211] > > [] ieee80211_register_hw+0xbc/0x410 [mac80211] > > [] iwl3945_pci_probe+0xa1c/0x1080 [iwl3945] > > [] local_pci_probe+0x17/0x20 > > [] pci_device_probe+0x88/0xb0 > > [] driver_probe_device+0x89/0x180 > > [] __driver_attach+0x9b/0xa0 > > [] bus_for_each_dev+0x6c/0xa0 > > [] driver_attach+0x1e/0x20 > > [] bus_add_driver+0xd5/0x290 > > [] driver_register+0x78/0x140 > > [] __pci_register_driver+0x66/0xe0 > > [] 0xffffffffa00bd05c > > [] do_one_initcall+0x3f/0x1c0 > > [] sys_init_module+0xb1/0x200 > > [] system_call_fastpath+0x16/0x1b > > [] 0xffffffffffffffff > > > > -> #2 (cfg80211_mutex){+.+.+.}: > > [] __lock_acquire+0xc64/0x10a0 > > [] lock_acquire+0x98/0x140 > > [] __mutex_lock_common+0x4c/0x3b0 > > [] mutex_lock_nested+0x46/0x60 > > [] reg_todo+0x19a/0x590 [cfg80211] > > [] worker_thread+0x1e8/0x3a0 > > [] kthread+0x5a/0xa0 > > [] child_rip+0xa/0x20 > > [] 0xffffffffffffffff > > > > -> #1 (reg_work){+.+.+.}: > > [] __lock_acquire+0xc64/0x10a0 > > [] lock_acquire+0x98/0x140 > > [] worker_thread+0x1e2/0x3a0 > > [] kthread+0x5a/0xa0 > > [] child_rip+0xa/0x20 > > [] 0xffffffffffffffff > > > > -> #0 (events){+.+.+.}: > > [] __lock_acquire+0xd36/0x10a0 > > [] lock_acquire+0x98/0x140 > > [] cleanup_workqueue_thread+0x4d/0xd0 > > [] workqueue_cpu_callback+0xc9/0x10f > > [] notifier_call_chain+0x68/0xa0 > > [] raw_notifier_call_chain+0x16/0x20 > > [] _cpu_down+0x1cc/0x2d0 > > [] disable_nonboot_cpus+0xb0/0x130 > > [] suspend_devices_and_enter+0xbd/0x1b0 > > [] enter_state+0x107/0x170 > > [] state_store+0x99/0x100 > > [] kobj_attr_store+0x17/0x20 > > [] sysfs_write_file+0xd9/0x160 > > [] vfs_write+0xb8/0x180 > > [] sys_write+0x51/0x90 > > [] system_call_fastpath+0x16/0x1b > > [] 0xffffffffffffffff > > > > other info that might help us debug this: > > > > 4 locks held by pm-suspend/12129: > > #0: (&buffer->mutex){+.+.+.}, at: [] > > sysfs_write_file+0x44/0x160 > > #1: (pm_mutex){+.+.+.}, at: [] enter_state+0x54/0x170 > > #2: (dpm_list_mtx){+.+.+.}, at: [] device_pm_lock+0x17/0x20 > > #3: (cpu_add_remove_lock){+.+.+.}, at: [] > > cpu_maps_update_begin+0x17/0x20 > > > > stack backtrace: > > Pid: 12129, comm: pm-suspend Not tainted 2.6.30-rc5-00097-gd665355 #59 > > Call Trace: > > [] print_circular_bug_tail+0x9d/0xe0 > > [] __lock_acquire+0xd36/0x10a0 > > [] ? mark_lock+0x3e0/0x400 > > [] lock_acquire+0x98/0x140 > > [] ? cleanup_workqueue_thread+0x26/0xd0 > > [] cleanup_workqueue_thread+0x4d/0xd0 > > [] ? cleanup_workqueue_thread+0x26/0xd0 > > [] ? trace_hardirqs_on+0xd/0x10 > > [] workqueue_cpu_callback+0xc9/0x10f > > [] ? cpu_callback+0x12/0x280 > > [] notifier_call_chain+0x68/0xa0 > > [] raw_notifier_call_chain+0x16/0x20 > > [] _cpu_down+0x1cc/0x2d0 > > [] disable_nonboot_cpus+0xb0/0x130 > > [] suspend_devices_and_enter+0xbd/0x1b0 > > [] enter_state+0x107/0x170 > > [] state_store+0x99/0x100 > > [] kobj_attr_store+0x17/0x20 > > [] sysfs_write_file+0xd9/0x160 > > [] vfs_write+0xb8/0x180 > > [] ? audit_syscall_entry+0x21c/0x240 > > [] sys_write+0x51/0x90 > > [] system_call_fastpath+0x16/0x1b