From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754606Ab0CYL37 (ORCPT ); Thu, 25 Mar 2010 07:29:59 -0400 Received: from mx1.redhat.com ([209.132.183.28]:28481 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754409Ab0CYL35 (ORCPT ); Thu, 25 Mar 2010 07:29:57 -0400 Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: <1264740107.20211.53.camel@pasglop> References: <1264740107.20211.53.camel@pasglop> To: Benjamin Herrenschmidt Cc: dhowells@redhat.com, "linux-kernel@vger.kernel.org" , Ingo Molnar , Peter Zijlstra , Johannes Berg Subject: Re: [2.6.33-rc5] Weird deadlock when shutting down Date: Thu, 25 Mar 2010 11:28:59 +0000 Message-ID: <6397.1269516539@redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Benjamin Herrenschmidt wrote: > Johannes and I see this on our quad G5s... it -could- be similar to > one reported a short while ago by Xiaotian Feng > under the subject [2.6.33-rc4] sysfs lockdep warnings on cpu hotplug. > > Basically, the machine deadlocks right after printing the following > when doing a shutdown: > > halt/4071 is trying to acquire lock: > (s_active){++++.+}, at: [] .sysfs_addrm_finish+0x58/0xc0 > > but task is already holding lock: > (&per_cpu(cpu_policy_rwsem, cpu)){+.+.+.}, at: [] .lock_policy_rwsem_write+0x84/0xf4 > > which lock already depends on the new lock. > > the existing dependency chain (in reverse order) is: > > I see this now, with full backtrace: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.34-rc2-cachefs #115 ------------------------------------------------------- halt/2291 is trying to acquire lock: (s_active#31){++++.+}, at: [] sysfs_addrm_finish+0x31/0x5a but task is already holding lock: (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: [] lock_policy_rwsem_write+0x4a/0x7b which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}: [] __lock_acquire+0x1343/0x16cd [] lock_acquire+0x57/0x6d [] down_write+0x3f/0x62 [] lock_policy_rwsem_write+0x4a/0x7b [] store+0x39/0x79 [] sysfs_write_file+0x103/0x13f [] vfs_write+0xad/0x172 [] sys_write+0x45/0x6c [] system_call_fastpath+0x16/0x1b -> #0 (s_active#31){++++.+}: [] __lock_acquire+0xffa/0x16cd [] lock_acquire+0x57/0x6d [] sysfs_deactivate+0x8c/0xc9 [] sysfs_addrm_finish+0x31/0x5a [] sysfs_remove_dir+0x75/0x88 [] kobject_del+0x16/0x37 [] kobject_release+0x3e/0x66 [] kref_put+0x43/0x4d [] kobject_put+0x47/0x4b [] __cpufreq_remove_dev+0x1da/0x236 [] cpufreq_cpu_callback+0x62/0x7a [] notifier_call_chain+0x32/0x5e [] __raw_notifier_call_chain+0x9/0xb [] _cpu_down+0x90/0x29e [] disable_nonboot_cpus+0x6f/0x105 [] kernel_power_off+0x21/0x3b [] sys_reboot+0x103/0x16a [] system_call_fastpath+0x16/0x1b other info that might help us debug this: 4 locks held by halt/2291: #0: (reboot_mutex){+.+.+.}, at: [] sys_reboot+0x91/0x16a #1: (cpu_add_remove_lock){+.+.+.}, at: [] cpu_maps_update_begin+0x12/0x14 #2: (cpu_hotplug.lock){+.+.+.}, at: [] cpu_hotplug_begin+0x27/0x4e #3: (&per_cpu(cpu_policy_rwsem, cpu)){+++++.}, at: [] lock_policy_rwsem_write+0x4a/0x7b stack backtrace: Pid: 2291, comm: halt Not tainted 2.6.34-rc2-cachefs #115 Call Trace: [] print_circular_bug+0xae/0xbd [] __lock_acquire+0xffa/0x16cd [] lock_acquire+0x57/0x6d [] ? sysfs_addrm_finish+0x31/0x5a [] ? __init_waitqueue_head+0x35/0x46 [] sysfs_deactivate+0x8c/0xc9 [] ? sysfs_addrm_finish+0x31/0x5a [] ? release_sysfs_dirent+0x9e/0xbe [] sysfs_addrm_finish+0x31/0x5a [] sysfs_remove_dir+0x75/0x88 [] kobject_del+0x16/0x37 [] kobject_release+0x3e/0x66 [] ? kobject_release+0x0/0x66 [] kref_put+0x43/0x4d [] kobject_put+0x47/0x4b [] __cpufreq_remove_dev+0x1da/0x236 [] cpufreq_cpu_callback+0x62/0x7a [] notifier_call_chain+0x32/0x5e [] __raw_notifier_call_chain+0x9/0xb [] _cpu_down+0x90/0x29e [] disable_nonboot_cpus+0x6f/0x105 [] kernel_power_off+0x21/0x3b [] sys_reboot+0x103/0x16a [] ? do_nanosleep+0x78/0xb2 [] ? hrtimer_nanosleep+0xab/0x118 [] ? hrtimer_wakeup+0x0/0x21 [] ? retint_swapgs+0xe/0x13 [] ? trace_hardirqs_on_caller+0x10c/0x130 [] ? audit_syscall_entry+0x17d/0x1b0 [] ? trace_hardirqs_on_thunk+0x3a/0x3f [] system_call_fastpath+0x16/0x1b Broke affinity for irq 4 lockdep: fixing up alternatives. SMP alternatives: switching to UP code Power down. acpi_power_off called David