From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757178Ab2IENt5 (ORCPT ); Wed, 5 Sep 2012 09:49:57 -0400 Received: from mx1.redhat.com ([209.132.183.28]:19909 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752725Ab2IENtz (ORCPT ); Wed, 5 Sep 2012 09:49:55 -0400 Date: Tue, 4 Sep 2012 11:44:12 -0400 From: Dave Jones To: Linus Torvalds Cc: Linux Kernel Mailing List , Thomas Gleixner Subject: Re: Linux 3.6-rc4 Message-ID: <20120904154412.GA14321@redhat.com> Mail-Followup-To: Dave Jones , Linus Torvalds , Linux Kernel Mailing List , Thomas Gleixner References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Sep 01, 2012 at 03:10:58PM -0700, Linus Torvalds wrote: > The kernel summit is over, and most people have either returned or are > returning from San Diego. Still seeing this, that I started seeing just before leaving for San Diego.. Dave ====================================================== [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ] 3.6.0-rc4+ #36 Not tainted ------------------------------------------------------ trinity-child6/2057 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: blocked: (tasklist_lock){.+.+..}, instance: ffffffff81c05098, at: [] posix_cpu_timer_del+0x2b/0xe0 and this task is already holding: held: (&(&new_timer->it_lock)->rlock){..-...}, instance: ffff8801207a4d78, at: [] exit_itimers+0x5e/0xe0 which would create a new lock dependency: (&(&new_timer->it_lock)->rlock){..-...} -> (tasklist_lock){.+.+..} but this new dependency connects a SOFTIRQ-irq-safe lock: (&(&new_timer->it_lock)->rlock){..-...} ... which became SOFTIRQ-irq-safe at: [] __lock_acquire+0x587/0x1ae0 [] lock_acquire+0xad/0x220 [] _raw_spin_lock_irqsave+0x7d/0xd0 [] posix_timer_fn+0x37/0xe0 [] __run_hrtimer+0x94/0x4c0 [] hrtimer_interrupt+0xf7/0x230 [] __hrtimer_peek_ahead_timers.part.27+0x2b/0x30 [] hrtimer_peek_ahead_timers+0x49/0xa0 [] run_hrtimer_softirq+0x31/0x40 [] __do_softirq+0x100/0x420 [] run_ksoftirqd+0x13c/0x230 [] kthread+0xb7/0xc0 [] kernel_thread_helper+0x4/0x10 to a SOFTIRQ-irq-unsafe lock: (&(&p->alloc_lock)->rlock){+.+...} ... which became SOFTIRQ-irq-unsafe at: ... [] __lock_acquire+0x5e0/0x1ae0 [] lock_acquire+0xad/0x220 [] _raw_spin_lock+0x46/0x80 [] set_task_comm+0x32/0x180 [] kthreadd+0x38/0x2e0 [] kernel_thread_helper+0x4/0x10 other info that might help us debug this: Chain exists of: &(&new_timer->it_lock)->rlock --> tasklist_lock --> &(&p->alloc_lock)->rlock Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&p->alloc_lock)->rlock); local_irq_disable(); lock(&(&new_timer->it_lock)->rlock); lock(tasklist_lock); lock(&(&new_timer->it_lock)->rlock); *** DEADLOCK *** 1 lock on stack by trinity-child6/2057: #0: held: (&(&new_timer->it_lock)->rlock){..-...}, instance: ffff8801207a4d78, at: [] exit_itimers+0x5e/0xe0 the dependencies between SOFTIRQ-irq-safe lock and the holding lock: -> (&(&new_timer->it_lock)->rlock){..-...} ops: 8815 { IN-SOFTIRQ-W at: [] __lock_acquire+0x587/0x1ae0 [] lock_acquire+0xad/0x220 [] _raw_spin_lock_irqsave+0x7d/0xd0 [] posix_timer_fn+0x37/0xe0 [] __run_hrtimer+0x94/0x4c0 [] hrtimer_interrupt+0xf7/0x230 [] __hrtimer_peek_ahead_timers.part.27+0x2b/0x30 [] hrtimer_peek_ahead_timers+0x49/0xa0 [] run_hrtimer_softirq+0x31/0x40 [] __do_softirq+0x100/0x420 [] run_ksoftirqd+0x13c/0x230 [] kthread+0xb7/0xc0 [] kernel_thread_helper+0x4/0x10 INITIAL USE at: [] __lock_acquire+0x2ff/0x1ae0 [] lock_acquire+0xad/0x220 [] _raw_spin_lock_irqsave+0x7d/0xd0 [] __lock_timer+0x89/0x1f0 [] sys_timer_getoverrun+0x17/0x50 [] system_call_fastpath+0x1a/0x1f } ... key at: [] __key.30008+0x0/0x8 ... acquired at: [] check_irq_usage+0x5b/0xe0 [] __lock_acquire+0xd4d/0x1ae0 [] lock_acquire+0xad/0x220 [] _raw_read_lock+0x49/0x80 [] posix_cpu_timer_del+0x2b/0xe0 [] exit_itimers+0x46/0xe0 [] do_exit+0x7a6/0xb70 [] do_group_exit+0x4c/0xc0 [] sys_exit_group+0x17/0x20 [] system_call_fastpath+0x1a/0x1f the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock: -> (&(&p->alloc_lock)->rlock){+.+...} ops: 21207588 { HARDIRQ-ON-W at: [] __lock_acquire+0x5b2/0x1ae0 [] lock_acquire+0xad/0x220 [] _raw_spin_lock+0x46/0x80 [] set_task_comm+0x32/0x180 [] kthreadd+0x38/0x2e0 [] kernel_thread_helper+0x4/0x10 SOFTIRQ-ON-W at: [] __lock_acquire+0x5e0/0x1ae0 [] lock_acquire+0xad/0x220 [] _raw_spin_lock+0x46/0x80 [] set_task_comm+0x32/0x180 [] kthreadd+0x38/0x2e0 [] kernel_thread_helper+0x4/0x10 INITIAL USE at: [] __lock_acquire+0x2ff/0x1ae0 [] lock_acquire+0xad/0x220 [] _raw_spin_lock+0x46/0x80 [] set_task_comm+0x32/0x180 [] kthreadd+0x38/0x2e0 [] kernel_thread_helper+0x4/0x10 } ... key at: [] __key.46264+0x0/0x8 ... acquired at: [] lock_acquire+0xad/0x220 [] _raw_spin_lock+0x46/0x80 [] keyctl_session_to_parent+0xde/0x490 [] sys_keyctl+0x6d/0x1a0 [] system_call_fastpath+0x1a/0x1f -> (tasklist_lock){.+.+..} ops: 288395 { HARDIRQ-ON-R at: [] __lock_acquire+0x4bf/0x1ae0 [] lock_acquire+0xad/0x220 [] _raw_read_lock+0x49/0x80 [] do_wait+0xb2/0x360 [] sys_wait4+0x75/0xf0 [] wait_for_helper+0x82/0xb0 [] kernel_thread_helper+0x4/0x10 SOFTIRQ-ON-R at: [] __lock_acquire+0x5e0/0x1ae0 [] lock_acquire+0xad/0x220 [] _raw_read_lock+0x49/0x80 [] do_wait+0xb2/0x360 [] sys_wait4+0x75/0xf0 [] wait_for_helper+0x82/0xb0 [] kernel_thread_helper+0x4/0x10 INITIAL USE at: [] __lock_acquire+0x2ff/0x1ae0 [] lock_acquire+0xad/0x220 [] _raw_write_lock_irq+0x5c/0xa0 [] copy_process.part.22+0x1041/0x1740 [] do_fork+0x140/0x4e0 [] kernel_thread+0x76/0x80 [] rest_init+0x26/0x154 [] start_kernel+0x3fa/0x407 [] x86_64_start_reservations+0x131/0x135 [] x86_64_start_kernel+0x148/0x157 } ... key at: [] tasklist_lock+0x18/0x80 ... acquired at: [] check_irq_usage+0x5b/0xe0 [] __lock_acquire+0xd4d/0x1ae0 [] lock_acquire+0xad/0x220 [] _raw_read_lock+0x49/0x80 [] posix_cpu_timer_del+0x2b/0xe0 [] exit_itimers+0x46/0xe0 [] do_exit+0x7a6/0xb70 [] do_group_exit+0x4c/0xc0 [] sys_exit_group+0x17/0x20 [] system_call_fastpath+0x1a/0x1f stack backtrace: Pid: 2057, comm: trinity-child6 Not tainted 3.6.0-rc4+ #36 Call Trace: [] check_usage+0x4e4/0x500 [] ? sched_clock+0x13/0x20 [] ? native_sched_clock+0x19/0x80 [] ? sched_clock+0x13/0x20 [] ? native_sched_clock+0x19/0x80 [] check_irq_usage+0x5b/0xe0 [] __lock_acquire+0xd4d/0x1ae0 [] ? trace_hardirqs_off+0xd/0x10 [] ? local_clock+0x99/0xc0 [] ? native_sched_clock+0x19/0x80 [] lock_acquire+0xad/0x220 [] ? posix_cpu_timer_del+0x2b/0xe0 [] ? exit_itimers+0x5e/0xe0 [] _raw_read_lock+0x49/0x80 [] ? posix_cpu_timer_del+0x2b/0xe0 [] ? _raw_spin_lock_irqsave+0xa9/0xd0 [] posix_cpu_timer_del+0x2b/0xe0 [] exit_itimers+0x46/0xe0 [] do_exit+0x7a6/0xb70 [] ? retint_swapgs+0x13/0x1b [] do_group_exit+0x4c/0xc0 [] sys_exit_group+0x17/0x20 [] system_call_fastpath+0x1a/0x1f