From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ranko Zivojnovic Subject: Re: + gen_estimator-fix-locking-and-timer-related-bugs.patch added to -mm tree Date: Sat, 07 Jul 2007 10:55:18 +0300 Message-ID: <1183794918.30237.69.camel@localhost.localdomain> References: <200706271921.l5RJLgCC003910@imap1.linux-foundation.org> <1183642800.3789.11.camel@ranko-fc2.spidernet.net> <20070705142135.GG4759@ff.dom.local> <1183646029.4069.11.camel@ranko-fc2.spidernet.net> <1183651165.4069.26.camel@ranko-fc2.spidernet.net> <20070706061420.GA1846@ff.dom.local> <20070706062629.GC1846@ff.dom.local> <20070706064523.GA2144@ff.dom.local> <1183727654.6389.3.camel@ranko-fc2.spidernet.net> <468E4FD9.4060702@trash.net> <1183733732.6389.26.camel@ranko-fc2.spidernet.net> Mime-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit Cc: Jarek Poplawski , akpm@linux-foundation.org, netdev@vger.kernel.org To: Patrick McHardy Return-path: Received: from MPP-Goya.spidernet.net ([194.154.128.52]:11674 "EHLO mail0pop.spidernet.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753951AbXGGH4h (ORCPT ); Sat, 7 Jul 2007 03:56:37 -0400 In-Reply-To: <1183733732.6389.26.camel@ranko-fc2.spidernet.net> Sender: netdev-owner@vger.kernel.org List-Id: netdev.vger.kernel.org On Fri, 2007-07-06 at 17:55 +0300, Ranko Zivojnovic wrote: > On Fri, 2007-07-06 at 16:21 +0200, Patrick McHardy wrote: > > Ranko Zivojnovic wrote: > > > BUG: spinlock lockup on CPU#0, swapper/0, c03eff80 > > > [] _raw_spin_lock+0x108/0x13c > > > [] __qdisc_run+0x97/0x1b0 > > > [] qdisc_watchdog+0x19/0x58 > > > [] __lock_text_start+0x37/0x43 > > > [] qdisc_watchdog+0x56/0x58 > > > [] qdisc_watchdog+0x0/0x58 > > > [] run_hrtimer_softirq+0x58/0xb5 > > > [...] > > > > > BUG: spinlock lockup on CPU#1, swapper/0, c03eff80 > > > [] _raw_spin_lock+0x108/0x13c > > > [] est_timer+0x53/0x148 > > > [] run_timer_softirq+0x30/0x184 > > > [] run_timer_softirq+0x121/0x184 > > > [] __do_softirq+0x66/0xf3 > > > [] est_timer+0x0/0x148 > > > [...] > > > > > > There is at least one ABBA deadlock, est_timer does: > > > > read_lock(&est_lock) > > spin_lock(e->stats_lock) (which is dev->queue_lock) > > > > and qdisc_destroy calls htb_destroy under dev->queue_lock, which > > calls htb_destroy_class, then gen_kill_estimator and this > > write_locks est_lock. > > > > I can't see the problem above though, the qdisc_run path only takes > > dev->queue_lock. Please enable lockdep and post the output if any. > I've got both code paths this time. It shows exactly the ABBA deadlock you describe above. The details are below. Maybe the appropriate way to fix this would to call gen_kill_estimator, with the appropriate lock order, before the call to qdisc_destroy, so when dev->queue_lock is taken for qdisc_destroy - the structure is already off the list. -------------LOG------------ BUG: spinlock lockup on CPU#2, ping/27868, c03eff80 [] _raw_spin_lock+0x108/0x13c [] est_timer+0x53/0x148 [] run_timer_softirq+0x121/0x184 [] __do_softirq+0x66/0xf3 [] est_timer+0x0/0x148 [] __do_softirq+0x7e/0xf3 [] do_softirq+0x56/0x58 [] smp_apic_timer_interrupt+0x5a/0x85 [] apic_timer_interrupt+0x29/0x38 [] apic_timer_interrupt+0x33/0x38 [] local_bh_enable+0x94/0x13b [] dev_queue_xmit+0x95/0x2d5 [] ip_output+0x193/0x32a [] ip_finish_output+0x0/0x29e [] ip_push_pending_frames+0x27f/0x46b [] dst_output+0x0/0x7 [] raw_sendmsg+0x70b/0x7f2 [] inet_sendmsg+0x2b/0x49 [] sock_sendmsg+0xe2/0xfd [] autoremove_wake_function+0x0/0x37 [] autoremove_wake_function+0x0/0x37 [] enqueue_entity+0x139/0x4f8 [] copy_from_user+0x2d/0x59 [] sys_sendmsg+0x12d/0x243 [] __lock_acquire+0x825/0x1002 [] __lock_acquire+0x825/0x1002 [] scheduler_tick+0x1a7/0x20e [] _spin_unlock_irq+0x20/0x23 [] trace_hardirqs_on+0x73/0x147 [] run_timer_softirq+0x30/0x184 [] _spin_unlock_irq+0x20/0x23 [] sys_socketcall+0x24f/0x271 [] trace_hardirqs_on+0xab/0x147 [] copy_to_user+0x2f/0x49 [] sysenter_past_esp+0x8f/0x99 [] sysenter_past_esp+0x5f/0x99 ======================= And here is the ABBA deadlock: ---cut--- SysRq : Show Locks Held Showing all locks held in the system: ****snip**** 3 locks held by ping/27868: #0: (sk_lock-AF_INET){--..}, at: [] raw_sendmsg+0x676/0x7f2 #1: (est_lock#2){-.-+}, at: [] est_timer+0x15/0x148 #2: (&dev->queue_lock){-+..}, at: [] est_timer+0x53/0x148 ****snip**** 8 locks held by tc/2278: #0: (rtnl_mutex){--..}, at: [] rtnetlink_rcv+0x18/0x42 #1: (&dev->queue_lock){-+..}, at: [] qdisc_lock_tree+0xe/0x1c #2: (&dev->ingress_lock){-...}, at: [] tc_get_qdisc+0x192/0x1e9 #3: (est_lock#2){-.-+}, at: [] gen_kill_estimator+0x58/0x6f #4: (&irq_lists[i].lock){++..}, at: [] serial8250_interrupt+0x14/0x132 #5: (&port_lock_key){++..}, at: [] serial8250_interrupt+0x62/0x132 #6: (sysrq_key_table_lock){+...}, at: [] __handle_sysrq+0x17/0x115 #7: (tasklist_lock){..-?}, at: [] debug_show_all_locks+0x2e/0x15e ****snip**** ---cut--- As well as 'tc' stack: ---cut--- SysRq : Show Regs Pid: 2278, comm: tc EIP: 0060:[] CPU: 0 EIP is at __write_lock_failed+0xf/0x1c EFLAGS: 00000287 Not tainted (2.6.22-rc6-mm1.SNET.Thors.htbpatch.2.lockdebug #1) EAX: c03f5968 EBX: c03f5968 ECX: 00000000 EDX: 00000004 ESI: c9852840 EDI: c85eae24 EBP: c06aaa60 DS: 007b ES: 007b FS: 00d8 CR0: 8005003b CR2: 008ba828 CR3: 11841000 CR4: 000006d0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 [] _raw_write_lock+0x32/0x6e [] gen_kill_estimator+0x58/0x6f [] htb_destroy_class+0x27/0x12f [sch_htb] [] htb_destroy+0x34/0x70 [sch_htb] [] qdisc_destroy+0x52/0x8d [] trace_hardirqs_on+0x73/0x147 [] htb_destroy_class+0xd2/0x12f [sch_htb] [] htb_destroy+0x34/0x70 [sch_htb] [] qdisc_destroy+0x52/0x8d [] tc_get_qdisc+0x19b/0x1e9 [] tc_get_qdisc+0x0/0x1e9 [] rtnetlink_rcv_msg+0x1c2/0x1f5 [] netlink_run_queue+0x96/0xfd [] rtnetlink_rcv_msg+0x0/0x1f5 [] rtnetlink_rcv+0x26/0x42 [] netlink_data_ready+0x12/0x54 [] netlink_sendskb+0x1f/0x53 [] netlink_sendmsg+0x1f5/0x2d4 [] sock_sendmsg+0xe2/0xfd [] autoremove_wake_function+0x0/0x37 [] __lock_acquire+0x825/0x1002 [] sock_sendmsg+0xe2/0xfd [] copy_from_user+0x2d/0x59 [] sys_sendmsg+0x12d/0x243 [] __do_fault+0x12b/0x38b [] __do_fault+0x198/0x38b [] __lock_acquire+0x118/0x1002 [] filemap_fault+0x0/0x42f [] __handle_mm_fault+0x11e/0x68d [] sys_socketcall+0x24f/0x271 [] trace_hardirqs_on+0xab/0x147 [] restore_nocheck+0x12/0x15 [] sysenter_past_esp+0x5f/0x99 ======================= ---cut--- Best regards, Ranko