From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andi Kleen Subject: Re: reiserfs unstable on large systems in 2.6.13-git9 Date: Mon, 12 Sep 2005 10:48:13 +0200 Message-ID: <200509121048.13917.ak@suse.de> References: <20050911223045.GA11071@wotan.suse.de> <20050911192100.21e1d960@watt.suse.com> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Cc: reiserfs-dev@namesys.com, mason@suse.de, jeffm@suse.de, linux-fsdevel@vger.kernel.org Return-path: Received: from cantor2.suse.de ([195.135.220.15]:44197 "EHLO mx2.suse.de") by vger.kernel.org with ESMTP id S1751244AbVILIsT (ORCPT ); Mon, 12 Sep 2005 04:48:19 -0400 To: Chris Mason In-Reply-To: <20050911192100.21e1d960@watt.suse.com> Content-Disposition: inline Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Monday 12 September 2005 01:21, Chris Mason wrote: > On Mon, 12 Sep 2005 00:30:46 +0200 > > Andi Kleen wrote: > > When I run even relatively minor stress on 8 or 16 core > > Opterons I get deadlocks like this: > > I'm assuming this goes away when acls are turned off? No, it doesn't although the oopses look different now. BTW the second oops in https://bugzilla.novell.com/show_bug.cgi?id=105377 (for 2.6.13) looks similar too. -Andi (on 16 core system, but i've seen it on other smaller systems too) NMI Watchdog detected LOCKUP on CPU 11 ^MCPU 11 ^MModules linked in: ^MPid: 20408, comm: ls Not tainted 2.6.13-git9 #4 ^MRIP: 0010:[] {_spin_lock_irqsave+9} ^MRSP: 0018:ffff81043e259e60 EFLAGS: 00000002 ^MRAX: 0000000000000000 RBX: ffffffff804c2ba0 RCX: 0000000000000000 ^MRDX: 0000000000000000 RSI: 00007fffff959fc0 RDI: ffffffff804c2ba8 ^MRBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 ^MR10: ffffffffffffffff R11: 0000000000000246 R12: ffffffff804c2ba8 ^MR13: ffff8102bd6eb540 R14: ffff81043e259e70 R15: 00007fffff95a000 ^MFS: 00002aaaab31d6e0(0000) GS:ffffffff80603d80(0000) knlGS:0000000000000000 ^MCS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b ^MCR2: 00002aaaaaf9c450 CR3: 000000033bfca000 CR4: 00000000000006a0 ^MProcess ls (pid: 20408, threadinfo ffff81043e258000, task ffff8102bd6eb540) ^MStack: 0000000000000286 ffffffff80418a7b 0000000000000000 ffff8102bd6eb540 ^M ffffffff801313b0 0000000000000000 0000000000000000 ffff810009d6ba38 ^M 0000000000000000 0000000000000000 ^MCall Trace:{__down+75} {default_wake_function+0} ^M {__down_failed+53} {.text.lock.kernel_lock+25} ^M {sys_sysctl+38} {system_call+126} ^M ^MCode: f0 fe 0f 0f 88 5b 02 00 00 48 8b 04 24 48 83 c4 08 c3 66 66 ^Mconsole shuts up ... ^M NMI Watchdog detected LOCKUP on CPU 10 ^MKernel panic - not syncing: Aiee, killing interrupt handler! ^M CPU 10 ^MModules linked in: ^MPid: 25726, comm: reaim Not tainted 2.6.13-git9 #4 ^MRIP: 0010:[] {.text.lock.spinlock+22} ^MRSP: 0018:ffff810133aebc40 EFLAGS: 00000086 ^MRAX: 0000000000000000 RBX: ffffffff804c2ba0 RCX: 0000000000000000 ^MRDX: 0000000000000000 RSI: ffff810133aebdd8 RDI: ffffffff804c2ba8 ^MRBP: ffff8103bfa26c78 R08: ffff81043ff72000 R09: 0000000000000000 ^MR10: ffff81007f70c5e0 R11: 0000000000000246 R12: ffffffff804c2ba8 ^MR13: ffff8101bc67c880 R14: ffff810133aebc50 R15: 00000000000001ff ^MFS: 00002aaaaaf3b0a0(0000) GS:ffffffff80603d00(0000) knlGS:0000000000000000 ^MCS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b ^MCR2: 00002aaaaae1641e CR3: 000000013fedc000 CR4: 00000000000006a0 ^MProcess reaim (pid: 25726, threadinfo ffff810133aea000, task ffff8101bc67c880) ^MStack: 0000000000000282 ffffffff80418a7b 0000000000000000 ffff8101bc67c880 ^M ffffffff801313b0 0000000000000000 0000000000000000 ffff8101bc67c880 ^M ffff810133aebdd8 ffff8103bfa26c78 ^MCall Trace:{__down+75} {default_wake_function+0} ^M {__down_failed+53} {.text.lock.kernel_lock+25} ^M {reiserfs_setattr+44} {__down_write+51} ^M {notify_change+340} {do_truncate+65} ^M {may_open+468} {open_namei+734} ^M {thread_return+0} {filp_open+39} ^M {get_unused_fd+219} {do_sys_open+81} ^M {system_call+126} ^MCode: 80 3f 00 7e f9 e9 90 fd ff ff f3 90 80 3f 00 7e f9 e9 9c fd ^Mconsole shuts up ... ^MBadness in do_unblank_screen at drivers/char/vt.c:2831 ^MCall Trace: {do_unblank_screen+75} {bust_spinlocks+28} ^M {oops_end+21} {die_nmi+113} ^M {nmi_watchdog_tick+245} {default_do_nmi+130} ^M {do_nmi+69} {nmi+127} ^M {.text.lock.spinlock+22} {__down+75} ^M {default_wake_function+0} {__down_failed+53} ^M {.text.lock.kernel_lock+25} {reiserfs_setattr+44} ^M {__down_write+51} {notify_change+340} ^M {do_truncate+65} {may_open+468} ^M {open_namei+734} {thread_return+0} ^M {filp_open+39} {get_unused_fd+219} ^M {do_sys_open+81} {system_call+126} ^M ^M <0>Rebooting in 30 seconds..SESC[0mESC[1m^@ESC[01;00H^@