From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754464AbYA2UL4 (ORCPT ); Tue, 29 Jan 2008 15:11:56 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1764631AbYA2ULn (ORCPT ); Tue, 29 Jan 2008 15:11:43 -0500 Received: from brick.kernel.dk ([87.55.233.238]:7717 "EHLO kernel.dk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755106AbYA2ULl (ORCPT ); Tue, 29 Jan 2008 15:11:41 -0500 Date: Tue, 29 Jan 2008 21:11:36 +0100 From: Jens Axboe To: "Luck, Tony" Cc: LKML , linux-ia64@vger.kernel.org Subject: Re: system hang on latest git Message-ID: <20080129201136.GX15220@kernel.dk> References: <1FE6DD409037234FAB833C420AA843EC75736A@orsmsx424.amr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1FE6DD409037234FAB833C420AA843EC75736A@orsmsx424.amr.corp.intel.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jan 29 2008, Luck, Tony wrote: > I pulled Linus' tree this morning (git head = 0ba6c33bcddc64a54b5f1c25a696c4767dc76292) > and built for ia64 (using arch/ia64/configs/tiger_defconfig). System booted > OK, but when I stressed it a little (building another kernel with "make -j32") > it hung. > > The console has a bunch (98) of warnings about tasks blocked for more than 120 > seconds like this: > INFO: task grep:9168 blocked for more than 120 seconds. > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > Call Trace: > [] schedule+0x11c0/0x1340 > sp=e0000001ed8afbf0 bsp=e0000001ed8a1280 > [] do_get_write_access+0x660/0xbe0 > sp=e0000001ed8afc20 bsp=e0000001ed8a1208 > [] journal_get_write_access+0x40/0x80 > sp=e0000001ed8afca0 bsp=e0000001ed8a11c8 > [] __ext3_journal_get_write_access+0x30/0xa0 > sp=e0000001ed8afca0 bsp=e0000001ed8a1190 > [] ext3_reserve_inode_write+0x80/0x120 > sp=e0000001ed8afca0 bsp=e0000001ed8a1158 > [] ext3_mark_inode_dirty+0x30/0x80 > sp=e0000001ed8afca0 bsp=e0000001ed8a1130 > [] ext3_dirty_inode+0xd0/0x120 > sp=e0000001ed8afcc0 bsp=e0000001ed8a1100 > [] __mark_inode_dirty+0xa0/0x3e0 > sp=e0000001ed8afcc0 bsp=e0000001ed8a10b0 > [] touch_atime+0x310/0x340 > sp=e0000001ed8afcc0 bsp=e0000001ed8a1088 > [] do_generic_mapping_read+0x780/0x7a0 > sp=e0000001ed8afce0 bsp=e0000001ed8a0fe0 > [] generic_file_aio_read+0x290/0x340 > sp=e0000001ed8afce0 bsp=e0000001ed8a0f80 > [] do_sync_read+0x170/0x200 > sp=e0000001ed8afd10 bsp=e0000001ed8a0f40 > [] vfs_read+0x1b0/0x2e0 > sp=e0000001ed8afe20 bsp=e0000001ed8a0ef0 > [] sys_read+0x70/0xe0 > sp=e0000001ed8afe20 bsp=e0000001ed8a0e78 > [] ia64_ret_from_syscall+0x0/0x20 > sp=e0000001ed8afe30 bsp=e0000001ed8a0e78 > > > [The stack trace has several variations ... some from sys_read(), some from > sys_open(), some from sys_execve(), some from sys_mmap() etc. 84/98 stack > traces pass through the touch_atime->__mark_inode_dirty path ... all 98 > are attached] > > A quick dig into processor state shows 8 cpus are idle. 7 are spinning > in __spin_lock_irq() from __make_request() and one is in spin_lock() from > as_merged_requests(). Looks like a deadlock on queue lock and ioc lock, but I don't see immediately what the problem is. I can't stick around for longer tonight, but I'll get to the bottom of this tomorrow. -- Jens Axboe