From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id n9EF8LsK198638 for ; Wed, 14 Oct 2009 10:08:22 -0500 Received: from bombadil.infradead.org (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id 07DD45DFC for ; Wed, 14 Oct 2009 08:09:51 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org [18.85.46.34]) by cuda.sgi.com with ESMTP id lPzoK1RLne1FrmFw for ; Wed, 14 Oct 2009 08:09:51 -0700 (PDT) Date: Wed, 14 Oct 2009 11:09:51 -0400 From: Christoph Hellwig Subject: Re: deadlocks with fallocate Message-ID: <20091014150951.GA13123@infradead.org> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: Thomas Neumann Cc: linux-xfs@oss.sgi.com On Thu, Oct 08, 2009 at 08:59:45AM +0200, Thomas Neumann wrote: > I am willing to help to debug the problem, although it is probably a race > condition, as it does not occur all of the time. Is there anything I should > do to pinpoint the problem? > It always seems to occur when the user space calls fallocate (100% of my log > entries contained this function call), but otherwise I am not sure what > triggers it. I think we're deadlocking here because we have one process waiting from I/O completions from another task, but the waiting task holds a lock that the I/O completion needs. > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > xfsconvertd/0 D 0000000000000000 0 411 2 0x00000000 > ffff88007b21d3e0 0000000000000046 ffff88007d4e8c40 ffff88007b21dfd8 > ffff88007adfdb40 0000000000015980 0000000000015980 ffff88007b21dfd8 > 0000000000015980 ffff88007b21dfd8 0000000000015980 ffff88007adfdf00 > Call Trace: > [] io_schedule+0x42/0x60 > [] sync_page+0x35/0x50 > [] __wait_on_bit+0x55/0x80 > [] wait_on_page_bit+0x70/0x80 > [] shrink_page_list+0x3d8/0x550 > [] shrink_inactive_list+0x6b6/0x700 > [] shrink_list+0x51/0xb0 > [] shrink_zone+0x1ea/0x200 > [] shrink_zones+0x63/0xf0 > [] do_try_to_free_pages+0x70/0x280 > [] try_to_free_pages+0x9c/0xc0 > [] __alloc_pages_slowpath+0x232/0x520 > [] __alloc_pages_nodemask+0x146/0x180 > [] alloc_pages_current+0x87/0xd0 > [] allocate_slab+0x11c/0x1b0 > [] new_slab+0x2b/0x190 > [] __slab_alloc+0x121/0x230 > [] kmem_cache_alloc+0xf0/0x130 > [] kmem_zone_alloc+0x5d/0xd0 [xfs] > [] kmem_zone_zalloc+0x19/0x50 [xfs] > [] _xfs_trans_alloc+0x2f/0x70 [xfs] > [] xfs_trans_alloc+0x92/0xa0 [xfs] > [] xfs_iomap_write_unwritten+0x71/0x200 [xfs] > [] xfs_end_bio_unwritten+0x65/0x80 [xfs] > [] run_workqueue+0xb7/0x190 > [] worker_thread+0x96/0xf0 > [] child_rip+0xa/0x20 This thread is completing an unwritten I/O, which is the XFS name for preallocated space as the one that is created by posix_fallocate since the system call was wired up. It needs to allocate memory for the transaction pointer, and we go down all the way into the page allocator where we get stuck, probably waiting for I/O on the same inode. > [] xfs_ioend_wait+0x85/0xc0 [xfs] > [] xfs_setattr+0x85d/0xb20 [xfs] > [] xfs_vn_fallocate+0xed/0x100 [xfs] > [] do_fallocate+0xfd/0x110 > [] sys_fallocate+0x49/0x70 > [] system_call_fastpath+0x16/0x1b And this thread is inside the fallocate handler. Can you try if the hack below makes the problem go away? Index: linux-2.6/fs/xfs/linux-2.6/xfs_aops.c =================================================================== --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_aops.c 2009-10-14 17:06:50.489254023 +0200 +++ linux-2.6/fs/xfs/linux-2.6/xfs_aops.c 2009-10-14 17:07:54.055006445 +0200 @@ -278,6 +278,8 @@ xfs_end_bio_unwritten( xfs_off_t offset = ioend->io_offset; size_t size = ioend->io_size; + current->flags |= PF_FSTRANS; + if (likely(!ioend->io_error)) { if (!XFS_FORCED_SHUTDOWN(ip->i_mount)) { int error; _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs