From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Chinner Subject: Re: task hung over xfs Date: Tue, 5 Jun 2012 16:28:58 +1000 Message-ID: <20120605062858.GH4347@dastard> References: Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: linux-xfs@oss.sgi.com, linux-fsdevel@vger.kernel.org To: Raz Return-path: Received: from ipmail07.adl2.internode.on.net ([150.101.137.131]:54630 "EHLO ipmail07.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753819Ab2FEG3E (ORCPT ); Tue, 5 Jun 2012 02:29:04 -0400 Content-Disposition: inline In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, May 30, 2012 at 09:44:45PM +0300, Raz wrote: > Hello > We using 2.6.32 gentoo 64bit.=A0 and we're getting task_hung timeout = stack. >=20 > Our server uses direct IO. =A0It reads files contents to buffers in > memory =A0and sends them by TCP. =A0In addition, data is received > by TCP and stored in files on disk. > Most of the IO is reading data and sending it by TCP sockets. >=20 > There are 4 threads reading data from disk into memory buffers. One > thread per partition. > There are about 20 threads reading data from the network and saving i= t > to disk. >=20 > In addition, there is an operation that is done on every file once it= is > downloaded. =A0This operation maps data from file to memory. =A0It is= done > in Java. I assume it is mmap. =A0The mapping is very short. >=20 > The bellow is the stack. Is this xfs bug=A0 ? root file system is xfs= as > well the data partition. > Was a fix made in this area=A0 ? when was it ? > thank you > raz >=20 >=20 INFO: task java:10449 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this messa= ge. java=A0=A0=A0=A0=A0=A0=A0=A0=A0 D 0007ffffffffe708 =A0=A0=A0 0 10449=A0= 10408 0x00000000 =A0 ffff88042acd1c28 0000000000000086 0000000000cd1b88 0000000000000000 =A0 0000000000000000 ffff88042acd1d7c 000000002c95c410 0000000000000000 =A0 0000000000015840 000000000000f9c8 ffff88042c95c410 ffff88042e4d96b0 Call Trace: =A0 [] ? kmem_zone_alloc+0xaa/0x110 [xfs] =A0 [] __down_write_nested+0xa5/0x100 =A0 [] __down_write+0x1e/0x40 =A0 [] down_write+0x1c/0x40 =A0 [] xfs_ilock+0x9c/0xb0 [xfs] =A0 [] xfs_free_eofblocks+0x256/0x290 [xfs] =A0 [] xfs_release+0x14d/0x210 [xfs] =A0 [] xfs_file_release+0x23/0x40 [xfs] =A0 [] __fput+0xe9/0x210 =A0 [] fput+0x2b/0x50 =A0 [] remove_vma+0x49/0xb0 =A0 [] do_munmap+0x36a/0x3d0 =A0 [] ? __down_write+0x1e/0x40 =A0 [] sys_munmap+0x5c/0xa0 =A0 [] system_call_fastpath+0x16/0x1b Holding the mmap_sem, blocked on the iolock in exclusive mode waiting f= or IO to complete. java=A0=A0=A0=A0=A0=A0=A0=A0=A0 D ffff8803bc495b48 =A0=A0=A0 0 11768=A0= 10408 0x00000000 =A0 ffff8803bc495a58 0000000000000086 ffff8803bc4959a8 ffffffff815f38bc =A0 ffff8803bc4959e8 000000005c14da46 ffff8803bc4959d8 ffffffff811300a2 =A0 0000000000015840 000000000000f9c8 ffff8803bc468000 ffff88042e4c2d60 Call Trace: =A0 [] ? _spin_lock+0x1c/0x40 =A0 [] ? swap_info_get+0x82/0x120 =A0 [] ? mem_cgroup_commit_charge_swapin+0x21/0x40 =A0 [] __down_read+0xad/0xfa =A0 [] down_read+0x1c/0x40 =A0 [] do_page_fault+0x379/0x3a0 =A0 [] page_fault+0x25/0x30 =A0 [] ? file_read_actor+0x6c/0x180 =A0 [] ? file_read_actor+0x107/0x180 =A0 [] generic_file_aio_read+0x492/0x6b0 =A0 [] xfs_read+0x138/0x2c0 [xfs] =A0 [] xfs_file_aio_read+0x6e/0x90 [xfs] =A0 [] do_sync_read+0x101/0x160 =A0 [] ? autoremove_wake_function+0x0/0x60 =A0 [] ? security_file_permission+0x24/0x40 =A0 [] vfs_read+0xe4/0x1c0 =A0 [] sys_read+0x5f/0xc0 =A0 [] system_call_fastpath+0x16/0x1b Holding the iolock in shared mode, taken a page fault during the read() call and blocked on the mmap_sem. IOWs, you're doing read() IO into a mmap()d buffer, and there's a concurrent munmap() of another region of the same file that is open under a different file descriptor. ABBA deadlock, and it's been there for about 10 years. The problem is the munmap() call calling fput() with the mmap_sem() held. Here's the latest discussion thread about solving it: https://lkml.org/lkml/2012/4/19/635 Right now your only option for avoiding the deadlock is "don't do that". Soon it might be "upgrade to 3.x", but don't hold your breath... Cheers, Dave. --=20 Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html