From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wu Fengguang Subject: Re: [Bug 18632] "INFO: task" dpkg "blocked for more than 120 seconds. Date: Thu, 9 Jun 2011 17:09:06 +0800 Message-ID: <20110609090906.GA19186@localhost> References: <201106082138.p58Lchgj002615@demeter2.kernel.org> <20110608150241.8412a63d.akpm@linux-foundation.org> <20110609033217.GA10741@localhost> <20110609035426.GA12061@localhost> <20110609082718.GA10335@infradead.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Dave Chinner , Andrew Morton , Jan Kara , "linux-fsdevel@vger.kernel.org" , "bugzilla-daemon@bugzilla.kernel.org" , "daaugusto@gmail.com" , "kernel-bugzilla@cygnusx-1.org" , "listposter@gmail.com" , "justincase@yopmail.com" , "clopez@igalia.com" To: Christoph Hellwig Return-path: Received: from mga02.intel.com ([134.134.136.20]:58440 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752827Ab1FIJJJ (ORCPT ); Thu, 9 Jun 2011 05:09:09 -0400 Content-Disposition: inline In-Reply-To: <20110609082718.GA10335@infradead.org> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Thu, Jun 09, 2011 at 04:27:18PM +0800, Christoph Hellwig wrote: > On Thu, Jun 09, 2011 at 11:54:26AM +0800, Wu Fengguang wrote: > > Oh it's not sync(1) in Carlos' case, but random commands like man/dd > > as well as > > > > task xfssyncd:451 blocked for more than 120 seconds. > > > > It looks more like XFS related livelock, because I ran into similar > > problem in a very simple XFS setup on some plain disk partition. > > Care to explain what issues you saw, and with what setup and kernel > version? Also usually the task blocked more than 120 seconds display > should come with a stacktrace, how does it look like? I have a sync livelock test script and it sometimes livelocked on XFS even with the livelock fix patches. Ext4 is always OK. [ 3581.181253] sync D ffff8800b6ca15d8 4560 4403 4392 0x00000000 [ 3581.181734] ffff88006f775bc8 0000000000000046 ffff8800b6ca12b8 00000001b6ca1938 [ 3581.182411] ffff88006f774000 00000000001d2e40 00000000001d2e40 ffff8800b6ca1280 [ 3581.183088] 00000000001d2e40 ffff88006f775fd8 00000340af111ef2 00000000001d2e40 [ 3581.183765] Call Trace: [ 3581.184008] [] ? lock_release_holdtime+0xa3/0xab [ 3581.184392] [] ? prepare_to_wait+0x6c/0x79 [ 3581.184756] [] ? prepare_to_wait+0x6c/0x79 [ 3581.185120] [] xfs_ioend_wait+0x87/0x9f [ 3581.185474] [] ? wake_up_bit+0x2a/0x2a [ 3581.185827] [] xfs_sync_inode_data+0x92/0x9d [ 3581.186198] [] xfs_inode_ag_walk+0x1a5/0x287 [ 3581.186569] [] ? xfs_inode_ag_walk+0x25e/0x287 [ 3581.186946] [] ? xfs_sync_worker+0x69/0x69 [ 3581.187311] [] ? xfs_perag_get+0x68/0xd0 [ 3581.187669] [] ? local_clock+0x41/0x5a [ 3581.188020] [] ? lock_release_holdtime+0xa3/0xab [ 3581.188403] [] ? xfs_check_sizes+0x160/0x160 [ 3581.188773] [] ? xfs_perag_get+0x68/0xd0 [ 3581.189130] [] ? xfs_perag_get+0x80/0xd0 [ 3581.189488] [] ? xfs_check_sizes+0x160/0x160 [ 3581.189858] [] ? xfs_inode_ag_iterator+0x6d/0x8f [ 3581.190241] [] ? xfs_sync_worker+0x69/0x69 [ 3581.190606] [] xfs_inode_ag_iterator+0x47/0x8f [ 3581.190982] [] ? __sync_filesystem+0x7a/0x7a [ 3581.191352] [] xfs_sync_data+0x24/0x43 [ 3581.191703] [] xfs_quiesce_data+0x2c/0x88 [ 3581.192065] [] xfs_fs_sync_fs+0x21/0x48 [ 3581.192419] [] __sync_filesystem+0x66/0x7a [ 3581.192783] [] sync_one_sb+0x16/0x18 [ 3581.193128] [] iterate_supers+0x72/0xce [ 3581.193482] [] sync_filesystems+0x20/0x22 [ 3581.193842] [] sys_sync+0x21/0x33 [ 3581.194177] [] system_call_fastpath+0x16/0x1b I just reconfirmed the problem on 3.0-rc2 with/without the livelock fix patches (writeback fixes and cleanups v5 at http://lkml.org/lkml/2011/6/7/569), and find that situation has improved for XFS. It still has much longer sync time than ext4, however won't stuck until dd exits. root@fat /home/wfg# ./sync-livelock.sh 3.0-rc2, xfs: sync time: 20 sync time: 26 sync time: 27 3.0-rc2, ext4: sync time: 4 sync time: 4 sync time: 3 3.0-rc2 with livelock fix patches, xfs: sync time: 18 sync time: 21 sync time: 14 sync time: 20 sync time: 21 Thanks, Fengguang --- $ cat ./sync-livelock.sh #!/bin/sh umount /dev/sda7 mkfs.xfs -f /dev/sda7 # mkfs.ext4 /dev/sda7 # mkfs.btrfs /dev/sda7 mount /dev/sda7 /fs echo $((50<<20)) > /proc/sys/vm/dirty_bytes pid= for i in `seq 10` do dd if=/dev/zero of=/fs/zero-$i bs=1M count=1000 & pid="$pid $!" done sleep 1 tic=$(date +'%s') sync tac=$(date +'%s') echo echo sync time: $((tac-tic)) egrep '(Dirty|Writeback|NFS_Unstable)' /proc/meminfo pidof dd > /dev/null && { kill -9 $pid; echo sync NOT livelocked; }