From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cuda.sgi.com (cuda3.sgi.com [192.48.176.15]) by oss.sgi.com (8.14.3/8.14.3/SuSE Linux 0.8) with ESMTP id p6B1429w164785 for ; Sun, 10 Jul 2011 20:04:02 -0500 Received: from ipmail07.adl2.internode.on.net (localhost [127.0.0.1]) by cuda.sgi.com (Spam Firewall) with ESMTP id B3B0717AC82F for ; Sun, 10 Jul 2011 18:03:59 -0700 (PDT) Received: from ipmail07.adl2.internode.on.net (ipmail07.adl2.internode.on.net [150.101.137.131]) by cuda.sgi.com with ESMTP id Nfer2aAZE7bWgFi2 for ; Sun, 10 Jul 2011 18:03:59 -0700 (PDT) Date: Mon, 11 Jul 2011 11:03:57 +1000 From: Dave Chinner Subject: [regression, 3.0-rc] xfs: freeze hang in 068 Message-ID: <20110711010357.GD23038@dastard> MIME-Version: 1.0 Content-Disposition: inline List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: xfs-bounces@oss.sgi.com Errors-To: xfs-bounces@oss.sgi.com To: hch@lst.de Cc: xfs@oss.sgi.com Christoph, The recent changes to the active transaction accounting to close a race on freeze can hang the freeze process and hence the filesystem. SysRq : Show Blocked State task PC stack pid father xfs_io D ffff88005b7a0f00 5592 32539 32535 0x00020000 ffff88005bc9dd88 0000000000000082 ffff88005bc9dd28 0000000000000296 ffff88005bc9c010 ffff88005bff4fe0 0000000000010f80 ffff88005bc9dfd8 ffff88005bc9dfd8 0000000000010f80 ffff88005cd1dfc0 ffff88005bff4fe0 Call Trace: [] schedule_timeout+0x97/0xbb [] ? lock_timer_base+0x4d/0x4d [] schedule_timeout_uninterruptible+0x19/0x1b [] xfs_quiesce_attr+0x1d/0x7f [] xfs_fs_freeze+0x20/0x2e [] freeze_super+0x8b/0xca [] do_vfs_ioctl+0x1d0/0x45c [] ? do_raw_spin_unlock+0x8f/0x98 [] ? virt_to_head_page+0x9/0x2c [] compat_sys_ioctl+0x33c/0x368 [] ? do_sys_open+0xee/0x100 [] sysenter_dispatch+0x7/0x2e This is waiting for mp->m_active_trans to reach zero. fsstress D 0000000000000000 5376 32541 32540 0x00020000 ffff88005b68dd48 0000000000000086 ffff88005b68dce8 ffffffff812a97b7 ffff88005b68c010 ffff88005cf8d7d0 0000000000010f80 ffff88005b68dfd8 ffff88005b68dfd8 0000000000010f80 ffffffff81a0b020 ffff88005cf8d7d0 Call Trace: [] ? do_raw_spin_unlock+0x8f/0x98 [] ? prepare_to_wait+0x71/0x7c [] xfs_file_aio_write+0x10a/0x245 [] ? wake_up_bit+0x25/0x25 [] do_sync_write+0xc6/0x103 [] ? handle_mm_fault+0xff/0x111 [] vfs_write+0xa9/0x105 [] ? vfs_llseek+0x2e/0x30 [] sys_write+0x45/0x6c [] sysenter_dispatch+0x7/0x2e This is waiting for the filesystem to unfreeze. fsstress D 0000000000000000 5040 32542 32540 0x00020000 ffff88005b62fc78 0000000000000082 ffff88005b62fc18 ffffffff812a97b7 ffff88005b62e010 ffff88005be9c000 0000000000010f80 ffff88005b62ffd8 ffff88005b62ffd8 0000000000010f80 ffff88005cf8efa0 ffff88005be9c000 Call Trace: [] ? do_raw_spin_unlock+0x8f/0x98 [] ? prepare_to_wait+0x71/0x7c [] _xfs_trans_alloc+0x89/0xee [] ? wake_up_bit+0x25/0x25 [] xfs_trans_alloc+0x13/0x15 [] xfs_change_file_space+0x1f9/0x2f0 [] ? mntput+0x21/0x23 [] ? path_put+0x1d/0x21 [] xfs_ioc_space+0xc2/0xd3 [] xfs_file_compat_ioctl+0x2e1/0x49b [] ? mntput_no_expire+0x50/0x141 [] ? mntput+0x21/0x23 [] ? vfs_fstat+0x3b/0x45 [] compat_sys_ioctl+0x1a1/0x368 [] sysenter_dispatch+0x7/0x2e This has an active transaction reference (i.e. keeping mp->m_active_trans > 0) and is waiting for the freeze to complete. Basically the problem is this: thread 1 freeze SB_FREEZE_WRITE sync_filesystem() SB_FREEZE_TRANS ->freeze xfs_trans_alloc atomic_inc(mp->m_active_trans) wait on (SB_FREEZE_TRANS) xfs_quiese_attr() while (mp->m_active_trans > 0) delay(1); So effective we cannot sleep waiting for SB_FREEZE_TRANS to go away while holding an active transaction reference because the freeze process does not set and check SB_FREEZE_TRANS/mp->m_active_trans atomically. I haven't put any thought into how to solve this problem yet, so I'd suggest that at this late stage we need to revert 315fdfa (xfs: fix filesystsem freeze race in xfs_trans_alloc) because the race it fixes is far less critical (i.e. doesn't hang the filesystem) and harder to hit than the regression introduced here. I've reproduced this a coupe lof times now on a 1p/1.5GB x86_64 kernel/i686 userspace VM. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs