From: Dave Chinner <david@fromorbit.com>
To: hch@lst.de
Cc: xfs@oss.sgi.com
Subject: [regression, 3.0-rc] xfs: freeze hang in 068
Date: Mon, 11 Jul 2011 11:03:57 +1000 [thread overview]
Message-ID: <20110711010357.GD23038@dastard> (raw)
Christoph,
The recent changes to the active transaction accounting to close a
race on freeze can hang the freeze process and hence the filesystem.
SysRq : Show Blocked State
task PC stack pid father
xfs_io D ffff88005b7a0f00 5592 32539 32535 0x00020000
ffff88005bc9dd88 0000000000000082 ffff88005bc9dd28 0000000000000296
ffff88005bc9c010 ffff88005bff4fe0 0000000000010f80 ffff88005bc9dfd8
ffff88005bc9dfd8 0000000000010f80 ffff88005cd1dfc0 ffff88005bff4fe0
Call Trace:
[<ffffffff8150c184>] schedule_timeout+0x97/0xbb
[<ffffffff81072766>] ? lock_timer_base+0x4d/0x4d
[<ffffffff8150c1c1>] schedule_timeout_uninterruptible+0x19/0x1b
[<ffffffff81269786>] xfs_quiesce_attr+0x1d/0x7f
[<ffffffff81266bb2>] xfs_fs_freeze+0x20/0x2e
[<ffffffff8110db00>] freeze_super+0x8b/0xca
[<ffffffff81118abc>] do_vfs_ioctl+0x1d0/0x45c
[<ffffffff812a97b7>] ? do_raw_spin_unlock+0x8f/0x98
[<ffffffff81102846>] ? virt_to_head_page+0x9/0x2c
[<ffffffff81143cb0>] compat_sys_ioctl+0x33c/0x368
[<ffffffff8110a0f3>] ? do_sys_open+0xee/0x100
[<ffffffff81514960>] sysenter_dispatch+0x7/0x2e
This is waiting for mp->m_active_trans to reach zero.
fsstress D 0000000000000000 5376 32541 32540 0x00020000
ffff88005b68dd48 0000000000000086 ffff88005b68dce8 ffffffff812a97b7
ffff88005b68c010 ffff88005cf8d7d0 0000000000010f80 ffff88005b68dfd8
ffff88005b68dfd8 0000000000010f80 ffffffff81a0b020 ffff88005cf8d7d0
Call Trace:
[<ffffffff812a97b7>] ? do_raw_spin_unlock+0x8f/0x98
[<ffffffff81080c73>] ? prepare_to_wait+0x71/0x7c
[<ffffffff81262d5d>] xfs_file_aio_write+0x10a/0x245
[<ffffffff81080a31>] ? wake_up_bit+0x25/0x25
[<ffffffff8110b52b>] do_sync_write+0xc6/0x103
[<ffffffff810eec9e>] ? handle_mm_fault+0xff/0x111
[<ffffffff8110be64>] vfs_write+0xa9/0x105
[<ffffffff8110b055>] ? vfs_llseek+0x2e/0x30
[<ffffffff8110bf79>] sys_write+0x45/0x6c
[<ffffffff81514960>] sysenter_dispatch+0x7/0x2e
This is waiting for the filesystem to unfreeze.
fsstress D 0000000000000000 5040 32542 32540 0x00020000
ffff88005b62fc78 0000000000000082 ffff88005b62fc18 ffffffff812a97b7
ffff88005b62e010 ffff88005be9c000 0000000000010f80 ffff88005b62ffd8
ffff88005b62ffd8 0000000000010f80 ffff88005cf8efa0 ffff88005be9c000
Call Trace:
[<ffffffff812a97b7>] ? do_raw_spin_unlock+0x8f/0x98
[<ffffffff81080c73>] ? prepare_to_wait+0x71/0x7c
[<ffffffff81255843>] _xfs_trans_alloc+0x89/0xee
[<ffffffff81080a31>] ? wake_up_bit+0x25/0x25
[<ffffffff8125875b>] xfs_trans_alloc+0x13/0x15
[<ffffffff8125abcb>] xfs_change_file_space+0x1f9/0x2f0
[<ffffffff81122dda>] ? mntput+0x21/0x23
[<ffffffff81113e7c>] ? path_put+0x1d/0x21
[<ffffffff81263301>] xfs_ioc_space+0xc2/0xd3
[<ffffffff81208f92>] xfs_file_compat_ioctl+0x2e1/0x49b
[<ffffffff81122cc8>] ? mntput_no_expire+0x50/0x141
[<ffffffff81122dda>] ? mntput+0x21/0x23
[<ffffffff8110efd8>] ? vfs_fstat+0x3b/0x45
[<ffffffff81143b15>] compat_sys_ioctl+0x1a1/0x368
[<ffffffff81514960>] sysenter_dispatch+0x7/0x2e
This has an active transaction reference (i.e. keeping
mp->m_active_trans > 0) and is waiting for the freeze to complete.
Basically the problem is this:
thread 1 freeze
SB_FREEZE_WRITE
sync_filesystem()
SB_FREEZE_TRANS
->freeze
xfs_trans_alloc
atomic_inc(mp->m_active_trans)
wait on (SB_FREEZE_TRANS)
xfs_quiese_attr()
while (mp->m_active_trans > 0)
delay(1);
So effective we cannot sleep waiting for SB_FREEZE_TRANS to go away
while holding an active transaction reference because the freeze
process does not set and check SB_FREEZE_TRANS/mp->m_active_trans
atomically.
I haven't put any thought into how to solve this problem yet, so I'd
suggest that at this late stage we need to revert 315fdfa (xfs: fix
filesystsem freeze race in xfs_trans_alloc) because the race it
fixes is far less critical (i.e. doesn't hang the filesystem) and
harder to hit than the regression introduced here.
I've reproduced this a coupe lof times now on a 1p/1.5GB x86_64
kernel/i686 userspace VM.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next reply other threads:[~2011-07-11 1:04 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-07-11 1:03 Dave Chinner [this message]
2011-07-11 9:51 ` [regression, 3.0-rc] xfs: freeze hang in 068 Christoph Hellwig
2011-07-12 0:46 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110711010357.GD23038@dastard \
--to=david@fromorbit.com \
--cc=hch@lst.de \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.