linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing
@ 2013-12-24 12:48 Jeff Liu
  2013-12-30 15:20 ` Mark Tinguely
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Liu @ 2013-12-24 12:48 UTC (permalink / raw)
  To: xfs@oss.sgi.com

From: Jie Liu <jeff.liu@oracle.com>

I can easily to hit a hang up while running fsstress and shutting down
XFS on SSD via the tests below:

for ((i=0;i<10;i++))
do
    echo "[$i] Fire up..."
    mount /dev/sda7 /xfs
    fsstress -d /xfs -n 1000 -p 100 >/dev/null 2>&1 &
    sleep 10
    godown /xfs
    wait
    killall -q fsstress
    umount /xfs
    echo "[$i] Done...."
    echo
done

which yielding a backtrace as below:

[  246.268987] INFO: task fsstress:3347 blocked for more than 120 seconds.
[  246.268992]       Tainted: PF          O 3.13.0-rc2+ #4
[  246.268994] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  246.268996] fsstress        D ffff88026f254440     0  3347   3284
<snip>
[  246.269013] Call Trace:
[  246.269022]  [<ffffffff816f3829>] schedule+0x29/0x70
[  246.269054]  [<ffffffffa0c4546b>] xlog_cil_force_lsn+0x1cb/0x220 [xfs]
[  246.269059]  [<ffffffff81097210>] ? wake_up_state+0x20/0x20
[  246.269064]  [<ffffffff811e9110>] ? do_fsync+0x80/0x80
[  246.269087]  [<ffffffffa0c43881>] _xfs_log_force+0x61/0x270 [xfs]
[  246.269091]  [<ffffffff8128b490>] ? jbd2_log_wait_commit+0x110/0x180
[  246.269095]  [<ffffffff810a83f0>] ? prepare_to_wait_event+0x100/0x100
[  246.269098]  [<ffffffff811e9110>] ? do_fsync+0x80/0x80
[  246.269120]  [<ffffffffa0c43ab6>] xfs_log_force+0x26/0x80 [xfs]
[  246.269139]  [<ffffffffa0bea31d>] xfs_fs_sync_fs+0x2d/0x50 [xfs]
[  246.269143]  [<ffffffff811e9130>] sync_fs_one_sb+0x20/0x30
[  246.269147]  [<ffffffff811bd5d2>] iterate_supers+0xb2/0x110
[  246.269150]  [<ffffffff811e9262>] sys_sync+0x62/0xa0
[  246.269156]  [<ffffffff816ffd6d>] system_call_fastpath+0x1a/0x1f
[  266.335154] XFS (sda7): xfs_log_force: error 5 returned.
[  296.400515] XFS (sda7): xfs_log_force: error 5 returned.

In xlog_cil_force_lsn(), if the task finds a previous sequence still in
committing, it need to wait until all those previously sequence commits
to complete, i.e, blocked on cil->xc_commit_wait wait queue.  In normal
situations, the ctx with a previous sequence will eventually commit and
wake up tasks on cil->xc_commit_wait after getting a vaild commit_lsn
(see xlog_cil_push()).  However, if something wrong during commit, e.g,
XLOG_STATE_IOERROR is detected, it will be aborted and the ctx will be
just removed from the cil->xc_committing list but we did not wake up
the waiting tasks in this case.  Hence, there is a race condition will
happen as below:

	Task1                    Task2

                	list_add(&ctx->committing, &cil->xc_committing);

xlog_wait(&cil->xc_commit_wait..)
schedule()...

                	Aborting!! list_del(&ctx->committing);
                	wake_up_all(&cil->xc_commit_wait); <-- MISSING!

As a result, we should handle this situation in xlog_cil_committed().

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
---
 fs/xfs/xfs_log_cil.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index 5eb51fc..8c7e9c7 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -406,6 +406,8 @@ xlog_cil_committed(
 
 	spin_lock(&ctx->cil->xc_push_lock);
 	list_del(&ctx->committing);
+	if (abort)
+		wake_up_all(&ctx->cil->xc_commit_wait);
 	spin_unlock(&ctx->cil->xc_push_lock);
 
 	xlog_cil_free_logvec(ctx->lv_chain);
-- 
1.8.3.2

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-01-03 15:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-24 12:48 [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing Jeff Liu
2013-12-30 15:20 ` Mark Tinguely
2014-01-01 14:38   ` Jeff Liu
2014-01-02  0:45     ` Dave Chinner
2014-01-03 10:25       ` Jeff Liu
2014-01-03 13:17         ` Jeff Liu
2014-01-03 15:30           ` Mark Tinguely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).