From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay2.corp.sgi.com [137.38.102.29]) by oss.sgi.com (Postfix) with ESMTP id 740237F3F for ; Mon, 30 Dec 2013 09:20:51 -0600 (CST) Message-ID: <52C18F56.70709@sgi.com> Date: Mon, 30 Dec 2013 09:20:54 -0600 From: Mark Tinguely MIME-Version: 1.0 Subject: Re: [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing References: <52B98292.5040002@oracle.com> In-Reply-To: <52B98292.5040002@oracle.com> List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: Jeff Liu Cc: "xfs@oss.sgi.com" On 12/24/13 06:48, Jeff Liu wrote: > From: Jie Liu > > I can easily to hit a hang up while running fsstress and shutting down > XFS on SSD via the tests below: > > for ((i=0;i<10;i++)) > do > echo "[$i] Fire up..." > mount /dev/sda7 /xfs > fsstress -d /xfs -n 1000 -p 100>/dev/null 2>&1& > sleep 10 > godown /xfs > wait > killall -q fsstress > umount /xfs > echo "[$i] Done...." > echo > done > > which yielding a backtrace as below: > > [ 246.268987] INFO: task fsstress:3347 blocked for more than 120 seconds. > [ 246.268992] Tainted: PF O 3.13.0-rc2+ #4 > [ 246.268994] "echo 0> /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 246.268996] fsstress D ffff88026f254440 0 3347 3284 > > [ 246.269013] Call Trace: > [ 246.269022] [] schedule+0x29/0x70 > [ 246.269054] [] xlog_cil_force_lsn+0x1cb/0x220 [xfs] > [ 246.269059] [] ? wake_up_state+0x20/0x20 > [ 246.269064] [] ? do_fsync+0x80/0x80 > [ 246.269087] [] _xfs_log_force+0x61/0x270 [xfs] > [ 246.269091] [] ? jbd2_log_wait_commit+0x110/0x180 > [ 246.269095] [] ? prepare_to_wait_event+0x100/0x100 > [ 246.269098] [] ? do_fsync+0x80/0x80 > [ 246.269120] [] xfs_log_force+0x26/0x80 [xfs] > [ 246.269139] [] xfs_fs_sync_fs+0x2d/0x50 [xfs] > [ 246.269143] [] sync_fs_one_sb+0x20/0x30 > [ 246.269147] [] iterate_supers+0xb2/0x110 > [ 246.269150] [] sys_sync+0x62/0xa0 > [ 246.269156] [] system_call_fastpath+0x1a/0x1f > [ 266.335154] XFS (sda7): xfs_log_force: error 5 returned. > [ 296.400515] XFS (sda7): xfs_log_force: error 5 returned. > > In xlog_cil_force_lsn(), if the task finds a previous sequence still in > committing, it need to wait until all those previously sequence commits > to complete, i.e, blocked on cil->xc_commit_wait wait queue. In normal > situations, the ctx with a previous sequence will eventually commit and > wake up tasks on cil->xc_commit_wait after getting a vaild commit_lsn > (see xlog_cil_push()). However, if something wrong during commit, e.g, > XLOG_STATE_IOERROR is detected, it will be aborted and the ctx will be > just removed from the cil->xc_committing list but we did not wake up > the waiting tasks in this case. Hence, there is a race condition will > happen as below: > > Task1 Task2 > > list_add(&ctx->committing,&cil->xc_committing); > > xlog_wait(&cil->xc_commit_wait..) > schedule()... > > Aborting!! list_del(&ctx->committing); > wake_up_all(&cil->xc_commit_wait);<-- MISSING! > > As a result, we should handle this situation in xlog_cil_committed(). > > Signed-off-by: Jie Liu > --- > fs/xfs/xfs_log_cil.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c > index 5eb51fc..8c7e9c7 100644 > --- a/fs/xfs/xfs_log_cil.c > +++ b/fs/xfs/xfs_log_cil.c > @@ -406,6 +406,8 @@ xlog_cil_committed( > > spin_lock(&ctx->cil->xc_push_lock); > list_del(&ctx->committing); > + if (abort) > + wake_up_all(&ctx->cil->xc_commit_wait); > spin_unlock(&ctx->cil->xc_push_lock); > > xlog_cil_free_logvec(ctx->lv_chain); Hi Jeff, I hope you had a good break, So you are saying the wakeup in the CIL push error path missing? I agree with that. But I don't like adding a new wake up to xlog_cil_committed(), which is after the log buffer is written. Thanks. --Mark. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs