* [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing
@ 2013-12-24 12:48 Jeff Liu
2013-12-30 15:20 ` Mark Tinguely
0 siblings, 1 reply; 7+ messages in thread
From: Jeff Liu @ 2013-12-24 12:48 UTC (permalink / raw)
To: xfs@oss.sgi.com
From: Jie Liu <jeff.liu@oracle.com>
I can easily to hit a hang up while running fsstress and shutting down
XFS on SSD via the tests below:
for ((i=0;i<10;i++))
do
echo "[$i] Fire up..."
mount /dev/sda7 /xfs
fsstress -d /xfs -n 1000 -p 100 >/dev/null 2>&1 &
sleep 10
godown /xfs
wait
killall -q fsstress
umount /xfs
echo "[$i] Done...."
echo
done
which yielding a backtrace as below:
[ 246.268987] INFO: task fsstress:3347 blocked for more than 120 seconds.
[ 246.268992] Tainted: PF O 3.13.0-rc2+ #4
[ 246.268994] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 246.268996] fsstress D ffff88026f254440 0 3347 3284
<snip>
[ 246.269013] Call Trace:
[ 246.269022] [<ffffffff816f3829>] schedule+0x29/0x70
[ 246.269054] [<ffffffffa0c4546b>] xlog_cil_force_lsn+0x1cb/0x220 [xfs]
[ 246.269059] [<ffffffff81097210>] ? wake_up_state+0x20/0x20
[ 246.269064] [<ffffffff811e9110>] ? do_fsync+0x80/0x80
[ 246.269087] [<ffffffffa0c43881>] _xfs_log_force+0x61/0x270 [xfs]
[ 246.269091] [<ffffffff8128b490>] ? jbd2_log_wait_commit+0x110/0x180
[ 246.269095] [<ffffffff810a83f0>] ? prepare_to_wait_event+0x100/0x100
[ 246.269098] [<ffffffff811e9110>] ? do_fsync+0x80/0x80
[ 246.269120] [<ffffffffa0c43ab6>] xfs_log_force+0x26/0x80 [xfs]
[ 246.269139] [<ffffffffa0bea31d>] xfs_fs_sync_fs+0x2d/0x50 [xfs]
[ 246.269143] [<ffffffff811e9130>] sync_fs_one_sb+0x20/0x30
[ 246.269147] [<ffffffff811bd5d2>] iterate_supers+0xb2/0x110
[ 246.269150] [<ffffffff811e9262>] sys_sync+0x62/0xa0
[ 246.269156] [<ffffffff816ffd6d>] system_call_fastpath+0x1a/0x1f
[ 266.335154] XFS (sda7): xfs_log_force: error 5 returned.
[ 296.400515] XFS (sda7): xfs_log_force: error 5 returned.
In xlog_cil_force_lsn(), if the task finds a previous sequence still in
committing, it need to wait until all those previously sequence commits
to complete, i.e, blocked on cil->xc_commit_wait wait queue. In normal
situations, the ctx with a previous sequence will eventually commit and
wake up tasks on cil->xc_commit_wait after getting a vaild commit_lsn
(see xlog_cil_push()). However, if something wrong during commit, e.g,
XLOG_STATE_IOERROR is detected, it will be aborted and the ctx will be
just removed from the cil->xc_committing list but we did not wake up
the waiting tasks in this case. Hence, there is a race condition will
happen as below:
Task1 Task2
list_add(&ctx->committing, &cil->xc_committing);
xlog_wait(&cil->xc_commit_wait..)
schedule()...
Aborting!! list_del(&ctx->committing);
wake_up_all(&cil->xc_commit_wait); <-- MISSING!
As a result, we should handle this situation in xlog_cil_committed().
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
---
fs/xfs/xfs_log_cil.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index 5eb51fc..8c7e9c7 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -406,6 +406,8 @@ xlog_cil_committed(
spin_lock(&ctx->cil->xc_push_lock);
list_del(&ctx->committing);
+ if (abort)
+ wake_up_all(&ctx->cil->xc_commit_wait);
spin_unlock(&ctx->cil->xc_push_lock);
xlog_cil_free_logvec(ctx->lv_chain);
--
1.8.3.2
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing 2013-12-24 12:48 [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing Jeff Liu @ 2013-12-30 15:20 ` Mark Tinguely 2014-01-01 14:38 ` Jeff Liu 0 siblings, 1 reply; 7+ messages in thread From: Mark Tinguely @ 2013-12-30 15:20 UTC (permalink / raw) To: Jeff Liu; +Cc: xfs@oss.sgi.com On 12/24/13 06:48, Jeff Liu wrote: > From: Jie Liu<jeff.liu@oracle.com> > > I can easily to hit a hang up while running fsstress and shutting down > XFS on SSD via the tests below: > > for ((i=0;i<10;i++)) > do > echo "[$i] Fire up..." > mount /dev/sda7 /xfs > fsstress -d /xfs -n 1000 -p 100>/dev/null 2>&1& > sleep 10 > godown /xfs > wait > killall -q fsstress > umount /xfs > echo "[$i] Done...." > echo > done > > which yielding a backtrace as below: > > [ 246.268987] INFO: task fsstress:3347 blocked for more than 120 seconds. > [ 246.268992] Tainted: PF O 3.13.0-rc2+ #4 > [ 246.268994] "echo 0> /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 246.268996] fsstress D ffff88026f254440 0 3347 3284 > <snip> > [ 246.269013] Call Trace: > [ 246.269022] [<ffffffff816f3829>] schedule+0x29/0x70 > [ 246.269054] [<ffffffffa0c4546b>] xlog_cil_force_lsn+0x1cb/0x220 [xfs] > [ 246.269059] [<ffffffff81097210>] ? wake_up_state+0x20/0x20 > [ 246.269064] [<ffffffff811e9110>] ? do_fsync+0x80/0x80 > [ 246.269087] [<ffffffffa0c43881>] _xfs_log_force+0x61/0x270 [xfs] > [ 246.269091] [<ffffffff8128b490>] ? jbd2_log_wait_commit+0x110/0x180 > [ 246.269095] [<ffffffff810a83f0>] ? prepare_to_wait_event+0x100/0x100 > [ 246.269098] [<ffffffff811e9110>] ? do_fsync+0x80/0x80 > [ 246.269120] [<ffffffffa0c43ab6>] xfs_log_force+0x26/0x80 [xfs] > [ 246.269139] [<ffffffffa0bea31d>] xfs_fs_sync_fs+0x2d/0x50 [xfs] > [ 246.269143] [<ffffffff811e9130>] sync_fs_one_sb+0x20/0x30 > [ 246.269147] [<ffffffff811bd5d2>] iterate_supers+0xb2/0x110 > [ 246.269150] [<ffffffff811e9262>] sys_sync+0x62/0xa0 > [ 246.269156] [<ffffffff816ffd6d>] system_call_fastpath+0x1a/0x1f > [ 266.335154] XFS (sda7): xfs_log_force: error 5 returned. > [ 296.400515] XFS (sda7): xfs_log_force: error 5 returned. > > In xlog_cil_force_lsn(), if the task finds a previous sequence still in > committing, it need to wait until all those previously sequence commits > to complete, i.e, blocked on cil->xc_commit_wait wait queue. In normal > situations, the ctx with a previous sequence will eventually commit and > wake up tasks on cil->xc_commit_wait after getting a vaild commit_lsn > (see xlog_cil_push()). However, if something wrong during commit, e.g, > XLOG_STATE_IOERROR is detected, it will be aborted and the ctx will be > just removed from the cil->xc_committing list but we did not wake up > the waiting tasks in this case. Hence, there is a race condition will > happen as below: > > Task1 Task2 > > list_add(&ctx->committing,&cil->xc_committing); > > xlog_wait(&cil->xc_commit_wait..) > schedule()... > > Aborting!! list_del(&ctx->committing); > wake_up_all(&cil->xc_commit_wait);<-- MISSING! > > As a result, we should handle this situation in xlog_cil_committed(). > > Signed-off-by: Jie Liu<jeff.liu@oracle.com> > --- > fs/xfs/xfs_log_cil.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c > index 5eb51fc..8c7e9c7 100644 > --- a/fs/xfs/xfs_log_cil.c > +++ b/fs/xfs/xfs_log_cil.c > @@ -406,6 +406,8 @@ xlog_cil_committed( > > spin_lock(&ctx->cil->xc_push_lock); > list_del(&ctx->committing); > + if (abort) > + wake_up_all(&ctx->cil->xc_commit_wait); > spin_unlock(&ctx->cil->xc_push_lock); > > xlog_cil_free_logvec(ctx->lv_chain); Hi Jeff, I hope you had a good break, So you are saying the wakeup in the CIL push error path missing? I agree with that. But I don't like adding a new wake up to xlog_cil_committed(), which is after the log buffer is written. Thanks. --Mark. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing 2013-12-30 15:20 ` Mark Tinguely @ 2014-01-01 14:38 ` Jeff Liu 2014-01-02 0:45 ` Dave Chinner 0 siblings, 1 reply; 7+ messages in thread From: Jeff Liu @ 2014-01-01 14:38 UTC (permalink / raw) To: Mark Tinguely; +Cc: xfs@oss.sgi.com On 12/30 2013 23:20 PM, Mark Tinguely wrote: > On 12/24/13 06:48, Jeff Liu wrote: >> From: Jie Liu<jeff.liu@oracle.com> >> >> I can easily to hit a hang up while running fsstress and shutting down >> XFS on SSD via the tests below: <snip> >> >> Task1 Task2 >> >> list_add(&ctx->committing,&cil->xc_committing); >> >> xlog_wait(&cil->xc_commit_wait..) >> schedule()... >> >> Aborting!! list_del(&ctx->committing); >> wake_up_all(&cil->xc_commit_wait);<-- MISSING! >> >> As a result, we should handle this situation in xlog_cil_committed(). >> >> Signed-off-by: Jie Liu<jeff.liu@oracle.com> >> --- >> fs/xfs/xfs_log_cil.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c >> index 5eb51fc..8c7e9c7 100644 >> --- a/fs/xfs/xfs_log_cil.c >> +++ b/fs/xfs/xfs_log_cil.c >> @@ -406,6 +406,8 @@ xlog_cil_committed( >> >> spin_lock(&ctx->cil->xc_push_lock); >> list_del(&ctx->committing); >> + if (abort) >> + wake_up_all(&ctx->cil->xc_commit_wait); >> spin_unlock(&ctx->cil->xc_push_lock); >> >> xlog_cil_free_logvec(ctx->lv_chain); > > Hi Jeff, I hope you had a good break, Thanks :) > > So you are saying the wakeup in the CIL push error path missing? Yes. > I agree with that. But I don't like adding a new wake up to > xlog_cil_committed(), which is after the log buffer is written. IMO this callback would be called if any problem is happened before the log buffer is written as well, e.g, xlog_cil_push()->xfs_log_notify() <-- failed | |->xlog_cil_committed() Besides, the CTX will be removed from the committing list here but there might still have waiters sleeping on it. Thanks, -Jeff _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing 2014-01-01 14:38 ` Jeff Liu @ 2014-01-02 0:45 ` Dave Chinner 2014-01-03 10:25 ` Jeff Liu 0 siblings, 1 reply; 7+ messages in thread From: Dave Chinner @ 2014-01-02 0:45 UTC (permalink / raw) To: Jeff Liu; +Cc: Mark Tinguely, xfs@oss.sgi.com On Wed, Jan 01, 2014 at 10:38:36PM +0800, Jeff Liu wrote: > On 12/30 2013 23:20 PM, Mark Tinguely wrote: > > On 12/24/13 06:48, Jeff Liu wrote: > >> From: Jie Liu<jeff.liu@oracle.com> > >> > >> I can easily to hit a hang up while running fsstress and shutting down > >> XFS on SSD via the tests below: > <snip> > >> > >> Task1 Task2 > >> > >> list_add(&ctx->committing,&cil->xc_committing); > >> > >> xlog_wait(&cil->xc_commit_wait..) > >> schedule()... > >> > >> Aborting!! list_del(&ctx->committing); > >> wake_up_all(&cil->xc_commit_wait);<-- MISSING! > >> > >> As a result, we should handle this situation in xlog_cil_committed(). > >> > >> Signed-off-by: Jie Liu<jeff.liu@oracle.com> > >> --- > >> fs/xfs/xfs_log_cil.c | 2 ++ > >> 1 file changed, 2 insertions(+) > >> > >> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c > >> index 5eb51fc..8c7e9c7 100644 > >> --- a/fs/xfs/xfs_log_cil.c > >> +++ b/fs/xfs/xfs_log_cil.c > >> @@ -406,6 +406,8 @@ xlog_cil_committed( > >> > >> spin_lock(&ctx->cil->xc_push_lock); > >> list_del(&ctx->committing); > >> + if (abort) > >> + wake_up_all(&ctx->cil->xc_commit_wait); > >> spin_unlock(&ctx->cil->xc_push_lock); > >> > >> xlog_cil_free_logvec(ctx->lv_chain); > > > > Hi Jeff, I hope you had a good break, > Thanks :) > > > > So you are saying the wakeup in the CIL push error path missing? > Yes. > > > I agree with that. But I don't like adding a new wake up to > > xlog_cil_committed(), which is after the log buffer is written. Hi Mark, any particular reason why you don't like this? It would be great if you could explain why you don't like something up front so we don't have to guess at your reasons or wait for another round trip in the conversation to find them out.... > IMO this callback would be called if any problem is happened before > the log buffer is written as well, e.g, > xlog_cil_push()->xfs_log_notify() <-- failed > | > |->xlog_cil_committed() Right, it's the generic CIL commit handler and it can be called directly or from log IO completion. The question is this: it is safe to wake up waiters from log IO completion if that is where an abort is first triggered from (i.e. on log IO error). From what I can see, it is safe to do the wakeup on abort because the iclog iwe attach the IO completion callback to in xlog_cil_push() cannot be put under IO until we release the reference gained in xfs_log_done(). But this raises an interesting question - the wakeup in xlog_cil_push() is done before the log IO for the checkpoint is complete, so the wakeup is occurring on checkpoint processing completion, not iclog IO completion. i.e. the actual log force sleeping still needs to wait for log IO completion to occur after then CIL has been pushed. This occurs in the _xfs_log_force{_lsn}() wrappers, where iclog state changes are waited for. Why is this important? The iclog write/flush wakeups are all done from IO completion context, except for the force shutdown case, which calls xlog_state_do_callback(log, XFS_LI_ABORTED, NULL); to trigger wakeups and aborts via the log IO completion callbacks on all the outstanding iclogs. IOWs, we've already got a design pattern that says: - run log force wakeups from IO completions - on shutdown, run IO completions directly to abort pending log operations So, really, issuing wakeups from iclog IO completion on log aborts or errors is exactly what we currently do to ensure that shutdowns don't leave processes waiting on log force completion behind. So from that perspective, adding the wakeup on abort to xlog_cil_committed() seems like the right approach to take. Actually, there's more issues here: xlog_cil_push() leaks a reference to the iclog when it triggers the error path via xfs_log_notify() failure. At this point we always need to release the iclog. Hence if xfs_log_notify() were to always add the IO completion to the iclog and xlog_cil_committed() issued wakeups on abort errors, then we could completely ignore the log state in xfs_log_notify() and have xfs_log_release_iclog() capture the IO error and the subsequent shutdown would handle the aborts and wakeups.... Hmmm, then xfs_log_notify could go away, and the callback list could be made a lockless list and the ic_callback_lock could go away, too.... > Besides, the CTX will be removed from the committing list here but > there might still have waiters sleeping on it. Right, if we get an abort from log IO completion, then we may not have any other wakeup vector that can be triggered. Triggering them from IO completion ensures that even forced shutdowns have a trigger for wakeups... SO, for the hang issue, I think the minimal patch is OK, but we should look to clean up the logic and fix the leaked iclog reference on xfs_log_notify() failure as well. Cheers, Dave. -- Dave Chinner david@fromorbit.com _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing 2014-01-02 0:45 ` Dave Chinner @ 2014-01-03 10:25 ` Jeff Liu 2014-01-03 13:17 ` Jeff Liu 0 siblings, 1 reply; 7+ messages in thread From: Jeff Liu @ 2014-01-03 10:25 UTC (permalink / raw) To: Dave Chinner; +Cc: Mark Tinguely, xfs@oss.sgi.com On 01/02 2014 08:45, Dave Chinner wrote: > On Wed, Jan 01, 2014 at 10:38:36PM +0800, Jeff Liu wrote: >> On 12/30 2013 23:20 PM, Mark Tinguely wrote: >>> On 12/24/13 06:48, Jeff Liu wrote: >>>> From: Jie Liu<jeff.liu@oracle.com> >>>> >>>> I can easily to hit a hang up while running fsstress and shutting down >>>> XFS on SSD via the tests below: >> <snip> >>>> >>>> Task1 Task2 >>>> >>>> list_add(&ctx->committing,&cil->xc_committing); >>>> >>>> xlog_wait(&cil->xc_commit_wait..) >>>> schedule()... >>>> >>>> Aborting!! list_del(&ctx->committing); >>>> wake_up_all(&cil->xc_commit_wait);<-- MISSING! >>>> >>>> As a result, we should handle this situation in xlog_cil_committed(). >>>> >>>> Signed-off-by: Jie Liu<jeff.liu@oracle.com> >>>> --- >>>> fs/xfs/xfs_log_cil.c | 2 ++ >>>> 1 file changed, 2 insertions(+) >>>> >>>> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c >>>> index 5eb51fc..8c7e9c7 100644 >>>> --- a/fs/xfs/xfs_log_cil.c >>>> +++ b/fs/xfs/xfs_log_cil.c >>>> @@ -406,6 +406,8 @@ xlog_cil_committed( >>>> >>>> spin_lock(&ctx->cil->xc_push_lock); >>>> list_del(&ctx->committing); >>>> + if (abort) >>>> + wake_up_all(&ctx->cil->xc_commit_wait); >>>> spin_unlock(&ctx->cil->xc_push_lock); >>>> >>>> xlog_cil_free_logvec(ctx->lv_chain); >>> >>> Hi Jeff, I hope you had a good break, >> Thanks :) >>> >>> So you are saying the wakeup in the CIL push error path missing? >> Yes. >> >>> I agree with that. But I don't like adding a new wake up to >>> xlog_cil_committed(), which is after the log buffer is written. > > Hi Mark, any particular reason why you don't like this? It would be > great if you could explain why you don't like something up front so > we don't have to guess at your reasons or wait for another round > trip in the conversation to find them out.... > >> IMO this callback would be called if any problem is happened before >> the log buffer is written as well, e.g, >> xlog_cil_push()->xfs_log_notify() <-- failed >> | >> |->xlog_cil_committed() > > Right, it's the generic CIL commit handler and it can be called > directly or from log IO completion. > > The question is this: it is safe to wake up waiters from log IO > completion if that is where an abort is first triggered from (i.e. > on log IO error). From what I can see, it is safe to do the wakeup > on abort because the iclog iwe attach the IO completion callback to > in xlog_cil_push() cannot be put under IO until we release the > reference gained in xfs_log_done(). > > But this raises an interesting question - the wakeup in > xlog_cil_push() is done before the log IO for the checkpoint is > complete, so the wakeup is occurring on checkpoint processing > completion, not iclog IO completion. i.e. the actual log force > sleeping still needs to wait for log IO completion to occur after > then CIL has been pushed. This occurs in the _xfs_log_force{_lsn}() > wrappers, where iclog state changes are waited for. > > Why is this important? The iclog write/flush wakeups are all done > from IO completion context, except for the force shutdown case, > which calls xlog_state_do_callback(log, XFS_LI_ABORTED, NULL); to > trigger wakeups and aborts via the log IO completion callbacks on > all the outstanding iclogs. > > IOWs, we've already got a design pattern that says: > > - run log force wakeups from IO completions > - on shutdown, run IO completions directly to abort pending > log operations > > So, really, issuing wakeups from iclog IO completion on log aborts > or errors is exactly what we currently do to ensure that shutdowns > don't leave processes waiting on log force completion behind. So > from that perspective, adding the wakeup on abort to > xlog_cil_committed() seems like the right approach to take. > > Actually, there's more issues here: xlog_cil_push() leaks a > reference to the iclog when it triggers the error path via > xfs_log_notify() failure. At this point we always need to release > the iclog. Hence if xfs_log_notify() were to always add the IO > completion to the iclog and xlog_cil_committed() issued wakeups on > abort errors, then we could completely ignore the log state in > xfs_log_notify() and have xfs_log_release_iclog() capture the IO > error and the subsequent shutdown would handle the aborts and > wakeups.... There is indeed an iclog ref leak after digging into the code. > > Hmmm, then xfs_log_notify could go away, and the callback list could > be made a lockless list and the ic_callback_lock could go away, > too.... Hence we can fold xfs_log_notify() into xlog_cil_push() directly, but am not sure I get the reason why we could make the callback list lockless: When attaching the IO completion callback to iclog, we assert the iclog state to be XLOG_STATE_ACTIVE or XLOG_STATE_WANT_SYNC, but in the other place where we also try to get the ic_callback_lock, i.e, xlog_state_do_callback(), we only perform callbacks for iclogs that in XLOG_STATE_DONE_SYNC or in XLOG_STATE_DO_CALLBACK, so they're already prevented from the potential race situations, am I understood correctly? Also, it seems like the iclog->ic_callback_tail can go away as well, since it only serves as a left value. Thanks, -Jeff _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing 2014-01-03 10:25 ` Jeff Liu @ 2014-01-03 13:17 ` Jeff Liu 2014-01-03 15:30 ` Mark Tinguely 0 siblings, 1 reply; 7+ messages in thread From: Jeff Liu @ 2014-01-03 13:17 UTC (permalink / raw) To: Dave Chinner; +Cc: Mark Tinguely, xfs@oss.sgi.com On 01/03 2014 18:25 PM, Jeff Liu wrote: > On 01/02 2014 08:45, Dave Chinner wrote: >> On Wed, Jan 01, 2014 at 10:38:36PM +0800, Jeff Liu wrote: >>> On 12/30 2013 23:20 PM, Mark Tinguely wrote: >>>> On 12/24/13 06:48, Jeff Liu wrote: >>>>> From: Jie Liu<jeff.liu@oracle.com> >>>>> >>>>> I can easily to hit a hang up while running fsstress and shutting down >>>>> XFS on SSD via the tests below: >>> <snip> >>>>> >>>>> Task1 Task2 >>>>> >>>>> list_add(&ctx->committing,&cil->xc_committing); >>>>> >>>>> xlog_wait(&cil->xc_commit_wait..) >>>>> schedule()... >>>>> >>>>> Aborting!! list_del(&ctx->committing); >>>>> wake_up_all(&cil->xc_commit_wait);<-- MISSING! >>>>> >>>>> As a result, we should handle this situation in xlog_cil_committed(). >>>>> >>>>> Signed-off-by: Jie Liu<jeff.liu@oracle.com> >>>>> --- >>>>> fs/xfs/xfs_log_cil.c | 2 ++ >>>>> 1 file changed, 2 insertions(+) >>>>> >>>>> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c >>>>> index 5eb51fc..8c7e9c7 100644 >>>>> --- a/fs/xfs/xfs_log_cil.c >>>>> +++ b/fs/xfs/xfs_log_cil.c >>>>> @@ -406,6 +406,8 @@ xlog_cil_committed( >>>>> >>>>> spin_lock(&ctx->cil->xc_push_lock); >>>>> list_del(&ctx->committing); >>>>> + if (abort) >>>>> + wake_up_all(&ctx->cil->xc_commit_wait); >>>>> spin_unlock(&ctx->cil->xc_push_lock); >>>>> >>>>> xlog_cil_free_logvec(ctx->lv_chain); >>>> >>>> Hi Jeff, I hope you had a good break, >>> Thanks :) >>>> >>>> So you are saying the wakeup in the CIL push error path missing? >>> Yes. >>> >>>> I agree with that. But I don't like adding a new wake up to >>>> xlog_cil_committed(), which is after the log buffer is written. >> >> Hi Mark, any particular reason why you don't like this? It would be >> great if you could explain why you don't like something up front so >> we don't have to guess at your reasons or wait for another round >> trip in the conversation to find them out.... >> >>> IMO this callback would be called if any problem is happened before >>> the log buffer is written as well, e.g, >>> xlog_cil_push()->xfs_log_notify() <-- failed >>> | >>> |->xlog_cil_committed() >> >> Right, it's the generic CIL commit handler and it can be called >> directly or from log IO completion. >> >> The question is this: it is safe to wake up waiters from log IO >> completion if that is where an abort is first triggered from (i.e. >> on log IO error). From what I can see, it is safe to do the wakeup >> on abort because the iclog iwe attach the IO completion callback to >> in xlog_cil_push() cannot be put under IO until we release the >> reference gained in xfs_log_done(). >> >> But this raises an interesting question - the wakeup in >> xlog_cil_push() is done before the log IO for the checkpoint is >> complete, so the wakeup is occurring on checkpoint processing >> completion, not iclog IO completion. i.e. the actual log force >> sleeping still needs to wait for log IO completion to occur after >> then CIL has been pushed. This occurs in the _xfs_log_force{_lsn}() >> wrappers, where iclog state changes are waited for. >> >> Why is this important? The iclog write/flush wakeups are all done >> from IO completion context, except for the force shutdown case, >> which calls xlog_state_do_callback(log, XFS_LI_ABORTED, NULL); to >> trigger wakeups and aborts via the log IO completion callbacks on >> all the outstanding iclogs. >> >> IOWs, we've already got a design pattern that says: >> >> - run log force wakeups from IO completions >> - on shutdown, run IO completions directly to abort pending >> log operations >> >> So, really, issuing wakeups from iclog IO completion on log aborts >> or errors is exactly what we currently do to ensure that shutdowns >> don't leave processes waiting on log force completion behind. So >> from that perspective, adding the wakeup on abort to >> xlog_cil_committed() seems like the right approach to take. >> >> Actually, there's more issues here: xlog_cil_push() leaks a >> reference to the iclog when it triggers the error path via >> xfs_log_notify() failure. At this point we always need to release >> the iclog. Hence if xfs_log_notify() were to always add the IO >> completion to the iclog and xlog_cil_committed() issued wakeups on >> abort errors, then we could completely ignore the log state in >> xfs_log_notify() and have xfs_log_release_iclog() capture the IO >> error and the subsequent shutdown would handle the aborts and >> wakeups.... > > There is indeed an iclog ref leak after digging into the code. >> >> Hmmm, then xfs_log_notify could go away, and the callback list could >> be made a lockless list and the ic_callback_lock could go away, >> too.... > > Hence we can fold xfs_log_notify() into xlog_cil_push() directly, but am > not sure I get the reason why we could make the callback list lockless: > When attaching the IO completion callback to icl og, we assert the iclog > state to be XLOG_STATE_ACTIVE or XLOG_STATE_WANT_SYNC, but in the other > place where we also try to get the ic_callback_lock, i.e, > xlog_state_do_callback(), we only perform callbacks for iclogs that in > XLOG_STATE_DONE_SYNC or in XLOG_STATE_DO_CALLBACK, so they're already > prevented from the potential race situations, am I understood correctly? > > Also, it seems like the iclog->ic_callback_tail can go away as well, > since it only serves as a left value. > Oh, no! I took ic_callback_tail wrong... It's used to attach func to the tail of callback list. But IMHO, since it seems like the current code only attach one callback to iclog (xlog_cil_committed()), the only "iclog->ic_callback" could handle it if no more callbacks would be added in the future... Thanks, -Jeff _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing 2014-01-03 13:17 ` Jeff Liu @ 2014-01-03 15:30 ` Mark Tinguely 0 siblings, 0 replies; 7+ messages in thread From: Mark Tinguely @ 2014-01-03 15:30 UTC (permalink / raw) To: Jeff Liu; +Cc: xfs@oss.sgi.com On 1/3/2014 7:17 AM, Jeff Liu wrote: > On 01/03 2014 18:25 PM, Jeff Liu wrote: >> On 01/02 2014 08:45, Dave Chinner wrote: >>> On Wed, Jan 01, 2014 at 10:38:36PM +0800, Jeff Liu wrote: >>>> On 12/30 2013 23:20 PM, Mark Tinguely wrote: >>>>> On 12/24/13 06:48, Jeff Liu wrote: >>>>>> From: Jie Liu<jeff.liu@oracle.com> >>>>>> >>>>>> I can easily to hit a hang up while running fsstress and shutting down >>>>>> XFS on SSD via the tests below: >>>> <snip> >>>>>> Task1 Task2 >>>>>> >>>>>> list_add(&ctx->committing,&cil->xc_committing); >>>>>> >>>>>> xlog_wait(&cil->xc_commit_wait..) >>>>>> schedule()... >>>>>> >>>>>> Aborting!! list_del(&ctx->committing); >>>>>> wake_up_all(&cil->xc_commit_wait);<-- MISSING! >>>>>> >>>>>> As a result, we should handle this situation in xlog_cil_committed(). >>>>>> >>>>>> Signed-off-by: Jie Liu<jeff.liu@oracle.com> >>>>>> --- >>>>>> fs/xfs/xfs_log_cil.c | 2 ++ >>>>>> 1 file changed, 2 insertions(+) >>>>>> >>>>>> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c >>>>>> index 5eb51fc..8c7e9c7 100644 >>>>>> --- a/fs/xfs/xfs_log_cil.c >>>>>> +++ b/fs/xfs/xfs_log_cil.c >>>>>> @@ -406,6 +406,8 @@ xlog_cil_committed( >>>>>> >>>>>> spin_lock(&ctx->cil->xc_push_lock); >>>>>> list_del(&ctx->committing); >>>>>> + if (abort) >>>>>> + wake_up_all(&ctx->cil->xc_commit_wait); >>>>>> spin_unlock(&ctx->cil->xc_push_lock); >>>>>> >>>>>> xlog_cil_free_logvec(ctx->lv_chain); >>>>> Hi Jeff, I hope you had a good break, >>>> Thanks :) >>>>> So you are saying the wakeup in the CIL push error path missing? >>>> Yes. >>>> >>>>> I agree with that. But I don't like adding a new wake up to >>>>> xlog_cil_committed(), which is after the log buffer is written. >>> Hi Mark, any particular reason why you don't like this? It would be >>> great if you could explain why you don't like something up front so >>> we don't have to guess at your reasons or wait for another round My concern is consistency, with the patch there will be two paths that could do the wake up. Originally, the wakeup happened before the iclog write. With the patch, if the cil push sequence successfully wrote it's ticket, woke up the waiters, wrote back the iclog, and then had an error writing the iclog, it would wakeup the xc_commit_wait a second time. Not too drastic of a problem, because the zeroed commit_lsn will prevent an premature write of the next cil push. I just prefer to handle the error in the cil push routine and avoid a second form of wake up. --Mark. _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-01-03 15:30 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-12-24 12:48 [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing Jeff Liu 2013-12-30 15:20 ` Mark Tinguely 2014-01-01 14:38 ` Jeff Liu 2014-01-02 0:45 ` Dave Chinner 2014-01-03 10:25 ` Jeff Liu 2014-01-03 13:17 ` Jeff Liu 2014-01-03 15:30 ` Mark Tinguely
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).