linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing
@ 2013-12-24 12:48 Jeff Liu
  2013-12-30 15:20 ` Mark Tinguely
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Liu @ 2013-12-24 12:48 UTC (permalink / raw)
  To: xfs@oss.sgi.com

From: Jie Liu <jeff.liu@oracle.com>

I can easily to hit a hang up while running fsstress and shutting down
XFS on SSD via the tests below:

for ((i=0;i<10;i++))
do
    echo "[$i] Fire up..."
    mount /dev/sda7 /xfs
    fsstress -d /xfs -n 1000 -p 100 >/dev/null 2>&1 &
    sleep 10
    godown /xfs
    wait
    killall -q fsstress
    umount /xfs
    echo "[$i] Done...."
    echo
done

which yielding a backtrace as below:

[  246.268987] INFO: task fsstress:3347 blocked for more than 120 seconds.
[  246.268992]       Tainted: PF          O 3.13.0-rc2+ #4
[  246.268994] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  246.268996] fsstress        D ffff88026f254440     0  3347   3284
<snip>
[  246.269013] Call Trace:
[  246.269022]  [<ffffffff816f3829>] schedule+0x29/0x70
[  246.269054]  [<ffffffffa0c4546b>] xlog_cil_force_lsn+0x1cb/0x220 [xfs]
[  246.269059]  [<ffffffff81097210>] ? wake_up_state+0x20/0x20
[  246.269064]  [<ffffffff811e9110>] ? do_fsync+0x80/0x80
[  246.269087]  [<ffffffffa0c43881>] _xfs_log_force+0x61/0x270 [xfs]
[  246.269091]  [<ffffffff8128b490>] ? jbd2_log_wait_commit+0x110/0x180
[  246.269095]  [<ffffffff810a83f0>] ? prepare_to_wait_event+0x100/0x100
[  246.269098]  [<ffffffff811e9110>] ? do_fsync+0x80/0x80
[  246.269120]  [<ffffffffa0c43ab6>] xfs_log_force+0x26/0x80 [xfs]
[  246.269139]  [<ffffffffa0bea31d>] xfs_fs_sync_fs+0x2d/0x50 [xfs]
[  246.269143]  [<ffffffff811e9130>] sync_fs_one_sb+0x20/0x30
[  246.269147]  [<ffffffff811bd5d2>] iterate_supers+0xb2/0x110
[  246.269150]  [<ffffffff811e9262>] sys_sync+0x62/0xa0
[  246.269156]  [<ffffffff816ffd6d>] system_call_fastpath+0x1a/0x1f
[  266.335154] XFS (sda7): xfs_log_force: error 5 returned.
[  296.400515] XFS (sda7): xfs_log_force: error 5 returned.

In xlog_cil_force_lsn(), if the task finds a previous sequence still in
committing, it need to wait until all those previously sequence commits
to complete, i.e, blocked on cil->xc_commit_wait wait queue.  In normal
situations, the ctx with a previous sequence will eventually commit and
wake up tasks on cil->xc_commit_wait after getting a vaild commit_lsn
(see xlog_cil_push()).  However, if something wrong during commit, e.g,
XLOG_STATE_IOERROR is detected, it will be aborted and the ctx will be
just removed from the cil->xc_committing list but we did not wake up
the waiting tasks in this case.  Hence, there is a race condition will
happen as below:

	Task1                    Task2

                	list_add(&ctx->committing, &cil->xc_committing);

xlog_wait(&cil->xc_commit_wait..)
schedule()...

                	Aborting!! list_del(&ctx->committing);
                	wake_up_all(&cil->xc_commit_wait); <-- MISSING!

As a result, we should handle this situation in xlog_cil_committed().

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
---
 fs/xfs/xfs_log_cil.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index 5eb51fc..8c7e9c7 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -406,6 +406,8 @@ xlog_cil_committed(
 
 	spin_lock(&ctx->cil->xc_push_lock);
 	list_del(&ctx->committing);
+	if (abort)
+		wake_up_all(&ctx->cil->xc_commit_wait);
 	spin_unlock(&ctx->cil->xc_push_lock);
 
 	xlog_cil_free_logvec(ctx->lv_chain);
-- 
1.8.3.2

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing
  2013-12-24 12:48 [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing Jeff Liu
@ 2013-12-30 15:20 ` Mark Tinguely
  2014-01-01 14:38   ` Jeff Liu
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Tinguely @ 2013-12-30 15:20 UTC (permalink / raw)
  To: Jeff Liu; +Cc: xfs@oss.sgi.com

On 12/24/13 06:48, Jeff Liu wrote:
> From: Jie Liu<jeff.liu@oracle.com>
>
> I can easily to hit a hang up while running fsstress and shutting down
> XFS on SSD via the tests below:
>
> for ((i=0;i<10;i++))
> do
>      echo "[$i] Fire up..."
>      mount /dev/sda7 /xfs
>      fsstress -d /xfs -n 1000 -p 100>/dev/null 2>&1&
>      sleep 10
>      godown /xfs
>      wait
>      killall -q fsstress
>      umount /xfs
>      echo "[$i] Done...."
>      echo
> done
>
> which yielding a backtrace as below:
>
> [  246.268987] INFO: task fsstress:3347 blocked for more than 120 seconds.
> [  246.268992]       Tainted: PF          O 3.13.0-rc2+ #4
> [  246.268994] "echo 0>  /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  246.268996] fsstress        D ffff88026f254440     0  3347   3284
> <snip>
> [  246.269013] Call Trace:
> [  246.269022]  [<ffffffff816f3829>] schedule+0x29/0x70
> [  246.269054]  [<ffffffffa0c4546b>] xlog_cil_force_lsn+0x1cb/0x220 [xfs]
> [  246.269059]  [<ffffffff81097210>] ? wake_up_state+0x20/0x20
> [  246.269064]  [<ffffffff811e9110>] ? do_fsync+0x80/0x80
> [  246.269087]  [<ffffffffa0c43881>] _xfs_log_force+0x61/0x270 [xfs]
> [  246.269091]  [<ffffffff8128b490>] ? jbd2_log_wait_commit+0x110/0x180
> [  246.269095]  [<ffffffff810a83f0>] ? prepare_to_wait_event+0x100/0x100
> [  246.269098]  [<ffffffff811e9110>] ? do_fsync+0x80/0x80
> [  246.269120]  [<ffffffffa0c43ab6>] xfs_log_force+0x26/0x80 [xfs]
> [  246.269139]  [<ffffffffa0bea31d>] xfs_fs_sync_fs+0x2d/0x50 [xfs]
> [  246.269143]  [<ffffffff811e9130>] sync_fs_one_sb+0x20/0x30
> [  246.269147]  [<ffffffff811bd5d2>] iterate_supers+0xb2/0x110
> [  246.269150]  [<ffffffff811e9262>] sys_sync+0x62/0xa0
> [  246.269156]  [<ffffffff816ffd6d>] system_call_fastpath+0x1a/0x1f
> [  266.335154] XFS (sda7): xfs_log_force: error 5 returned.
> [  296.400515] XFS (sda7): xfs_log_force: error 5 returned.
>
> In xlog_cil_force_lsn(), if the task finds a previous sequence still in
> committing, it need to wait until all those previously sequence commits
> to complete, i.e, blocked on cil->xc_commit_wait wait queue.  In normal
> situations, the ctx with a previous sequence will eventually commit and
> wake up tasks on cil->xc_commit_wait after getting a vaild commit_lsn
> (see xlog_cil_push()).  However, if something wrong during commit, e.g,
> XLOG_STATE_IOERROR is detected, it will be aborted and the ctx will be
> just removed from the cil->xc_committing list but we did not wake up
> the waiting tasks in this case.  Hence, there is a race condition will
> happen as below:
>
> 	Task1                    Task2
>
>                  	list_add(&ctx->committing,&cil->xc_committing);
>
> xlog_wait(&cil->xc_commit_wait..)
> schedule()...
>
>                  	Aborting!! list_del(&ctx->committing);
>                  	wake_up_all(&cil->xc_commit_wait);<-- MISSING!
>
> As a result, we should handle this situation in xlog_cil_committed().
>
> Signed-off-by: Jie Liu<jeff.liu@oracle.com>
> ---
>   fs/xfs/xfs_log_cil.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
> index 5eb51fc..8c7e9c7 100644
> --- a/fs/xfs/xfs_log_cil.c
> +++ b/fs/xfs/xfs_log_cil.c
> @@ -406,6 +406,8 @@ xlog_cil_committed(
>
>   	spin_lock(&ctx->cil->xc_push_lock);
>   	list_del(&ctx->committing);
> +	if (abort)
> +		wake_up_all(&ctx->cil->xc_commit_wait);
>   	spin_unlock(&ctx->cil->xc_push_lock);
>
>   	xlog_cil_free_logvec(ctx->lv_chain);

Hi Jeff, I hope you had a good break,

So you are saying the wakeup in the CIL push error path missing?
I agree with that. But I don't like adding a new wake up to 
xlog_cil_committed(), which is after the log buffer is written.

Thanks.

--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing
  2013-12-30 15:20 ` Mark Tinguely
@ 2014-01-01 14:38   ` Jeff Liu
  2014-01-02  0:45     ` Dave Chinner
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Liu @ 2014-01-01 14:38 UTC (permalink / raw)
  To: Mark Tinguely; +Cc: xfs@oss.sgi.com

On 12/30 2013 23:20 PM, Mark Tinguely wrote:
> On 12/24/13 06:48, Jeff Liu wrote:
>> From: Jie Liu<jeff.liu@oracle.com>
>>
>> I can easily to hit a hang up while running fsstress and shutting down
>> XFS on SSD via the tests below:
<snip>
>>
>>     Task1                    Task2
>>
>>                      list_add(&ctx->committing,&cil->xc_committing);
>>
>> xlog_wait(&cil->xc_commit_wait..)
>> schedule()...
>>
>>                      Aborting!! list_del(&ctx->committing);
>>                      wake_up_all(&cil->xc_commit_wait);<-- MISSING!
>>
>> As a result, we should handle this situation in xlog_cil_committed().
>>
>> Signed-off-by: Jie Liu<jeff.liu@oracle.com>
>> ---
>>   fs/xfs/xfs_log_cil.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
>> index 5eb51fc..8c7e9c7 100644
>> --- a/fs/xfs/xfs_log_cil.c
>> +++ b/fs/xfs/xfs_log_cil.c
>> @@ -406,6 +406,8 @@ xlog_cil_committed(
>>
>>       spin_lock(&ctx->cil->xc_push_lock);
>>       list_del(&ctx->committing);
>> +    if (abort)
>> +        wake_up_all(&ctx->cil->xc_commit_wait);
>>       spin_unlock(&ctx->cil->xc_push_lock);
>>
>>       xlog_cil_free_logvec(ctx->lv_chain);
> 
> Hi Jeff, I hope you had a good break,
Thanks :)
> 
> So you are saying the wakeup in the CIL push error path missing?
Yes.

> I agree with that. But I don't like adding a new wake up to
> xlog_cil_committed(), which is after the log buffer is written.
IMO this callback would be called if any problem is happened before
the log buffer is written as well, e.g, 
xlog_cil_push()->xfs_log_notify() <-- failed
			| 
        		|->xlog_cil_committed()

Besides, the CTX will be removed from the committing list here but
there might still have waiters sleeping on it.

Thanks,
-Jeff

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing
  2014-01-01 14:38   ` Jeff Liu
@ 2014-01-02  0:45     ` Dave Chinner
  2014-01-03 10:25       ` Jeff Liu
  0 siblings, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2014-01-02  0:45 UTC (permalink / raw)
  To: Jeff Liu; +Cc: Mark Tinguely, xfs@oss.sgi.com

On Wed, Jan 01, 2014 at 10:38:36PM +0800, Jeff Liu wrote:
> On 12/30 2013 23:20 PM, Mark Tinguely wrote:
> > On 12/24/13 06:48, Jeff Liu wrote:
> >> From: Jie Liu<jeff.liu@oracle.com>
> >>
> >> I can easily to hit a hang up while running fsstress and shutting down
> >> XFS on SSD via the tests below:
> <snip>
> >>
> >>     Task1                    Task2
> >>
> >>                      list_add(&ctx->committing,&cil->xc_committing);
> >>
> >> xlog_wait(&cil->xc_commit_wait..)
> >> schedule()...
> >>
> >>                      Aborting!! list_del(&ctx->committing);
> >>                      wake_up_all(&cil->xc_commit_wait);<-- MISSING!
> >>
> >> As a result, we should handle this situation in xlog_cil_committed().
> >>
> >> Signed-off-by: Jie Liu<jeff.liu@oracle.com>
> >> ---
> >>   fs/xfs/xfs_log_cil.c | 2 ++
> >>   1 file changed, 2 insertions(+)
> >>
> >> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
> >> index 5eb51fc..8c7e9c7 100644
> >> --- a/fs/xfs/xfs_log_cil.c
> >> +++ b/fs/xfs/xfs_log_cil.c
> >> @@ -406,6 +406,8 @@ xlog_cil_committed(
> >>
> >>       spin_lock(&ctx->cil->xc_push_lock);
> >>       list_del(&ctx->committing);
> >> +    if (abort)
> >> +        wake_up_all(&ctx->cil->xc_commit_wait);
> >>       spin_unlock(&ctx->cil->xc_push_lock);
> >>
> >>       xlog_cil_free_logvec(ctx->lv_chain);
> > 
> > Hi Jeff, I hope you had a good break,
> Thanks :)
> > 
> > So you are saying the wakeup in the CIL push error path missing?
> Yes.
> 
> > I agree with that. But I don't like adding a new wake up to
> > xlog_cil_committed(), which is after the log buffer is written.

Hi Mark, any particular reason why you don't like this? It would be
great if you could explain why you don't like something up front so
we don't have to guess at your reasons or wait for another round
trip in the conversation to find them out....

> IMO this callback would be called if any problem is happened before
> the log buffer is written as well, e.g, 
> xlog_cil_push()->xfs_log_notify() <-- failed
> 			| 
>         		|->xlog_cil_committed()

Right, it's the generic CIL commit handler and it can be called
directly or from log IO completion.

The question is this: it is safe to wake up waiters from log IO
completion if that is where an abort is first triggered from (i.e.
on log IO error). From what I can see, it is safe to do the wakeup
on abort because the iclog iwe attach the IO completion callback to
in xlog_cil_push() cannot be put under IO until we release the
reference gained in xfs_log_done().

But this raises an interesting question - the wakeup in
xlog_cil_push() is done before the log IO for the checkpoint is
complete, so the wakeup is occurring on checkpoint processing
completion, not iclog IO completion. i.e. the actual log force
sleeping still needs to wait for log IO completion to occur after
then CIL has been pushed. This occurs in the _xfs_log_force{_lsn}()
wrappers, where iclog state changes are waited for.

Why is this important? The iclog write/flush wakeups are all done
from IO completion context, except for the force shutdown case,
which calls xlog_state_do_callback(log, XFS_LI_ABORTED, NULL); to
trigger wakeups and aborts via the log IO completion callbacks on
all the outstanding iclogs.

IOWs, we've already got a design pattern that says:

	- run log force wakeups from IO completions
	- on shutdown, run IO completions directly to abort pending
	  log operations

So, really, issuing wakeups from iclog IO completion on log aborts
or errors is exactly what we currently do to ensure that shutdowns
don't leave processes waiting on log force completion behind. So
from that perspective, adding the wakeup on abort to
xlog_cil_committed() seems like the right approach to take.

Actually, there's more issues here: xlog_cil_push() leaks a
reference to the iclog when it triggers the error path via
xfs_log_notify() failure. At this point we always need to release
the iclog. Hence if xfs_log_notify() were to always add the IO
completion to the iclog and xlog_cil_committed() issued wakeups on
abort errors, then we could completely ignore the log state in
xfs_log_notify() and have xfs_log_release_iclog() capture the IO
error and the subsequent shutdown would handle the aborts and
wakeups....

Hmmm, then xfs_log_notify could go away, and the callback list could
be made a lockless list and the ic_callback_lock could go away,
too....

> Besides, the CTX will be removed from the committing list here but
> there might still have waiters sleeping on it.

Right, if we get an abort from log IO completion, then we may not
have any other wakeup vector that can be triggered. Triggering them
from IO completion ensures that even forced shutdowns have a trigger
for wakeups...

SO, for the hang issue, I think the minimal patch is OK, but we
should look to clean up the logic and fix the leaked iclog reference
on xfs_log_notify() failure as well.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing
  2014-01-02  0:45     ` Dave Chinner
@ 2014-01-03 10:25       ` Jeff Liu
  2014-01-03 13:17         ` Jeff Liu
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Liu @ 2014-01-03 10:25 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Mark Tinguely, xfs@oss.sgi.com

On 01/02 2014 08:45, Dave Chinner wrote:
> On Wed, Jan 01, 2014 at 10:38:36PM +0800, Jeff Liu wrote:
>> On 12/30 2013 23:20 PM, Mark Tinguely wrote:
>>> On 12/24/13 06:48, Jeff Liu wrote:
>>>> From: Jie Liu<jeff.liu@oracle.com>
>>>>
>>>> I can easily to hit a hang up while running fsstress and shutting down
>>>> XFS on SSD via the tests below:
>> <snip>
>>>>
>>>>     Task1                    Task2
>>>>
>>>>                      list_add(&ctx->committing,&cil->xc_committing);
>>>>
>>>> xlog_wait(&cil->xc_commit_wait..)
>>>> schedule()...
>>>>
>>>>                      Aborting!! list_del(&ctx->committing);
>>>>                      wake_up_all(&cil->xc_commit_wait);<-- MISSING!
>>>>
>>>> As a result, we should handle this situation in xlog_cil_committed().
>>>>
>>>> Signed-off-by: Jie Liu<jeff.liu@oracle.com>
>>>> ---
>>>>   fs/xfs/xfs_log_cil.c | 2 ++
>>>>   1 file changed, 2 insertions(+)
>>>>
>>>> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
>>>> index 5eb51fc..8c7e9c7 100644
>>>> --- a/fs/xfs/xfs_log_cil.c
>>>> +++ b/fs/xfs/xfs_log_cil.c
>>>> @@ -406,6 +406,8 @@ xlog_cil_committed(
>>>>
>>>>       spin_lock(&ctx->cil->xc_push_lock);
>>>>       list_del(&ctx->committing);
>>>> +    if (abort)
>>>> +        wake_up_all(&ctx->cil->xc_commit_wait);
>>>>       spin_unlock(&ctx->cil->xc_push_lock);
>>>>
>>>>       xlog_cil_free_logvec(ctx->lv_chain);
>>>
>>> Hi Jeff, I hope you had a good break,
>> Thanks :)
>>>
>>> So you are saying the wakeup in the CIL push error path missing?
>> Yes.
>>
>>> I agree with that. But I don't like adding a new wake up to
>>> xlog_cil_committed(), which is after the log buffer is written.
> 
> Hi Mark, any particular reason why you don't like this? It would be
> great if you could explain why you don't like something up front so
> we don't have to guess at your reasons or wait for another round
> trip in the conversation to find them out....
> 
>> IMO this callback would be called if any problem is happened before
>> the log buffer is written as well, e.g, 
>> xlog_cil_push()->xfs_log_notify() <-- failed
>> 			| 
>>         		|->xlog_cil_committed()
> 
> Right, it's the generic CIL commit handler and it can be called
> directly or from log IO completion.
> 
> The question is this: it is safe to wake up waiters from log IO
> completion if that is where an abort is first triggered from (i.e.
> on log IO error). From what I can see, it is safe to do the wakeup
> on abort because the iclog iwe attach the IO completion callback to
> in xlog_cil_push() cannot be put under IO until we release the
> reference gained in xfs_log_done().
> 
> But this raises an interesting question - the wakeup in
> xlog_cil_push() is done before the log IO for the checkpoint is
> complete, so the wakeup is occurring on checkpoint processing
> completion, not iclog IO completion. i.e. the actual log force
> sleeping still needs to wait for log IO completion to occur after
> then CIL has been pushed. This occurs in the _xfs_log_force{_lsn}()
> wrappers, where iclog state changes are waited for.
> 
> Why is this important? The iclog write/flush wakeups are all done
> from IO completion context, except for the force shutdown case,
> which calls xlog_state_do_callback(log, XFS_LI_ABORTED, NULL); to
> trigger wakeups and aborts via the log IO completion callbacks on
> all the outstanding iclogs.
> 
> IOWs, we've already got a design pattern that says:
> 
> 	- run log force wakeups from IO completions
> 	- on shutdown, run IO completions directly to abort pending
> 	  log operations
> 
> So, really, issuing wakeups from iclog IO completion on log aborts
> or errors is exactly what we currently do to ensure that shutdowns
> don't leave processes waiting on log force completion behind. So
> from that perspective, adding the wakeup on abort to
> xlog_cil_committed() seems like the right approach to take.
> 
> Actually, there's more issues here: xlog_cil_push() leaks a
> reference to the iclog when it triggers the error path via
> xfs_log_notify() failure. At this point we always need to release
> the iclog. Hence if xfs_log_notify() were to always add the IO
> completion to the iclog and xlog_cil_committed() issued wakeups on
> abort errors, then we could completely ignore the log state in
> xfs_log_notify() and have xfs_log_release_iclog() capture the IO
> error and the subsequent shutdown would handle the aborts and
> wakeups....

There is indeed an iclog ref leak after digging into the code.
> 
> Hmmm, then xfs_log_notify could go away, and the callback list could
> be made a lockless list and the ic_callback_lock could go away,
> too....

Hence we can fold xfs_log_notify() into xlog_cil_push() directly, but am
not sure I get the reason why we could make the callback list lockless:
When attaching the IO completion callback to iclog, we assert the iclog
state to be XLOG_STATE_ACTIVE or XLOG_STATE_WANT_SYNC, but in the other
place where we also try to get the ic_callback_lock, i.e,
xlog_state_do_callback(),  we only perform callbacks for iclogs that in
XLOG_STATE_DONE_SYNC or in XLOG_STATE_DO_CALLBACK, so they're already
prevented from the potential race situations, am I understood correctly?

Also, it seems like the iclog->ic_callback_tail can go away as well,
since it only serves as a left value.

Thanks,
-Jeff

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing
  2014-01-03 10:25       ` Jeff Liu
@ 2014-01-03 13:17         ` Jeff Liu
  2014-01-03 15:30           ` Mark Tinguely
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff Liu @ 2014-01-03 13:17 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Mark Tinguely, xfs@oss.sgi.com

On 01/03 2014 18:25 PM, Jeff Liu wrote:
> On 01/02 2014 08:45, Dave Chinner wrote:
>> On Wed, Jan 01, 2014 at 10:38:36PM +0800, Jeff Liu wrote:
>>> On 12/30 2013 23:20 PM, Mark Tinguely wrote:
>>>> On 12/24/13 06:48, Jeff Liu wrote:
>>>>> From: Jie Liu<jeff.liu@oracle.com>
>>>>>
>>>>> I can easily to hit a hang up while running fsstress and shutting down
>>>>> XFS on SSD via the tests below:
>>> <snip>
>>>>>
>>>>>     Task1                    Task2
>>>>>
>>>>>                      list_add(&ctx->committing,&cil->xc_committing);
>>>>>
>>>>> xlog_wait(&cil->xc_commit_wait..)
>>>>> schedule()...
>>>>>
>>>>>                      Aborting!! list_del(&ctx->committing);
>>>>>                      wake_up_all(&cil->xc_commit_wait);<-- MISSING!
>>>>>
>>>>> As a result, we should handle this situation in xlog_cil_committed().
>>>>>
>>>>> Signed-off-by: Jie Liu<jeff.liu@oracle.com>
>>>>> ---
>>>>>   fs/xfs/xfs_log_cil.c | 2 ++
>>>>>   1 file changed, 2 insertions(+)
>>>>>
>>>>> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
>>>>> index 5eb51fc..8c7e9c7 100644
>>>>> --- a/fs/xfs/xfs_log_cil.c
>>>>> +++ b/fs/xfs/xfs_log_cil.c
>>>>> @@ -406,6 +406,8 @@ xlog_cil_committed(
>>>>>
>>>>>       spin_lock(&ctx->cil->xc_push_lock);
>>>>>       list_del(&ctx->committing);
>>>>> +    if (abort)
>>>>> +        wake_up_all(&ctx->cil->xc_commit_wait);
>>>>>       spin_unlock(&ctx->cil->xc_push_lock);
>>>>>
>>>>>       xlog_cil_free_logvec(ctx->lv_chain);
>>>>
>>>> Hi Jeff, I hope you had a good break,
>>> Thanks :)
>>>>
>>>> So you are saying the wakeup in the CIL push error path missing?
>>> Yes.
>>>
>>>> I agree with that. But I don't like adding a new wake up to
>>>> xlog_cil_committed(), which is after the log buffer is written.
>>
>> Hi Mark, any particular reason why you don't like this? It would be
>> great if you could explain why you don't like something up front so
>> we don't have to guess at your reasons or wait for another round
>> trip in the conversation to find them out....
>>
>>> IMO this callback would be called if any problem is happened before
>>> the log buffer is written as well, e.g, 
>>> xlog_cil_push()->xfs_log_notify() <-- failed
>>> 			| 
>>>         		|->xlog_cil_committed()
>>
>> Right, it's the generic CIL commit handler and it can be called
>> directly or from log IO completion.
>>
>> The question is this: it is safe to wake up waiters from log IO
>> completion if that is where an abort is first triggered from (i.e.
>> on log IO error). From what I can see, it is safe to do the wakeup
>> on abort because the iclog iwe attach the IO completion callback to
>> in xlog_cil_push() cannot be put under IO until we release the
>> reference gained in xfs_log_done().
>>
>> But this raises an interesting question - the wakeup in
>> xlog_cil_push() is done before the log IO for the checkpoint is
>> complete, so the wakeup is occurring on checkpoint processing
>> completion, not iclog IO completion. i.e. the actual log force
>> sleeping still needs to wait for log IO completion to occur after
>> then CIL has been pushed. This occurs in the _xfs_log_force{_lsn}()
>> wrappers, where iclog state changes are waited for.
>>
>> Why is this important? The iclog write/flush wakeups are all done
>> from IO completion context, except for the force shutdown case,
>> which calls xlog_state_do_callback(log, XFS_LI_ABORTED, NULL); to
>> trigger wakeups and aborts via the log IO completion callbacks on
>> all the outstanding iclogs.
>>
>> IOWs, we've already got a design pattern that says:
>>
>> 	- run log force wakeups from IO completions
>> 	- on shutdown, run IO completions directly to abort pending
>> 	  log operations
>>
>> So, really, issuing wakeups from iclog IO completion on log aborts
>> or errors is exactly what we currently do to ensure that shutdowns
>> don't leave processes waiting on log force completion behind. So
>> from that perspective, adding the wakeup on abort to
>> xlog_cil_committed() seems like the right approach to take.
>>
>> Actually, there's more issues here: xlog_cil_push() leaks a
>> reference to the iclog when it triggers the error path via
>> xfs_log_notify() failure. At this point we always need to release
>> the iclog. Hence if xfs_log_notify() were to always add the IO
>> completion to the iclog and xlog_cil_committed() issued wakeups on
>> abort errors, then we could completely ignore the log state in
>> xfs_log_notify() and have xfs_log_release_iclog() capture the IO
>> error and the subsequent shutdown would handle the aborts and
>> wakeups....
> 
> There is indeed an iclog ref leak after digging into the code.
>>
>> Hmmm, then xfs_log_notify could go away, and the callback list could
>> be made a lockless list and the ic_callback_lock could go away,
>> too....
> 
> Hence we can fold xfs_log_notify() into xlog_cil_push() directly, but am
> not sure I get the reason why we could make the callback list lockless:
> When attaching the IO completion callback to icl og, we assert the iclog
> state to be XLOG_STATE_ACTIVE or XLOG_STATE_WANT_SYNC, but in the other
> place where we also try to get the ic_callback_lock, i.e, 
> xlog_state_do_callback(),  we only perform callbacks for iclogs that in
> XLOG_STATE_DONE_SYNC or in XLOG_STATE_DO_CALLBACK, so they're already
> prevented from the potential race situations, am I understood correctly?
> 
> Also, it seems like the iclog->ic_callback_tail can go away as well,
> since it only serves as a left value.
> 

Oh, no! I took ic_callback_tail wrong... It's used to attach func to the
tail of callback list.

But IMHO, since it seems like the current code only attach one callback to
iclog (xlog_cil_committed()), the only "iclog->ic_callback" could handle it
if no more callbacks would be added in the future...

Thanks,
-Jeff

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing
  2014-01-03 13:17         ` Jeff Liu
@ 2014-01-03 15:30           ` Mark Tinguely
  0 siblings, 0 replies; 7+ messages in thread
From: Mark Tinguely @ 2014-01-03 15:30 UTC (permalink / raw)
  To: Jeff Liu; +Cc: xfs@oss.sgi.com

On 1/3/2014 7:17 AM, Jeff Liu wrote:
> On 01/03 2014 18:25 PM, Jeff Liu wrote:
>> On 01/02 2014 08:45, Dave Chinner wrote:
>>> On Wed, Jan 01, 2014 at 10:38:36PM +0800, Jeff Liu wrote:
>>>> On 12/30 2013 23:20 PM, Mark Tinguely wrote:
>>>>> On 12/24/13 06:48, Jeff Liu wrote:
>>>>>> From: Jie Liu<jeff.liu@oracle.com>
>>>>>>
>>>>>> I can easily to hit a hang up while running fsstress and shutting down
>>>>>> XFS on SSD via the tests below:
>>>> <snip>
>>>>>>      Task1                    Task2
>>>>>>
>>>>>>                       list_add(&ctx->committing,&cil->xc_committing);
>>>>>>
>>>>>> xlog_wait(&cil->xc_commit_wait..)
>>>>>> schedule()...
>>>>>>
>>>>>>                       Aborting!! list_del(&ctx->committing);
>>>>>>                       wake_up_all(&cil->xc_commit_wait);<-- MISSING!
>>>>>>
>>>>>> As a result, we should handle this situation in xlog_cil_committed().
>>>>>>
>>>>>> Signed-off-by: Jie Liu<jeff.liu@oracle.com>
>>>>>> ---
>>>>>>    fs/xfs/xfs_log_cil.c | 2 ++
>>>>>>    1 file changed, 2 insertions(+)
>>>>>>
>>>>>> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
>>>>>> index 5eb51fc..8c7e9c7 100644
>>>>>> --- a/fs/xfs/xfs_log_cil.c
>>>>>> +++ b/fs/xfs/xfs_log_cil.c
>>>>>> @@ -406,6 +406,8 @@ xlog_cil_committed(
>>>>>>
>>>>>>        spin_lock(&ctx->cil->xc_push_lock);
>>>>>>        list_del(&ctx->committing);
>>>>>> +    if (abort)
>>>>>> +        wake_up_all(&ctx->cil->xc_commit_wait);
>>>>>>        spin_unlock(&ctx->cil->xc_push_lock);
>>>>>>
>>>>>>        xlog_cil_free_logvec(ctx->lv_chain);
>>>>> Hi Jeff, I hope you had a good break,
>>>> Thanks :)
>>>>> So you are saying the wakeup in the CIL push error path missing?
>>>> Yes.
>>>>
>>>>> I agree with that. But I don't like adding a new wake up to
>>>>> xlog_cil_committed(), which is after the log buffer is written.
>>> Hi Mark, any particular reason why you don't like this? It would be
>>> great if you could explain why you don't like something up front so
>>> we don't have to guess at your reasons or wait for another round

My concern is consistency, with the patch there will be two paths that 
could do the wake up.

Originally, the wakeup happened before the iclog write.  With the patch, 
if the cil push
sequence successfully wrote it's ticket, woke up the waiters, wrote back 
the iclog, and
then had an error writing the iclog, it would wakeup the xc_commit_wait 
a second time.
Not too drastic of a problem, because the zeroed commit_lsn will prevent 
an premature
write of the next cil push. I just prefer to handle the error in the cil 
push routine and avoid
a second form of wake up.

--Mark.


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-01-03 15:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-24 12:48 [PATCH 1/4] xfs: wake up cil->xc_commit_wait while removing ctx from cil->xc_committing Jeff Liu
2013-12-30 15:20 ` Mark Tinguely
2014-01-01 14:38   ` Jeff Liu
2014-01-02  0:45     ` Dave Chinner
2014-01-03 10:25       ` Jeff Liu
2014-01-03 13:17         ` Jeff Liu
2014-01-03 15:30           ` Mark Tinguely

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).