From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Eric Sandeen <sandeen@redhat.com>,
Mark Tinguely <tinguely@sgi.com>, Boris Ranto <branto@redhat.com>,
xfs-oss <xfs@oss.sgi.com>
Subject: Re: [PATCH] xfs: test for shut down fs in xfs_dir_fsync()
Date: Tue, 29 Apr 2014 08:56:49 -0400 [thread overview]
Message-ID: <20140429125648.GA59046@bfoster.bfoster> (raw)
In-Reply-To: <20140428221849.GC18672@dastard>
On Tue, Apr 29, 2014 at 08:18:49AM +1000, Dave Chinner wrote:
> On Mon, Apr 28, 2014 at 04:39:50PM -0500, Mark Tinguely wrote:
> > On 04/28/14 15:54, Dave Chinner wrote:
> > >On Mon, Apr 28, 2014 at 11:35:16AM -0500, Eric Sandeen wrote:
...
>
> xfs: don't sleep in xlog_cil_force_lsn on shutdown
>
> From: Dave Chinner <dchinner@redhat.com>
>
> Reports of a shutdown hang when fsyncing a directory have surfaced,
> such as this:
>
> [ 3663.394472] Call Trace:
> [ 3663.397199] [<ffffffff815f1889>] schedule+0x29/0x70
> [ 3663.402743] [<ffffffffa01feda5>] xlog_cil_force_lsn+0x185/0x1a0 [xfs]
> [ 3663.416249] [<ffffffffa01fd3af>] _xfs_log_force_lsn+0x6f/0x2f0 [xfs]
> [ 3663.429271] [<ffffffffa01a339d>] xfs_dir_fsync+0x7d/0xe0 [xfs]
> [ 3663.435873] [<ffffffff811df8c5>] do_fsync+0x65/0xa0
> [ 3663.441408] [<ffffffff811dfbc0>] SyS_fsync+0x10/0x20
> [ 3663.447043] [<ffffffff815fc7d9>] system_call_fastpath+0x16/0x1b
>
> If we trigger a shutdown in xlog_cil_push() from xlog_write(), we
> will never wake waiters on the current push sequence number, so
> anything waiting in xlog_cil_force_lsn() for that push sequence
> number to come up will not get woken and hence stall the shutdown.
>
> Fix this by ensuring we call wake_up_all(&cil->xc_commit_wait) in
> the push abort handling, in the log shutdown code when waking all
> waiters, and adding a shutdown check in the sequence completion wait
> loops to ensure they abort when a wakeup due to a shutdown occurs.
>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/xfs/xfs_log.c | 7 +++++--
> fs/xfs/xfs_log_cil.c | 50 ++++++++++++++++++++++++++++++++++++++++++--------
> 2 files changed, 47 insertions(+), 10 deletions(-)
>
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index a5f8bd9..dbba2d7 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -3952,11 +3952,14 @@ xfs_log_force_umount(
> retval = xlog_state_ioerror(log);
> spin_unlock(&log->l_icloglock);
> }
> +
> /*
> - * Wake up everybody waiting on xfs_log_force.
> - * Callback all log item committed functions as if the
> + * Wake up everybody waiting on xfs_log_force. This needs to wake anyone
> + * waiting on a CIL push that is issued as part of a log force first
> + * before running the log item committed callback functions as if the
> * log writes were completed.
> */
> + wake_up_all(&log->l_cilp->xc_commit_wait);
> xlog_state_do_callback(log, XFS_LI_ABORTED, NULL);
>
Why is this necessary? It looks like xlog_state_do_callback() will hit
the xlog_cil_committed() callback with the aborted flag...
> #ifdef XFSERRORDEBUG
> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
> index 7e54553..3a68ddf 100644
> --- a/fs/xfs/xfs_log_cil.c
> +++ b/fs/xfs/xfs_log_cil.c
> @@ -385,7 +385,15 @@ xlog_cil_committed(
> xfs_extent_busy_clear(mp, &ctx->busy_extents,
> (mp->m_flags & XFS_MOUNT_DISCARD) && !abort);
>
> + /*
> + * If we are aborting the commit, wake up anyone waiting on the
> + * committing list. If we don't, then a shutdown we can leave processes
> + * waiting in xlog_cil_force_lsn() waiting on a sequence commit that
> + * will never happen because we aborted it.
> + */
> spin_lock(&ctx->cil->xc_push_lock);
> + if (abort)
> + wake_up_all(&cil->xc_commit_wait);
There's a compile error here. The parameter should be '&ctx->cil->...'
Brian
> list_del(&ctx->committing);
> spin_unlock(&ctx->cil->xc_push_lock);
>
> @@ -564,8 +572,18 @@ restart:
> spin_lock(&cil->xc_push_lock);
> list_for_each_entry(new_ctx, &cil->xc_committing, committing) {
> /*
> + * Avoid getting stuck in this loop because we were woken by the
> + * shutdown, but then went back to sleep once already in the
> + * shutdown state.
> + */
> + if (XLOG_FORCED_SHUTDOWN(log)) {
> + spin_unlock(&cil->xc_push_lock);
> + goto out_abort_free_ticket;
> + }
> +
> + /*
> * Higher sequences will wait for this one so skip them.
> - * Don't wait for own own sequence, either.
> + * Don't wait for our own sequence, either.
> */
> if (new_ctx->sequence >= ctx->sequence)
> continue;
> @@ -810,6 +828,13 @@ restart:
> */
> spin_lock(&cil->xc_push_lock);
> list_for_each_entry(ctx, &cil->xc_committing, committing) {
> + /*
> + * Avoid getting stuck in this loop because we were woken by the
> + * shutdown, but then went back to sleep once already in the
> + * shutdown state.
> + */
> + if (XLOG_FORCED_SHUTDOWN(log))
> + goto out_shutdown;
> if (ctx->sequence > sequence)
> continue;
> if (!ctx->commit_lsn) {
> @@ -833,14 +858,12 @@ restart:
> * push sequence after the above wait loop and the CIL still contains
> * dirty objects.
> *
> - * When the push occurs, it will empty the CIL and
> - * atomically increment the currect sequence past the push sequence and
> - * move it into the committing list. Of course, if the CIL is clean at
> - * the time of the push, it won't have pushed the CIL at all, so in that
> - * case we should try the push for this sequence again from the start
> - * just in case.
> + * When the push occurs, it will empty the CIL and atomically increment
> + * the currect sequence past the push sequence and move it into the
> + * committing list. Of course, if the CIL is clean at the time of the
> + * push, it won't have pushed the CIL at all, so in that case we should
> + * try the push for this sequence again from the start just in case.
> */
> -
> if (sequence == cil->xc_current_sequence &&
> !list_empty(&cil->xc_cil)) {
> spin_unlock(&cil->xc_push_lock);
> @@ -849,6 +872,17 @@ restart:
>
> spin_unlock(&cil->xc_push_lock);
> return commit_lsn;
> +
> + /*
> + * We detected a shutdown in progress. We need to trigger the log force
> + * to pass through it's iclog state machine error handling, even though
> + * we are already in a shutdown state. Hence we can't return
> + * NULLCOMMITLSN here as that has special meaning to log forces (i.e.
> + * LSN is already stable), so we return a zero LSN instead.
> + */
> +out_shutdown:
> + spin_unlock(&cil->xc_push_lock);
> + return 0;
> }
>
> /*
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-04-29 12:57 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-28 16:35 [PATCH] xfs: test for shut down fs in xfs_dir_fsync() Eric Sandeen
2014-04-28 16:47 ` Christoph Hellwig
2014-04-28 17:18 ` Eric Sandeen
2014-04-28 17:22 ` Mark Tinguely
2014-04-28 17:26 ` Eric Sandeen
2014-04-28 17:49 ` Mark Tinguely
2014-04-28 17:53 ` Eric Sandeen
2014-04-29 10:24 ` Boris Ranto
2014-04-28 20:54 ` Dave Chinner
2014-04-28 21:39 ` Mark Tinguely
2014-04-28 22:18 ` Dave Chinner
2014-04-28 23:00 ` Mark Tinguely
2014-04-29 12:56 ` Brian Foster [this message]
2014-04-29 20:12 ` Dave Chinner
2014-07-21 15:33 ` Eric Sandeen
2014-09-12 19:29 ` Eric Sandeen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140429125648.GA59046@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=branto@redhat.com \
--cc=david@fromorbit.com \
--cc=sandeen@redhat.com \
--cc=tinguely@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.