From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 4/6] xfs: log shutdown triggers should only shut down the log
Date: Mon, 28 Mar 2022 17:14:18 -0700 [thread overview]
Message-ID: <20220329001418.GC27713@magnolia> (raw)
In-Reply-To: <20220324002103.710477-5-david@fromorbit.com>
On Thu, Mar 24, 2022 at 11:21:01AM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
>
> We've got a mess on our hands.
>
> 1. xfs_trans_commit() cannot cancel transactions because the mount is
> shut down - that causes dirty, aborted, unlogged log items to sit
> unpinned in memory and potentially get written to disk before the
> log is shut down. Hence xfs_trans_commit() can only abort
> transactions when xlog_is_shutdown() is true.
>
> 2. xfs_force_shutdown() is used in places to cause the current
> modification to be aborted via xfs_trans_commit() because it may be
> impractical or impossible to cancel the transaction directly, and
> hence xfs_trans_commit() must cancel transactions when
> xfs_is_shutdown() is true in this situation. But we can't do that
> because of #1.
>
> 3. Log IO errors cause log shutdowns by calling xfs_force_shutdown()
> to shut down the mount and then the log from log IO completion.
>
> 4. xfs_force_shutdown() can result in a log force being issued,
> which has to wait for log IO completion before it will mark the log
> as shut down. If #3 races with some other shutdown trigger that runs
> a log force, we rely on xfs_force_shutdown() silently ignoring #3
> and avoiding shutting down the log until the failed log force
> completes.
>
> 5. To ensure #2 always works, we have to ensure that
> xfs_force_shutdown() does not return until the the log is shut down.
> But in the case of #4, this will result in a deadlock because the
> log Io completion will block waiting for a log force to complete
> which is blocked waiting for log IO to complete....
>
> So the very first thing we have to do here to untangle this mess is
> dissociate log shutdown triggers from mount shutdowns. We already
> have xlog_forced_shutdown, which will atomically transistion to the
> log a shutdown state. Due to internal asserts it cannot be called
> multiple times, but was done simply because the only place that
> could call it was xfs_do_force_shutdown() (i.e. the mount shutdown!)
> and that could only call it once and once only. So the first thing
> we do is remove the asserts.
>
> We then convert all the internal log shutdown triggers to call
> xlog_force_shutdown() directly instead of xfs_force_shutdown(). This
> allows the log shutdown triggers to shut down the log without
> needing to care about mount based shutdown constraints. This means
> we shut down the log independently of the mount and the mount may
> not notice this until it's next attempt to read or modify metadata.
> At that point (e.g. xfs_trans_commit()) it will see that the log is
> shutdown, error out and shutdown the mount.
>
> To ensure that all the unmount behaviours and asserts track
> correctly as a result of a log shutdown, propagate the shutdown up
> to the mount if it is not already set. This keeps the mount and log
> state in sync, and saves a huge amount of hassle where code fails
> because of a log shutdown but only checks for mount shutdowns and
> hence ends up doing the wrong thing. Cleaning up that mess is
> an exercise for another day.
>
> This enables us to address the other problems noted above in
> followup patches.
>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
> fs/xfs/xfs_log.c | 32 +++++++++++++++++++++++---------
> fs/xfs/xfs_log_cil.c | 4 ++--
> fs/xfs/xfs_log_recover.c | 6 +++---
> fs/xfs/xfs_mount.c | 1 +
> fs/xfs/xfs_trans_ail.c | 8 ++++----
> 5 files changed, 33 insertions(+), 18 deletions(-)
>
> diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
> index 388d53df7620..6879d6d19d68 100644
> --- a/fs/xfs/xfs_log.c
> +++ b/fs/xfs/xfs_log.c
> @@ -1360,7 +1360,7 @@ xlog_ioend_work(
> */
> if (XFS_TEST_ERROR(error, log->l_mp, XFS_ERRTAG_IODONE_IOERR)) {
> xfs_alert(log->l_mp, "log I/O error %d", error);
> - xfs_force_shutdown(log->l_mp, SHUTDOWN_LOG_IO_ERROR);
> + xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> }
>
> xlog_state_done_syncing(iclog);
> @@ -1899,7 +1899,7 @@ xlog_write_iclog(
> iclog->ic_flags &= ~(XLOG_ICL_NEED_FLUSH | XLOG_ICL_NEED_FUA);
>
> if (xlog_map_iclog_data(&iclog->ic_bio, iclog->ic_data, count)) {
> - xfs_force_shutdown(log->l_mp, SHUTDOWN_LOG_IO_ERROR);
> + xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> return;
> }
> if (is_vmalloc_addr(iclog->ic_data))
> @@ -2474,7 +2474,7 @@ xlog_write(
> xfs_alert_tag(log->l_mp, XFS_PTAG_LOGRES,
> "ctx ticket reservation ran out. Need to up reservation");
> xlog_print_tic_res(log->l_mp, ticket);
> - xfs_force_shutdown(log->l_mp, SHUTDOWN_LOG_IO_ERROR);
> + xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> }
>
> len = xlog_write_calc_vec_length(ticket, log_vector, optype);
> @@ -3808,9 +3808,10 @@ xlog_verify_iclog(
> #endif
>
> /*
> - * Perform a forced shutdown on the log. This should be called once and once
> - * only by the high level filesystem shutdown code to shut the log subsystem
> - * down cleanly.
> + * Perform a forced shutdown on the log.
> + *
> + * This can be called from low level log code to trigger a shutdown, or from the
> + * high level mount shutdown code when the mount shuts down.
> *
> * Our main objectives here are to make sure that:
> * a. if the shutdown was not due to a log IO error, flush the logs to
> @@ -3819,6 +3820,8 @@ xlog_verify_iclog(
> * parties to find out. Nothing new gets queued after this is done.
> * c. Tasks sleeping on log reservations, pinned objects and
> * other resources get woken up.
> + * d. The mount is also marked as shut down so that log triggered shutdowns
> + * still behave the same as if they called xfs_forced_shutdown().
> *
> * Return true if the shutdown cause was a log IO error and we actually shut the
> * log down.
> @@ -3837,8 +3840,6 @@ xlog_force_shutdown(
> if (!log || xlog_in_recovery(log))
> return false;
>
> - ASSERT(!xlog_is_shutdown(log));
> -
> /*
> * Flush all the completed transactions to disk before marking the log
> * being shut down. We need to do this first as shutting down the log
> @@ -3865,11 +3866,24 @@ xlog_force_shutdown(
> spin_lock(&log->l_icloglock);
> if (test_and_set_bit(XLOG_IO_ERROR, &log->l_opstate)) {
> spin_unlock(&log->l_icloglock);
> - ASSERT(0);
> return false;
> }
> spin_unlock(&log->l_icloglock);
>
> + /*
> + * If this log shutdown also sets the mount shutdown state, issue a
> + * shutdown warning message.
> + */
> + if (!test_and_set_bit(XFS_OPSTATE_SHUTDOWN, &log->l_mp->m_opstate)) {
> + xfs_alert_tag(log->l_mp, XFS_PTAG_SHUTDOWN_LOGERROR,
> + "Filesystem has been shut down due to log error (0x%x).",
> + shutdown_flags);
> + xfs_alert(log->l_mp,
> + "Please unmount the filesystem and rectify the problem(s)");
Style nit: Can we make these strings have a different level of indent
than the enclosing function call?
Other than that, I think it makes a lot of sense that xfs_log* functions
should call the xlog shutdown procedure, not the one in the fs.
With the style nit fixed,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> + if (xfs_error_level >= XFS_ERRLEVEL_HIGH)
> + xfs_stack_trace();
> + }
> +
> /*
> * We don't want anybody waiting for log reservations after this. That
> * means we have to wake up everybody queued up on reserveq as well as
> diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
> index 43006f6f5ab8..ba57323bfdce 100644
> --- a/fs/xfs/xfs_log_cil.c
> +++ b/fs/xfs/xfs_log_cil.c
> @@ -540,7 +540,7 @@ xlog_cil_insert_items(
> spin_unlock(&cil->xc_cil_lock);
>
> if (tp->t_ticket->t_curr_res < 0)
> - xfs_force_shutdown(log->l_mp, SHUTDOWN_LOG_IO_ERROR);
> + xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> }
>
> static void
> @@ -864,7 +864,7 @@ xlog_cil_write_commit_record(
>
> error = xlog_write(log, ctx, &vec, ctx->ticket, XLOG_COMMIT_TRANS);
> if (error)
> - xfs_force_shutdown(log->l_mp, SHUTDOWN_LOG_IO_ERROR);
> + xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> return error;
> }
>
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 7758a6706b8c..c4ad4296c540 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2485,7 +2485,7 @@ xlog_finish_defer_ops(
> error = xfs_trans_alloc(mp, &resv, dfc->dfc_blkres,
> dfc->dfc_rtxres, XFS_TRANS_RESERVE, &tp);
> if (error) {
> - xfs_force_shutdown(mp, SHUTDOWN_LOG_IO_ERROR);
> + xlog_force_shutdown(mp->m_log, SHUTDOWN_LOG_IO_ERROR);
> return error;
> }
>
> @@ -3454,7 +3454,7 @@ xlog_recover_finish(
> */
> xlog_recover_cancel_intents(log);
> xfs_alert(log->l_mp, "Failed to recover intents");
> - xfs_force_shutdown(log->l_mp, SHUTDOWN_LOG_IO_ERROR);
> + xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> return error;
> }
>
> @@ -3501,7 +3501,7 @@ xlog_recover_finish(
> * end of intents processing can be pushed through the CIL
> * and AIL.
> */
> - xfs_force_shutdown(log->l_mp, SHUTDOWN_LOG_IO_ERROR);
> + xlog_force_shutdown(log, SHUTDOWN_LOG_IO_ERROR);
> }
>
> return 0;
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index bed73e8002a5..4e1aa4240b61 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -21,6 +21,7 @@
> #include "xfs_trans.h"
> #include "xfs_trans_priv.h"
> #include "xfs_log.h"
> +#include "xfs_log_priv.h"
> #include "xfs_error.h"
> #include "xfs_quota.h"
> #include "xfs_fsops.h"
> diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
> index c2ccb98c7bcd..d3a97a028560 100644
> --- a/fs/xfs/xfs_trans_ail.c
> +++ b/fs/xfs/xfs_trans_ail.c
> @@ -873,17 +873,17 @@ xfs_trans_ail_delete(
> int shutdown_type)
> {
> struct xfs_ail *ailp = lip->li_ailp;
> - struct xfs_mount *mp = ailp->ail_log->l_mp;
> + struct xlog *log = ailp->ail_log;
> xfs_lsn_t tail_lsn;
>
> spin_lock(&ailp->ail_lock);
> if (!test_bit(XFS_LI_IN_AIL, &lip->li_flags)) {
> spin_unlock(&ailp->ail_lock);
> - if (shutdown_type && !xlog_is_shutdown(ailp->ail_log)) {
> - xfs_alert_tag(mp, XFS_PTAG_AILDELETE,
> + if (shutdown_type && !xlog_is_shutdown(log)) {
> + xfs_alert_tag(log->l_mp, XFS_PTAG_AILDELETE,
> "%s: attempting to delete a log item that is not in the AIL",
> __func__);
> - xfs_force_shutdown(mp, shutdown_type);
> + xlog_force_shutdown(log, shutdown_type);
> }
> return;
> }
> --
> 2.35.1
>
next prev parent reply other threads:[~2022-03-29 0:14 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-24 0:20 [PATCH 0/6 v2] xfs: more shutdown/recovery fixes Dave Chinner
2022-03-24 0:20 ` [PATCH 1/6] xfs: aborting inodes on shutdown may need buffer lock Dave Chinner
2022-03-28 22:44 ` Darrick J. Wong
2022-03-28 23:11 ` Dave Chinner
2022-03-30 1:20 ` Darrick J. Wong
2022-03-24 0:20 ` [PATCH 2/6] xfs: shutdown in intent recovery has non-intent items in the AIL Dave Chinner
2022-03-28 22:46 ` Darrick J. Wong
2022-03-24 0:21 ` [PATCH 3/6] xfs: run callbacks before waking waiters in xlog_state_shutdown_callbacks Dave Chinner
2022-03-28 23:05 ` Darrick J. Wong
2022-03-28 23:13 ` Dave Chinner
2022-03-28 23:36 ` Darrick J. Wong
2022-03-24 0:21 ` [PATCH 4/6] xfs: log shutdown triggers should only shut down the log Dave Chinner
2022-03-29 0:14 ` Darrick J. Wong [this message]
2022-03-24 0:21 ` [PATCH 5/6] xfs: xfs_do_force_shutdown needs to block racing shutdowns Dave Chinner
2022-03-29 0:19 ` Darrick J. Wong
2022-03-24 0:21 ` [PATCH 6/6] xfs: xfs_trans_commit() path must check for log shutdown Dave Chinner
2022-03-29 0:36 ` Darrick J. Wong
2022-03-29 3:08 ` Dave Chinner
2022-03-27 22:55 ` [PATCH 7/6] xfs: xfs: shutdown during log recovery needs to mark the " Dave Chinner
2022-03-29 0:37 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220329001418.GC27713@magnolia \
--to=djwong@kernel.org \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).