From: Dave Chinner <david@fromorbit.com>
To: linux-xfs@vger.kernel.org
Subject: [PATCH 2/8] xfs: don't wait on future iclogs when pushing the CIL
Date: Thu, 17 Jun 2021 18:26:11 +1000 [thread overview]
Message-ID: <20210617082617.971602-3-david@fromorbit.com> (raw)
In-Reply-To: <20210617082617.971602-1-david@fromorbit.com>
From: Dave Chinner <dchinner@redhat.com>
The iclogbuf ring attached to the struct xlog is circular, hence the
first and last iclogs in the ring can only be determined by
comparing them against the log->l_iclog pointer.
In xfs_cil_push_work(), we want to wait on previous iclogs that were
issued so that we can flush them to stable storage with the commit
record write, and it simply waits on the previous iclog in the ring.
This, however, leads to CIL push hangs in generic/019 like so:
task:kworker/u33:0 state:D stack:12680 pid: 7 ppid: 2 flags:0x00004000
Workqueue: xfs-cil/pmem1 xlog_cil_push_work
Call Trace:
__schedule+0x30b/0x9f0
schedule+0x68/0xe0
xlog_wait_on_iclog+0x121/0x190
? wake_up_q+0xa0/0xa0
xlog_cil_push_work+0x994/0xa10
? _raw_spin_lock+0x15/0x20
? xfs_swap_extents+0x920/0x920
process_one_work+0x1ab/0x390
worker_thread+0x56/0x3d0
? rescuer_thread+0x3c0/0x3c0
kthread+0x14d/0x170
? __kthread_bind_mask+0x70/0x70
ret_from_fork+0x1f/0x30
With other threads blocking in either xlog_state_get_iclog_space()
waiting for iclog space or xlog_grant_head_wait() waiting for log
reservation space.
The problem here is that the previous iclog on the ring might
actually be a future iclog. That is, if log->l_iclog points at
commit_iclog, commit_iclog is the first (oldest) iclog in the ring
and there are no previous iclogs pending as they have all completed
their IO and been activated again. IOWs, commit_iclog->ic_prev
points to an iclog that will be written in the future, not one that
has been written in the past.
Hence, in this case, waiting on the ->ic_prev iclog is incorrect
behaviour, and depending on the state of the future iclog, we can
end up with a circular ABA wait cycle and we hang.
The fix is made more complex by the fact that many iclogs states
cannot be used to determine if the iclog is a past or future iclog.
Hence we have to determine past iclogs by checking the LSN of the
iclog rather than their state. A past ACTIVE iclog will have a LSN
of zero, while a future ACTIVE iclog will have a LSN greater than
the current iclog. We don't wait on either of these cases.
Similarly, a future iclog that hasn't completed IO will have an LSN
greater than the current iclog and so we don't wait on them. A past
iclog that is still undergoing IO completion will have a LSN less
than the current iclog and those are the only iclogs that we need to
wait on.
Hence we can use the iclog LSN to determine what iclogs we need to
wait on here.
Fixes: 5fd9256ce156 ("xfs: separate CIL commit record IO")
Reported-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_log_cil.c | 51 ++++++++++++++++++++++++++++++++++++++------
1 file changed, 45 insertions(+), 6 deletions(-)
diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c
index 705619e9dab4..2fb0ab02dda3 100644
--- a/fs/xfs/xfs_log_cil.c
+++ b/fs/xfs/xfs_log_cil.c
@@ -1075,15 +1075,54 @@ xlog_cil_push_work(
ticket = ctx->ticket;
/*
- * If the checkpoint spans multiple iclogs, wait for all previous
- * iclogs to complete before we submit the commit_iclog. In this case,
- * the commit_iclog write needs to issue a pre-flush so that the
- * ordering is correctly preserved down to stable storage.
+ * If the checkpoint spans multiple iclogs, wait for all previous iclogs
+ * to complete before we submit the commit_iclog. We can't use state
+ * checks for this - ACTIVE can be either a past completed iclog or a
+ * future iclog being filled, while WANT_SYNC through SYNC_DONE can be a
+ * past or future iclog awaiting IO or ordered IO completion to be run.
+ * In the latter case, if it's a future iclog and we wait on it, the we
+ * will hang because it won't get processed through to ic_force_wait
+ * wakeup until this commit_iclog is written to disk. Hence we use the
+ * iclog header lsn and compare it to the commit lsn to determine if we
+ * need to wait on iclogs or not.
*/
spin_lock(&log->l_icloglock);
if (ctx->start_lsn != commit_lsn) {
- xlog_wait_on_iclog(commit_iclog->ic_prev);
- spin_lock(&log->l_icloglock);
+ struct xlog_in_core *iclog;
+
+ for (iclog = commit_iclog->ic_prev;
+ iclog != commit_iclog;
+ iclog = iclog->ic_prev) {
+ xfs_lsn_t hlsn;
+
+ /*
+ * If the LSN of the iclog is zero or in the future it
+ * means it has passed through IO completion and
+ * activation and hence all previous iclogs have also
+ * done so. We do not need to wait at all in this case.
+ */
+ hlsn = be64_to_cpu(iclog->ic_header.h_lsn);
+ if (!hlsn || XFS_LSN_CMP(hlsn, commit_lsn) > 0)
+ break;
+
+ /*
+ * If the LSN of the iclog is older than the commit lsn,
+ * we have to wait on it. Waiting on this via the
+ * ic_force_wait should also order the completion of all
+ * older iclogs, too, but we leave checking that to the
+ * next loop iteration.
+ */
+ ASSERT(XFS_LSN_CMP(hlsn, commit_lsn) < 0);
+ xlog_wait_on_iclog(iclog);
+ spin_lock(&log->l_icloglock);
+ }
+
+ /*
+ * Regardless of whether we need to wait or not, the the
+ * commit_iclog write needs to issue a pre-flush so that the
+ * ordering for this checkpoint is correctly preserved down to
+ * stable storage.
+ */
commit_iclog->ic_flags |= XLOG_ICL_NEED_FLUSH;
}
--
2.31.1
next prev parent reply other threads:[~2021-06-17 8:26 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-17 8:26 [PATCH 0/8 V2] xfs: log fixes for for-next Dave Chinner
2021-06-17 8:26 ` [PATCH 1/8] xfs: add iclog state trace events Dave Chinner
2021-06-17 16:45 ` Darrick J. Wong
2021-06-18 14:09 ` Christoph Hellwig
2021-06-17 8:26 ` Dave Chinner [this message]
2021-06-17 17:49 ` [PATCH 2/8] xfs: don't wait on future iclogs when pushing the CIL Darrick J. Wong
2021-06-17 21:55 ` Dave Chinner
2021-06-17 8:26 ` [PATCH 3/8] xfs: move xlog_commit_record to xfs_log_cil.c Dave Chinner
2021-06-17 12:57 ` kernel test robot
2021-06-17 17:50 ` Darrick J. Wong
2021-06-17 21:56 ` Dave Chinner
2021-06-18 14:16 ` Christoph Hellwig
2021-06-17 8:26 ` [PATCH 4/8] xfs: pass a CIL context to xlog_write() Dave Chinner
2021-06-17 14:46 ` kernel test robot
2021-06-17 20:24 ` Darrick J. Wong
2021-06-17 22:03 ` Dave Chinner
2021-06-17 22:18 ` Darrick J. Wong
2021-06-18 14:23 ` Christoph Hellwig
2021-06-28 8:58 ` Dan Carpenter
2021-06-17 8:26 ` [PATCH 5/8] xfs: factor out log write ordering from xlog_cil_push_work() Dave Chinner
2021-06-17 19:59 ` Darrick J. Wong
2021-06-18 14:27 ` Christoph Hellwig
2021-06-18 22:34 ` Dave Chinner
2021-06-17 8:26 ` [PATCH 6/8] xfs: separate out setting CIL context LSNs from xlog_write Dave Chinner
2021-06-17 20:28 ` Darrick J. Wong
2021-06-17 22:10 ` Dave Chinner
2021-06-17 8:26 ` [PATCH 7/8] xfs: attached iclog callbacks in xlog_cil_set_ctx_write_state() Dave Chinner
2021-06-17 20:55 ` Darrick J. Wong
2021-06-17 22:20 ` Dave Chinner
2021-06-17 8:26 ` [PATCH 8/8] xfs: order CIL checkpoint start records Dave Chinner
2021-06-17 21:31 ` Darrick J. Wong
2021-06-17 22:49 ` Dave Chinner
2021-06-17 18:32 ` [PATCH 0/8 V2] xfs: log fixes for for-next Brian Foster
2021-06-17 19:05 ` Darrick J. Wong
2021-06-17 20:06 ` Brian Foster
2021-06-17 20:26 ` Darrick J. Wong
2021-06-17 23:31 ` Brian Foster
2021-06-17 23:43 ` Dave Chinner
2021-06-18 13:08 ` Brian Foster
2021-06-18 13:55 ` Christoph Hellwig
2021-06-18 14:02 ` Christoph Hellwig
2021-06-18 22:28 ` Dave Chinner
2021-06-18 22:15 ` Dave Chinner
2021-06-18 22:48 ` Dave Chinner
2021-06-19 20:22 ` Darrick J. Wong
2021-06-20 22:18 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210617082617.971602-3-david@fromorbit.com \
--to=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox