linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] xfs: log recovery wrap and tail overwrite fixes
@ 2017-06-27 14:40 Brian Foster
  2017-06-27 14:40 ` [PATCH 1/4] xfs: fix recovery failure when log record header wraps log end Brian Foster
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Brian Foster @ 2017-06-27 14:40 UTC (permalink / raw)
  To: linux-xfs

Hi all,

Here's a first real stab at a fix for the log tail overwrite issue. The
general approach is similar to torn write detection: move the tail
forward when corruption is detected within the range of a possible tail
overwrite.

Patch 1 fixes an independent and spurious log recovery failure when a
log record header wraps around the end of the physical log. Patch 2 is a
semi-preparatory patch that unconditionally invokes log tail
verification rather than only after torn write detection at the head.
Patch 3 introduces the core fix to move the tail forward in the event of
corruption. Patch 4 introduces an error injection tag to force log item
pinning and facilitates the test that reliably reproduces the tail
overwrite problem.

This survives the latest variant of the xfstests test that reproduces
the tail overwrite condition and otherwise hasn't shown any regressions
in my testing so far (still ongoing). This also allows the metadump
images provided by Sweet Tea[1] to mount (though those images do still
show filesystem corruption after mount, so I suspect something more is
going on there).

One other slight change worth noting in log recovery behavior is that
tail overwrite detection causes earlier reporting of legitimate log CRC
or corruption errors. Before this series, a log corruption that is not
resolved by torn write/tail overwrite detection results in log recovery
failure after a partial recovery up to the point at which the corruption
is encountered. After this series, it is very likely that the corruption
is identified during tail verification and an error returned to
userspace before real recovery begins.

An xfs_repair is necessary in either case, but I'm curious if there is a
preference towards the old or newly proposed behavior. An alternative
I've considered to preserve the old behavior, for example, is to use the
tail verification CRC pass for tail fixing only (and otherwise consider
errors at this point as nonfatal). This means that we would fix up the
tail if possible, otherwise leave errors to the real recovery sequence
such that a partial recovery can occur before the (imminent) failure.
Thoughts?

Brian

v1:
- Add patch to fix log recovery header wrapping problem.
- Replace transaction reservation rfc with log recovery based fix.
- Replace custom log pinning sysfs knob with error injection tag.
rfc: http://www.spinics.net/lists/linux-xfs/msg07623.html

[1] http://www.spinics.net/lists/linux-xfs/msg07667.html


Brian Foster (4):
  xfs: fix recovery failure when log record header wraps log end
  xfs: always verify the log tail during recovery
  xfs: fix log recovery corruption error due to tail overwrite
  xfs: add log item pinning error injection tag

 fs/xfs/xfs_error.c       |   3 +
 fs/xfs/xfs_error.h       |   4 +-
 fs/xfs/xfs_log_recover.c | 150 +++++++++++++++++++++++++++++------------------
 fs/xfs/xfs_trans_ail.c   |  17 +++++-
 4 files changed, 114 insertions(+), 60 deletions(-)

-- 
2.7.5


^ permalink raw reply	[flat|nested] 18+ messages in thread
* [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail
@ 2017-06-16 16:44 Brian Foster
  2017-06-16 16:46 ` [PATCH] tests/xfs: test for log recovery failure after tail overwrite Brian Foster
  0 siblings, 1 reply; 18+ messages in thread
From: Brian Foster @ 2017-06-16 16:44 UTC (permalink / raw)
  To: linux-xfs

Create a debug mode only sysfs option to force pin the tail of the
log. This option can be used by test infrastructure to induce head
behind tail conditions. Specifically, this is intended to be used by
xfstests to reproduce log recovery problems after failed/corrupted
log writes overwrite the last good tail LSN in the log.

When enabled, AIL push attempts see every log item on the AIL in the
pinned state. This stalls metadata writeback and thus prevents the
current tail of the log from moving forward. When disabled,
subsequent AIL pushes observe the log items in their appropriate
state and filesystem operation continues as normal.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---

Hi all,

This patch is a supporting patch for an xfstests test I'm about to post
that pins the tail of the log in order to reproduce the log recovery
problem that appears to be the root of the problem in this[1] thread.
That is the primary motivation for the patch and so should probably be
reviewed with that context. IOW, if there's a better way to reproduce
the problem in the test without the need for kernel support, I'm happy
to drop this. Thoughts, reviews, flames appreciated.

Brian

[1] http://www.spinics.net/lists/linux-xfs/msg07499.html

 fs/xfs/xfs_log_priv.h  |  2 ++
 fs/xfs/xfs_sysfs.c     | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_trans_ail.c | 20 +++++++++++++++++++-
 3 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h
index c2604a5..bfbfde12 100644
--- a/fs/xfs/xfs_log_priv.h
+++ b/fs/xfs/xfs_log_priv.h
@@ -413,6 +413,8 @@ struct xlog {
 	void			*l_iclog_bak[XLOG_MAX_ICLOGS];
 	/* log record crc error injection factor */
 	uint32_t		l_badcrc_factor;
+	/* force pin the log tail */
+	bool			l_pin_tail;
 #endif
 	/* log recovery lsn tracking (for buffer submission */
 	xfs_lsn_t		l_recovery_lsn;
diff --git a/fs/xfs/xfs_sysfs.c b/fs/xfs/xfs_sysfs.c
index ec6e0e2..b86148a 100644
--- a/fs/xfs/xfs_sysfs.c
+++ b/fs/xfs/xfs_sysfs.c
@@ -378,6 +378,51 @@ log_badcrc_factor_show(
 }
 
 XFS_SYSFS_ATTR_RW(log_badcrc_factor);
+
+/*
+ * DEBUG mode flag to force pin the tail of the log. Used from test
+ * infrastructure to manufacture head-behind-tail conditions. DO NOT USE
+ * DIRECTLY. This will lock up the fs!
+ *
+ * When this option is enabled, all log items present in the AIL are emulated as
+ * being in the pinned state until the option is disabled. Once disabled, log
+ * items return to their natural state and fs operation continues as normal.
+ */
+STATIC ssize_t
+log_pin_tail_store(
+	struct kobject		*kobject,
+	const char		*buf,
+	size_t			count)
+{
+	struct xlog		*log = to_xlog(kobject);
+	int			ret;
+	int			val;
+
+	ret = kstrtoint(buf, 0, &val);
+	if (ret)
+		return ret;
+
+	if (val == 1)
+		log->l_pin_tail = true;
+	else if (val == 0)
+		log->l_pin_tail = false;
+	else
+		return -EINVAL;
+
+	return count;
+}
+
+STATIC ssize_t
+log_pin_tail_show(
+	struct kobject		*kobject,
+	char			*buf)
+{
+	struct xlog		*log = to_xlog(kobject);
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", log->l_pin_tail ? 1 : 0);
+}
+XFS_SYSFS_ATTR_RW(log_pin_tail);
+
 #endif	/* DEBUG */
 
 static struct attribute *xfs_log_attrs[] = {
@@ -387,6 +432,7 @@ static struct attribute *xfs_log_attrs[] = {
 	ATTR_LIST(write_grant_head),
 #ifdef DEBUG
 	ATTR_LIST(log_badcrc_factor),
+	ATTR_LIST(log_pin_tail),
 #endif
 	NULL,
 };
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index 9056c0f..c901e61 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -27,6 +27,7 @@
 #include "xfs_trace.h"
 #include "xfs_error.h"
 #include "xfs_log.h"
+#include "xfs_log_priv.h"
 
 #ifdef DEBUG
 /*
@@ -325,6 +326,23 @@ xfs_ail_delete(
 	xfs_trans_ail_cursor_clear(ailp, lip);
 }
 
+static inline uint
+xfsaild_push_item(
+	struct xfs_ail		*ailp,
+	struct xfs_log_item	*lip)
+{
+#ifdef DEBUG
+	/*
+	 * If tail pinning is enabled, skip the push and track all items as
+	 * pinned to force pin the log tail. This helps induce head-behind-tail
+	 * conditions.
+	 */
+	if (ailp->xa_mount->m_log->l_pin_tail)
+		return XFS_ITEM_PINNED;
+#endif
+	return lip->li_ops->iop_push(lip, &ailp->xa_buf_list);
+}
+
 static long
 xfsaild_push(
 	struct xfs_ail		*ailp)
@@ -382,7 +400,7 @@ xfsaild_push(
 		 * rely on the AIL cursor implementation to be able to deal with
 		 * the dropped lock.
 		 */
-		lock_result = lip->li_ops->iop_push(lip, &ailp->xa_buf_list);
+		lock_result = xfsaild_push_item(ailp, lip);
 		switch (lock_result) {
 		case XFS_ITEM_SUCCESS:
 			XFS_STATS_INC(mp, xs_push_ail_success);
-- 
2.7.5


^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-07-03 16:39 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-27 14:40 [PATCH 0/4] xfs: log recovery wrap and tail overwrite fixes Brian Foster
2017-06-27 14:40 ` [PATCH 1/4] xfs: fix recovery failure when log record header wraps log end Brian Foster
2017-07-01  4:38   ` Darrick J. Wong
2017-07-03 12:11     ` Brian Foster
2017-06-27 14:40 ` [PATCH 2/4] xfs: always verify the log tail during recovery Brian Foster
2017-07-01  4:43   ` Darrick J. Wong
2017-07-03 12:11     ` Brian Foster
2017-06-27 14:40 ` [PATCH 3/4] xfs: fix log recovery corruption error due to tail overwrite Brian Foster
2017-07-01  5:06   ` Darrick J. Wong
2017-07-03 12:13     ` Brian Foster
2017-07-03 16:27       ` Brian Foster
2017-07-03 16:39       ` Darrick J. Wong
2017-06-27 14:40 ` [PATCH 4/4] xfs: add log item pinning error injection tag Brian Foster
2017-07-01  3:03   ` Darrick J. Wong
2017-06-27 14:50 ` [PATCH] tests/xfs: test for log recovery failure after tail overwrite Brian Foster
  -- strict thread matches above, loose matches on Subject: below --
2017-06-16 16:44 [PATCH] xfs: debug mode sysfs flag to force [un]pin the log tail Brian Foster
2017-06-16 16:46 ` [PATCH] tests/xfs: test for log recovery failure after tail overwrite Brian Foster
2017-06-30  3:44   ` Eryu Guan
2017-06-30  4:09     ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).