[PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes
@ 2017-07-10 13:27 Brian Foster
  2017-07-10 13:27 ` [PATCH v2 1/6] xfs: fix recovery failure when log record header wraps log end Brian Foster
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Brian Foster @ 2017-07-10 13:27 UTC (permalink / raw)
  To: linux-xfs

Hi all,

This is v2 of the tail overwrite detection series. Patches 1-4 include
mostly minor changes to fix up some error handling issues and whatnot.
Patch 5 is new and updates both of the torn write and tail overwrite
detection mechanisms to handle corruption errors along with crc mismatch
errors. Patch 6 is also new and a trivial introduction of a tracepoint.

This has survived a decent amount of testing to this point with various
block sizes, logbsize values, and feature settings (debug on/off, crc
on/off, reflink on/off) on a couple different arches (x86-64, ppc64).
Thoughts, reviews, flames appreciated.

Brian

v2:
- Fix 64-bit division.
- Fix up error checking in xlog_verify_[head|tail]() to only move tail
  tail on corruption and otherwise fail mount immediately.
- Added patch 5 to handle -EFSCORRUPTED on head/tail verification.
- Added patch 6 to introduce log head/tail tracepoint.
v1: http://www.spinics.net/lists/linux-xfs/msg07974.html
- Add patch to fix log recovery header wrapping problem.
- Replace transaction reservation rfc with log recovery based fix.
- Replace custom log pinning sysfs knob with error injection tag.
rfc: http://www.spinics.net/lists/linux-xfs/msg07623.html

Brian Foster (6):
  xfs: fix recovery failure when log record header wraps log end
  xfs: always verify the log tail during recovery
  xfs: fix log recovery corruption error due to tail overwrite
  xfs: add log item pinning error injection tag
  xfs: handle -EFSCORRUPTED during head/tail verification
  xfs: add log recovery tracepoint for head/tail

 fs/xfs/xfs_error.c       |   3 +
 fs/xfs/xfs_error.h       |   4 +-
 fs/xfs/xfs_log_recover.c | 157 +++++++++++++++++++++++++++++------------------
 fs/xfs/xfs_trace.h       |  18 ++++++
 fs/xfs/xfs_trans_ail.c   |  17 ++++-
 5 files changed, 137 insertions(+), 62 deletions(-)

-- 
2.9.4


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/6] xfs: fix recovery failure when log record header wraps log end
  2017-07-10 13:27 [PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes Brian Foster
@ 2017-07-10 13:27 ` Brian Foster
  2017-07-10 13:27 ` [PATCH v2 2/6] xfs: always verify the log tail during recovery Brian Foster
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Brian Foster @ 2017-07-10 13:27 UTC (permalink / raw)
  To: linux-xfs

The high-level log recovery algorithm consists of two loops that
walk the physical log and process log records from the tail to the
head. The first loop handles the case where the tail is beyond the
head and processes records up to the end of the physical log. The
subsequent loop processes records from the beginning of the physical
log to the head.

Because log records can wrap around the end of the physical log, the
first loop mentioned above must handle this case appropriately.
Records are processed from in-core buffers, which means that this
algorithm must split the reads of such records into two partial
I/Os: 1.) from the beginning of the record to the end of the log and
2.) from the beginning of the log to the end of the record. This is
further complicated by the fact that the log record header and log
record data are read into independent buffers.

The current handling of each buffer correctly splits the reads when
either the header or data starts before the end of the log and wraps
around the end. The data read does not correctly handle the case
where the prior header read wrapped or ends on the physical log end
boundary. blk_no is incremented to or beyond the log end after the
header read to point to the record data, but the split data read
logic triggers, attempts to read from an invalid log block and
ultimately causes log recovery to fail. This can be reproduced
fairly reliably via xfstests tests generic/047 and generic/388 with
large iclog sizes (256k) and small (10M) logs.

If the record header read has pushed beyond the end of the physical
log, the subsequent data read is actually contiguous. Update the
data read logic to detect the case where blk_no has wrapped, mod it
against the log size to read from the correct address and issue one
contiguous read for the log data buffer. The log record is processed
as normal from the buffer(s), the loop exits after the current
iteration and the subsequent loop picks up with the first new record
after the start of the log.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log_recover.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index b6a40bd..a388a84 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -5218,7 +5218,7 @@ xlog_do_recovery_pass(
 	xfs_daddr_t		*first_bad)	/* out: first bad log rec */
 {
 	xlog_rec_header_t	*rhead;
-	xfs_daddr_t		blk_no;
+	xfs_daddr_t		blk_no, rblk_no;
 	xfs_daddr_t		rhead_blk;
 	char			*offset;
 	xfs_buf_t		*hbp, *dbp;
@@ -5371,9 +5371,19 @@ xlog_do_recovery_pass(
 			bblks = (int)BTOBB(be32_to_cpu(rhead->h_len));
 			blk_no += hblks;

-			/* Read in data for log record */
-			if (blk_no + bblks <= log->l_logBBsize) {
-				error = xlog_bread(log, blk_no, bblks, dbp,
+			/*
+			 * Read the log record data in multiple reads if it
+			 * wraps around the end of the log. Note that if the
+			 * header already wrapped, blk_no could point past the
+			 * end of the log. The record data is contiguous in
+			 * that case.
+			 */
+			if (blk_no + bblks <= log->l_logBBsize ||
+			    blk_no >= log->l_logBBsize) {
+				/* mod blk_no in case the header wrapped and
+				 * pushed it beyond the end of the log */
+				rblk_no = do_mod(blk_no, log->l_logBBsize);
+				error = xlog_bread(log, rblk_no, bblks, dbp,
 						   &offset);
 				if (error)
 					goto bread_err2;
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/6] xfs: always verify the log tail during recovery
  2017-07-10 13:27 [PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes Brian Foster
  2017-07-10 13:27 ` [PATCH v2 1/6] xfs: fix recovery failure when log record header wraps log end Brian Foster
@ 2017-07-10 13:27 ` Brian Foster
  2017-07-10 13:27 ` [PATCH v2 3/6] xfs: fix log recovery corruption error due to tail overwrite Brian Foster
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Brian Foster @ 2017-07-10 13:27 UTC (permalink / raw)
  To: linux-xfs

Log tail verification currently only occurs when torn writes are
detected at the head of the log. This was introduced because a
change in the head block due to torn writes can lead to a change in
the tail block (each log record header references the current tail)
and the tail block should be verified before log recovery proceeds.

Tail corruption is possible outside of torn write scenarios,
however. For example, partial log writes can be detected and cleared
during the initial head/tail block discovery process. If the partial
write coincides with a tail overwrite, the log tail is corrupted and
recovery fails.

To facilitate correct handling of log tail overwites, update log
recovery to always perform tail verification. This is necessary to
detect potential tail overwrite conditions when torn writes may not
have occurred. This changes normal (i.e., no torn writes) recovery
behavior slightly to detect and return CRC related errors near the
tail before actual recovery starts.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_log_recover.c | 26 +++-----------------------
 1 file changed, 3 insertions(+), 23 deletions(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index a388a84..72c12b1 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -1183,31 +1183,11 @@ xlog_verify_head(
 			ASSERT(0);
 			return 0;
 		}
-
-		/*
-		 * Now verify the tail based on the updated head. This is
-		 * required because the torn writes trimmed from the head could
-		 * have been written over the tail of a previous record. Return
-		 * any errors since recovery cannot proceed if the tail is
-		 * corrupt.
-		 *
-		 * XXX: This leaves a gap in truly robust protection from torn
-		 * writes in the log. If the head is behind the tail, the tail
-		 * pushes forward to create some space and then a crash occurs
-		 * causing the writes into the previous record's tail region to
-		 * tear, log recovery isn't able to recover.
-		 *
-		 * How likely is this to occur? If possible, can we do something
-		 * more intelligent here? Is it safe to push the tail forward if
-		 * we can determine that the tail is within the range of the
-		 * torn write (e.g., the kernel can only overwrite the tail if
-		 * it has actually been pushed forward)? Alternatively, could we
-		 * somehow prevent this condition at runtime?
-		 */
-		error = xlog_verify_tail(log, *head_blk, *tail_blk);
 	}
+	if (error)
+		return error;

-	return error;
+	return xlog_verify_tail(log, *head_blk, *tail_blk);
 }

 /*
-- 
2.9.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/6] xfs: fix log recovery corruption error due to tail overwrite
  2017-07-10 13:27 [PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes Brian Foster
  2017-07-10 13:27 ` [PATCH v2 1/6] xfs: fix recovery failure when log record header wraps log end Brian Foster
  2017-07-10 13:27 ` [PATCH v2 2/6] xfs: always verify the log tail during recovery Brian Foster
@ 2017-07-10 13:27 ` Brian Foster
  2017-07-10 13:27 ` [PATCH v2 4/6] xfs: add log item pinning error injection tag Brian Foster
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Brian Foster @ 2017-07-10 13:27 UTC (permalink / raw)
  To: linux-xfs

If we consider the case where the tail (T) of the log is pinned long
enough for the head (H) to push and block behind the tail, we can
end up blocked in the following state without enough free space (f)
in the log to satisfy a transaction reservation:

	0	phys. log	N
	[-------HffT---H'--T'---]

The last good record in the log (before H) refers to T. The tail
eventually pushes forward (T') leaving more free space in the log
for writes to H. At this point, suppose space frees up in the log
for the maximum of 8 in-core log buffers to start flushing out to
the log. If this pushes the head from H to H', these next writes
overwrite the previous tail T. This is safe because the items logged
from T to T' have been written back and removed from the AIL.

If the next log writes (H -> H') happen to fail and result in
partial records in the log, the filesystem shuts down having
overwritten T with invalid data. Log recovery correctly locates H on
the subsequent mount, but H still refers to the now corrupted tail
T. This results in log corruption errors and recovery failure.

Since the tail overwrite results from otherwise correct runtime
behavior, it is up to log recovery to try and deal with this
situation. Update log recovery tail verification to run a CRC pass
from the first record past the tail to the head. This facilitates
error detection at T and moves the recovery tail to the first good
record past H' (similar to truncating the head on torn write
detection). If corruption is detected beyond the range possibly
affected by the max number of iclogs, the log is legitimately
corrupted and log recovery failure is expected.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log_recover.c | 108 +++++++++++++++++++++++++++++++++--------------
 1 file changed, 77 insertions(+), 31 deletions(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 72c12b1..d3bef45 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -1029,61 +1029,106 @@ xlog_seek_logrec_hdr(
 }
 
 /*
- * Check the log tail for torn writes. This is required when torn writes are
- * detected at the head and the head had to be walked back to a previous record.
- * The tail of the previous record must now be verified to ensure the torn
- * writes didn't corrupt the previous tail.
+ * Calculate distance from head to tail (i.e., unused space in the log).
+ */
+static inline int
+xlog_tail_distance(
+	struct xlog	*log,
+	xfs_daddr_t	head_blk,
+	xfs_daddr_t	tail_blk)
+{
+	if (head_blk < tail_blk)
+		return tail_blk - head_blk;
+
+	return tail_blk + (log->l_logBBsize - head_blk);
+}
+
+/*
+ * Verify the log tail. This is particularly important when torn or incomplete
+ * writes have been detected near the front of the log and the head has been
+ * walked back accordingly.
+ *
+ * We also have to handle the case where the tail was pinned and the head
+ * blocked behind the tail right before a crash. If the tail had been pushed
+ * immediately prior to the crash and the subsequent checkpoint was only
+ * partially written, it's possible it overwrote the last referenced tail in the
+ * log with garbage. This is not a coherency problem because the tail must have
+ * been pushed before it can be overwritten, but appears as log corruption to
+ * recovery because we have no way to know the tail was updated if the
+ * subsequent checkpoint didn't write successfully.
  *
- * Return an error if CRC verification fails as recovery cannot proceed.
+ * Therefore, CRC check the log from tail to head. If a failure occurs and the
+ * offending record is within max iclog bufs from the head, walk the tail
+ * forward and retry until a valid tail is found or corruption is detected out
+ * of the range of a possible overwrite.
  */
 STATIC int
 xlog_verify_tail(
 	struct xlog		*log,
 	xfs_daddr_t		head_blk,
-	xfs_daddr_t		tail_blk)
+	xfs_daddr_t		*tail_blk,
+	int			hsize)
 {
 	struct xlog_rec_header	*thead;
 	struct xfs_buf		*bp;
 	xfs_daddr_t		first_bad;
-	int			count;
 	int			error = 0;
 	bool			wrapped;
-	xfs_daddr_t		tmp_head;
+	xfs_daddr_t		tmp_tail;
+	xfs_daddr_t		orig_tail = *tail_blk;
 
 	bp = xlog_get_bp(log, 1);
 	if (!bp)
 		return -ENOMEM;
 
 	/*
-	 * Seek XLOG_MAX_ICLOGS + 1 records past the current tail record to get
-	 * a temporary head block that points after the last possible
-	 * concurrently written record of the tail.
+	 * Make sure the tail points to a record (returns positive count on
+	 * success).
 	 */
-	count = xlog_seek_logrec_hdr(log, head_blk, tail_blk,
-				     XLOG_MAX_ICLOGS + 1, bp, &tmp_head, &thead,
-				     &wrapped);
-	if (count < 0) {
-		error = count;
+	error = xlog_seek_logrec_hdr(log, head_blk, *tail_blk, 1, bp,
+			&tmp_tail, &thead, &wrapped);
+	if (error < 0)
 		goto out;
-	}
+	if (*tail_blk != tmp_tail)
+		*tail_blk = tmp_tail;
 
 	/*
-	 * If the call above didn't find XLOG_MAX_ICLOGS + 1 records, we ran
-	 * into the actual log head. tmp_head points to the start of the record
-	 * so update it to the actual head block.
+	 * Run a CRC check from the tail to the head. We can't just check
+	 * MAX_ICLOGS records past the tail because the tail may point to stale
+	 * blocks cleared during the search for the head/tail. These blocks are
+	 * overwritten with zero-length records and thus record count is not a
+	 * reliable indicator of the iclog state before a crash.
 	 */
-	if (count < XLOG_MAX_ICLOGS + 1)
-		tmp_head = head_blk;
-
-	/*
-	 * We now have a tail and temporary head block that covers at least
-	 * XLOG_MAX_ICLOGS records from the tail. We need to verify that these
-	 * records were completely written. Run a CRC verification pass from
-	 * tail to head and return the result.
-	 */
-	error = xlog_do_recovery_pass(log, tmp_head, tail_blk,
+	first_bad = 0;
+	error = xlog_do_recovery_pass(log, head_blk, *tail_blk,
 				      XLOG_RECOVER_CRCPASS, &first_bad);
+	while (error == -EFSBADCRC && first_bad) {
+		int	tail_distance;
+
+		/*
+		 * Is corruption within range of the head? If so, retry from
+		 * the next record. Otherwise return an error.
+		 */
+		tail_distance = xlog_tail_distance(log, head_blk, first_bad);
+		if (tail_distance > BTOBB(XLOG_MAX_ICLOGS * hsize))
+			break;
 
+		/* skip to the next record; returns positive count on success */
+		error = xlog_seek_logrec_hdr(log, head_blk, first_bad, 2, bp,
+				&tmp_tail, &thead, &wrapped);
+		if (error < 0)
+			goto out;
+
+		*tail_blk = tmp_tail;
+		first_bad = 0;
+		error = xlog_do_recovery_pass(log, head_blk, *tail_blk,
+					      XLOG_RECOVER_CRCPASS, &first_bad);
+	}
+
+	if (!error && *tail_blk != orig_tail)
+		xfs_warn(log->l_mp,
+		"Tail block (0x%llx) overwrite detected. Updated to 0x%llx",
+			 orig_tail, *tail_blk);
 out:
 	xlog_put_bp(bp);
 	return error;
@@ -1187,7 +1232,8 @@ xlog_verify_head(
 	if (error)
 		return error;
 
-	return xlog_verify_tail(log, *head_blk, *tail_blk);
+	return xlog_verify_tail(log, *head_blk, tail_blk,
+				be32_to_cpu((*rhead)->h_size));
 }
 
 /*
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 4/6] xfs: add log item pinning error injection tag
  2017-07-10 13:27 [PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes Brian Foster
                   ` (2 preceding siblings ...)
  2017-07-10 13:27 ` [PATCH v2 3/6] xfs: fix log recovery corruption error due to tail overwrite Brian Foster
@ 2017-07-10 13:27 ` Brian Foster
  2017-07-10 13:27 ` [PATCH v2 5/6] xfs: handle -EFSCORRUPTED during head/tail verification Brian Foster
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Brian Foster @ 2017-07-10 13:27 UTC (permalink / raw)
  To: linux-xfs

Add an error injection tag to force log items in the AIL to the
pinned state. This option can be used by test infrastructure to
induce head behind tail conditions. Specifically, this is intended
to be used by xfstests to reproduce log recovery problems after
failed/corrupted log writes overwrite the last good tail LSN in the
log.

When enabled, AIL push attempts see log items in the AIL in the
pinned state. This stalls metadata writeback and thus prevents the
current tail of the log from moving forward. When disabled,
subsequent AIL pushes observe the log items in their appropriate
state and filesystem operation continues as normal.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_error.c     |  3 +++
 fs/xfs/xfs_error.h     |  4 +++-
 fs/xfs/xfs_trans_ail.c | 17 ++++++++++++++++-
 3 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c
index 2f4feb9..bd786a9 100644
--- a/fs/xfs/xfs_error.c
+++ b/fs/xfs/xfs_error.c
@@ -57,6 +57,7 @@ static unsigned int xfs_errortag_random_default[] = {
 	XFS_RANDOM_AG_RESV_CRITICAL,
 	XFS_RANDOM_DROP_WRITES,
 	XFS_RANDOM_LOG_BAD_CRC,
+	XFS_RANDOM_LOG_ITEM_PIN,
 };
 
 struct xfs_errortag_attr {
@@ -161,6 +162,7 @@ XFS_ERRORTAG_ATTR_RW(bmap_finish_one,	XFS_ERRTAG_BMAP_FINISH_ONE);
 XFS_ERRORTAG_ATTR_RW(ag_resv_critical,	XFS_ERRTAG_AG_RESV_CRITICAL);
 XFS_ERRORTAG_ATTR_RW(drop_writes,	XFS_ERRTAG_DROP_WRITES);
 XFS_ERRORTAG_ATTR_RW(log_bad_crc,	XFS_ERRTAG_LOG_BAD_CRC);
+XFS_ERRORTAG_ATTR_RW(log_item_pin,	XFS_ERRTAG_LOG_ITEM_PIN);
 
 static struct attribute *xfs_errortag_attrs[] = {
 	XFS_ERRORTAG_ATTR_LIST(noerror),
@@ -193,6 +195,7 @@ static struct attribute *xfs_errortag_attrs[] = {
 	XFS_ERRORTAG_ATTR_LIST(ag_resv_critical),
 	XFS_ERRORTAG_ATTR_LIST(drop_writes),
 	XFS_ERRORTAG_ATTR_LIST(log_bad_crc),
+	XFS_ERRORTAG_ATTR_LIST(log_item_pin),
 	NULL,
 };
 
diff --git a/fs/xfs/xfs_error.h b/fs/xfs/xfs_error.h
index 7577be5..7c4bef3 100644
--- a/fs/xfs/xfs_error.h
+++ b/fs/xfs/xfs_error.h
@@ -106,7 +106,8 @@ extern void xfs_verifier_error(struct xfs_buf *bp);
  */
 #define XFS_ERRTAG_DROP_WRITES				28
 #define XFS_ERRTAG_LOG_BAD_CRC				29
-#define XFS_ERRTAG_MAX					30
+#define XFS_ERRTAG_LOG_ITEM_PIN				30
+#define XFS_ERRTAG_MAX					31
 
 /*
  * Random factors for above tags, 1 means always, 2 means 1/2 time, etc.
@@ -141,6 +142,7 @@ extern void xfs_verifier_error(struct xfs_buf *bp);
 #define XFS_RANDOM_AG_RESV_CRITICAL			4
 #define XFS_RANDOM_DROP_WRITES				1
 #define XFS_RANDOM_LOG_BAD_CRC				1
+#define XFS_RANDOM_LOG_ITEM_PIN				1
 
 #ifdef DEBUG
 extern int xfs_errortag_init(struct xfs_mount *mp);
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index 9056c0f..563b315 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -325,6 +325,21 @@ xfs_ail_delete(
 	xfs_trans_ail_cursor_clear(ailp, lip);
 }
 
+static inline uint
+xfsaild_push_item(
+	struct xfs_ail		*ailp,
+	struct xfs_log_item	*lip)
+{
+	/*
+	 * If log item pinning is enabled, skip the push and track the item as
+	 * pinned. This can help induce head-behind-tail conditions.
+	 */
+	if (XFS_TEST_ERROR(false, ailp->xa_mount, XFS_ERRTAG_LOG_ITEM_PIN))
+		return XFS_ITEM_PINNED;
+
+	return lip->li_ops->iop_push(lip, &ailp->xa_buf_list);
+}
+
 static long
 xfsaild_push(
 	struct xfs_ail		*ailp)
@@ -382,7 +397,7 @@ xfsaild_push(
 		 * rely on the AIL cursor implementation to be able to deal with
 		 * the dropped lock.
 		 */
-		lock_result = lip->li_ops->iop_push(lip, &ailp->xa_buf_list);
+		lock_result = xfsaild_push_item(ailp, lip);
 		switch (lock_result) {
 		case XFS_ITEM_SUCCESS:
 			XFS_STATS_INC(mp, xs_push_ail_success);
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 5/6] xfs: handle -EFSCORRUPTED during head/tail verification
  2017-07-10 13:27 [PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes Brian Foster
                   ` (3 preceding siblings ...)
  2017-07-10 13:27 ` [PATCH v2 4/6] xfs: add log item pinning error injection tag Brian Foster
@ 2017-07-10 13:27 ` Brian Foster
  2017-07-10 13:28 ` [PATCH v2 6/6] xfs: add log recovery tracepoint for head/tail Brian Foster
  2017-07-13  5:21 ` [PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes Darrick J. Wong
  6 siblings, 0 replies; 8+ messages in thread
From: Brian Foster @ 2017-07-10 13:27 UTC (permalink / raw)
  To: linux-xfs

Torn write and tail overwrite detection both trigger only on
-EFSBADCRC errors. While this is the most likely failure scenario
for each condition, -EFSCORRUPTED is still possible in certain cases
depending on what ends up on disk when a torn write or partial tail
overwrite occurs. For example, an invalid log record h_len can lead
to an -EFSCORRUPTED error when running the log recovery CRC pass.

Therefore, update log head and tail verification to trigger the
associated head/tail fixups in the event of -EFSCORRUPTED errors
along with -EFSBADCRC. Also, -EFSCORRUPTED can currently be returned
from xlog_do_recovery_pass() before rhead_blk is initialized if the
first record encountered happens to be corrupted. This leads to an
incorrect 'first_bad' return value. Initialize rhead_blk earlier in
the function to address that problem as well.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log_recover.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index d3bef45..2e34797 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -1102,7 +1102,7 @@ xlog_verify_tail(
 	first_bad = 0;
 	error = xlog_do_recovery_pass(log, head_blk, *tail_blk,
 				      XLOG_RECOVER_CRCPASS, &first_bad);
-	while (error == -EFSBADCRC && first_bad) {
+	while ((error == -EFSBADCRC || error == -EFSCORRUPTED) && first_bad) {
 		int	tail_distance;
 
 		/*
@@ -1188,7 +1188,7 @@ xlog_verify_head(
 	 */
 	error = xlog_do_recovery_pass(log, *head_blk, tmp_rhead_blk,
 				      XLOG_RECOVER_CRCPASS, &first_bad);
-	if (error == -EFSBADCRC) {
+	if ((error == -EFSBADCRC || error == -EFSCORRUPTED) && first_bad) {
 		/*
 		 * We've hit a potential torn write. Reset the error and warn
 		 * about it.
@@ -5257,7 +5257,7 @@ xlog_do_recovery_pass(
 	LIST_HEAD		(buffer_list);
 
 	ASSERT(head_blk != tail_blk);
-	rhead_blk = 0;
+	blk_no = rhead_blk = tail_blk;
 
 	for (i = 0; i < XLOG_RHASH_SIZE; i++)
 		INIT_HLIST_HEAD(&rhash[i]);
@@ -5335,7 +5335,6 @@ xlog_do_recovery_pass(
 	}
 
 	memset(rhash, 0, sizeof(rhash));
-	blk_no = rhead_blk = tail_blk;
 	if (tail_blk > head_blk) {
 		/*
 		 * Perform recovery around the end of the physical log.
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 6/6] xfs: add log recovery tracepoint for head/tail
  2017-07-10 13:27 [PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes Brian Foster
                   ` (4 preceding siblings ...)
  2017-07-10 13:27 ` [PATCH v2 5/6] xfs: handle -EFSCORRUPTED during head/tail verification Brian Foster
@ 2017-07-10 13:28 ` Brian Foster
  2017-07-13  5:21 ` [PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes Darrick J. Wong
  6 siblings, 0 replies; 8+ messages in thread
From: Brian Foster @ 2017-07-10 13:28 UTC (permalink / raw)
  To: linux-xfs

Torn write detection and tail overwrite detection can shift the log
head and tail respectively in the event of CRC mismatch or
corruption errors. Add a high-level log recovery tracepoint to dump
the final log head/tail and make those values easily attainable in
debug/diagnostic situations.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_log_recover.c |  2 ++
 fs/xfs/xfs_trace.h       | 18 ++++++++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 2e34797..dd2ad2a 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -5598,6 +5598,8 @@ xlog_do_recover(
 	xfs_buf_t	*bp;
 	xfs_sb_t	*sbp;
 
+	trace_xfs_log_recover(log, head_blk, tail_blk);
+
 	/*
 	 * First replay the images in the log.
 	 */
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index bcc3cdf..6881047 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -1963,6 +1963,24 @@ DEFINE_EVENT(xfs_swap_extent_class, name, \
 DEFINE_SWAPEXT_EVENT(xfs_swap_extent_before);
 DEFINE_SWAPEXT_EVENT(xfs_swap_extent_after);
 
+TRACE_EVENT(xfs_log_recover,
+	TP_PROTO(struct xlog *log, xfs_daddr_t headblk, xfs_daddr_t tailblk),
+	TP_ARGS(log, headblk, tailblk),
+	TP_STRUCT__entry(
+		__field(dev_t, dev)
+		__field(xfs_daddr_t, headblk)
+		__field(xfs_daddr_t, tailblk)
+	),
+	TP_fast_assign(
+		__entry->dev = log->l_mp->m_super->s_dev;
+		__entry->headblk = headblk;
+		__entry->tailblk = tailblk;
+	),
+	TP_printk("dev %d:%d headblk 0x%llx tailblk 0x%llx",
+		  MAJOR(__entry->dev), MINOR(__entry->dev), __entry->headblk,
+		  __entry->tailblk)
+)
+
 TRACE_EVENT(xfs_log_recover_record,
 	TP_PROTO(struct xlog *log, struct xlog_rec_header *rhead, int pass),
 	TP_ARGS(log, rhead, pass),
-- 
2.9.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes
  2017-07-10 13:27 [PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes Brian Foster
                   ` (5 preceding siblings ...)
  2017-07-10 13:28 ` [PATCH v2 6/6] xfs: add log recovery tracepoint for head/tail Brian Foster
@ 2017-07-13  5:21 ` Darrick J. Wong
  6 siblings, 0 replies; 8+ messages in thread
From: Darrick J. Wong @ 2017-07-13  5:21 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Mon, Jul 10, 2017 at 09:27:54AM -0400, Brian Foster wrote:
> Hi all,
> 
> This is v2 of the tail overwrite detection series. Patches 1-4 include
> mostly minor changes to fix up some error handling issues and whatnot.
> Patch 5 is new and updates both of the torn write and tail overwrite
> detection mechanisms to handle corruption errors along with crc mismatch
> errors. Patch 6 is also new and a trivial introduction of a tracepoint.
> 
> This has survived a decent amount of testing to this point with various
> block sizes, logbsize values, and feature settings (debug on/off, crc
> on/off, reflink on/off) on a couple different arches (x86-64, ppc64).
> Thoughts, reviews, flames appreciated.

For the whole series,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> 
> Brian
> 
> v2:
> - Fix 64-bit division.
> - Fix up error checking in xlog_verify_[head|tail]() to only move tail
>   tail on corruption and otherwise fail mount immediately.
> - Added patch 5 to handle -EFSCORRUPTED on head/tail verification.
> - Added patch 6 to introduce log head/tail tracepoint.
> v1: http://www.spinics.net/lists/linux-xfs/msg07974.html
> - Add patch to fix log recovery header wrapping problem.
> - Replace transaction reservation rfc with log recovery based fix.
> - Replace custom log pinning sysfs knob with error injection tag.
> rfc: http://www.spinics.net/lists/linux-xfs/msg07623.html
> 
> Brian Foster (6):
>   xfs: fix recovery failure when log record header wraps log end
>   xfs: always verify the log tail during recovery
>   xfs: fix log recovery corruption error due to tail overwrite
>   xfs: add log item pinning error injection tag
>   xfs: handle -EFSCORRUPTED during head/tail verification
>   xfs: add log recovery tracepoint for head/tail
> 
>  fs/xfs/xfs_error.c       |   3 +
>  fs/xfs/xfs_error.h       |   4 +-
>  fs/xfs/xfs_log_recover.c | 157 +++++++++++++++++++++++++++++------------------
>  fs/xfs/xfs_trace.h       |  18 ++++++
>  fs/xfs/xfs_trans_ail.c   |  17 ++++-
>  5 files changed, 137 insertions(+), 62 deletions(-)
> 
> -- 
> 2.9.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-07-13  5:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-07-10 13:27 [PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes Brian Foster
2017-07-10 13:27 ` [PATCH v2 1/6] xfs: fix recovery failure when log record header wraps log end Brian Foster
2017-07-10 13:27 ` [PATCH v2 2/6] xfs: always verify the log tail during recovery Brian Foster
2017-07-10 13:27 ` [PATCH v2 3/6] xfs: fix log recovery corruption error due to tail overwrite Brian Foster
2017-07-10 13:27 ` [PATCH v2 4/6] xfs: add log item pinning error injection tag Brian Foster
2017-07-10 13:27 ` [PATCH v2 5/6] xfs: handle -EFSCORRUPTED during head/tail verification Brian Foster
2017-07-10 13:28 ` [PATCH v2 6/6] xfs: add log recovery tracepoint for head/tail Brian Foster
2017-07-13  5:21 ` [PATCH v2 0/6] xfs: log recovery wrap and tail overwrite fixes Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox