[PATCH 0/8] xfs: log recovery torn write detection

* [PATCH 0/8] xfs: log recovery torn write detection
@ 2015-11-09 20:21 Brian Foster
  2015-11-09 20:21 ` [PATCH 1/8] xfs: detect and handle invalid iclog size set by mkfs Brian Foster
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Brian Foster @ 2015-11-09 20:21 UTC (permalink / raw)
  To: xfs

Hi all,

Here's a first real pass at XFS log recovery torn write detection. This
series has been tested via xfstests and via repetitive fsstress/shutdown
sequences followed by simulated CRC errors on log recovery. The latter
testing has proven useful in shaking out a few bugs, but I have still
reproduced fs inconsistency after a couple hundred iterations or so.
That said, I suspect the problems at this point are either actual
logging problems (e.g., all of the EFI/EFD logging patches and whatnot
originated from this kind of testing) or due to the nature of the error
simulation.

In short, it simulates log corruption moreso than torn writes because it
injects errors at recovery time. The log buffers are written
successfully at shutdown time and therefore I believe it's still
possible for the filesystem to have modifications that depend on
committed transactions (which are ultimately skipped if a crc error is
simulated). I've marked this patch RFC for the time being because I'd
like to try and come up with something a bit more deterministic, if
possible (so long as it can be done reasonably simply). For example,
perhaps we can replace it with a similar debug mode that intentionally
corrupts a crc at write time and shuts down the fs on write completion
such that the AIL is not updated and there is less risk of inconsistency
due to writing back metadata items in the "corrupted" log buffer(s).
Anyways, the current patch is included so the current test procedure is
documented, reviewable and repeatable.

Patch 1 is a bug fix for a problem exposed by this mechanism. Patches
2-6 are primarily refactoring and introduce the CRC-check-only log
recovery pass. Patch 7 enables log head/tail torn write detection. Patch
8 implements the DEBUG mode error injection mechanism described above.
Thoughts, reviews, flames appreciated.

Brian

v1:
- Added bug fix for mkfs log record header inconsistency.
- Refactored log recovery code to support a CRC-check-only recovery
  pass.
- CRC verify the last 8 records behind the head to account for
  concurrent log writes.
- Verify the tail of the log as well when the head is torn.
- Added (rfc) crc error injection patch for testing purposes.
rfc: http://oss.sgi.com/pipermail/xfs/2015-July/042415.html

Brian Foster (8):
  xfs: detect and handle invalid iclog size set by mkfs
  xfs: refactor log record unpack and data processing
  xfs: refactor and open code log record crc check
  xfs: return start block of first bad log record during recovery
  xfs: support a crc verification only log record pass
  xfs: refactor log record start detection into a new helper
  xfs: detect and trim torn writes during log recovery
  xfs: debug mode log recovery crc error injection

 fs/xfs/libxfs/xfs_log_recover.h |   1 +
 fs/xfs/xfs_globals.c            |   1 +
 fs/xfs/xfs_log_recover.c        | 646 +++++++++++++++++++++++++++++++++-------
 fs/xfs/xfs_sysctl.h             |   1 +
 fs/xfs/xfs_sysfs.c              |  31 ++
 5 files changed, 574 insertions(+), 106 deletions(-)

-- 
2.1.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 11+ messages in thread