Re: [Lsf-pc] [LSF/MM TOPIC] Working towards better power fail testing

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: Jan Kara <jack@suse.cz>
Cc: Sage Weil <sage@newdream.net>, Josef Bacik <jbacik@fb.com>,
	lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org
Subject: Re: [Lsf-pc] [LSF/MM TOPIC] Working towards better power fail testing
Date: Wed, 7 Jan 2015 09:07:06 +1100	[thread overview]
Message-ID: <20150106220706.GD31508@dastard> (raw)
In-Reply-To: <20150106085347.GA15729@quack.suse.cz>

On Tue, Jan 06, 2015 at 09:53:47AM +0100, Jan Kara wrote:
> On Tue 06-01-15 08:47:55, Dave Chinner wrote:
> > > As things stand now the other devs are loathe to touch any remotely exotic 
> > > fs call, but that hardly seems ideal.  Hopefully a common framework for 
> > > powerfail testing can improve on this.  Perhaps there are other ways we 
> > > make it easier to tell what is (well) tested, and conversely ensure that 
> > > those tests are well-aligned with what real users are doing...
> > 
> > We don't actually need power failure (or even device failure)
> > infrastructure to test data integrity on failure. Filesystems just
> > need a shutdown method that stops any IO from being issued once the
> > shutdown flag is set. XFS has this and it's used by xfstests via the
> > "godown" utility to shut the fileystem down in various
> > circumstances. We've been using this for data integrity and log
> > recovery testing in xfstests for many years.
> > 
> > Hence we know if the device behaves correctly w.r.t cache flushes
> > and FUA then the filesystem will behave correctly on power loss. We
> > don't need a device power fail simulator to tell us violating
> > fundamental architectural assumptions will corrupt filesystems....
>   I think that fs ioctl cannot easily simulate the situation where
> on-device volatile caches aren't properly flushed in all the necessary
> cases (we had a bugs like this in ext3/4 in the past which were hit by real
> users).

Sure, I'm not arguing that it does. I'm suggesting that it's the
wrong place to be focussing effort on initially as it assumes the
filesystem behaves correctly on simple device failures.  i.e. if
filesystems fail to do the right thing on a block device that isn't
lossy, then we've got big problems to solve before we even consider
random "volatile cache blocks went missing" corruption and recovery
issues.

i.e. what we need to focus on first is "failure paths are exercised
and work reliably". When we have decent coverage of that for most
filesystems (and we sure as hell don't for btrfs and ext4), then we
can focus on "in this corner case of broken/lying hardware..."

> I also think that simulating the device failure in a different layer is
> simpler than checking for superblock flag in all the places where the
> filesystem submits IO (e.g. ext4 doesn't have dedicated buffer layer like
> xfs has and we rely on flusher thread to flush committed metadata to final

flusher threads call back into the filesystems to write both data
and metadata, so I don't think that's an issue. And there's
realtively few places you'd need to add a flag support to (ie.
wrappers around submit_bh and submit_bio in the relavent layers)
and that would trap all IO.

Don't get fooled by the fact that XFS has lots of shutdown traps;
there really are only three shutdown traps that prevent IO - one in
xfs_buf_submit() for metadata IO, one in xfs_map_blocks() during
->writepage for data IO, and one in xlog_bdstrat() for log IO.

All the other shutdown traps are for aborting operations that may
not reach the IO layer (as many operations will hit cached objects)
or will fail later when the inevitable IO is done (e.g. on
transaction commit). Hence shutdown traps get us fast, reliable
responses to userspace when fatal corruption errors occur, and in
doing so they also provide hooks for testing error paths in ways
that otherwise are very difficult to exercise.

This is my point - shutdown traps are far more useful for *verifying
correct filesystem behaviour in error situations* than something
that just returns errors or corrupts blocks at the IO layer. If we
really want to test behaviour with corrupt random disk blocks,
fsfuzzer already exists ;)

> location on disk so that writeback path completely avoids ext4 code - it's
> a generic writeback of the block device mapping).  So I like the solution
> with the dm target more than a fs ioctl although I agree that it's more
> clumsy from the xfstests perspective.

Wrong perspective. I'm looking at this from a filesystem layer
validation perspective, not a xfstests perspective.  The fs ioctl is
far more useful for exercising and validation filesystem behaviour
in error conditions than a dm-device that targets a rare device
failure issue.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2015-01-06 22:07 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-08 22:11 [LSF/MM TOPIC] Working towards better power fail testing Josef Bacik
2014-12-10 11:27 ` [Lsf-pc] " Jan Kara
2014-12-10 15:09   ` Josef Bacik
2015-01-05 18:34     ` Sage Weil
2015-01-05 19:02       ` Brian Foster
2015-01-05 19:13         ` Sage Weil
2015-01-05 19:33           ` Brian Foster
2015-01-05 21:17       ` Jan Kara
2015-01-05 21:47       ` Dave Chinner
2015-01-05 22:26         ` Sage Weil
2015-01-05 23:27           ` Dave Chinner
2015-01-06 17:37             ` Sage Weil
2015-01-06  8:53         ` Jan Kara
2015-01-06 16:39           ` Josef Bacik
2015-01-06 22:07           ` Dave Chinner [this message]
2015-01-07 10:10             ` Jan Kara
2015-01-13 17:05 ` Dmitry Monakhov
2015-01-13 17:17   ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150106220706.GD31508@dastard \
    --to=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=jbacik@fb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=sage@newdream.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).