From: Brian Foster <bfoster@redhat.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>,
fstests <fstests@vger.kernel.org>,
linux-xfs <linux-xfs@vger.kernel.org>, Qu Wenruo <wqu@suse.com>,
Josef Bacik <josef@toxicpanda.com>
Subject: Re: [PATCH] generic: skip dm-log-writes tests on XFS v5 superblock filesystems
Date: Wed, 27 Feb 2019 08:23:02 -0500 [thread overview]
Message-ID: <20190227132302.GB49593@bfoster> (raw)
In-Reply-To: <CAOQ4uxgueAtF5o+Zjd+N0_h7784veL2FNd7-aB+33HWEN07gHg@mail.gmail.com>
On Wed, Feb 27, 2019 at 06:06:57AM +0200, Amir Goldstein wrote:
> On Wed, Feb 27, 2019 at 1:22 AM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Tue, Feb 26, 2019 at 11:10:02PM +0200, Amir Goldstein wrote:
> > > On Tue, Feb 26, 2019 at 8:14 PM Brian Foster <bfoster@redhat.com> wrote:
> > > >
> > > > The dm-log-writes mechanism runs a workload against a filesystem,
> > > > tracks underlying FUAs and restores the filesystem to various points
> > > > in time based on FUA marks. This allows fstests to check fs
> > > > consistency at various points and verify log recovery works as
> > > > expected.
> > > >
> > >
> > > Inaccurate. generic/482 restores to FUA points.
> > > generic/45[57] restore to user defined points in time (marks).
> > > dm-log-writes mechanism is capable of restoring either.
> > >
> > > > This mechanism does not play well with LSN based log recovery
> > > > ordering behavior on XFS v5 superblocks, however. For example,
> > > > generic/482 can reproduce false positive corruptions based on extent
> > > > to btree conversion of an inode if the inode and associated btree
> > > > block are written back after different checkpoints. Even though both
> > > > items are logged correctly in the extent-to-btree transaction, the
> > > > btree block can be relogged (multiple times) and only written back
> > > > once when the filesystem unmounts. If the inode was written back
> > > > after the initial conversion, recovery points between that mark and
> > > > when the btree block is ultimately written back will show corruption
> > > > because log recovery sees that the destination buffer is newer than
> > > > the recovered buffer and intentionally skips the buffer. This is a
> > > > false positive because the destination buffer was resiliently
> > > > written back after being physically relogged one or more times.
> > > >
> > >
> > > This story doesn't add up.
> > > Either dm-log-writes emulated power failure correctly or it doesn't.
> > > My understanding is that the issue you are seeing is a result of
> > > XFS seeing "data from the future" after a restore of a power failure
> > > snapshot, because the scratch device is not a clean slate.
> > > If I am right, then the correct solution is to wipe the journal before
> > > starting to replay restore points.
> >
> > If that is the problem, then I think we should be wiping the entire
> > block device before replaying the recorded logwrite.
> >
>
> Indeed.
>
> > i.e. this sounds like a "block device we are replaying onto has
> > stale data in it" problem because we are replaying the same
> > filesystem over the top of itself. Hence there are no unique
> > identifiers in the metadata that can detect stale metadata in
> > the block device.
> >
> > I'm surprised that we haven't tripped over this much earlier that
> > this...
> >
>
> I remember asking myself the same thing... it's coming back to me
> now. I really remember having this discussion during test review.
> generic/482 is an adaptation of Josef's test script [1], which
> does log recovery onto a snapshot on every FUA checkpoint.
>
> [1] https://github.com/josefbacik/log-writes/blob/master/replay-fsck-wrapper.sh
>
> Setting up snapshots for every checkpoint was found empirically to take
> more test runtime, than replaying log from the start for each checkpoint.
> That observation was limited to the systems that Qu and Eryu tested on.
>
> IRC, what usually took care of cleaning the block device is replaying the
> "discard everything" IO from mkfs time. dm-log-writes driver should take care
> of zeroing blocks upon replaying a discard IO even on a target device that
> doesn't support discard, but maybe if original device doesn't support discard,
> the IO is not recorded at all and therefore not replayed?
>
> Brian, can you check if your log-writes stream contains a discard IO for
> the entire block device? I do remember that the initial log-writes tests worked
> very reliably on my laptop with SSD and were a bit flaky on another test machine
> with spinning rust, but that machine had other hardware reliability issues at
> the time (bad SATA cable) so I attributed all issues to that problem.
>
FYI, the command from your other mail on a logwrites dev that
demonstrates this problem shows the following:
# ./src/log-writes/replay-log -vv --find --end-mark mkfs --log /dev/test/tmp | grep DISCARD
seek entry 0@2: 0, size 8388607, flags 0x4(DISCARD)
seek entry 1@3: 8388607, size 8388607, flags 0x4(DISCARD)
seek entry 2@4: 16777214, size 8388607, flags 0x4(DISCARD)
seek entry 3@5: 25165821, size 6291459, flags 0x4(DISCARD)
... which appears to cover the entire device. Is the intention that this
should wipe the scratch device?
Brian
> BTW, looking closer at generic/482, $prev does not seem to be used at all
> in the replay loop.
>
> > > > Update the dm-log-writes require checks to enforce v4 superblocks
> > > > when running against XFS and skip the test otherwise.
> > >
> > > You might as well disable dm-log-writes test for XFS completely.
> > > Who cares about v4 superblocks these days?
> >
> > Enough of the inflammatory hyperbole, Amir. Statements like this
> > serve no useful purpose.
> >
>
> <deep breath> Agreed.
>
> Thanks,
> Amir.
next prev parent reply other threads:[~2019-02-27 13:23 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-26 18:14 [PATCH] generic: skip dm-log-writes tests on XFS v5 superblock filesystems Brian Foster
2019-02-26 21:10 ` Amir Goldstein
2019-02-26 23:22 ` Dave Chinner
2019-02-27 4:06 ` Amir Goldstein
2019-02-27 4:19 ` Qu Wenruo
2019-02-27 4:49 ` Amir Goldstein
2019-02-27 5:01 ` Qu Wenruo
2019-02-27 5:19 ` Amir Goldstein
2019-02-27 5:32 ` Qu Wenruo
2019-02-27 5:58 ` Amir Goldstein
2019-02-27 6:15 ` Dave Chinner
2019-02-27 13:23 ` Brian Foster [this message]
2019-02-27 13:18 ` Brian Foster
2019-02-27 14:17 ` Brian Foster
2019-02-27 15:54 ` Josef Bacik
2019-02-27 17:11 ` Amir Goldstein
2019-02-27 17:13 ` Brian Foster
2019-02-27 18:46 ` Amir Goldstein
2019-02-27 20:45 ` Brian Foster
2019-02-27 19:27 ` Josef Bacik
2019-02-27 20:47 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190227132302.GB49593@bfoster \
--to=bfoster@redhat.com \
--cc=amir73il@gmail.com \
--cc=david@fromorbit.com \
--cc=fstests@vger.kernel.org \
--cc=josef@toxicpanda.com \
--cc=linux-xfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.