From: Brian Foster <bfoster@redhat.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>,
fstests <fstests@vger.kernel.org>,
linux-xfs <linux-xfs@vger.kernel.org>, Qu Wenruo <wqu@suse.com>,
Josef Bacik <josef@toxicpanda.com>
Subject: Re: [PATCH] generic: skip dm-log-writes tests on XFS v5 superblock filesystems
Date: Wed, 27 Feb 2019 08:23:02 -0500 [thread overview]
Message-ID: <20190227132302.GB49593@bfoster> (raw)
In-Reply-To: <CAOQ4uxgueAtF5o+Zjd+N0_h7784veL2FNd7-aB+33HWEN07gHg@mail.gmail.com>
On Wed, Feb 27, 2019 at 06:06:57AM +0200, Amir Goldstein wrote:
> On Wed, Feb 27, 2019 at 1:22 AM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Tue, Feb 26, 2019 at 11:10:02PM +0200, Amir Goldstein wrote:
> > > On Tue, Feb 26, 2019 at 8:14 PM Brian Foster <bfoster@redhat.com> wrote:
> > > >
> > > > The dm-log-writes mechanism runs a workload against a filesystem,
> > > > tracks underlying FUAs and restores the filesystem to various points
> > > > in time based on FUA marks. This allows fstests to check fs
> > > > consistency at various points and verify log recovery works as
> > > > expected.
> > > >
> > >
> > > Inaccurate. generic/482 restores to FUA points.
> > > generic/45[57] restore to user defined points in time (marks).
> > > dm-log-writes mechanism is capable of restoring either.
> > >
> > > > This mechanism does not play well with LSN based log recovery
> > > > ordering behavior on XFS v5 superblocks, however. For example,
> > > > generic/482 can reproduce false positive corruptions based on extent
> > > > to btree conversion of an inode if the inode and associated btree
> > > > block are written back after different checkpoints. Even though both
> > > > items are logged correctly in the extent-to-btree transaction, the
> > > > btree block can be relogged (multiple times) and only written back
> > > > once when the filesystem unmounts. If the inode was written back
> > > > after the initial conversion, recovery points between that mark and
> > > > when the btree block is ultimately written back will show corruption
> > > > because log recovery sees that the destination buffer is newer than
> > > > the recovered buffer and intentionally skips the buffer. This is a
> > > > false positive because the destination buffer was resiliently
> > > > written back after being physically relogged one or more times.
> > > >
> > >
> > > This story doesn't add up.
> > > Either dm-log-writes emulated power failure correctly or it doesn't.
> > > My understanding is that the issue you are seeing is a result of
> > > XFS seeing "data from the future" after a restore of a power failure
> > > snapshot, because the scratch device is not a clean slate.
> > > If I am right, then the correct solution is to wipe the journal before
> > > starting to replay restore points.
> >
> > If that is the problem, then I think we should be wiping the entire
> > block device before replaying the recorded logwrite.
> >
>
> Indeed.
>
> > i.e. this sounds like a "block device we are replaying onto has
> > stale data in it" problem because we are replaying the same
> > filesystem over the top of itself. Hence there are no unique
> > identifiers in the metadata that can detect stale metadata in
> > the block device.
> >
> > I'm surprised that we haven't tripped over this much earlier that
> > this...
> >
>
> I remember asking myself the same thing... it's coming back to me
> now. I really remember having this discussion during test review.
> generic/482 is an adaptation of Josef's test script [1], which
> does log recovery onto a snapshot on every FUA checkpoint.
>
> [1] https://github.com/josefbacik/log-writes/blob/master/replay-fsck-wrapper.sh
>
> Setting up snapshots for every checkpoint was found empirically to take
> more test runtime, than replaying log from the start for each checkpoint.
> That observation was limited to the systems that Qu and Eryu tested on.
>
> IRC, what usually took care of cleaning the block device is replaying the
> "discard everything" IO from mkfs time. dm-log-writes driver should take care
> of zeroing blocks upon replaying a discard IO even on a target device that
> doesn't support discard, but maybe if original device doesn't support discard,
> the IO is not recorded at all and therefore not replayed?
>
> Brian, can you check if your log-writes stream contains a discard IO for
> the entire block device? I do remember that the initial log-writes tests worked
> very reliably on my laptop with SSD and were a bit flaky on another test machine
> with spinning rust, but that machine had other hardware reliability issues at
> the time (bad SATA cable) so I attributed all issues to that problem.
>
FYI, the command from your other mail on a logwrites dev that
demonstrates this problem shows the following:
# ./src/log-writes/replay-log -vv --find --end-mark mkfs --log /dev/test/tmp | grep DISCARD
seek entry 0@2: 0, size 8388607, flags 0x4(DISCARD)
seek entry 1@3: 8388607, size 8388607, flags 0x4(DISCARD)
seek entry 2@4: 16777214, size 8388607, flags 0x4(DISCARD)
seek entry 3@5: 25165821, size 6291459, flags 0x4(DISCARD)
... which appears to cover the entire device. Is the intention that this
should wipe the scratch device?
Brian
> BTW, looking closer at generic/482, $prev does not seem to be used at all
> in the replay loop.
>
> > > > Update the dm-log-writes require checks to enforce v4 superblocks
> > > > when running against XFS and skip the test otherwise.
> > >
> > > You might as well disable dm-log-writes test for XFS completely.
> > > Who cares about v4 superblocks these days?
> >
> > Enough of the inflammatory hyperbole, Amir. Statements like this
> > serve no useful purpose.
> >
>
> <deep breath> Agreed.
>
> Thanks,
> Amir.
next prev parent reply other threads:[~2019-02-27 13:23 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-26 18:14 [PATCH] generic: skip dm-log-writes tests on XFS v5 superblock filesystems Brian Foster
2019-02-26 21:10 ` Amir Goldstein
2019-02-26 23:22 ` Dave Chinner
2019-02-27 4:06 ` Amir Goldstein
2019-02-27 4:19 ` Qu Wenruo
2019-02-27 4:49 ` Amir Goldstein
2019-02-27 5:01 ` Qu Wenruo
2019-02-27 5:19 ` Amir Goldstein
2019-02-27 5:32 ` Qu Wenruo
2019-02-27 5:58 ` Amir Goldstein
2019-02-27 6:15 ` Dave Chinner
2019-02-27 13:23 ` Brian Foster [this message]
2019-02-27 13:18 ` Brian Foster
2019-02-27 14:17 ` Brian Foster
2019-02-27 15:54 ` Josef Bacik
2019-02-27 17:11 ` Amir Goldstein
2019-02-27 17:13 ` Brian Foster
2019-02-27 18:46 ` Amir Goldstein
2019-02-27 20:45 ` Brian Foster
2019-02-27 19:27 ` Josef Bacik
2019-02-27 20:47 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190227132302.GB49593@bfoster \
--to=bfoster@redhat.com \
--cc=amir73il@gmail.com \
--cc=david@fromorbit.com \
--cc=fstests@vger.kernel.org \
--cc=josef@toxicpanda.com \
--cc=linux-xfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox