All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Amir Goldstein <amir73il@gmail.com>
Cc: Josef Bacik <josef@toxicpanda.com>,
	fstests <fstests@vger.kernel.org>,
	linux-xfs <linux-xfs@vger.kernel.org>, Qu Wenruo <wqu@suse.com>
Subject: Re: [PATCH] generic: skip dm-log-writes tests on XFS v5 superblock filesystems
Date: Wed, 27 Feb 2019 15:45:22 -0500	[thread overview]
Message-ID: <20190227204521.GB50261@bfoster> (raw)
In-Reply-To: <CAOQ4uxgZruCPAgk0-GP8=Ngecn=Yjv+6KKLmkUuDjh+QDqTWUg@mail.gmail.com>

On Wed, Feb 27, 2019 at 08:46:42PM +0200, Amir Goldstein wrote:
> > > > FYI, I gave this a try and it didn't ultimately work because mkfs didn't
> > > > clear the device either. I ended up reproducing the problem, physically
> > > > zeroing the device, replaying the associated FUA and observing the
> > > > problem go away. From there, if I replay to the final FUA mark and go
> > > > back to the (originally) problematic FUA, the problem is reintroduced.
> > > >
> > >
> > > Sorry guys, whenever I run log-writes on xfs I use my helper script here
> > >
> > > https://github.com/josefbacik/log-writes
> > >
> > > specifically replay-individual-faster.sh.  This creates a snapshot at every
> > > replay point, mounts and checks the fs, and then destroys the snapshot and keeps
> > > going.  This way you don't end up with the "new" data still being on the device.
> > > It's not super fast, but this is usually a fire and forget sort of thing.  I
> > > could probably integrate this into xfstests for our log-writes tests, those tend
> > > to not generate large logs so wouldn't take super long.  Does this fix the
> > > problem for you Brian?
> > >
> >
> > Thanks Josef. At a glance at that script I'm not quite following how
> > this fits together. Are you taking a snapshot of the original device
> > before the workload being tested is run against dm-log-writes, then
> > replaying on top of that? In general, anything that puts the device back
> > into the state from before the workload ran and replays from there
> > should be enough to fix the problem I think. As long as a replay
> > sequence runs in order, I don't think snapshots of each replay point
> > should technically be necessary (vs a single replay snapshot, for e.g.).
> >
> > Note that I ended up testing generic/482 with a loop device for a
> > scratch device and that also worked around the problem, I suspect
> > because that allowed the aforementioned discards to actually reset the
> > underlying the device via loop discard->hole punch (which doesn't occur
> > with my original, non-loop scratch dev). So it seems that another option
> 
> Urrgh! take a look at src/log-writes/log-writes.c:log_discard()->zero_range().
> I remembered correctly the replay log falls back to zeroing device with no
> discard support, but wasn't aware that it drops large discard requests
> (> 128MB), so there we have it.
> 

Heh, Ok.

> > for more deterministic behavior in fstests could be to just enforce the
> > use of a loop device as the underlying target in these particular tests.
> > For example, have the log-writes init create a file on the test device
> > and loop mount that rather than using the scratch device. Just a thought
> > though, it's not clear to me that's any more simple than what you have
> > already implemented here..
> >
> 
> Seems like a pretty good idea to me, but in doing that we may loose
> diversity in the tested block devices. On different test machines mkfs
> and later fsstress ops will result in slightly different io patterns.
> Since generic/482 is not a deterministic test (random fsstress seed)
> than we cannot argue that reproducible results are better than
> deterministic results.
> 
> Another option is to relax the  log->max_zero_size limit of
> log_discard() and allow it to zero out devices that don't support
> discard, but that is likely going to cost much more runtime.
> Note that this simple patch could also work for you:
> 
> --- a/src/log-writes/log-writes.c
> +++ b/src/log-writes/log-writes.c
> @@ -95,7 +95,7 @@ int log_discard(struct log *log, struct
> log_write_entry *entry)
>  {
>         u64 start = le64_to_cpu(entry->sector) * log->sectorsize;
>         u64 size = le64_to_cpu(entry->nr_sectors) * log->sectorsize;
> -       u64 max_chunk = 1 * 1024 * 1024 * 1024;
> +       u64 max_chunk = log->max_zero_size;
> 
> ---
> 
> Brian,
> 
> Can you try this patch on you test machine and say how much
> worse the runtime is compared to the loop device setup?
> 
> In general, I am in favor of the simplest possible fix
> and in keeping runtime shortest, so if the above fix
> results in longer runtime, I would prefer to make the test always
> run on loopdev as you suggested.
> 

Note that this is a low resource VM with a spinning disk. The scratch
device is a 15G LVM volume. My last run of generic/482 was 139s (which I
think was using loop for a scratch device). With this change and back to
using the LVM volume, I ended up stopping it because I started seeing a
~160s or so delay between XFS mount cycles in the syslog (i.e., per
replay iteration). ;P

Brian

> Thanks,
> Amir.

  reply	other threads:[~2019-02-27 20:45 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-26 18:14 [PATCH] generic: skip dm-log-writes tests on XFS v5 superblock filesystems Brian Foster
2019-02-26 21:10 ` Amir Goldstein
2019-02-26 23:22   ` Dave Chinner
2019-02-27  4:06     ` Amir Goldstein
2019-02-27  4:19       ` Qu Wenruo
2019-02-27  4:49         ` Amir Goldstein
2019-02-27  5:01           ` Qu Wenruo
2019-02-27  5:19             ` Amir Goldstein
2019-02-27  5:32               ` Qu Wenruo
2019-02-27  5:58                 ` Amir Goldstein
2019-02-27  6:15           ` Dave Chinner
2019-02-27 13:23       ` Brian Foster
2019-02-27 13:18   ` Brian Foster
2019-02-27 14:17     ` Brian Foster
2019-02-27 15:54       ` Josef Bacik
2019-02-27 17:11         ` Amir Goldstein
2019-02-27 17:13         ` Brian Foster
2019-02-27 18:46           ` Amir Goldstein
2019-02-27 20:45             ` Brian Foster [this message]
2019-02-27 19:27           ` Josef Bacik
2019-02-27 20:47             ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190227204521.GB50261@bfoster \
    --to=bfoster@redhat.com \
    --cc=amir73il@gmail.com \
    --cc=fstests@vger.kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.