public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@kernel.org>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Qu Wenruo <wqu@suse.com>, fstests <fstests@vger.kernel.org>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH] btrfs: add test case to make sure autodefrag works even the extent maps are read from disk
Date: Fri, 11 Feb 2022 11:52:22 +0000	[thread overview]
Message-ID: <YgZN9up8fuOt894m@debian9.Home> (raw)
In-Reply-To: <08a1e4df-27b2-23e4-ec1d-1ba1c4fe7e2a@gmx.com>

On Fri, Feb 11, 2022 at 08:01:24AM +0800, Qu Wenruo wrote:
> 
> 
> On 2022/2/11 00:37, Filipe Manana wrote:
> > On Tue, Feb 8, 2022 at 12:00 PM Qu Wenruo <wqu@suse.com> wrote:
> > > 
> > > There is a long existing problem that extent_map::generation is not
> > > populated (thus always 0) if its read from disk.
> > > 
> > > This can prevent btrfs autodefrag from working as it relies on
> > > extent_map::generation.
> > > If it's always 0, then autodefrag will not consider the range as a
> > > defrag target.
> > > 
> > > The test case itself will verify the behavior by:
> > > 
> > > - Create a fragmented file
> > >    By writing backwards with OSYNC
> > >    This will also queue the file for autodefrag.
> > > 
> > > - Drop all cache
> > >    Including the extent map cache, meaning later read will
> > >    all get extent map by reading from on-disk file extent items.
> > > 
> > > - Trigger autodefrag and verify the file layout
> > >    If defrag works, the new file layout should differ from the original
> > >    one.
> > > 
> > > The kernel fix is titled:
> > > 
> > >    btrfs: populate extent_map::generation when reading from disk
> > > 
> > > Signed-off-by: Qu Wenruo <wqu@suse.com>
> > > ---
> > >   tests/btrfs/259     | 64 +++++++++++++++++++++++++++++++++++++++++++++
> > >   tests/btrfs/259.out |  2 ++
> > >   2 files changed, 66 insertions(+)
> > >   create mode 100755 tests/btrfs/259
> > >   create mode 100644 tests/btrfs/259.out
> > > 
> > > diff --git a/tests/btrfs/259 b/tests/btrfs/259
> > > new file mode 100755
> > > index 00000000..577e4ce4
> > > --- /dev/null
> > > +++ b/tests/btrfs/259
> > > @@ -0,0 +1,64 @@
> > > +#! /bin/bash
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# Copyright (C) 2022 SUSE Linux Products GmbH. All Rights Reserved.
> > > +#
> > > +# FS QA Test 259
> > > +#
> > > +# Make sure autodefrag can still defrag the file even their extent maps are
> > > +# read from disk
> > > +#
> > > +. ./common/preamble
> > > +_begin_fstest auto quick defrag
> > > +
> > > +# Override the default cleanup function.
> > > +# _cleanup()
> > > +# {
> > > +#      cd /
> > > +#      rm -r -f $tmp.*
> > > +# }
> > > +
> > > +# Import common functions.
> > > +# . ./common/filter
> > > +
> > > +# real QA test starts here
> > > +
> > > +# Modify as appropriate.
> > > +_supported_fs btrfs
> > > +_require_scratch
> > > +
> > > +# Need 4K sectorsize, as the autodefrag threshold is only 64K,
> > > +# thus 64K sectorsize will not work.
> > > +_require_btrfs_support_sectorsize 4096
> > 
> > Missing a:
> > 
> > _require_xfs_io_command fiemap
> > 
> > > +_scratch_mkfs -s 4k >> $seqres.full
> > > +_scratch_mount -o datacow,autodefrag
> > > +
> > > +# Create fragmented write
> > > +$XFS_IO_PROG -f -s -c "pwrite 24k 8k" -c "pwrite 16k 8k" \
> > > +               -c "pwrite 8k 8k" -c "pwrite 0 8k" \
> > > +               "$SCRATCH_MNT/foobar" >> $seqres.full
> > > +sync
> > 
> > A comment on why this sync is needed would be good to have.
> > It may be confusing to the reader since we were doing synchronous writes before.
> > 
> > > +
> > > +echo "=== Before autodefrag ===" >> $seqres.full
> > > +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $tmp.before
> > > +cat $tmp.before >> $seqres.full
> > > +
> > > +# Drop the cache (including extent map cache per-inode)
> > > +echo 3 > /proc/sys/vm/drop_caches
> > > +
> > > +# Now trigger autodefrag
> > 
> > A bit more explanation would be useful.
> > 
> > Set the commit interval to 1 second, so that 1 second after the
> > remount the transaction kthread runs
> > and wakes up the cleaner kthread, which in turn will run autodefrag.
> > 
> > > +_scratch_remount commit=1
> > > +sleep 3
> > > +sync
> > 
> > This sync is useless, so it should go away.
> 
> Autodefrag doesn't write data back at all.
> It just mark the target range dirty and wait for later writeback.
> 
> Thus sync is still needed AFAIK.

Well, something weird is going on then.

Removing the sync, makes the test still fail on an unpatched kernel,
and on a patched kernel, it still succeeds.

Looking at the .full file, the results are correct:

=== Before autodefrag ===
/home/fdmanana/btrfs-tests/scratch_1/foobar:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..15]:         26672..26687        16   0x0
   1: [16..31]:        26656..26671        16   0x0
   2: [32..47]:        26640..26655        16   0x0
   3: [48..63]:        26624..26639        16   0x1
=== After autodefrag ===
/home/fdmanana/btrfs-tests/scratch_1/foobar:
 EXT: FILE-OFFSET      BLOCK-RANGE      TOTAL FLAGS
   0: [0..63]:         26688..26751        64   0x1

So why this worked seems to be because:

- After doing the writes to the file, a delayed iput on the inode
  did a wakeup on the cleaner kthread;

- The cleaner kthread ran defrag;

- btrfs_remount() calls sync_filesystem(), flushing all delalloc.

So, yes, it's better to leave the 'sync', as there's no way to
guarantee the cleaner ran defrag before remount.

Perhaps a comment about that sync would be useful as well.

Thanks.

> 
> Thanks,
> Qu
> 
> > 
> > Otherwise, it looks good and the test works as expected.
> > 
> > Thanks for doing it.
> > 
> > > +
> > > +echo "=== After autodefrag ===" >> $seqres.full
> > > +$XFS_IO_PROG -c "fiemap -v" "$SCRATCH_MNT/foobar" >> $tmp.after
> > > +cat $tmp.after >> $seqres.full
> > > +
> > > +# The layout should differ if autodefrag is working
> > > +diff $tmp.before $tmp.after > /dev/null && echo "autodefrag didn't defrag the file"
> > > +
> > > +echo "Silence is golden"
> > > +
> > > +# success, all done
> > > +status=0
> > > +exit
> > > diff --git a/tests/btrfs/259.out b/tests/btrfs/259.out
> > > new file mode 100644
> > > index 00000000..bfbd2dea
> > > --- /dev/null
> > > +++ b/tests/btrfs/259.out
> > > @@ -0,0 +1,2 @@
> > > +QA output created by 259
> > > +Silence is golden
> > > --
> > > 2.34.1
> > > 

      reply	other threads:[~2022-02-11 11:52 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-08  7:14 [PATCH] btrfs: add test case to make sure autodefrag works even the extent maps are read from disk Qu Wenruo
2022-02-10 16:37 ` Filipe Manana
2022-02-11  0:01   ` Qu Wenruo
2022-02-11 11:52     ` Filipe Manana [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YgZN9up8fuOt894m@debian9.Home \
    --to=fdmanana@kernel.org \
    --cc=fstests@vger.kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox