From: Filipe Manana <fdmanana@kernel.org>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit
Date: Wed, 26 Jan 2022 11:40:36 +0000 [thread overview]
Message-ID: <YfEzNCybtrSufSvu@debian9.Home> (raw)
In-Reply-To: <20220126005850.14729-2-wqu@suse.com>
On Wed, Jan 26, 2022 at 08:58:49AM +0800, Qu Wenruo wrote:
> In defrag_lookup_extent() we use hardcoded extent size threshold, SZ_128K,
> other than @extent_thresh in btrfs_defrag_file().
>
> This can lead to some inconsistent behavior, especially the default
> extent size threshold is 256K.
>
> Fix this by passing @extent_thresh into defrag_check_next_extent() and
> use that value.
>
> Also, since the extent_thresh check should be applied to all extents,
> not only physically adjacent extents, move the threshold check into a
> dedicate if ().
>
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> fs/btrfs/ioctl.c | 12 +++++++-----
> 1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 0d8bfc716e6b..2911df12fc48 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1050,7 +1050,7 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
> }
>
> static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
> - bool locked)
> + u32 extent_thresh, bool locked)
> {
> struct extent_map *next;
> bool ret = false;
> @@ -1066,9 +1066,11 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
> /* Preallocated */
> if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
> goto out;
> - /* Physically adjacent and large enough */
> - if ((em->block_start + em->block_len == next->block_start) &&
> - (em->block_len > SZ_128K && next->block_len > SZ_128K))
> + /* Extent is already large enough */
> + if (next->len >= extent_thresh)
> + goto out;
So this will trigger unnecessary rewrites of compressed extents.
The SZ_128K is there to deal with compressed extents, it has nothing to
do with the threshold passed to the ioctl.
After applying this patchset, if you run a trivial test like this:
#!/bin/bash
DEV=/dev/sdj
MNT=/mnt/sdj
mkfs.btrfs -f $DEV
mount -o compress $DEV $MNT
xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/foobar
sync
# Write to some other file so that the next extent for foobar
# is not contiguous with the first extent.
xfs_io -f -c "pwrite 0 128K" $MNT/baz
sync
xfs_io -f -c "pwrite -S 0xcd 128K 128K" $MNT/foobar
sync
echo -e "\n\nTree after creating file:\n\n"
btrfs inspect-internal dump-tree -t 5 $DEV
btrfs filesystem defragment $MNT/foobar
sync
echo -e "\n\nTree after defrag:\n\n"
btrfs inspect-internal dump-tree -t 5 $DEV
umount $MNT
It will result in rewriting the two 128K compressed extents:
(...)
Tree after write and sync:
btrfs-progs v5.12.1
fs tree key (FS_TREE ROOT_ITEM 0)
(...)
item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
index 2 namelen 6 name: foobar
item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
generation 6 type 1 (regular)
extent data disk byte 13631488 nr 4096
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
generation 8 type 1 (regular)
extent data disk byte 14163968 nr 4096
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
(...)
Tree after defrag:
btrfs-progs v5.12.1
fs tree key (FS_TREE ROOT_ITEM 0)
(...)
item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
index 2 namelen 6 name: foobar
item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
generation 9 type 1 (regular)
extent data disk byte 14430208 nr 4096
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
generation 9 type 1 (regular)
extent data disk byte 13635584 nr 4096
extent data offset 0 nr 131072 ram 131072
extent compression 1 (zlib)
In other words, a waste of IO and CPU time.
So it needs to check if we are dealing with compressed extents, and
if so, skip either of them has a size of SZ_128K (and changelog updated).
Thanks.
> + /* Physically adjacent */
> + if ((em->block_start + em->block_len == next->block_start))
> goto out;
> ret = true;
> out:
> @@ -1231,7 +1233,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
> goto next;
>
> next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,
> - locked);
> + extent_thresh, locked);
> if (!next_mergeable) {
> struct defrag_target_range *last;
>
> --
> 2.34.1
>
next prev parent reply other threads:[~2022-01-26 11:40 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-01-26 0:58 [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents Qu Wenruo
2022-01-26 0:58 ` [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit Qu Wenruo
2022-01-26 11:40 ` Filipe Manana [this message]
2022-01-26 12:26 ` Qu Wenruo
2022-01-26 12:36 ` Filipe Manana
2022-01-26 13:00 ` Qu Wenruo
2022-01-26 13:37 ` Filipe Manana
2022-01-26 23:57 ` Qu Wenruo
2022-01-27 10:58 ` Filipe Manana
2022-01-27 11:11 ` Forza
2022-01-26 0:58 ` [PATCH 3/3] btrfs: defrag: remove the physical adjacent extents rejection in defrag_check_next_extent() Qu Wenruo
2022-01-26 11:44 ` Filipe Manana
2022-01-26 11:26 ` [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents Filipe Manana
2022-01-26 11:33 ` Qu Wenruo
2022-01-26 11:47 ` Filipe Manana
2022-01-28 6:31 ` Qu Wenruo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YfEzNCybtrSufSvu@debian9.Home \
--to=fdmanana@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.