public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@kernel.org>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit
Date: Wed, 26 Jan 2022 11:40:36 +0000	[thread overview]
Message-ID: <YfEzNCybtrSufSvu@debian9.Home> (raw)
In-Reply-To: <20220126005850.14729-2-wqu@suse.com>

On Wed, Jan 26, 2022 at 08:58:49AM +0800, Qu Wenruo wrote:
> In defrag_lookup_extent() we use hardcoded extent size threshold, SZ_128K,
> other than @extent_thresh in btrfs_defrag_file().
> 
> This can lead to some inconsistent behavior, especially the default
> extent size threshold is 256K.
> 
> Fix this by passing @extent_thresh into defrag_check_next_extent() and
> use that value.
> 
> Also, since the extent_thresh check should be applied to all extents,
> not only physically adjacent extents, move the threshold check into a
> dedicate if ().
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/ioctl.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
> index 0d8bfc716e6b..2911df12fc48 100644
> --- a/fs/btrfs/ioctl.c
> +++ b/fs/btrfs/ioctl.c
> @@ -1050,7 +1050,7 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start,
>  }
>  
>  static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
> -				     bool locked)
> +				     u32 extent_thresh, bool locked)
>  {
>  	struct extent_map *next;
>  	bool ret = false;
> @@ -1066,9 +1066,11 @@ static bool defrag_check_next_extent(struct inode *inode, struct extent_map *em,
>  	/* Preallocated */
>  	if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
>  		goto out;
> -	/* Physically adjacent and large enough */
> -	if ((em->block_start + em->block_len == next->block_start) &&
> -	    (em->block_len > SZ_128K && next->block_len > SZ_128K))
> +	/* Extent is already large enough */
> +	if (next->len >= extent_thresh)
> +		goto out;

So this will trigger unnecessary rewrites of compressed extents.
The SZ_128K is there to deal with compressed extents, it has nothing to
do with the threshold passed to the ioctl.

After applying this patchset, if you run a trivial test like this:

   #!/bin/bash

   DEV=/dev/sdj
   MNT=/mnt/sdj

   mkfs.btrfs -f $DEV
   mount -o compress $DEV $MNT

   xfs_io -f -c "pwrite -S 0xab 0 128K" $MNT/foobar
   sync
   # Write to some other file so that the next extent for foobar
   # is not contiguous with the first extent.
   xfs_io -f -c "pwrite 0 128K" $MNT/baz
   sync
   xfs_io -f -c "pwrite -S 0xcd 128K 128K" $MNT/foobar
   sync

   echo -e "\n\nTree after creating file:\n\n"
   btrfs inspect-internal dump-tree -t 5 $DEV

   btrfs filesystem defragment $MNT/foobar
   sync

   echo -e "\n\nTree after defrag:\n\n"
   btrfs inspect-internal dump-tree -t 5 $DEV

   umount $MNT

It will result in rewriting the two 128K compressed extents:

(...)
Tree after write and sync:

btrfs-progs v5.12.1 
fs tree key (FS_TREE ROOT_ITEM 0) 
(...)
	item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
		index 2 namelen 6 name: foobar
	item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
		generation 6 type 1 (regular)
		extent data disk byte 13631488 nr 4096
		extent data offset 0 nr 131072 ram 131072
		extent compression 1 (zlib)
	item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
		generation 8 type 1 (regular)
		extent data disk byte 14163968 nr 4096
		extent data offset 0 nr 131072 ram 131072
		extent compression 1 (zlib)
(...)

Tree after defrag:

btrfs-progs v5.12.1 
fs tree key (FS_TREE ROOT_ITEM 0) 
(...)
	item 7 key (257 INODE_REF 256) itemoff 15797 itemsize 16
		index 2 namelen 6 name: foobar
	item 8 key (257 EXTENT_DATA 0) itemoff 15744 itemsize 53
		generation 9 type 1 (regular)
		extent data disk byte 14430208 nr 4096
		extent data offset 0 nr 131072 ram 131072
		extent compression 1 (zlib)
	item 9 key (257 EXTENT_DATA 131072) itemoff 15691 itemsize 53
		generation 9 type 1 (regular)
		extent data disk byte 13635584 nr 4096
		extent data offset 0 nr 131072 ram 131072
		extent compression 1 (zlib)

In other words, a waste of IO and CPU time.

So it needs to check if we are dealing with compressed extents, and
if so, skip either of them has a size of SZ_128K (and changelog updated).

Thanks.

> +	/* Physically adjacent */
> +	if ((em->block_start + em->block_len == next->block_start))
>  		goto out;
>  	ret = true;
>  out:
> @@ -1231,7 +1233,7 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>  			goto next;
>  
>  		next_mergeable = defrag_check_next_extent(&inode->vfs_inode, em,
> -							  locked);
> +							  extent_thresh, locked);
>  		if (!next_mergeable) {
>  			struct defrag_target_range *last;
>  
> -- 
> 2.34.1
> 

  reply	other threads:[~2022-01-26 11:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-26  0:58 [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents Qu Wenruo
2022-01-26  0:58 ` [PATCH 2/3] btrfs: defrag: use extent_thresh to replace the hardcoded size limit Qu Wenruo
2022-01-26 11:40   ` Filipe Manana [this message]
2022-01-26 12:26     ` Qu Wenruo
2022-01-26 12:36       ` Filipe Manana
2022-01-26 13:00         ` Qu Wenruo
2022-01-26 13:37           ` Filipe Manana
2022-01-26 23:57             ` Qu Wenruo
2022-01-27 10:58               ` Filipe Manana
2022-01-27 11:11                 ` Forza
2022-01-26  0:58 ` [PATCH 3/3] btrfs: defrag: remove the physical adjacent extents rejection in defrag_check_next_extent() Qu Wenruo
2022-01-26 11:44   ` Filipe Manana
2022-01-26 11:26 ` [PATCH v3 1/3] btrfs: defrag: don't try to merge regular extents with preallocated extents Filipe Manana
2022-01-26 11:33   ` Qu Wenruo
2022-01-26 11:47     ` Filipe Manana
2022-01-28  6:31 ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YfEzNCybtrSufSvu@debian9.Home \
    --to=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox