public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Filipe Manana <fdmanana@kernel.org>, Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org,
	Christoph Anton Mitterer <calestyo@scientia.org>
Subject: Re: [PATCH 1/2] btrfs: defrag: add under utilized extent to defrag target list
Date: Wed, 7 Feb 2024 07:11:08 +1030	[thread overview]
Message-ID: <54a1bb50-7fd2-440f-8563-a82c54bb2179@gmx.com> (raw)
In-Reply-To: <CAL3q7H72pQ=3wPZpN9zow8+G4xnhSP5UKH1ev808Y5GYYB2BQw@mail.gmail.com>



On 2024/2/7 02:53, Filipe Manana wrote:
> On Mon, Feb 5, 2024 at 11:46 PM Qu Wenruo <wqu@suse.com> wrote:
>>
>> [BUG]
>> The following script can lead to a very under utilized extent and we
>> have no way to use defrag to properly reclaim its wasted space:
>>
>>    # mkfs.btrfs -f $dev
>>    # mount $dev $mnt
>>    # xfs_io -f -c "pwrite 0 128M" $mnt/foobar
>>    # sync
>>    # truncate -s 4k $mnt/foobar
>>    # btrfs filesystem defrag $mnt/foobar
>>    # sync
>>
>> After the above operations, the file "foobar" is still utilizing the
>> whole 128M:
>>
>>          item 4 key (257 INODE_ITEM 0) itemoff 15883 itemsize 160
>>                  generation 7 transid 8 size 4096 nbytes 4096
>>                  block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0
>>                  sequence 32770 flags 0x0(none)
>>          item 5 key (257 INODE_REF 256) itemoff 15869 itemsize 14
>>                  index 2 namelen 4 name: file
>>          item 6 key (257 EXTENT_DATA 0) itemoff 15816 itemsize 53
>>                  generation 7 type 1 (regular)
>>                  extent data disk byte 298844160 nr 134217728 <<<
>>                  extent data offset 0 nr 4096 ram 134217728
>>                  extent compression 0 (none)
>>
>> This is the common btrfs bookend behavior, that 128M extent would only
>> be freed if the last referencer of the extent is freed.
>>
>> The problem is, normally we would go defrag to free that 128M extent,
>> but defrag would not touch the extent at all.
>>
>> [CAUSE]
>> The file extent has no adjacent extent at all, thus all existing defrag
>> code consider it a perfectly good file extent, even if it's only
>> utilizing a very tiny amount of space.
>>
>> [FIX]
>> For a file extent without any adjacent file extent, we should still
>> consider to defrag such under utilized extent, base on the following
>> conditions:
>>
>> - utilization ratio
>>    If the extent is utilizing less than 1/16 of the on-disk extent size,
>>    then it would be a defrag target.
>>
>> - wasted space
>>    If we defrag the extent and can free at least 16MiB, then it would be
>>    a defrag target.
>>
>> Reported-by: Christoph Anton Mitterer <calestyo@scientia.org>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>
> As I said in my previous review, please include the lore link to the
> report, it's always useful.
>
>> ---
>>   fs/btrfs/defrag.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 42 insertions(+)
>>
>> diff --git a/fs/btrfs/defrag.c b/fs/btrfs/defrag.c
>> index 8fc8118c3225..85c6e45d0cd4 100644
>> --- a/fs/btrfs/defrag.c
>> +++ b/fs/btrfs/defrag.c
>> @@ -950,6 +950,38 @@ struct defrag_target_range {
>>          u64 len;
>>   };
>>
>> +/*
>> + * Special entry for extents that do not have any adjacent extents.
>> + *
>> + * This is for cases like the only and truncated extent of a file.
>> + * Normally they won't be defraged at all (as they won't be merged with
>> + * any adjacent ones), but we may still want to defrag them, to free up
>> + * some space if possible.
>> + */
>> +static bool should_defrag_under_utilized(struct extent_map *em)
>
> Can be made const.
>
>> +{
>> +       /*
>> +        * Ratio based check.
>> +        *
>> +        * If the current extent is only utilizing 1/16 of its on-disk size,
>> +        * it's definitely under-utilized, and defragging it may free up
>> +        * the whole extent.
>> +        */
>> +       if (em->len < em->orig_block_len / 16)
>> +               return true;
>> +
>> +       /*
>> +        * Wasted space based check.
>> +        *
>> +        * If we can free up at least 16MiB, then it may be a good idea
>> +        * to defrag.
>> +        */
>> +       if (em->len < em->orig_block_len &&
>> +           em->orig_block_len - em->len > SZ_16M)
>
> The first check, len < orig_block_len, is redundant, isn't it?
> em->len can only be less than or equals to em->orig_block_len.
> The second condition is enough to have.

Not exactly, don't forget compressed file extents.

>
>> +               return true;
>> +       return false;
>> +}
>> +
>>   /*
>>    * Collect all valid target extents.
>>    *
>> @@ -1070,6 +1102,16 @@ static int defrag_collect_targets(struct btrfs_inode *inode,
>>                  if (!next_mergeable) {
>>                          struct defrag_target_range *last;
>>
>> +                       /*
>> +                        * This is a single extent without any chance to merge
>> +                        * with any adjacent extent.
>> +                        *
>> +                        * But if we may free up some space, it is still worth
>> +                        * defragging.
>> +                        */
>> +                       if (should_defrag_under_utilized(em))
>> +                               goto add;
>> +
>
> So this logic is making some cases worse actually, making us use more
> disk space.
> For example:
>
> DEV=/dev/sdi
> MNT=/mnt/sdi
>
> mkfs.btrfs -f $DEV
> mount $DEV $MNT
>
> xfs_io -f -c "pwrite 0 128M" $MNT/foobar
> sync
> xfs_io -c "truncate 40M" $MNT/foobar
> btrfs filesystem defrag $MNT/foobar
>
> After this patch, we get:
>
> item 6 key (257 EXTENT_DATA 0) itemoff 15810 itemsize 53
> generation 7 type 1 (regular)
> extent data disk byte 1104150528 nr 134217728
> extent data offset 0 nr 8650752 ram 134217728
> extent compression 0 (none)
> item 7 key (257 EXTENT_DATA 8650752) itemoff 15757 itemsize 53
> generation 8 type 1 (regular)
> extent data disk byte 1238368256 nr 33292288
> extent data offset 0 nr 33292288 ram 33292288
> extent compression 0 (none)

This behavior is unexpected, as we should redirty the whole 40M, but the
first 8.25M didn't got re-dirtied is a big problem to me.
Will look into the situation.

>
> So we're now using 128M + 32M of disk space where before defrag we used 128M.
> Before the defrag we had:
>
> item 6 key (257 EXTENT_DATA 0) itemoff 15810 itemsize 53
> generation 7 type 1 (regular)
> extent data disk byte 1104150528 nr 134217728
> extent data offset 0 nr 41943040 ram 134217728
> extent compression 0 (none)
>
> So something's not good in this logic.
>
> Also, there's something else worth considering here, which is extent sharedness.

This is a known problem for defrag, thus for snapshot/reflink it's
really on the end user to determine whether they really need defrag.

Thanks,
Qu
> If an extent is shared we don't want to always dirty the respective
> range and rewrite it, as that may result in using more disk space...
>
> Example:
>
> xfs_io -f -c "pwrite 0 128M" $MNT/foobar
> btrfs subvolume snapshot $MNT $MNT/snap
> xfs_io -c "truncate 16M" $MNT/foobar
> btrfs filesystem defrag $MNT/foobar
>
> We end up consuming an additional 16M of data without any benefits.
> Without this patch, this wouldn't happen.
>
> Thanks.
>
>
>>                          /* Empty target list, no way to merge with last entry */
>>                          if (list_empty(target_list))
>>                                  goto next;
>> --
>> 2.43.0
>>
>>
>

  reply	other threads:[~2024-02-06 20:41 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-05 23:45 [PATCH 0/2] btrfs: defrag: better lone extent handling Qu Wenruo
2024-02-05 23:45 ` [PATCH 1/2] btrfs: defrag: add under utilized extent to defrag target list Qu Wenruo
2024-02-06 16:23   ` Filipe Manana
2024-02-06 20:41     ` Qu Wenruo [this message]
2024-02-06 22:26       ` Qu Wenruo
2024-02-07 12:19       ` Filipe Manana
2024-02-05 23:45 ` [PATCH 2/2] btrfs: defrag: allow fine-tuning on lone extent defrag behavior Qu Wenruo
2024-02-06 17:03   ` Filipe Manana
2024-02-06 20:50     ` Qu Wenruo
2024-02-07 12:27       ` Filipe Manana
2024-02-07 17:40         ` Christoph Anton Mitterer
2024-02-06 20:27 ` [PATCH 0/2] btrfs: defrag: better lone extent handling Christoph Anton Mitterer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54a1bb50-7fd2-440f-8563-a82c54bb2179@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=calestyo@scientia.org \
    --cc=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox