Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Andrei Borzenkov <arvidjaar@gmail.com>, linux-btrfs@vger.kernel.org
Cc: Christoph Anton Mitterer <calestyo@scientia.org>
Subject: Re: [PATCH] btrfs: defrag: add under utilized extent to defrag target list
Date: Sat, 6 Jan 2024 06:41:56 +1030	[thread overview]
Message-ID: <a83cb94c-df03-446c-ab20-93f252d95724@suse.com> (raw)
In-Reply-To: <dd314b94-6c45-4c3d-9c6e-4def5b67aafe@gmail.com>


[-- Attachment #1.1.1: Type: text/plain, Size: 4182 bytes --]



On 2024/1/6 03:15, Andrei Borzenkov wrote:
> On 05.01.2024 10:33, Qu Wenruo wrote:
>> [BUG]
>> The following script can lead to a very under utilized extent and we
>> have no way to use defrag to properly reclaim its wasted space:
>>
>>    # mkfs.btrfs -f $dev
>>    # mount $dev $mnt
>>    # xfs_io -f -c "pwrite 0 128M" $mnt/foobar
>>    # sync
>>    # btrfs filesystem defrag $mnt/foobar
>>    # sync
>>
>> After the above operations, the file "foobar" is still utilizing the
>> whole 128M:
>>
>>          item 4 key (257 INODE_ITEM 0) itemoff 15883 itemsize 160
>>                  generation 7 transid 8 size 4096 nbytes 4096
>>                  block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0
>>                  sequence 32770 flags 0x0(none)
>>          item 5 key (257 INODE_REF 256) itemoff 15869 itemsize 14
>>                  index 2 namelen 4 name: file
>>          item 6 key (257 EXTENT_DATA 0) itemoff 15816 itemsize 53
>>                  generation 7 type 1 (regular)
>>                  extent data disk byte 298844160 nr 134217728 <<<
>>                  extent data offset 0 nr 4096 ram 134217728
>>                  extent compression 0 (none)
>>
>> Meaning the expected defrag way to reclaim the space is not working.
>>
>> [CAUSE]
>> The file extent has no adjacent extent at all, thus all existing defrag
>> code consider it a perfectly good file extent, even if it's only
>> utilizing a very tiny amount of space.
>>
>> [FIX]
>> Add a special handling for under utilized extents, currently the ratio
>> is 6.25% (1/16).
>>
>> This would allow us to add such extent to our defrag target list,
>> resulting it to be properly defragged.
>>
> 
> This sounds like the very wrong thing to do unconditionally. You have no 
> knowledge of application I/O pattern and do not know whether application 
> is going to fill this extent or not. Instead of de-fragmenting you are 
> effectively fragmenting.

This is not exactly true.

If you have one file extent only referring 1/16 of a larger extent, it 
already means the extent itself is fragmented.

Either through truncate (the case we're getting into), or tons of later 
COW leading to this situation (which would also be defragged by the 
existing defrag conditions)

If you want to prove it wrong, please give me a counter example.

Thanks,
Qu
> 
> This sounds more like a target for a different operation, which may be 
> called "garbage collection". But it should be explicit operation not 
> tied to defragmentation (or at least explicit opt-in for defragmentation).
> 
>> Reported-by: Christoph Anton Mitterer <calestyo@scientia.org>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/defrag.c | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/fs/btrfs/defrag.c b/fs/btrfs/defrag.c
>> index c276b136ab63..cc319190b6fb 100644
>> --- a/fs/btrfs/defrag.c
>> +++ b/fs/btrfs/defrag.c
>> @@ -1070,6 +1070,17 @@ static int defrag_collect_targets(struct 
>> btrfs_inode *inode,
>>           if (!next_mergeable) {
>>               struct defrag_target_range *last;
>> +            /*
>> +             * Special entry point utilization ratio under 1/16 (only
>> +             * referring 1/16 of an on-disk extent).
>> +             * This can happen for a truncated large extent.
>> +             * If we don't add them, then for a truncated file
>> +             * (may be the last 4K of a 128M extent) it will never
>> +             * be defraged.
>> +             */
>> +            if (em->ram_bytes < em->orig_block_len / 16)
>> +                goto add;
>> +
>>               /* Empty target list, no way to merge with last entry */
>>               if (list_empty(target_list))
>>                   goto next;
> 

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7027 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

  reply	other threads:[~2024-01-05 20:12 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-05  7:33 [PATCH] btrfs: defrag: add under utilized extent to defrag target list Qu Wenruo
2024-01-05 16:45 ` Andrei Borzenkov
2024-01-05 20:11   ` Qu Wenruo [this message]
2024-01-09 14:55 ` Filipe Manana
2024-01-09 16:12   ` Filipe Manana
2024-01-09 21:04   ` Qu Wenruo
2024-01-09 21:57     ` Christoph Anton Mitterer
2024-01-09 22:17       ` Qu Wenruo
2024-01-10 17:09 ` David Sterba
2024-01-11  6:24   ` Qu Wenruo
2024-01-12 15:58     ` David Sterba
2024-01-13  3:17       ` Qu Wenruo
2024-01-13  8:05         ` Andrei Borzenkov
2024-01-13  8:32           ` Qu Wenruo
2024-01-13  3:47       ` Christoph Anton Mitterer
2024-02-05  5:39 ` Christoph Anton Mitterer
2024-02-05  5:42   ` Qu Wenruo
2024-04-20  4:30     ` Skirnir Torvaldsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a83cb94c-df03-446c-ab20-93f252d95724@suse.com \
    --to=wqu@suse.com \
    --cc=arvidjaar@gmail.com \
    --cc=calestyo@scientia.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox