Re: 'btrfs filesystem defragment' makes files explode in size, especially fallocated ones

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Hanabishi <i.r.e.c.c.a.k.u.n+kernel.org@gmail.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: 'btrfs filesystem defragment' makes files explode in size, especially fallocated ones
Date: Tue, 6 Aug 2024 20:53:04 +0930	[thread overview]
Message-ID: <d089a164-b2e8-4d29-8d96-41b12cbfae42@gmx.com> (raw)
In-Reply-To: <ef164317-6472-4808-83cf-acaa2b8ab758@gmail.com>

在 2024/8/6 20:35, Hanabishi 写道:
> On 8/6/24 10:42, Qu Wenruo wrote:
>
>> Too low values means kernel will trigger dirty writeback aggressively, I
>> believe for all extent based file systems (ext4/xfs/btrfs etc), it would
>> cause a huge waste of metadata, due to the huge amount of small extents.
>>
>> So yes, that setting is the cause, although it will reduce the memory
>> used by page cache (it still counts as memory pressure), but the cost is
>> more fragmented extents and overall worse fs performance and possibly
>> more wear on NAND based storage.
>
> Thanks for explanation. I'm aware of low dirty page cache performance
> tradeoffs, I prefer more reliability in case of system failure / power
> outage.
> But that rises questions anyway.
>
> 1. Why are files ok initially regardless of page cache size? It only
> blows up with explicit run of the defragment command. And I didn't face
> anything similar with other filesystems either.

Because btrfs merges extents that are physically adjacent at fiemap time.

Especially if you go fallocate, then the initial write are ensured to
land in that preallocated range.
Although they may be split into many small extents, they are still
physically adjacent.

When defrag happens, it triggers data COW, and screw up everything.

>
> 2. How I get my space back without deleting the files? Even if I crank
> up the page cache amount and then defragment "properly", it doesn't
> reclaim the actual space back.
>
> # btrfs filesystem defragment mingw-w64-gcc-13.1.0-1-x86_64.pkg.tar.zst
>
> # compsize mingw-w64-gcc-13.1.0-1-x86_64.pkg.tar.zst
> Processed 1 file, 3 regular extents (3 refs), 0 inline.
> Type       Perc     Disk Usage   Uncompressed Referenced
> TOTAL      100%      449M         449M         224M
> none       100%      449M         449M         224M
>
> There are 3 extents, it's defenitely not a metadata overhead.

I'm not sure how high the value you set, but at least please do
everything with default kernel config, not just crank the settings up.

And have you tried sync before compsize/fiemap?

If you still have problems reclaiming the space, please provide the
fiemap output (before defrag, and after defrag and sync)

>
> 3. Regardless of settings, what if users do end up in low memory
> conditions for some reason? It's not an uncommon scenario.
> You end up with Btrfs borking your disk space. In my opinion it looks
> like a bug and should not happen.
>

If we try to lock the defrag range, to ensure them to land in a larger
extent, I'm 100% sure MM guys won't be happy, it's blocking the most
common way to reclaim memory.

By that method we're only going to exhaust the system memory at the
worst timing.

IIRC it's already in the document, although not that clear:

   The value is only advisory and the final size of the extents may
   differ, depending on the state of the free space and fragmentation or
   other internal logic.

To be honest, defrag is not recommended for modern extent based file
systems already, thus there is no longer a common and good example to
follow.

And for COW file systems, along with btrfs' specific bookend behavior,
it brings a new level of complexity.

So overall, if you're not sure what the defrag internal logic is, nor
have a clear problem you want to solve, do not defrag.

Thanks,
Qu

next prev parent reply	other threads:[~2024-08-06 11:23 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-04  9:20 'btrfs filesystem defragment' makes files explode in size, especially fallocated ones i.r.e.c.c.a.k.u.n+kernel.org
2024-08-04 22:19 ` Qu Wenruo
2024-08-05 18:16   ` Hanabishi
2024-08-05 22:47     ` Qu Wenruo
2024-08-06  7:19       ` Hanabishi
2024-08-06  9:55         ` Qu Wenruo
2024-08-06 10:23           ` Hanabishi
2024-08-06 10:42             ` Qu Wenruo
2024-08-06 11:05               ` Hanabishi
2024-08-06 11:23                 ` Qu Wenruo [this message]
2024-08-06 12:08                   ` Hanabishi
2024-08-06 22:10                     ` Qu Wenruo
2024-08-06 22:42                       ` Hanabishi
2024-08-06 22:51                         ` Qu Wenruo
2024-08-06 23:04                           ` Hanabishi
2024-08-06 12:17                   ` Hanabishi
2024-08-06 13:22                     ` Hanabishi
2024-08-06 22:18                       ` Qu Wenruo
2024-08-06 22:55                         ` Hanabishi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d089a164-b2e8-4d29-8d96-41b12cbfae42@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=i.r.e.c.c.a.k.u.n+kernel.org@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox