public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Hanabishi <i.r.e.c.c.a.k.u.n+kernel.org@gmail.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>, linux-btrfs@vger.kernel.org
Subject: Re: 'btrfs filesystem defragment' makes files explode in size, especially fallocated ones
Date: Tue, 6 Aug 2024 11:05:57 +0000	[thread overview]
Message-ID: <ef164317-6472-4808-83cf-acaa2b8ab758@gmail.com> (raw)
In-Reply-To: <e72e1aed-4493-4d03-81cd-a88abcda5051@gmx.com>

On 8/6/24 10:42, Qu Wenruo wrote:

> Too low values means kernel will trigger dirty writeback aggressively, I
> believe for all extent based file systems (ext4/xfs/btrfs etc), it would
> cause a huge waste of metadata, due to the huge amount of small extents.
> 
> So yes, that setting is the cause, although it will reduce the memory
> used by page cache (it still counts as memory pressure), but the cost is
> more fragmented extents and overall worse fs performance and possibly
> more wear on NAND based storage.

Thanks for explanation. I'm aware of low dirty page cache performance tradeoffs, I prefer more reliability in case of system failure / power outage.
But that rises questions anyway.

1. Why are files ok initially regardless of page cache size? It only blows up with explicit run of the defragment command. And I didn't face anything similar with other filesystems either.

2. How I get my space back without deleting the files? Even if I crank up the page cache amount and then defragment "properly", it doesn't reclaim the actual space back.

# btrfs filesystem defragment mingw-w64-gcc-13.1.0-1-x86_64.pkg.tar.zst

# compsize mingw-w64-gcc-13.1.0-1-x86_64.pkg.tar.zst
Processed 1 file, 3 regular extents (3 refs), 0 inline.
Type       Perc     Disk Usage   Uncompressed Referenced
TOTAL      100%      449M         449M         224M
none       100%      449M         449M         224M

There are 3 extents, it's defenitely not a metadata overhead.

3. Regardless of settings, what if users do end up in low memory conditions for some reason? It's not an uncommon scenario.
You end up with Btrfs borking your disk space. In my opinion it looks like a bug and should not happen.


  reply	other threads:[~2024-08-06 11:06 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-04  9:20 'btrfs filesystem defragment' makes files explode in size, especially fallocated ones i.r.e.c.c.a.k.u.n+kernel.org
2024-08-04 22:19 ` Qu Wenruo
2024-08-05 18:16   ` Hanabishi
2024-08-05 22:47     ` Qu Wenruo
2024-08-06  7:19       ` Hanabishi
2024-08-06  9:55         ` Qu Wenruo
2024-08-06 10:23           ` Hanabishi
2024-08-06 10:42             ` Qu Wenruo
2024-08-06 11:05               ` Hanabishi [this message]
2024-08-06 11:23                 ` Qu Wenruo
2024-08-06 12:08                   ` Hanabishi
2024-08-06 22:10                     ` Qu Wenruo
2024-08-06 22:42                       ` Hanabishi
2024-08-06 22:51                         ` Qu Wenruo
2024-08-06 23:04                           ` Hanabishi
2024-08-06 12:17                   ` Hanabishi
2024-08-06 13:22                     ` Hanabishi
2024-08-06 22:18                       ` Qu Wenruo
2024-08-06 22:55                         ` Hanabishi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ef164317-6472-4808-83cf-acaa2b8ab758@gmail.com \
    --to=i.r.e.c.c.a.k.u.n+kernel.org@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox