From: Leo Martins <loemra.dev@gmail.com>
To: Boris Burkov <boris@bur.io>
Cc: Chris Murphy <lists@colorremedies.com>,
kernel-team@fb.com, Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH] btrfs: make periodic dynamic reclaim the default for data
Date: Thu, 23 Oct 2025 16:27:35 -0700 [thread overview]
Message-ID: <20251023232737.3346933-1-loemra.dev@gmail.com> (raw)
In-Reply-To: <20251022010215.GA167205@zen.localdomain>
On Tue, 21 Oct 2025 18:02:15 -0700 Boris Burkov <boris@bur.io> wrote:
> On Tue, Oct 21, 2025 at 08:37:18PM -0400, Chris Murphy wrote:
> > Thanks for the response.
> >
> > On Tue, Oct 21, 2025, at 6:39 PM, Leo Martins wrote:
> >
> > >
> > > Wanted to provide some data from the Meta rollout to give more context on the
> > > decision to enable dynamic+periodic reclaim by default for data. All the before
> > > numbers are with bg_reclaim_threshold set to 30.
> > >
> > > Enabling dynamic+periodic reclaim for data block groups dramatically decreases
> > > number of reclaims per host, going from 150/day to just 5/day (p99), and from
> > > 6/day to 0/day (p50). The trade-offs are increases in fragmentation, and a
> > > slight uptick in enospcs.
> > >
> > > I currently don't have direct fragmentation metrics, though that is a
> > > work in progress, but I'm tracking FP as a proxy for fragmentation.
> > >
> > > FP = (allocated - used) / allocated
> > > So if there are 100G allocated for data and 80G are used, FP = (100 -
> > > 80) / 100 = 20%.
> > >
> > > FP has increased from 30% to 45% (p99), and from 5% to 7% (p50).
> > > Enospc rates have gone from around 0.5/day to 1/day per 100k hosts.
>
> Leo, correct me if I'm wrong, but we have yet to investigate a system
> where unallocated steadily marched down to 0 since the introduction of
> dynamic reclaim and then it ENOSPC'd, right? If there is a strong,
> undeniable increase in ENOSPCs we should absolutely look for such
> systems in those regions to motivate further improvements with
> full/filling filesystems.
After digging some more the only examples I found of btrfs enospcing
from lack of unallocated are true enospcs where either data or metadata
were entirely full.
>
> There is also the confounding variable of the bug fixed here:
> https://lore.kernel.org/linux-btrfs/22e8b64df3d4984000713433a89cfc14309b75fc.1759430967.git.boris@bur.io/
> that has been plaguing our fleet causing ENOSPC issues.
Yes, a deeper look revealed that the increase in ENOSPCs is
due to this bug and not dynamic+periodic reclaim. In fact,
the hosts with dynamic+periodic reclaim enabled see a relatively
smaller rate of enospc (about 2x less) than the rest of the fleet.
>
> > > This is a doubling in rate, but still a very small absolute number
> > > of enospcs. The unallocated space on disk decreases by ~15G (p99)
> > > and ~5G (p50) after rollout.
> >
> > I'm curious how it compares with default btrfsmaintenance btrfs-balance.timer/service - I'm guessing this is a bit harder to test at Meta in production due to the strictly time based trigger. And customization ends up being a choice between even higher reclaim or higher enospc.
> >
>
> Yeah, we don't have that data unfortunately.
>
> > > That being said I don't think bg_reclaim_threshold is enabled by default,
> > > and I am comfortable saying dynamic+periodic reclaim is better than no
> > > automatic reclaim!
> >
> > So there are still corner cases occurring even with dynamic periodic reclaim. What do those look like? Is the file system unable to write metadata for arbitrary deletes to back the file system out? Or is it stuck in some cases?
> >
>
> I would imagine the cases that are tough for dynamic reclaim are:
> 1. genuinely quite full fs
> 2. rapidly needs a big hunk of metadata between entering the dynamic
> reclaim zone but before the cleaner thread / reclaim worker can run.
Concerning point 1 it seems like dynamic+periodic reclaim actually does a pretty good
job here. I haven't seen any signs of thrashing with low unallocated space.
>
> > ext4 users are used to 5% of space being held in reserve for root user processes. I'm not sure if xfs has such a concept. Btrfs global reserve is different in that even root can't use it, it's really reserved for the kernel. But sometimes it's still possible to exhaust this metadata space, and be unable to delete files or balance even 1 data bg to back the file system out of the situation. The wedged in file system that keeps going read-only and appears stuck is a big concern since users have no idea what to do. And internet searches tend to produce results that are less help than no help.
> >
> > --
> > Chris Murphy
>
> Anyway, I think Leo's forthcoming detailed per-BG fragmentation data
> should be the most telling. System level fragmentation percentage
> isn't the most useful IMO.
>
> Thanks,
> Boris
Since the uptick in enospcs is not actually linked to dynamic+periodic
reclaim I now feel confident saying that dynamic+periodic reclaim
should be enabled by default for data.
Thanks,
Leo Martins.
next prev parent reply other threads:[~2025-10-23 23:27 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-15 18:58 [PATCH] btrfs: make periodic dynamic reclaim the default for data Boris Burkov
2025-07-16 6:24 ` Johannes Thumshirn
2025-07-16 15:56 ` Boris Burkov
2025-07-17 12:55 ` Johannes Thumshirn
2025-10-21 18:52 ` Chris Murphy
2025-10-21 22:39 ` Leo Martins
2025-10-22 0:37 ` Chris Murphy
2025-10-22 1:02 ` Boris Burkov
2025-10-23 23:27 ` Leo Martins [this message]
2025-12-13 22:09 ` Neal Gompa
2025-12-26 3:07 ` Sun Yangkai
2025-12-30 0:00 ` Boris Burkov
2025-12-30 1:29 ` Sun Yangkai
2025-12-30 1:41 ` Sun Yangkai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251023232737.3346933-1-loemra.dev@gmail.com \
--to=loemra.dev@gmail.com \
--cc=boris@bur.io \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=lists@colorremedies.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox