Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
To: Boris Burkov <boris@bur.io>
Cc: "linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
	"kernel-team@fb.com" <kernel-team@fb.com>,
	Hans Holmberg <Hans.Holmberg@wdc.com>, hch <hch@lst.de>
Subject: Re: [PATCH] btrfs: make periodic dynamic reclaim the default for data
Date: Thu, 17 Jul 2025 12:55:11 +0000	[thread overview]
Message-ID: <051be284-6fe7-4982-a834-e46ce9c124a9@wdc.com> (raw)
In-Reply-To: <20250716155640.GA2275999@zen.localdomain>

[+Cc Hans and Christoph who looked a lot of GC on zoned XFS lately]

On 16.07.25 17:55, Boris Burkov wrote:
> Thank you for running your perf test on it, excited to hear the results!

Net result is, reclaim kicks in earlier but the overwrite phase still 
isn't as good as I'd like it to be (kind of expected as you describe below).

> The reason I didn't propose enabling it for zoned is that I assumed the
> reclaim strategy was too conservative for zoned filesystems. I figured
> you would be reclaiming block_groups more regularly and that the hard
> coded 10G headroom wouldn't work in practice. Also, I'm not sure how the
> flipped threshold works. AFAIK, currently zoned inverts the meaning of
> bg_reclaim_threshold compared to non-zoned so I wonder if will use a
> threshold of 90 at 9 unalloc down to 10 at 1 unalloc for dynamic...

Yes on a zoned FS we (at the moment) don't look at un-allocated space 
but space we can't use (zone_unusable) because it is either:

a) an old generation of the data, or
b) the difference between zone_size and zone_capacity on ZNS drives.

But I have the feeling that mixing these two is a problem we didn't 
consider back then, as for an example ZNS drive with a zone size of 2G 
and a zone capacity of 1G, 50% of the drive are zone_unusable right 
after mkfs.

Not looking at the unallocated space, but the unusable space might be a 
mistake in hindsight. Especially as btrfs_zoned_should_reclaim() looks 
at all the FS used (data + unusable + metadata) vs total size.

> While we're on the topic, what would the ideal auto reclaim for zoned
> look like? 

Good question, unfortunately I'm thinking of this for several weeks now 
and haven't found an answer yet.

> Maybe we could track "finished" block_groups and trigger
> reclaim on the smallest ones (perhaps with the full-ness threshold) as
> that number goes up?

That was more or less the idea with the current zoned GC code. If 75% of 
the drive unusable, start cleaning it up. But it's doing it in one 
batch, causing latency spikes and/or premature ENOSPC because it's done 
in the cleaner kthread and the ticketing code isn't aware (see my RFC 
patches the last 4-6 weeks on the list, that document my failed attempts).

> Another idea for an extension that I was kicking around that I think
> would make sense for both zoned and non-zoned was to keep the current
> logic for the "we're out of unallocated" side of things but to add a
> slow burn of reclaims metered by reclaim_bytes / reclaim_extents at some
> slow pace. This would try to reasonably keep up with general
> fragmentation in the sub-critical condition without ever doing a large
> amount of reclaim.

This one sounds like an interesting idea. Give me some more time to 
contemplate on it.

  reply	other threads:[~2025-07-17 12:55 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-15 18:58 [PATCH] btrfs: make periodic dynamic reclaim the default for data Boris Burkov
2025-07-16  6:24 ` Johannes Thumshirn
2025-07-16 15:56   ` Boris Burkov
2025-07-17 12:55     ` Johannes Thumshirn [this message]
2025-10-21 18:52 ` Chris Murphy
2025-10-21 22:39   ` Leo Martins
2025-10-22  0:37     ` Chris Murphy
2025-10-22  1:02       ` Boris Burkov
2025-10-23 23:27         ` Leo Martins
2025-12-13 22:09           ` Neal Gompa
2025-12-26  3:07 ` Sun Yangkai
2025-12-30  0:00   ` Boris Burkov
2025-12-30  1:29     ` Sun Yangkai
2025-12-30  1:41     ` Sun Yangkai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=051be284-6fe7-4982-a834-e46ce9c124a9@wdc.com \
    --to=johannes.thumshirn@wdc.com \
    --cc=Hans.Holmberg@wdc.com \
    --cc=boris@bur.io \
    --cc=hch@lst.de \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox