From: Boris Burkov <boris@bur.io>
To: Christoph Hellwig <hch@lst.de>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>,
linux-btrfs@vger.kernel.org, Filipe Manana <fdmanana@suse.com>,
David Sterba <dsterba@suse.com>,
Hans Holmberg <Hans.Holmberg@wdc.com>,
Damien Le Moal <dlemoal@kernel.org>,
Naohiro Aota <naohiro.aota@wdc.com>
Subject: Re: [PATCH 5/7] btrfs: zoned: subtract zone_unusable space in statfs
Date: Fri, 15 May 2026 14:05:11 -0700 [thread overview]
Message-ID: <20260515210511.GA2887949@zen.localdomain> (raw)
In-Reply-To: <20260515113430.GA25099@lst.de>
On Fri, May 15, 2026 at 01:34:30PM +0200, Christoph Hellwig wrote:
> On Fri, May 15, 2026 at 11:26:22AM +0200, Johannes Thumshirn wrote:
> > On 5/15/26 6:39 AM, Christoph Hellwig wrote:
> >> On Wed, May 13, 2026 at 02:34:43PM +0200, Johannes Thumshirn wrote:
> >>> On zoned filesystems, space in block groups that has been freed but not
> >>> yet reset is tracked in bytes_zone_unusable. This space cannot be used for
> >>> new allocations until zone reclaim resets the zones, but it was being
> >>> reported as available space in statfs.
> >>>
> >>> This caused statfs to over-report free space, leading to ENOSPC errors
> >>> when applications tried to allocate based on the reported free space.
> >>>
> >>> Fix this by subtracting bytes_zone_unusable from total_free_data in the
> >>> statfs calculation for zoned filesystems.
> >> This OTOH sounds very wrong. Freed but not reclaimed space is usable,
> >> it just needs work.
> >>
> > man statfs says:
> >
> > fsblkcnt_t f_bavail; /* Free blocks available to
> > unprivileged user */
> >
> > which in my interpretation would include zone_unusable_bytes, as they're
> > not "free blocks available". Or would that be accounted as f_bfree?
>
> The only different between f_bavail and f_bfree is reservations for
> the superuser, which IIRC only extN do on Linux.
>
> f_bavail is all the space you as a user could potentially use on
> the file system (modulo quota restrictions).
I think it would be great to discuss this in more detail. The whole
"reservations for the superuser" thing never made a huge amount of sense
to me, and feels sort of under-specified.
I take it to spiritually mean "space allocated to other crap definitely
needed to do your data allocation anyway". So in btrfs that would
mean something like "an estimate of how much metadata you would need for
the remaining space"?
Returning to the actual implementation and this concrete question:
AIUI, we calculate f_bavail as unaware of any stranded metadata space
we could free up by running a 'btrfs balance start -m', for example.
Furthermore, there is a discontinuity to zero in edge cases when
potential space for metadata is exhausted even if there is still space
allocated for data.
So I think Johannes's change certainly fits with that (and the raid
aware) nature of the bavail calculation.
On the other hand, given that the zone reclaiming is guaranteed to
happen automatically, I can see how it could be treated differently,
though, unlike the metadata balancing. But if we made metadata
balancing happen automatically in a flusher, we would need to change
f_bavail to count stranded metadata space too?
There was another recent discussion on this, where you said we should
avoid over-estimating:
https://lore.kernel.org/linux-btrfs/efcb3c8470bf099534fec1c1745780eba445e111.1774394322.git.loemra.dev@gmail.com/
So I think the main thing is whether btrfs guarantees it will succeed on
allocations given space stranded in zones is available "after a balance"
or not? Should it admit those writes? (Is this the right test?)
I think in this case, given that balance is in a flusher, it should, but
it is better to say "everything" out loud to leave less to the
imagination. And we should confirm that this property really holds, that
such a write will succeed after painful extra balancing work. It is
possible the balance fails, for example, due to insufficient free space
for a large extent that needs balancing. Maybe because of the reloc zone,
that isn't true for zoned? Either way, does that aspect of it change how
we feel?
Thanks,
Boris
next prev parent reply other threads:[~2026-05-15 21:05 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-13 12:34 [PATCH 0/7] btrfs: fixes around generic/747 on zoned filesystems Johannes Thumshirn
2026-05-13 12:34 ` [PATCH 1/7] btrfs: zoned: document RECLAIM_ZONES flush state Johannes Thumshirn
2026-05-14 14:44 ` Boris Burkov
2026-05-13 12:34 ` [PATCH 2/7] btrfs: zoned: decode 'RECLAIM_ZONES' state in tracepoints Johannes Thumshirn
2026-05-13 12:34 ` [PATCH 3/7] btrfs: zoned: always set data_relocation_bg Johannes Thumshirn
2026-05-14 5:42 ` Damien Le Moal
2026-05-14 14:54 ` Boris Burkov
2026-05-13 12:34 ` [PATCH 4/7] btrfs: zoned: don't account data relocation space-info in statfs free space Johannes Thumshirn
2026-05-14 5:42 ` Damien Le Moal
2026-05-15 4:38 ` Christoph Hellwig
2026-05-13 12:34 ` [PATCH 5/7] btrfs: zoned: subtract zone_unusable space in statfs Johannes Thumshirn
2026-05-14 5:43 ` Damien Le Moal
2026-05-15 4:39 ` Christoph Hellwig
2026-05-15 9:26 ` Johannes Thumshirn
2026-05-15 11:34 ` Christoph Hellwig
2026-05-15 21:05 ` Boris Burkov [this message]
2026-05-13 12:34 ` [PATCH 6/7] btrfs: zoned: fix deadlock waiting for ticket during data relocation Johannes Thumshirn
2026-05-15 17:26 ` Boris Burkov
2026-05-13 12:34 ` [RFC PATCH 7/7] btrfs: zoned: add RECLAIM_ZONES and RESET_ZONES to first async reclaim loop Johannes Thumshirn
2026-05-15 18:38 ` Boris Burkov
2026-05-14 14:43 ` [PATCH 0/7] btrfs: fixes around generic/747 on zoned filesystems Boris Burkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260515210511.GA2887949@zen.localdomain \
--to=boris@bur.io \
--cc=Hans.Holmberg@wdc.com \
--cc=dlemoal@kernel.org \
--cc=dsterba@suse.com \
--cc=fdmanana@suse.com \
--cc=hch@lst.de \
--cc=johannes.thumshirn@wdc.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=naohiro.aota@wdc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.