From: Nicholas D Steeves <nsteeves@gmail.com>
To: Boris Burkov <boris@bur.io>, Chris Murphy <lists@colorremedies.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: permanently wedged in filesystem, fs/btrfs/relocation.c:1937 prepare_to_merge
Date: Thu, 03 Aug 2023 21:23:34 -0400 [thread overview]
Message-ID: <87fs4ztxbd.fsf@digitalMercury.freeddns.org> (raw)
In-Reply-To: <20230803211258.GA3669918@zen>
[-- Attachment #1: Type: text/plain, Size: 2237 bytes --]
Boris Burkov <boris@bur.io> writes:
> On Thu, Jul 20, 2023 at 09:42:37AM -0400, Chris Murphy wrote:
>
> The btrfs allocator is far from perfect and despite a few measures that
> attempt to prevent fragmentation, it can still happen. If you have a
> system that reproduces this, you can consider using the scripts I wrote
> here: https://github.com/josefbacik/fsperf/tree/master/src/frag to dump
> the fragmentation level of the FS (and even visualize it) to confirm my
> hypothesis. I'm happy to help you get that up and running.
>
> Now let's suppose you do have a workload that challenges our allocator,
> fragments the data block groups, and chews through all the unallocated
> space. We have a lot of those at Meta, so luckily, there is some relief
> available.
>
> Fundamentally the remediation is to defragment the disk, which we do
> do with data block group balancing. You can invoke this manually with:
> `btrfs balance start -d<thresh> <fs>`
> where <thresh> is a percentage fullness of data block_groups to target
> with balancing. Lower is more conservative so you can start low and
> increase it to 80 or so till you reclaim enough space. If you use that,
> it's better to do it proactively periodically rather than after you get
> stuck, 'cause as you saw, balances start failing with ENOSPC too.
> (see point 2. above :))
Would it be useful to use fsperf's frag (module?) in combination with
the required btrd to periodically assess the state of fragmentation?
What are the downsides of doing this?
I'm specifically interested in minimising the risk of "everything was
fine until the fs blew up", and it seems like running this test
periodically would provide useful data that would inform the sysadmin
about whether the risk of rewriting data at rest with a rebalance is
less than the risk of encountering issues triggered by the less than
perfect allocator.
Because it sounds like there still exist workloads that necessitate
periodic rebalancing, sysadmins need a way to determine the degree of
need for rebalancing in order to define a mitigation policy in a
fact-based way.
Is fsperf the correct tool for this general case, or should we be using
something else?
Thanks!
Nicholas
P.S. Please CC me in replies.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 861 bytes --]
next prev parent reply other threads:[~2023-08-04 1:23 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-20 13:42 permanently wedged in filesystem, fs/btrfs/relocation.c:1937 prepare_to_merge Chris Murphy
2023-08-03 21:12 ` Boris Burkov
2023-08-04 1:23 ` Nicholas D Steeves [this message]
2023-08-04 18:00 ` Boris Burkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87fs4ztxbd.fsf@digitalMercury.freeddns.org \
--to=nsteeves@gmail.com \
--cc=boris@bur.io \
--cc=linux-btrfs@vger.kernel.org \
--cc=lists@colorremedies.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox