From: Brandon Heisner <brandonh@wolfram.com>
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: btrfs metadata has reserved 1T of extra space and balances don't reclaim it
Date: Fri, 1 Oct 2021 02:49:39 -0500 (CDT) [thread overview]
Message-ID: <1185660843.2173930.1633074579864.JavaMail.zimbra@wolfram.com> (raw)
In-Reply-To: <20210929173055.GO29026@hungrycats.org>
A reboot of the server did help quite a bit with the problem, but still not fixed completely. I went from having 1.08T reserved for metadata to "only" having 446G reserved. My free space went from 346G to 1010G. So at least I have some breathing room again. I prefer not to do a defrag, as that breaks all the COW links and the disk usage would go up then. I haven't tried the balance of all the metadata, which might be resource intensive.
# btrfs fi us /opt/zimbra/ -T
Overall:
Device size: 5.82TiB
Device allocated: 4.36TiB
Device unallocated: 1.46TiB
Device missing: 0.00B
Used: 3.05TiB
Free (estimated): 1010.62GiB (min: 1010.62GiB)
Data ratio: 2.00
Metadata ratio: 2.00
Global reserve: 512.00MiB (used: 0.00B)
Data Metadata System
Id Path RAID10 RAID10 RAID10 Unallocated
-- -------- --------- --------- --------- -----------
1 /dev/sdc 446.25GiB 111.50GiB 32.00MiB 932.63GiB
2 /dev/sdd 446.25GiB 111.50GiB 32.00MiB 932.63GiB
3 /dev/sde 446.25GiB 111.50GiB 32.00MiB 932.63GiB
4 /dev/sdf 446.25GiB 111.50GiB 32.00MiB 932.63GiB
-- -------- --------- --------- --------- -----------
Total 1.74TiB 446.00GiB 128.00MiB 3.64TiB
Used 1.49TiB 38.16GiB 464.00KiB
# btrfs fi df /opt/zimbra/
Data, RAID10: total=1.74TiB, used=1.49TiB
System, RAID10: total=128.00MiB, used=464.00KiB
Metadata, RAID10: total=446.00GiB, used=38.19GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
----- On Sep 29, 2021, at 12:31 PM, Zygo Blaxell ce3g8jdj@umail.furryterror.org wrote:
> On Tue, Sep 28, 2021 at 09:23:01PM -0500, Brandon Heisner wrote:
>> I have a server running CentOS 7 on 4.9.5-1.el7.elrepo.x86_64 #1 SMP
>> Fri Jan 20 11:34:13 EST 2017 x86_64 x86_64 x86_64 GNU/Linux. It is
>
> That is a really old kernel. I recall there were some anomalous
> metadata allocation behaviors with kernels of that age, e.g. running
> scrub and balance at the same time would allocate a lot of metadata
> because scrub would lock a metadata block group immediately after
> it had been allocated, forcing another metadata block group to be
> allocated immediately. The symptom of that bug is very similar to
> yours--without warning, hundreds of GB of metadata block groups are
> allocated, all empty, during a scrub or balance operation.
>
> Unfortunately I don't have a better solution than "upgrade to a newer
> kernel", as that particular bug was solved years ago (along with
> hundreds of others).
>
>> version locked to that kernel. The metadata has reserved a full
>> 1T of disk space, while only using ~38G. I've tried to balance the
>> metadata to reclaim that so it can be used for data, but it doesn't
>> work and gives no errors. It just says it balanced the chunks but the
>> size doesn't change. The metadata total is still growing as well,
>> as it used to be 1.04 and now it is 1.08 with only about 10G more
>> of metadata used. I've tried doing balances up to 70 or 80 musage I
>> think, and the total metadata does not decrease. I've done so many
>> attempts at balancing, I've probably tried to move 300 chunks or more.
>> None have resulted in any change to the metadata total like they do
>> on other servers running btrfs. I first started with very low musage,
>> like 10 and then increased it by 10 to try to see if that would balance
>> any chunks out, but with no success.
>
> Have you tried rebooting? The block groups may be stuck in a locked
> state in memory or pinned by pending discard requests, in which case
> balance won't touch them. For that matter, try turning off discard
> (it's usually better to run fstrim once a day anyway, and not use
> the discard mount option).
>
>> # /sbin/btrfs balance start -musage=60 -mlimit=30 /opt/zimbra
>> Done, had to relocate 30 out of 2127 chunks
>>
>> I can do that command over and over again, or increase the mlimit,
>> and it doesn't change the metadata total ever.
>
> I would use just -m here (no filters, only metadata). If it gets the
> allocation under control, run 'btrfs balance cancel'; if it doesn't,
> let it run all the way to the end. Each balance starts from the last
> block group, so you are effectively restarting balance to process the
> same 30 block groups over and over here.
>
>> # btrfs fi show /opt/zimbra/
>> Label: 'Data' uuid: ece150db-5817-4704-9e84-80f7d8a3b1da
>> Total devices 4 FS bytes used 1.48TiB
>> devid 1 size 1.46TiB used 1.38TiB path /dev/sde
>> devid 2 size 1.46TiB used 1.38TiB path /dev/sdf
>> devid 3 size 1.46TiB used 1.38TiB path /dev/sdg
>> devid 4 size 1.46TiB used 1.38TiB path /dev/sdh
>>
>> # btrfs fi df /opt/zimbra/
>> Data, RAID10: total=1.69TiB, used=1.45TiB
>> System, RAID10: total=64.00MiB, used=640.00KiB
>> Metadata, RAID10: total=1.08TiB, used=37.69GiB
>> GlobalReserve, single: total=512.00MiB, used=0.00B
>>
>>
>> # btrfs fi us /opt/zimbra/ -T
>> Overall:
>> Device size: 5.82TiB
>> Device allocated: 5.54TiB
>> Device unallocated: 291.54GiB
>> Device missing: 0.00B
>> Used: 2.96TiB
>> Free (estimated): 396.36GiB (min: 396.36GiB)
>> Data ratio: 2.00
>> Metadata ratio: 2.00
>> Global reserve: 512.00MiB (used: 0.00B)
>>
>> Data Metadata System
>> Id Path RAID10 RAID10 RAID10 Unallocated
>> -- -------- --------- --------- --------- -----------
>> 1 /dev/sde 432.75GiB 276.00GiB 16.00MiB 781.65GiB
>> 2 /dev/sdf 432.75GiB 276.00GiB 16.00MiB 781.65GiB
>> 3 /dev/sdg 432.75GiB 276.00GiB 16.00MiB 781.65GiB
>> 4 /dev/sdh 432.75GiB 276.00GiB 16.00MiB 781.65GiB
>> -- -------- --------- --------- --------- -----------
>> Total 1.69TiB 1.08TiB 64.00MiB 3.05TiB
>> Used 1.45TiB 37.69GiB 640.00KiB
>>
>>
>>
>>
>>
>>
>> --
>> Brandon Heisner
>> System Administrator
> > Wolfram Research
prev parent reply other threads:[~2021-10-01 7:49 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-29 2:23 btrfs metadata has reserved 1T of extra space and balances don't reclaim it Brandon Heisner
2021-09-29 7:23 ` Forza
2021-09-29 14:34 ` Brandon Heisner
2021-10-03 11:26 ` Forza
2021-10-03 18:21 ` Zygo Blaxell
2021-09-29 8:22 ` Qu Wenruo
2021-09-29 15:18 ` Andrea Gelmini
2021-09-29 16:39 ` Forza
2021-09-29 18:55 ` Andrea Gelmini
2021-09-29 17:31 ` Zygo Blaxell
2021-10-01 7:49 ` Brandon Heisner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1185660843.2173930.1633074579864.JavaMail.zimbra@wolfram.com \
--to=brandonh@wolfram.com \
--cc=ce3g8jdj@umail.furryterror.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox