From: mailing@dmilz.net
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Filesystem Read Only due to errno=-28 during metadata allocation
Date: Mon, 25 Oct 2021 11:53:08 +0200 [thread overview]
Message-ID: <537ba7e11aae3e2c2cf28f546268c356@dmilz.net> (raw)
In-Reply-To: <20211019132631.GB1208@hungrycats.org>
On 19.10.2021 15:26, Zygo Blaxell wrote:
> On Tue, Oct 19, 2021 at 12:57:39PM +0200, mailing@dmilz.net wrote:
>> On 18.10.2021 16:09, Zygo Blaxell wrote:
>> > On Wed, Oct 13, 2021 at 02:35:39PM +0200, mailing@dmilz.net wrote:
>> > > Hello,
>> > >
>> > > I faced issue with btrfs FS /var forced to RO due to errno=-28 (no
>> > > space
>> > > left).
>
>> > Obviously, keeping 7GB reserved for metadata doesn't work on a device
>> > that is only 2.5GB to start with. Even if you elect not to use discard
>> > or scrub, you'd still need 2.5GB for dup metadata.
>> >
>>
>> Since this incident, the FS has been extended to 5GB.
>>
>> But in our case the chunk size for metadata is 128MB, so:
>> 2 [dup metadata] * ( 128 * ( 1 [disk] + 1 [discard] + 1 [balance] ) +
>> 512
>> [Global Reserve] ) = 1792 MB ?
>
> The global reserve is normally much less than 512 MB on smaller disks.
> It is only 13MB on the 5GB disks.
>
>> > While it's possible to use non-mixed block groups on a filesystem that
>> > has only a few GB, it's not possible to use a significant number of
>> > btrfs features, including scrub, balance, RAID disk replacement, online
>> > conversion to other RAID profiles, device shrink, or discard, due to
>> > the requirement to have an extra unlocked block group with available
>> > free space during these operations.
>>
>> Last weekend I had the same behavior, the size allocated to metadata
>> went
>> from 128MB to up to 768MB, so up to 6 * 12* MB metadata but the
>> metdata
>> usage didn't grow:
>>
>> ### Sat Oct 16 23:59:57 CEST 2021
>> Overall:
>> Device size: 5120.00MiB
>> Device allocated: 2647.25MiB
>> Device unallocated: 2472.75MiB
>> Device missing: 0.00MiB
>> Used: 1097.75MiB
>> Free (estimated): 2942.38MiB (min:
>> 1706.00MiB)
>> Data ratio: 1.00
>> Metadata ratio: 2.00
>> Global reserve: 13.00MiB (used:
>> 0.00MiB)
>>
>> Data,single: Size:1559.25MiB, Used:1089.62MiB
>> /dev/mapper/rootvg-varvol 1559.25MiB
>>
>> Metadata,DUP: Size:512.00MiB, Used:4.00MiB
>> /dev/mapper/rootvg-varvol 1024.00MiB
>>
>> System,DUP: Size:32.00MiB, Used:0.06MiB
>> /dev/mapper/rootvg-varvol 64.00MiB
>>
>> Unallocated:
>> /dev/mapper/rootvg-varvol 2472.75MiB
>>
>>
>>
>> ### Sun Oct 17 00:00:32 CEST 2021
>> Overall:
>> Device size: 5120.00MiB
>> Device allocated: 2903.25MiB
>> Device unallocated: 2216.75MiB
>> Device missing: 0.00MiB
>> Used: 1097.44MiB
>> Free (estimated): 2686.69MiB (min:
>> 1578.31MiB)
>> Data ratio: 1.00
>> Metadata ratio: 2.00
>> Global reserve: 13.00MiB (used:
>> 0.00MiB)
>>
>> Data,single: Size:1559.25MiB, Used:1089.31MiB
>> /dev/mapper/rootvg-varvol 1559.25MiB
>>
>> Metadata,DUP: Size:640.00MiB, Used:4.00MiB
>> /dev/mapper/rootvg-varvol 1280.00MiB
>>
>> System,DUP: Size:32.00MiB, Used:0.06MiB
>> /dev/mapper/rootvg-varvol 64.00MiB
>>
>> Unallocated:
>> /dev/mapper/rootvg-varvol 2216.75MiB
>>
>>
>> ### Sun Oct 17 00:05:27 CEST 2021
>> Overall:
>> Device size: 5120.00MiB
>> Device allocated: 2903.25MiB
>> Device unallocated: 2216.75MiB
>> Device missing: 0.00MiB
>> Used: 1099.69MiB
>> Free (estimated): 2684.44MiB (min:
>> 1576.06MiB)
>> Data ratio: 1.00
>> Metadata ratio: 2.00
>> Global reserve: 13.00MiB (used:
>> 0.00MiB)
>>
>> Data,single: Size:1559.25MiB, Used:1091.56MiB
>> /dev/mapper/rootvg-varvol 1559.25MiB
>>
>> Metadata,DUP: Size:640.00MiB, Used:4.00MiB
>> /dev/mapper/rootvg-varvol 1280.00MiB
>>
>> System,DUP: Size:32.00MiB, Used:0.06MiB
>> /dev/mapper/rootvg-varvol 64.00MiB
>>
>> Unallocated:
>> /dev/mapper/rootvg-varvol 2216.75MiB
>>
>>
>> ### Sun Oct 17 00:05:32 CEST 2021
>> Overall:
>> Device size: 5120.00MiB
>> Device allocated: 3159.25MiB
>> Device unallocated: 1960.75MiB
>> Device missing: 0.00MiB
>> Used: 1100.25MiB
>> Free (estimated): 2427.88MiB (min:
>> 1447.50MiB)
>> Data ratio: 1.00
>> Metadata ratio: 2.00
>> Global reserve: 13.00MiB (used:
>> 0.00MiB)
>>
>> Data,single: Size:1559.25MiB, Used:1092.12MiB
>> /dev/mapper/rootvg-varvol 1559.25MiB
>>
>> Metadata,DUP: Size:768.00MiB, Used:4.00MiB
>> /dev/mapper/rootvg-varvol 1536.00MiB
>>
>> System,DUP: Size:32.00MiB, Used:0.06MiB
>> /dev/mapper/rootvg-varvol 64.00MiB
>>
>> Unallocated:
>> /dev/mapper/rootvg-varvol 1960.75MiB
>>
>> ### Sun Oct 17 00:12:53 CEST 2021
>> Overall:
>> Device size: 5120.00MiB
>> Device allocated: 3159.25MiB
>> Device unallocated: 1960.75MiB
>> Device missing: 0.00MiB
>> Used: 1100.62MiB
>> Free (estimated): 2427.50MiB (min:
>> 1447.12MiB)
>> Data ratio: 1.00
>> Metadata ratio: 2.00
>> Global reserve: 13.00MiB (used:
>> 0.00MiB)
>>
>> Data,single: Size:1559.25MiB, Used:1092.50MiB
>> /dev/mapper/rootvg-varvol 1559.25MiB
>>
>> Metadata,DUP: Size:768.00MiB, Used:4.00MiB
>> /dev/mapper/rootvg-varvol 1536.00MiB
>>
>> System,DUP: Size:32.00MiB, Used:0.06MiB
>> /dev/mapper/rootvg-varvol 64.00MiB
>>
>> Unallocated:
>> /dev/mapper/rootvg-varvol 1960.75MiB
>> ### Sun Oct 17 00:12:58 CEST 2021
>> Overall:
>> Device size: 5120.00MiB
>> Device allocated: 2903.25MiB
>> Device unallocated: 2216.75MiB
>> Device missing: 0.00MiB
>> Used: 1100.69MiB
>> Free (estimated): 2683.44MiB (min:
>> 1575.06MiB)
>> Data ratio: 1.00
>> Metadata ratio: 2.00
>> Global reserve: 13.00MiB (used:
>> 0.12MiB)
>>
>> Data,single: Size:1559.25MiB, Used:1092.56MiB
>> /dev/mapper/rootvg-varvol 1559.25MiB
>>
>> Metadata,DUP: Size:640.00MiB, Used:4.00MiB
>> /dev/mapper/rootvg-varvol 1280.00MiB
>>
>> System,DUP: Size:32.00MiB, Used:0.06MiB
>> /dev/mapper/rootvg-varvol 64.00MiB
>>
>> Unallocated:
>> /dev/mapper/rootvg-varvol 2216.75MiB
>>
>> So if I understand properly it might be due to chunk being needed for
>> balance/discard or scrub.
>> Is there 3rd party tools which are also able to lock metadata block
>> group as
>> well? It seems to happens at the same time, when backup is running
>> (spectrum), and/or HANA backup and/or ReaR backup as well.
>
> If the 3rd party tools are triggering any of the btrfs maintenance
> functions then they'll lock block groups; otherwise, normal filesystem
> operations don't generally hold locks at the block group level.
>
> There might be some additional issues with the 4.12 kernel that cause
> it to allocate more than the minimum metadata. I recall there were
> some problems with old kernels where multiple threads allocating at
> the same time will all grab their own chunks, but I'm not sure which
> kernel those were fixed in. There are also changes to the allocator's
> behavior with the 'ssd' and 'nossd' mount options in 4.14 which might
> cause these effects.
>
> Some temporary overallocation is normal. It's likely it would have
> worked even without the overallocation, the overallocation has a pretty
> large impact on these tiny filesystems. This seems a bit higher than
> the amount I've come to expect from modern btrfs.
Thanks for all information!
If the chunk size (128MB, 256MB... 1GB) is in relation to the FS size,
is there a command to determine the chunk size for data and metadata?
Should I expect BTRFS to start allocating bigger chunk at some point
after filesystem extension?
next prev parent reply other threads:[~2021-10-25 10:01 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-13 12:35 Filesystem Read Only due to errno=-28 during metadata allocation mailing
2021-10-18 11:13 ` mailing
2021-10-18 14:09 ` Zygo Blaxell
2021-10-19 10:57 ` mailing
2021-10-19 13:26 ` Zygo Blaxell
2021-10-25 9:53 ` mailing [this message]
2021-10-25 10:32 ` Nikolay Borisov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=537ba7e11aae3e2c2cf28f546268c356@dmilz.net \
--to=mailing@dmilz.net \
--cc=ce3g8jdj@umail.furryterror.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).