From: "Joshua" <joshua@mailmag.net>
To: "Qu Wenruo" <quwenruo.btrfs@gmx.com>, linux-btrfs@vger.kernel.org
Subject: Re: Large multi-device BTRFS array (usually) fails to mount on boot.
Date: Sat, 06 Feb 2021 05:00:24 +0000 [thread overview]
Message-ID: <4ea56b2ba7077892079754c2449e923f@mailmag.net> (raw)
In-Reply-To: <45064ba0-08e5-f311-1f9e-9a4ec62abaab@gmx.com>
February 3, 2021 4:56 PM, "Qu Wenruo" <quwenruo.btrfs@gmx.com> wrote:
> On 2021/2/4 上午5:54, joshua@mailmag.net wrote:
>
>> Good Evening.
>>
>> I have a large BTRFS array, (14 Drives, ~100 TB RAW) which has been having problems mounting on
>> boot without timing out. This causes the system to drop to emergency mode. I am then able to mount
>> the array in emergency mode and all data appears fine, but upon reboot it fails again.
>>
>> I actually first had this problem around a year ago, and initially put considerable effort into
>> extending the timeout in systemd, as I believed that to be the problem. However, all the methods I
>> attempted did not work properly or caused the system to continue booting before the array was
>> mounted, causing all sorts of issues. Eventually, I was able to almost completely resolve it by
>> defragmenting the extent tree and subvolume tree for each subvolume. (btrfs fi defrag
>> /mountpoint/subvolume/) This seemed to reduce the time required to mount, and made it mount on boot
>> the majority of the time.
>>
>> Recently I expanded the array yet again by adding another drive, (and some more data) and now I am
>> having the same issue again. I've posted the relevant entries from my dmesg, as well as some
>> information on my array and system below. I ran a defrag as mentioned above on each subvolume, and
>> was able to get the system to boot successfully. Any ideas on a more reliable and permanent
>> solution this this? Thanks much!
>>
>> dmesg entries upon boot:
>> [ 22.775439] BTRFS info (device sdh): use lzo compression, level 0
>> [ 22.775441] BTRFS info (device sdh): using free space tree
>> [ 22.775442] BTRFS info (device sdh): has skinny extents
>> [ 124.250554] BTRFS error (device sdh): open_ctree failed
>>
>> dmesg entries after running 'mount -a' in emergency mode:
>> [ 178.317339] BTRFS info (device sdh): force zstd compression, level 2
>> [ 178.317342] BTRFS info (device sdh): using free space tree
>> [ 178.317343] BTRFS info (device sdh): has skinny extents
>>
>> uname -a:
>> Linux HOSTNAME 5.10.0-2-amd64 #1 SMP Debian 5.10.9-1 (2021-01-20) x86-64 GNU/Linux
>>
>> btrfs --version:
>> btrfs-progs v5.10
>>
>> btrfs fi show /mountpoint:
>> Label: 'DATA' uuid: {snip}
>> Total devices 14 FS bytes used 41.94TiB
>> devid 1 size 2.73TiB used 2.46TiB path /dev/sdh
>> devid 2 size 7.28TiB used 6.87TiB path /dev/sdm
>> devid 3 size 2.73TiB used 2.46TiB path /dev/sdk
>> devid 4 size 9.10TiB used 8.57TiB path /dev/sdj
>> devid 5 size 9.10TiB used 8.57TiB path /dev/sde
>> devid 6 size 9.10TiB used 8.57TiB path /dev/sdn
>> devid 7 size 7.28TiB used 4.65TiB path /dev/sdc
>> devid 9 size 9.10TiB used 8.57TiB path /dev/sdf
>> devid 10 size 2.73TiB used 2.21TiB path /dev/sdl
>> devid 12 size 2.73TiB used 2.20TiB path /dev/sdg
>> devid 13 size 9.10TiB used 8.57TiB path /dev/sdd
>> devid 15 size 7.28TiB used 6.75TiB path /dev/sda
>> devid 16 size 7.28TiB used 6.75TiB path /dev/sdi
>> devid 17 size 7.28TiB used 6.75TiB path /dev/sdb
>
> With such a large array, the extent tree is considerably large.
>
> And that's causing the mount time problem, as at mount we need to load
> each block group item into memory.
> When extent tree goes large, the read is mostly random read which is
> never a good thing for HDD.
>
> I was pushing skinny block group tree for btrfs, which arrange block
> group items into a very compact tree, just like chunk tree.
>
> This should greatly improve the mount performance, but there are several
> problems:
> - The feature is not yet merged
> - The feature needs to convert existing fs to the new tree
> For your fs, it may take quite some time
>
> So unfortunately, no good short term solution yet.
>
> THanks,
> Qu
Thanks for the information, that's more or less what I was wondering, but didn't really know.
Luckily the solution proposed by Graham appears to be working, and 'solved' the problem for me, allowing my system to boot reliably.
The only remaining issue is the annoyance of boot times (mount times) being so long, but luckily that's not a very big deal for my situation, and I don't need to reboot (mount) very frequently.
Thanks,
--Joshua Villwock
>> btrfs fi usage /mountpoint:
>> Overall:
>> Device size: 92.78TiB
>> Device allocated: 83.96TiB
>> Device unallocated: 8.83TiB
>> Device missing: 0.00B
>> Used: 83.94TiB
>> Free (estimated): 4.42TiB (min: 2.95TiB)
>> Free (statfs, df): 3.31TiB
>> Data ratio: 2.00
>> Metadata ratio: 3.00
>> Global reserve: 512.00MiB (used: 0.00B)
>> Multiple profiles: no
>>
>> Data,RAID1: Size:41.88TiB, Used:41.877TiB (99.99%)
>> {snip}
>>
>> Metadata,RAID1C3: Size:68GiB, Used:63.79GiB (93.81%)
>> {snip}
>>
>> System,RAID1C3: Size:32MiB, Used:6.69MiB (20.90%)
>> {snip}
>>
>> Unallocated:
>> {snip}
next prev parent reply other threads:[~2021-02-06 5:01 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-03 21:54 Large multi-device BTRFS array (usually) fails to mount on boot joshua
2021-02-03 23:08 ` Graham Cobb
2021-02-04 0:56 ` Qu Wenruo
2021-02-06 5:00 ` Joshua [this message]
2021-02-19 17:42 ` Joshua
2021-02-19 22:45 ` Graham Cobb
2021-02-19 23:56 ` Joshua
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ea56b2ba7077892079754c2449e923f@mailmag.net \
--to=joshua@mailmag.net \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).