linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Large multi-device BTRFS array (usually) fails to mount on boot.
@ 2021-02-03 21:54 joshua
  2021-02-03 23:08 ` Graham Cobb
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: joshua @ 2021-02-03 21:54 UTC (permalink / raw)
  To: linux-btrfs

Good Evening.

I have a large BTRFS array, (14 Drives, ~100 TB RAW) which has been having problems mounting on boot without timing out. This causes the system to drop to emergency mode. I am then able to mount the array in emergency mode and all data appears fine, but upon reboot it fails again.

I actually first had this problem around a year ago, and initially put considerable effort into extending the timeout in systemd, as I believed that to be the problem. However, all the methods I attempted did not work properly or caused the system to continue booting before the array was mounted, causing all sorts of issues. Eventually, I was able to almost completely resolve it by defragmenting the extent tree and subvolume tree for each subvolume. (btrfs fi defrag /mountpoint/subvolume/) This seemed to reduce the time required to mount, and made it mount on boot the majority of the time.

Recently I expanded the array yet again by adding another drive, (and some more data) and now I am having the same issue again. I've posted the relevant entries from my dmesg, as well as some information on my array and system below. I ran a defrag as mentioned above on each subvolume, and was able to get the system to boot successfully. Any ideas on a more reliable and permanent solution this this? Thanks much!

dmesg entries upon boot:
[ 22.775439] BTRFS info (device sdh): use lzo compression, level 0
[ 22.775441] BTRFS info (device sdh): using free space tree
[ 22.775442] BTRFS info (device sdh): has skinny extents
[ 124.250554] BTRFS error (device sdh): open_ctree failed

dmesg entries after running 'mount -a' in emergency mode:
[ 178.317339] BTRFS info (device sdh): force zstd compression, level 2
[ 178.317342] BTRFS info (device sdh): using free space tree
[ 178.317343] BTRFS info (device sdh): has skinny extents

uname -a:
Linux HOSTNAME 5.10.0-2-amd64 #1 SMP Debian 5.10.9-1 (2021-01-20) x86-64 GNU/Linux

btrfs --version:
btrfs-progs v5.10

btrfs fi show /mountpoint:
Label: 'DATA' uuid: {snip}
Total devices 14 FS bytes used 41.94TiB
devid 1 size 2.73TiB used 2.46TiB path /dev/sdh
devid 2 size 7.28TiB used 6.87TiB path /dev/sdm
devid 3 size 2.73TiB used 2.46TiB path /dev/sdk
devid 4 size 9.10TiB used 8.57TiB path /dev/sdj
devid 5 size 9.10TiB used 8.57TiB path /dev/sde
devid 6 size 9.10TiB used 8.57TiB path /dev/sdn
devid 7 size 7.28TiB used 4.65TiB path /dev/sdc
devid 9 size 9.10TiB used 8.57TiB path /dev/sdf
devid 10 size 2.73TiB used 2.21TiB path /dev/sdl
devid 12 size 2.73TiB used 2.20TiB path /dev/sdg
devid 13 size 9.10TiB used 8.57TiB path /dev/sdd
devid 15 size 7.28TiB used 6.75TiB path /dev/sda
devid 16 size 7.28TiB used 6.75TiB path /dev/sdi
devid 17 size 7.28TiB used 6.75TiB path /dev/sdb

btrfs fi usage /mountpoint:
Overall:
Device size: 92.78TiB
Device allocated: 83.96TiB
Device unallocated: 8.83TiB
Device missing: 0.00B
Used: 83.94TiB
Free (estimated): 4.42TiB (min: 2.95TiB)
Free (statfs, df): 3.31TiB
Data ratio: 2.00
Metadata ratio: 3.00
Global reserve: 512.00MiB (used: 0.00B)
Multiple profiles: no

Data,RAID1: Size:41.88TiB, Used:41.877TiB (99.99%)
{snip}

Metadata,RAID1C3: Size:68GiB, Used:63.79GiB (93.81%)
{snip}

System,RAID1C3: Size:32MiB, Used:6.69MiB (20.90%)
{snip}

Unallocated:
{snip}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Large multi-device BTRFS array (usually) fails to mount on boot.
  2021-02-03 21:54 Large multi-device BTRFS array (usually) fails to mount on boot joshua
@ 2021-02-03 23:08 ` Graham Cobb
  2021-02-04  0:56 ` Qu Wenruo
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Graham Cobb @ 2021-02-03 23:08 UTC (permalink / raw)
  To: joshua, linux-btrfs

On 03/02/2021 21:54, joshua@mailmag.net wrote:
> Good Evening.
> 
> I have a large BTRFS array, (14 Drives, ~100 TB RAW) which has been having problems mounting on boot without timing out. This causes the system to drop to emergency mode. I am then able to mount the array in emergency mode and all data appears fine, but upon reboot it fails again.
> 
> I actually first had this problem around a year ago, and initially put considerable effort into extending the timeout in systemd, as I believed that to be the problem. However, all the methods I attempted did not work properly or caused the system to continue booting before the array was mounted, causing all sorts of issues. Eventually, I was able to almost completely resolve it by defragmenting the extent tree and subvolume tree for each subvolume. (btrfs fi defrag /mountpoint/subvolume/) This seemed to reduce the time required to mount, and made it mount on boot the majority of the time.
> 

Not what you asked, but adding "x-systemd.mount-timeout=180s" to the
mount options in /etc/fstab works reliably for me to extend the timeout.
Of course, my largest filesystem is only 20TB, across only two devices
(two lvm-over-LUKS, each on separate physical drives) but it has very
heavy use of snapshot creation and deletion. I also run with commit=15
as power is not too reliable here and losing power is the most frequent
cause of a reboot.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Large multi-device BTRFS array (usually) fails to mount on boot.
  2021-02-03 21:54 Large multi-device BTRFS array (usually) fails to mount on boot joshua
  2021-02-03 23:08 ` Graham Cobb
@ 2021-02-04  0:56 ` Qu Wenruo
  2021-02-06  5:00 ` Joshua
  2021-02-19 17:42 ` Joshua
  3 siblings, 0 replies; 7+ messages in thread
From: Qu Wenruo @ 2021-02-04  0:56 UTC (permalink / raw)
  To: joshua, linux-btrfs



On 2021/2/4 上午5:54, joshua@mailmag.net wrote:
> Good Evening.
>
> I have a large BTRFS array, (14 Drives, ~100 TB RAW) which has been having problems mounting on boot without timing out. This causes the system to drop to emergency mode. I am then able to mount the array in emergency mode and all data appears fine, but upon reboot it fails again.
>
> I actually first had this problem around a year ago, and initially put considerable effort into extending the timeout in systemd, as I believed that to be the problem. However, all the methods I attempted did not work properly or caused the system to continue booting before the array was mounted, causing all sorts of issues. Eventually, I was able to almost completely resolve it by defragmenting the extent tree and subvolume tree for each subvolume. (btrfs fi defrag /mountpoint/subvolume/) This seemed to reduce the time required to mount, and made it mount on boot the majority of the time.
>
> Recently I expanded the array yet again by adding another drive, (and some more data) and now I am having the same issue again. I've posted the relevant entries from my dmesg, as well as some information on my array and system below. I ran a defrag as mentioned above on each subvolume, and was able to get the system to boot successfully. Any ideas on a more reliable and permanent solution this this? Thanks much!
>
> dmesg entries upon boot:
> [ 22.775439] BTRFS info (device sdh): use lzo compression, level 0
> [ 22.775441] BTRFS info (device sdh): using free space tree
> [ 22.775442] BTRFS info (device sdh): has skinny extents
> [ 124.250554] BTRFS error (device sdh): open_ctree failed
>
> dmesg entries after running 'mount -a' in emergency mode:
> [ 178.317339] BTRFS info (device sdh): force zstd compression, level 2
> [ 178.317342] BTRFS info (device sdh): using free space tree
> [ 178.317343] BTRFS info (device sdh): has skinny extents
>
> uname -a:
> Linux HOSTNAME 5.10.0-2-amd64 #1 SMP Debian 5.10.9-1 (2021-01-20) x86-64 GNU/Linux
>
> btrfs --version:
> btrfs-progs v5.10
>
> btrfs fi show /mountpoint:
> Label: 'DATA' uuid: {snip}
> Total devices 14 FS bytes used 41.94TiB
> devid 1 size 2.73TiB used 2.46TiB path /dev/sdh
> devid 2 size 7.28TiB used 6.87TiB path /dev/sdm
> devid 3 size 2.73TiB used 2.46TiB path /dev/sdk
> devid 4 size 9.10TiB used 8.57TiB path /dev/sdj
> devid 5 size 9.10TiB used 8.57TiB path /dev/sde
> devid 6 size 9.10TiB used 8.57TiB path /dev/sdn
> devid 7 size 7.28TiB used 4.65TiB path /dev/sdc
> devid 9 size 9.10TiB used 8.57TiB path /dev/sdf
> devid 10 size 2.73TiB used 2.21TiB path /dev/sdl
> devid 12 size 2.73TiB used 2.20TiB path /dev/sdg
> devid 13 size 9.10TiB used 8.57TiB path /dev/sdd
> devid 15 size 7.28TiB used 6.75TiB path /dev/sda
> devid 16 size 7.28TiB used 6.75TiB path /dev/sdi
> devid 17 size 7.28TiB used 6.75TiB path /dev/sdb

With such a large array, the extent tree is considerably large.

And that's causing the mount time problem, as at mount we need to load
each block group item into memory.
When extent tree goes large, the read is mostly random read which is
never a good thing for HDD.

I was pushing skinny block group tree for btrfs, which arrange block
group items into a very compact tree, just like chunk tree.

This should greatly improve the mount performance, but there are several
problems:
- The feature is not yet merged
- The feature needs to convert existing fs to the new tree
   For your fs, it may take quite some time

So unfortunately, no good short term solution yet.

THanks,
Qu
>
> btrfs fi usage /mountpoint:
> Overall:
> Device size: 92.78TiB
> Device allocated: 83.96TiB
> Device unallocated: 8.83TiB
> Device missing: 0.00B
> Used: 83.94TiB
> Free (estimated): 4.42TiB (min: 2.95TiB)
> Free (statfs, df): 3.31TiB
> Data ratio: 2.00
> Metadata ratio: 3.00
> Global reserve: 512.00MiB (used: 0.00B)
> Multiple profiles: no
>
> Data,RAID1: Size:41.88TiB, Used:41.877TiB (99.99%)
> {snip}
>
> Metadata,RAID1C3: Size:68GiB, Used:63.79GiB (93.81%)
> {snip}
>
> System,RAID1C3: Size:32MiB, Used:6.69MiB (20.90%)
> {snip}
>
> Unallocated:
> {snip}
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Large multi-device BTRFS array (usually) fails to mount on  boot.
  2021-02-03 21:54 Large multi-device BTRFS array (usually) fails to mount on boot joshua
  2021-02-03 23:08 ` Graham Cobb
  2021-02-04  0:56 ` Qu Wenruo
@ 2021-02-06  5:00 ` Joshua
  2021-02-19 17:42 ` Joshua
  3 siblings, 0 replies; 7+ messages in thread
From: Joshua @ 2021-02-06  5:00 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs

February 3, 2021 4:56 PM, "Qu Wenruo" <quwenruo.btrfs@gmx.com> wrote:

> On 2021/2/4 上午5:54, joshua@mailmag.net wrote:
> 
>> Good Evening.
>> 
>> I have a large BTRFS array, (14 Drives, ~100 TB RAW) which has been having problems mounting on
>> boot without timing out. This causes the system to drop to emergency mode. I am then able to mount
>> the array in emergency mode and all data appears fine, but upon reboot it fails again.
>> 
>> I actually first had this problem around a year ago, and initially put considerable effort into
>> extending the timeout in systemd, as I believed that to be the problem. However, all the methods I
>> attempted did not work properly or caused the system to continue booting before the array was
>> mounted, causing all sorts of issues. Eventually, I was able to almost completely resolve it by
>> defragmenting the extent tree and subvolume tree for each subvolume. (btrfs fi defrag
>> /mountpoint/subvolume/) This seemed to reduce the time required to mount, and made it mount on boot
>> the majority of the time.
>> 
>> Recently I expanded the array yet again by adding another drive, (and some more data) and now I am
>> having the same issue again. I've posted the relevant entries from my dmesg, as well as some
>> information on my array and system below. I ran a defrag as mentioned above on each subvolume, and
>> was able to get the system to boot successfully. Any ideas on a more reliable and permanent
>> solution this this? Thanks much!
>> 
>> dmesg entries upon boot:
>> [ 22.775439] BTRFS info (device sdh): use lzo compression, level 0
>> [ 22.775441] BTRFS info (device sdh): using free space tree
>> [ 22.775442] BTRFS info (device sdh): has skinny extents
>> [ 124.250554] BTRFS error (device sdh): open_ctree failed
>> 
>> dmesg entries after running 'mount -a' in emergency mode:
>> [ 178.317339] BTRFS info (device sdh): force zstd compression, level 2
>> [ 178.317342] BTRFS info (device sdh): using free space tree
>> [ 178.317343] BTRFS info (device sdh): has skinny extents
>> 
>> uname -a:
>> Linux HOSTNAME 5.10.0-2-amd64 #1 SMP Debian 5.10.9-1 (2021-01-20) x86-64 GNU/Linux
>> 
>> btrfs --version:
>> btrfs-progs v5.10
>> 
>> btrfs fi show /mountpoint:
>> Label: 'DATA' uuid: {snip}
>> Total devices 14 FS bytes used 41.94TiB
>> devid 1 size 2.73TiB used 2.46TiB path /dev/sdh
>> devid 2 size 7.28TiB used 6.87TiB path /dev/sdm
>> devid 3 size 2.73TiB used 2.46TiB path /dev/sdk
>> devid 4 size 9.10TiB used 8.57TiB path /dev/sdj
>> devid 5 size 9.10TiB used 8.57TiB path /dev/sde
>> devid 6 size 9.10TiB used 8.57TiB path /dev/sdn
>> devid 7 size 7.28TiB used 4.65TiB path /dev/sdc
>> devid 9 size 9.10TiB used 8.57TiB path /dev/sdf
>> devid 10 size 2.73TiB used 2.21TiB path /dev/sdl
>> devid 12 size 2.73TiB used 2.20TiB path /dev/sdg
>> devid 13 size 9.10TiB used 8.57TiB path /dev/sdd
>> devid 15 size 7.28TiB used 6.75TiB path /dev/sda
>> devid 16 size 7.28TiB used 6.75TiB path /dev/sdi
>> devid 17 size 7.28TiB used 6.75TiB path /dev/sdb
> 
> With such a large array, the extent tree is considerably large.
> 
> And that's causing the mount time problem, as at mount we need to load
> each block group item into memory.
> When extent tree goes large, the read is mostly random read which is
> never a good thing for HDD.
> 
> I was pushing skinny block group tree for btrfs, which arrange block
> group items into a very compact tree, just like chunk tree.
> 
> This should greatly improve the mount performance, but there are several
> problems:
> - The feature is not yet merged
> - The feature needs to convert existing fs to the new tree
> For your fs, it may take quite some time
> 
> So unfortunately, no good short term solution yet.
> 
> THanks,
> Qu

Thanks for the information, that's more or less what I was wondering, but didn't really know.

Luckily the solution proposed by Graham appears to be working, and 'solved' the problem for me, allowing my system to boot reliably.

The only remaining issue is the annoyance of boot times (mount times) being so long, but luckily that's not a very big deal for my situation, and I don't need to reboot (mount) very frequently.


Thanks,
--Joshua Villwock


>> btrfs fi usage /mountpoint:
>> Overall:
>> Device size: 92.78TiB
>> Device allocated: 83.96TiB
>> Device unallocated: 8.83TiB
>> Device missing: 0.00B
>> Used: 83.94TiB
>> Free (estimated): 4.42TiB (min: 2.95TiB)
>> Free (statfs, df): 3.31TiB
>> Data ratio: 2.00
>> Metadata ratio: 3.00
>> Global reserve: 512.00MiB (used: 0.00B)
>> Multiple profiles: no
>> 
>> Data,RAID1: Size:41.88TiB, Used:41.877TiB (99.99%)
>> {snip}
>> 
>> Metadata,RAID1C3: Size:68GiB, Used:63.79GiB (93.81%)
>> {snip}
>> 
>> System,RAID1C3: Size:32MiB, Used:6.69MiB (20.90%)
>> {snip}
>> 
>> Unallocated:
>> {snip}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Large multi-device BTRFS array (usually) fails to mount on  boot.
  2021-02-03 21:54 Large multi-device BTRFS array (usually) fails to mount on boot joshua
                   ` (2 preceding siblings ...)
  2021-02-06  5:00 ` Joshua
@ 2021-02-19 17:42 ` Joshua
  2021-02-19 22:45   ` Graham Cobb
  2021-02-19 23:56   ` Joshua
  3 siblings, 2 replies; 7+ messages in thread
From: Joshua @ 2021-02-19 17:42 UTC (permalink / raw)
  To: Graham Cobb, linux-btrfs

February 3, 2021 3:16 PM, "Graham Cobb" <g.btrfs@cobb.uk.net> wrote:

> On 03/02/2021 21:54, joshua@mailmag.net wrote:
> 
>> Good Evening.
>> 
>> I have a large BTRFS array, (14 Drives, ~100 TB RAW) which has been having problems mounting on
>> boot without timing out. This causes the system to drop to emergency mode. I am then able to mount
>> the array in emergency mode and all data appears fine, but upon reboot it fails again.
>> 
>> I actually first had this problem around a year ago, and initially put considerable effort into
>> extending the timeout in systemd, as I believed that to be the problem. However, all the methods I
>> attempted did not work properly or caused the system to continue booting before the array was
>> mounted, causing all sorts of issues. Eventually, I was able to almost completely resolve it by
>> defragmenting the extent tree and subvolume tree for each subvolume. (btrfs fi defrag
>> /mountpoint/subvolume/) This seemed to reduce the time required to mount, and made it mount on boot
>> the majority of the time.
> 
> Not what you asked, but adding "x-systemd.mount-timeout=180s" to the
> mount options in /etc/fstab works reliably for me to extend the timeout.
> Of course, my largest filesystem is only 20TB, across only two devices
> (two lvm-over-LUKS, each on separate physical drives) but it has very
> heavy use of snapshot creation and deletion. I also run with commit=15
> as power is not too reliable here and losing power is the most frequent
> cause of a reboot.

Thanks for the suggestion, but I have not been able to get this method to work either.

Here's what my fstab looks like, let me know if this is not what you meant!

UUID={snip} /         ext4  errors=remount-ro 0 0
UUID={snip} /mnt/data btrfs defaults,noatime,compress-force=zstd:2,x-systemd.mount-timeout=300s 0 0

However, the system still fails to mount in less than 5 minutes, and drops to emergency mode.
Upon checking dmesg logs, it is clear the system is only wait 120 seconds, before giving up on mounting, and dropping to emergency mode.

--Joshua

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Large multi-device BTRFS array (usually) fails to mount on boot.
  2021-02-19 17:42 ` Joshua
@ 2021-02-19 22:45   ` Graham Cobb
  2021-02-19 23:56   ` Joshua
  1 sibling, 0 replies; 7+ messages in thread
From: Graham Cobb @ 2021-02-19 22:45 UTC (permalink / raw)
  To: Joshua, linux-btrfs


On 19/02/2021 17:42, Joshua wrote:
> February 3, 2021 3:16 PM, "Graham Cobb" <g.btrfs@cobb.uk.net> wrote:
> 
>> On 03/02/2021 21:54, joshua@mailmag.net wrote:
>>
>>> Good Evening.
>>>
>>> I have a large BTRFS array, (14 Drives, ~100 TB RAW) which has been having problems mounting on
>>> boot without timing out. This causes the system to drop to emergency mode. I am then able to mount
>>> the array in emergency mode and all data appears fine, but upon reboot it fails again.
>>>
>>> I actually first had this problem around a year ago, and initially put considerable effort into
>>> extending the timeout in systemd, as I believed that to be the problem. However, all the methods I
>>> attempted did not work properly or caused the system to continue booting before the array was
>>> mounted, causing all sorts of issues. Eventually, I was able to almost completely resolve it by
>>> defragmenting the extent tree and subvolume tree for each subvolume. (btrfs fi defrag
>>> /mountpoint/subvolume/) This seemed to reduce the time required to mount, and made it mount on boot
>>> the majority of the time.
>>
>> Not what you asked, but adding "x-systemd.mount-timeout=180s" to the
>> mount options in /etc/fstab works reliably for me to extend the timeout.
>> Of course, my largest filesystem is only 20TB, across only two devices
>> (two lvm-over-LUKS, each on separate physical drives) but it has very
>> heavy use of snapshot creation and deletion. I also run with commit=15
>> as power is not too reliable here and losing power is the most frequent
>> cause of a reboot.
> 
> Thanks for the suggestion, but I have not been able to get this method to work either.
> 
> Here's what my fstab looks like, let me know if this is not what you meant!
> 
> UUID={snip} /         ext4  errors=remount-ro 0 0
> UUID={snip} /mnt/data btrfs defaults,noatime,compress-force=zstd:2,x-systemd.mount-timeout=300s 0 0

Hmmm. The line from my fstab is:

LABEL=lvmdata   /mnt/data       btrfs
defaults,subvolid=0,noatime,nodiratime,compress=lzo,skip_balance,commit=15,space_cache=v2,x-systemd.mount-timeout=180s,nofail
  0       3

I note that I do have "nofail" in there, although it doesn't fail for me
so I assume it shouldn't make a difference.

I can't swear that the disk is currently taking longer to mount than the
systemd default (and I will not be in a position to reboot this system
any time soon to check). But I am quite sure this made a difference when
I added it.

Not sure why it isn't working for you, unless it is some systemd
problem. It isn't systemd giving up and dropping to emergency because of
some other startup problem that occurs before the mount is finished, is
it? I could believe systemd cancels any mounts in progress when that
happens.

Graham

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Large multi-device BTRFS array (usually) fails to mount on boot.
  2021-02-19 17:42 ` Joshua
  2021-02-19 22:45   ` Graham Cobb
@ 2021-02-19 23:56   ` Joshua
  1 sibling, 0 replies; 7+ messages in thread
From: Joshua @ 2021-02-19 23:56 UTC (permalink / raw)
  To: Graham Cobb, linux-btrfs

February 19, 2021 2:45 PM, "Graham Cobb" <g.btrfs@cobb.uk.net> wrote:

> On 19/02/2021 17:42, Joshua wrote:
> 
>> February 3, 2021 3:16 PM, "Graham Cobb" <g.btrfs@cobb.uk.net> wrote:
>> 
>>> On 03/02/2021 21:54, joshua@mailmag.net wrote:
>> 
>> Good Evening.
>> 
>> I have a large BTRFS array, (14 Drives, ~100 TB RAW) which has been having problems mounting on
>> boot without timing out. This causes the system to drop to emergency mode. I am then able to mount
>> the array in emergency mode and all data appears fine, but upon reboot it fails again.
>> 
>> I actually first had this problem around a year ago, and initially put considerable effort into
>> extending the timeout in systemd, as I believed that to be the problem. However, all the methods I
>> attempted did not work properly or caused the system to continue booting before the array was
>> mounted, causing all sorts of issues. Eventually, I was able to almost completely resolve it by
>> defragmenting the extent tree and subvolume tree for each subvolume. (btrfs fi defrag
>> /mountpoint/subvolume/) This seemed to reduce the time required to mount, and made it mount on boot
>> the majority of the time.
>>> Not what you asked, but adding "x-systemd.mount-timeout=180s" to the
>>> mount options in /etc/fstab works reliably for me to extend the timeout.
>>> Of course, my largest filesystem is only 20TB, across only two devices
>>> (two lvm-over-LUKS, each on separate physical drives) but it has very
>>> heavy use of snapshot creation and deletion. I also run with commit=15
>>> as power is not too reliable here and losing power is the most frequent
>>> cause of a reboot.
>> 
>> Thanks for the suggestion, but I have not been able to get this method to work either.
>> 
>> Here's what my fstab looks like, let me know if this is not what you meant!
>> 
>> UUID={snip} / ext4 errors=remount-ro 0 0
>> UUID={snip} /mnt/data btrfs defaults,noatime,compress-force=zstd:2,x-systemd.mount-timeout=300s 0 0
> 
> Hmmm. The line from my fstab is:
> 
> LABEL=lvmdata /mnt/data btrfs
> defaults,subvolid=0,noatime,nodiratime,compress=lzo,skip_balance,commit=15,space_cache=v2,x-systemd.
> ount-timeout=180s,nofail
> 0 3

Not very important, but note that noatime implies nodiratime.  https://lwn.net/Articles/245002/

> I note that I do have "nofail" in there, although it doesn't fail for me
> so I assume it shouldn't make a difference.

Ahh, I bet you're right, at least indirectly.

It appears nofail makes the system continue booting even if the mount was unsuccessful, which I'd rather not since some services do depend on this volume.  For example, some docker containers could misbehave if the path to the data they expect doesn't exist.

Not exactly the outcome I'd prefer, (due to services that may depend on the mount existing being allowed to start) but it may work.


I'm really very unsure how nofail interacts with x-systemd.mount-timeout.  I would think it would increase the timeout period.  But that's not what I'm seeing.  Perhaps there's some other kind of internal systemd timeout, and it gives up and continues to boot after that runs out, but allows mount to continue for the time specified?  Seems kinda weird.

I'll give it a try and see what happens.  I'll try and remember to report back here if so.


> I can't swear that the disk is currently taking longer to mount than the
> systemd default (and I will not be in a position to reboot this system
> any time soon to check). But I am quite sure this made a difference when
> I added it.
> 
> Not sure why it isn't working for you, unless it is some systemd
> problem. It isn't systemd giving up and dropping to emergency because of
> some other startup problem that occurs before the mount is finished, is
> it? I could believe systemd cancels any mounts in progress when that
> happens.
> 
> Graham

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-02-19 23:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-03 21:54 Large multi-device BTRFS array (usually) fails to mount on boot joshua
2021-02-03 23:08 ` Graham Cobb
2021-02-04  0:56 ` Qu Wenruo
2021-02-06  5:00 ` Joshua
2021-02-19 17:42 ` Joshua
2021-02-19 22:45   ` Graham Cobb
2021-02-19 23:56   ` Joshua

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).