* Re: race condition mounting multi-device iscsi volume, not resolved in newer kernels
2023-07-22 0:40 race condition mounting multi-device iscsi volume, not resolved in newer kernels Eric Levy
@ 2023-07-22 4:34 ` Andrei Borzenkov
2023-07-22 6:04 ` Eric Levy
2023-07-22 5:27 ` Torbjörn Jansson
2023-07-22 11:50 ` Anand Jain
2 siblings, 1 reply; 12+ messages in thread
From: Andrei Borzenkov @ 2023-07-22 4:34 UTC (permalink / raw)
To: Eric Levy, linux-btrfs
On 22.07.2023 03:40, Eric Levy wrote:
> I mount a volume that spans multiple devices, all provided as logical
> units from an iSCSI target.
>
> Mounting is inconsistently successful at boot.
>
> The problem had presented with both kernel versions 5.19.0 and 6.2.0.
>
> After learning that a patch for similarly reported problems had been
> submitted in commit 50d281fc434c, I further tested with kernels 6.3.0
> and 6.5.0-rc2.
>
> The general problem has persisted even with the newer kernels.
>
> The system is Linux Mint and all kernels are Ubuntu mainline.
>
> Mounting is possible manually after boot, though the mount status of
> the volume appears to be left in an inconsistent state at such time.
>
> The below system log, with redaction for unrelated messages, is taken
> from a boot sequence for which mounting has failed, with kernel version
> 6.3.0.
>
Log with timestamps could be more useful.
> Note that attachment of device `sdc` follows the mount attempt, as well
> as the announcement of scan completion.
>
> ---
>
> kernel: Linux version 6.3.0-060300-generic (kernel@sita) (x86_64-linux-gnu-gcc-12 (Ubuntu 12.2.0-9ubuntu1) 12.2.0, GNU ld (GNU Binutils for Ubuntu) 2.39) #202304232030 SMP PREEMPT_DYNAMIC Sun Apr 23 20:37:49 UTC 2023
>
> systemd[1]: Starting Login to default iSCSI targets...
>
> systemd[1]: Starting iSCSI initiator daemon (iscsid)...
> iscsid[749]: iSCSI logger with pid=754 started!
> systemd[1]: iscsid.service: Failed to parse PID from file /run/iscsid.pid: Invalid argument
> iscsid[754]: iSCSI daemon with pid=755 started!
> systemd[1]: Started iSCSI initiator daemon (iscsid).
>
> kernel: Loading iSCSI transport class v2.0-870.
>
> kernel: iscsi: registered transport (tcp)
> kernel: scsi host4: iSCSI Initiator over TCP/IP
>
> kernel: scsi 4:0:0:6: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:6: [sdc] 2097152000 512-byte logical blocks: (1.07 TB/1000 GiB)
> kernel: sd 4:0:0:6: [sdc] Write Protect is off
> kernel: sd 4:0:0:6: [sdc] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:6: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:6: [sdc] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:6: [sdc] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: sd 4:0:0:6: Attached scsi generic sg4 type 0
> kernel: scsi 4:0:0:7: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:7: Attached scsi generic sg5 type 0
> kernel: sd 4:0:0:7: [sdd] 2097152000 512-byte logical blocks: (1.07 TB/1000 GiB)
> kernel: scsi 4:0:0:3: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:7: [sdd] Write Protect is off
> kernel: sd 4:0:0:7: [sdd] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:7: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:7: [sdd] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:7: [sdd] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: sd 4:0:0:3: [sde] 524288000 512-byte logical blocks: (268 GB/250 GiB)
> kernel: sd 4:0:0:3: [sde] Write Protect is off
> kernel: sd 4:0:0:3: [sde] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:3: Attached scsi generic sg6 type 0
> kernel: sd 4:0:0:3: [sde] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: scsi 4:0:0:4: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:3: [sde] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:3: [sde] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: sd 4:0:0:4: [sdf] 524288000 512-byte logical blocks: (268 GB/250 GiB)
> kernel: sd 4:0:0:4: [sdf] Write Protect is off
> kernel: sd 4:0:0:4: [sdf] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:4: [sdf] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:4: [sdf] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:4: [sdf] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: sd 4:0:0:7: [sdd] Attached SCSI disk
> kernel: sd 4:0:0:3: [sde] Attached SCSI disk
> kernel: sd 4:0:0:4: [sdf] Attached SCSI disk
> kernel: sd 4:0:0:4: Attached scsi generic sg7 type 0
> kernel: scsi 4:0:0:2: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:2: Attached scsi generic sg8 type 0
> kernel: sd 4:0:0:2: [sdg] 524288000 512-byte logical blocks: (268 GB/250 GiB)
> kernel: sd 4:0:0:2: [sdg] Write Protect is off
> kernel: sd 4:0:0:2: [sdg] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:2: [sdg] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:2: [sdg] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:2: [sdg] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: scsi 4:0:0:1: Direct-Access SYNOLOGY iSCSI Storage 4.0 PQ: 0 ANSI: 5
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 11 transid 309446 /dev/sdd scanned by systemd-udevd (371)
> kernel: sd 4:0:0:1: Attached scsi generic sg9 type 0
> kernel: sd 4:0:0:1: [sdh] 524288000 512-byte logical blocks: (268 GB/250 GiB)
> kernel: sd 4:0:0:1: [sdh] Write Protect is off
> kernel: sd 4:0:0:1: [sdh] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:1: [sdh] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:1: [sdh] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:1: [sdh] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> iscsiadm[721]: Logging in to [iface: default, target: iqn.2000-01.com.synology:diskstation.default-target.b890048b949, portal: 10.4.1.2,3260]
> iscsiadm[721]: Login to [iface: default, target: iqn.2000-01.com.synology:diskstation.default-target.b890048b949, portal: 10.4.1.2,3260] successful.
> kernel: sd 4:0:0:2: [sdg] Attached SCSI disk
> kernel: sd 4:0:0:1: [sdh] Attached SCSI disk
>
> systemd[1]: Finished Login to default iSCSI targets.
> systemd[1]: Reached target Preparation for Remote File Systems.
> systemd[1]: Mounting SAN Storage...
> systemd[1]: Finished Availability of block devices.
> kernel: BTRFS info (device sdd): using crc32c (crc32c-intel) checksum algorithm
> kernel: BTRFS info (device sdd): turning on async discard
> kernel: BTRFS info (device sdd): disk space caching is enabled
> kernel: BTRFS error (device sdd): devid 8 uuid 3d66ced8-24c1-45dc-9d70-8a921764dc88 is missing
> kernel: BTRFS error (device sdd): failed to read the system array: -2
> kernel: BTRFS error: device /dev/sdf belongs to fsid c6f83d24-1ac3-4417-bdd9-6249c899604d, and the fs is already mounted
> nm-dispatcher[919]: /etc/network/if-up.d/resolved: 12: mystatedir: not found
> kernel: BTRFS error (device sdd): open_ctree failed
> mount[921]: mount: /srv/store: wrong fs type, bad option, bad superblock on /dev/sdd, missing codepage or helper program, or other error.
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 8 transid 309446 /dev/sde scanned by systemd-udevd (376)
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 6 transid 309446 /dev/sdg scanned by systemd-udevd (365)
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 7 transid 309446 /dev/sdh scanned by systemd-udevd (363)
> systemd[1]: srv-store.mount: Mount process exited, code=exited, status=32/n/a
>
Something tries to mount this filesystem before all its devices are
present. You need to find out what does it. btrfs is just a messenger here.
> systemd[1]: srv-store.mount: Failed with result 'exit-code'.
> systemd[1]: Failed to mount SAN Storage.
> systemd[1]: Dependency failed for Remote File Systems.
> systemd[1]: remote-fs.target: Job remote-fs.target/start failed with result 'dependency'.
>
> kernel: sd 4:0:0:6: [sdc] Attached SCSI disk
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 12 transid 309446 /dev/sdc scanned by systemd-udevd (372)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: race condition mounting multi-device iscsi volume, not resolved in newer kernels
2023-07-22 4:34 ` Andrei Borzenkov
@ 2023-07-22 6:04 ` Eric Levy
2023-07-22 6:52 ` Andrei Borzenkov
0 siblings, 1 reply; 12+ messages in thread
From: Eric Levy @ 2023-07-22 6:04 UTC (permalink / raw)
To: Andrei Borzenkov; +Cc: linux-btrfs
On Sat, Jul 22 2023 at 07:34:26 AM +0300, Andrei Borzenkov
<arvidjaar@gmail.com> wrote:
> Log with timestamps could be more useful.
I would need to generate a report showing millisecond granularity in
order to reveal intervals not appearing as zero duration.
> Something tries to mount this filesystem before all its devices are
> present. You need to find out what does it. btrfs is just a messenger
> here.
The systemd mount unit is defined to start after iscsi finishes
initializing. The idea that the kernel or filesystem layers wer causing
problems follows from a remark given to me through a different thread
that I opened just days ago. I have tried many different variations for
the systemd mount unit without reaching any success for the issue.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: race condition mounting multi-device iscsi volume, not resolved in newer kernels
2023-07-22 6:04 ` Eric Levy
@ 2023-07-22 6:52 ` Andrei Borzenkov
0 siblings, 0 replies; 12+ messages in thread
From: Andrei Borzenkov @ 2023-07-22 6:52 UTC (permalink / raw)
To: Eric Levy; +Cc: linux-btrfs
On 22.07.2023 09:04, Eric Levy wrote:
>
>
> On Sat, Jul 22 2023 at 07:34:26 AM +0300, Andrei Borzenkov
> <arvidjaar@gmail.com> wrote:
>
>> Log with timestamps could be more useful.
>
> I would need to generate a report showing millisecond granularity in
> order to reveal intervals not appearing as zero duration.
>
>> Something tries to mount this filesystem before all its devices are
>> present. You need to find out what does it. btrfs is just a messenger
>> here.
>
> The systemd mount unit is defined to start after iscsi finishes
> initializing.
"iSCSI finishes initializing" just means iscsid was started. The actual
device discovery can take arbitrary amount of time due to network
condition, target behavior and overall system load. So condition "iscsi
finishes initializing" is absolutely wrong for "we can now start
mounting remote devices".
With default btrfs udev rules systemd should wait for device required by
mount unit to become ready; in case of btrfs all devices except the last
discovered one will be "missing" from systemd point of view and only the
last device should allow mount unit to start.
Provide full output of
journalctl -b --no-pager --full
immediately after boot (before doing anything else), it may give some hints.
> The idea that the kernel or filesystem layers wer causing
> problems follows from a remark given to me through a different thread
> that I opened just days ago. I have tried many different variations for
> the systemd mount unit without reaching any success for the issue.
>
Systemd was never designed for dynamic device management when devices
can come and go at arbitrary time and even less for multi-device
filesystems.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: race condition mounting multi-device iscsi volume, not resolved in newer kernels
2023-07-22 0:40 race condition mounting multi-device iscsi volume, not resolved in newer kernels Eric Levy
2023-07-22 4:34 ` Andrei Borzenkov
@ 2023-07-22 5:27 ` Torbjörn Jansson
2023-07-22 11:50 ` Anand Jain
2 siblings, 0 replies; 12+ messages in thread
From: Torbjörn Jansson @ 2023-07-22 5:27 UTC (permalink / raw)
To: Eric Levy, linux-btrfs
On 2023-07-22 02:40, Eric Levy wrote:
> I mount a volume that spans multiple devices, all provided as logical
> units from an iSCSI target.
>
> Mounting is inconsistently successful at boot.
>
> The problem had presented with both kernel versions 5.19.0 and 6.2.0.
>
> After learning that a patch for similarly reported problems had been
> submitted in commit 50d281fc434c, I further tested with kernels 6.3.0
> and 6.5.0-rc2.
>
> The general problem has persisted even with the newer kernels.
>
> The system is Linux Mint and all kernels are Ubuntu mainline.
>
> Mounting is possible manually after boot, though the mount status of
> the volume appears to be left in an inconsistent state at such time.
>
> The below system log, with redaction for unrelated messages, is taken
> from a boot sequence for which mounting has failed, with kernel version
> 6.3.0.
>
> Note that attachment of device `sdc` follows the mount attempt, as well
> as the announcement of scan completion.
>
i assume you mount your filesystem via fstab.
i had issues once in the past with a network filesystem and getting the timing
right, i then discovered that you can add a few extra options in fstab to tell
systemd how to behave.
from memory i think it was x-systemd.requires= i had to use in fstab.
see:
https://www.freedesktop.org/software/systemd/man/systemd.mount.html
perhaps you can make your fstab entry depend on your iscsi service and control
the order that way.
> ---
>
> kernel: Linux version 6.3.0-060300-generic (kernel@sita) (x86_64-linux-gnu-gcc-12 (Ubuntu 12.2.0-9ubuntu1) 12.2.0, GNU ld (GNU Binutils for Ubuntu) 2.39) #202304232030 SMP PREEMPT_DYNAMIC Sun Apr 23 20:37:49 UTC 2023
>
> systemd[1]: Starting Login to default iSCSI targets...
>
> systemd[1]: Starting iSCSI initiator daemon (iscsid)...
> iscsid[749]: iSCSI logger with pid=754 started!
> systemd[1]: iscsid.service: Failed to parse PID from file /run/iscsid.pid: Invalid argument
> iscsid[754]: iSCSI daemon with pid=755 started!
> systemd[1]: Started iSCSI initiator daemon (iscsid).
>
> kernel: Loading iSCSI transport class v2.0-870.
>
> kernel: iscsi: registered transport (tcp)
> kernel: scsi host4: iSCSI Initiator over TCP/IP
>
> kernel: scsi 4:0:0:6: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:6: [sdc] 2097152000 512-byte logical blocks: (1.07 TB/1000 GiB)
> kernel: sd 4:0:0:6: [sdc] Write Protect is off
> kernel: sd 4:0:0:6: [sdc] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:6: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:6: [sdc] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:6: [sdc] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: sd 4:0:0:6: Attached scsi generic sg4 type 0
> kernel: scsi 4:0:0:7: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:7: Attached scsi generic sg5 type 0
> kernel: sd 4:0:0:7: [sdd] 2097152000 512-byte logical blocks: (1.07 TB/1000 GiB)
> kernel: scsi 4:0:0:3: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:7: [sdd] Write Protect is off
> kernel: sd 4:0:0:7: [sdd] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:7: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:7: [sdd] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:7: [sdd] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: sd 4:0:0:3: [sde] 524288000 512-byte logical blocks: (268 GB/250 GiB)
> kernel: sd 4:0:0:3: [sde] Write Protect is off
> kernel: sd 4:0:0:3: [sde] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:3: Attached scsi generic sg6 type 0
> kernel: sd 4:0:0:3: [sde] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: scsi 4:0:0:4: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:3: [sde] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:3: [sde] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: sd 4:0:0:4: [sdf] 524288000 512-byte logical blocks: (268 GB/250 GiB)
> kernel: sd 4:0:0:4: [sdf] Write Protect is off
> kernel: sd 4:0:0:4: [sdf] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:4: [sdf] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:4: [sdf] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:4: [sdf] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: sd 4:0:0:7: [sdd] Attached SCSI disk
> kernel: sd 4:0:0:3: [sde] Attached SCSI disk
> kernel: sd 4:0:0:4: [sdf] Attached SCSI disk
> kernel: sd 4:0:0:4: Attached scsi generic sg7 type 0
> kernel: scsi 4:0:0:2: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:2: Attached scsi generic sg8 type 0
> kernel: sd 4:0:0:2: [sdg] 524288000 512-byte logical blocks: (268 GB/250 GiB)
> kernel: sd 4:0:0:2: [sdg] Write Protect is off
> kernel: sd 4:0:0:2: [sdg] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:2: [sdg] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:2: [sdg] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:2: [sdg] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: scsi 4:0:0:1: Direct-Access SYNOLOGY iSCSI Storage 4.0 PQ: 0 ANSI: 5
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 11 transid 309446 /dev/sdd scanned by systemd-udevd (371)
> kernel: sd 4:0:0:1: Attached scsi generic sg9 type 0
> kernel: sd 4:0:0:1: [sdh] 524288000 512-byte logical blocks: (268 GB/250 GiB)
> kernel: sd 4:0:0:1: [sdh] Write Protect is off
> kernel: sd 4:0:0:1: [sdh] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:1: [sdh] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:1: [sdh] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:1: [sdh] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> iscsiadm[721]: Logging in to [iface: default, target: iqn.2000-01.com.synology:diskstation.default-target.b890048b949, portal: 10.4.1.2,3260]
> iscsiadm[721]: Login to [iface: default, target: iqn.2000-01.com.synology:diskstation.default-target.b890048b949, portal: 10.4.1.2,3260] successful.
> kernel: sd 4:0:0:2: [sdg] Attached SCSI disk
> kernel: sd 4:0:0:1: [sdh] Attached SCSI disk
>
> systemd[1]: Finished Login to default iSCSI targets.
> systemd[1]: Reached target Preparation for Remote File Systems.
> systemd[1]: Mounting SAN Storage...
> systemd[1]: Finished Availability of block devices.
> kernel: BTRFS info (device sdd): using crc32c (crc32c-intel) checksum algorithm
> kernel: BTRFS info (device sdd): turning on async discard
> kernel: BTRFS info (device sdd): disk space caching is enabled
> kernel: BTRFS error (device sdd): devid 8 uuid 3d66ced8-24c1-45dc-9d70-8a921764dc88 is missing
> kernel: BTRFS error (device sdd): failed to read the system array: -2
> kernel: BTRFS error: device /dev/sdf belongs to fsid c6f83d24-1ac3-4417-bdd9-6249c899604d, and the fs is already mounted
> nm-dispatcher[919]: /etc/network/if-up.d/resolved: 12: mystatedir: not found
> kernel: BTRFS error (device sdd): open_ctree failed
> mount[921]: mount: /srv/store: wrong fs type, bad option, bad superblock on /dev/sdd, missing codepage or helper program, or other error.
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 8 transid 309446 /dev/sde scanned by systemd-udevd (376)
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 6 transid 309446 /dev/sdg scanned by systemd-udevd (365)
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 7 transid 309446 /dev/sdh scanned by systemd-udevd (363)
> systemd[1]: srv-store.mount: Mount process exited, code=exited, status=32/n/a
>
> systemd[1]: srv-store.mount: Failed with result 'exit-code'.
> systemd[1]: Failed to mount SAN Storage.
> systemd[1]: Dependency failed for Remote File Systems.
> systemd[1]: remote-fs.target: Job remote-fs.target/start failed with result 'dependency'.
>
> kernel: sd 4:0:0:6: [sdc] Attached SCSI disk
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 12 transid 309446 /dev/sdc scanned by systemd-udevd (372)
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: race condition mounting multi-device iscsi volume, not resolved in newer kernels
2023-07-22 0:40 race condition mounting multi-device iscsi volume, not resolved in newer kernels Eric Levy
2023-07-22 4:34 ` Andrei Borzenkov
2023-07-22 5:27 ` Torbjörn Jansson
@ 2023-07-22 11:50 ` Anand Jain
2023-07-22 19:07 ` Eric Levy
2 siblings, 1 reply; 12+ messages in thread
From: Anand Jain @ 2023-07-22 11:50 UTC (permalink / raw)
To: Eric Levy, linux-btrfs
> After learning that a patch for similarly reported problems had been
> submitted in commit 50d281fc434c, I further tested with kernels 6.3.0
> and 6.5.0-rc2.
>
> The general problem has persisted even with the newer kernels.
So its not that commit then. Let's see more.
>
> The system is Linux Mint and all kernels are Ubuntu mainline.
>
> Mounting is possible manually after boot,
Noted.
> though the mount status of
> the volume appears to be left in an inconsistent state at such time.
Hm. Do you mean the filesystem is in an inconsistent state after
the manual successful manual mount? Do you have any error/warn logs?
What tells you that the filesysem is inconsistent?
>
> The below system log, with redaction for unrelated messages, is taken
> from a boot sequence for which mounting has failed, with kernel version
> 6.3.0.
>
> Note that attachment of device `sdc` follows the mount attempt, as well
> as the announcement of scan completion.
>
> ---
>
> kernel: Linux version 6.3.0-060300-generic (kernel@sita) (x86_64-linux-gnu-gcc-12 (Ubuntu 12.2.0-9ubuntu1) 12.2.0, GNU ld (GNU Binutils for Ubuntu) 2.39) #202304232030 SMP PREEMPT_DYNAMIC Sun Apr 23 20:37:49 UTC 2023
>
> systemd[1]: Starting Login to default iSCSI targets...
>
> systemd[1]: Starting iSCSI initiator daemon (iscsid)...
> iscsid[749]: iSCSI logger with pid=754 started!
> systemd[1]: iscsid.service: Failed to parse PID from file /run/iscsid.pid: Invalid argument
> iscsid[754]: iSCSI daemon with pid=755 started!
> systemd[1]: Started iSCSI initiator daemon (iscsid).
>
> kernel: Loading iSCSI transport class v2.0-870.
>
> kernel: iscsi: registered transport (tcp)
> kernel: scsi host4: iSCSI Initiator over TCP/IP
>
> kernel: scsi 4:0:0:6: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:6: [sdc] 2097152000 512-byte logical blocks: (1.07 TB/1000 GiB)
> kernel: sd 4:0:0:6: [sdc] Write Protect is off
> kernel: sd 4:0:0:6: [sdc] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:6: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:6: [sdc] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:6: [sdc] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: sd 4:0:0:6: Attached scsi generic sg4 type 0
> kernel: scsi 4:0:0:7: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:7: Attached scsi generic sg5 type 0
> kernel: sd 4:0:0:7: [sdd] 2097152000 512-byte logical blocks: (1.07 TB/1000 GiB)
> kernel: scsi 4:0:0:3: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:7: [sdd] Write Protect is off
> kernel: sd 4:0:0:7: [sdd] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:7: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:7: [sdd] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:7: [sdd] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: sd 4:0:0:3: [sde] 524288000 512-byte logical blocks: (268 GB/250 GiB)
> kernel: sd 4:0:0:3: [sde] Write Protect is off
> kernel: sd 4:0:0:3: [sde] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:3: Attached scsi generic sg6 type 0
> kernel: sd 4:0:0:3: [sde] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: scsi 4:0:0:4: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:3: [sde] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:3: [sde] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: sd 4:0:0:4: [sdf] 524288000 512-byte logical blocks: (268 GB/250 GiB)
> kernel: sd 4:0:0:4: [sdf] Write Protect is off
> kernel: sd 4:0:0:4: [sdf] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:4: [sdf] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:4: [sdf] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:4: [sdf] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: sd 4:0:0:7: [sdd] Attached SCSI disk
> kernel: sd 4:0:0:3: [sde] Attached SCSI disk
> kernel: sd 4:0:0:4: [sdf] Attached SCSI disk
> kernel: sd 4:0:0:4: Attached scsi generic sg7 type 0
> kernel: scsi 4:0:0:2: Direct-Access SYNOLOGY Storage 4.0 PQ: 0 ANSI: 5
> kernel: sd 4:0:0:2: Attached scsi generic sg8 type 0
> kernel: sd 4:0:0:2: [sdg] 524288000 512-byte logical blocks: (268 GB/250 GiB)
> kernel: sd 4:0:0:2: [sdg] Write Protect is off
> kernel: sd 4:0:0:2: [sdg] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:2: [sdg] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:2: [sdg] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:2: [sdg] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> kernel: scsi 4:0:0:1: Direct-Access SYNOLOGY iSCSI Storage 4.0 PQ: 0 ANSI: 5
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 11 transid 309446 /dev/sdd scanned by systemd-udevd (371)
> kernel: sd 4:0:0:1: Attached scsi generic sg9 type 0
> kernel: sd 4:0:0:1: [sdh] 524288000 512-byte logical blocks: (268 GB/250 GiB)
> kernel: sd 4:0:0:1: [sdh] Write Protect is off
> kernel: sd 4:0:0:1: [sdh] Mode Sense: 43 00 10 08
> kernel: sd 4:0:0:1: [sdh] Write cache: enabled, read cache: enabled, supports DPO and FUA
> kernel: sd 4:0:0:1: [sdh] Preferred minimum I/O size 512 bytes
> kernel: sd 4:0:0:1: [sdh] Optimal transfer size 16384 logical blocks > dev_max (8192 logical blocks)
> iscsiadm[721]: Logging in to [iface: default, target: iqn.2000-01.com.synology:diskstation.default-target.b890048b949, portal: 10.4.1.2,3260]
> iscsiadm[721]: Login to [iface: default, target: iqn.2000-01.com.synology:diskstation.default-target.b890048b949, portal: 10.4.1.2,3260] successful.
> kernel: sd 4:0:0:2: [sdg] Attached SCSI disk
> kernel: sd 4:0:0:1: [sdh] Attached SCSI disk
>
> systemd[1]: Finished Login to default iSCSI targets.
> systemd[1]: Reached target Preparation for Remote File Systems.
> systemd[1]: Mounting SAN Storage...
> systemd[1]: Finished Availability of block devices.
6 devices sd[c-h] got discovered at this stage.
> kernel: BTRFS info (device sdd): using crc32c (crc32c-intel) checksum algorithm
> kernel: BTRFS info (device sdd): turning on async discard
> kernel: BTRFS info (device sdd): disk space caching is enabled
> kernel: BTRFS error (device sdd): devid 8 uuid 3d66ced8-24c1-45dc-9d70-8a921764dc88 is missing
Devid 8! How many devices do we have to this fsid?
As before do you have the dump-super?
Do you have -o degraded in the mount option?
Before (what ever is) calling the mount has it run
btrfs device scan ?
No I think because I don't see any device scanned messages.
> kernel: BTRFS error (device sdd): failed to read the system array: -2 > kernel: BTRFS error: device /dev/sdf belongs to fsid
c6f83d24-1ac3-4417-bdd9-6249c899604d, and the fs is already mounted
So what is the time diff between above two messages?
Already mounted => might be due to a race condition?
> nm-dispatcher[919]: /etc/network/if-up.d/resolved: 12: mystatedir: not found
> kernel: BTRFS error (device sdd): open_ctree failed
> mount[921]: mount: /srv/store: wrong fs type, bad option, bad superblock on /dev/sdd, missing codepage or helper program, or other error.
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 8 transid 309446 /dev/sde scanned by systemd-udevd (376)
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 6 transid 309446 /dev/sdg scanned by systemd-udevd (365)
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 7 transid 309446 /dev/sdh scanned by systemd-udevd (363)
Scanned messages appearing after the an attempt to mount so for sure
mount shall fail.
> systemd[1]: srv-store.mount: Mount process exited, code=exited, status=32/n/a
>
> systemd[1]: srv-store.mount: Failed with result 'exit-code'.
> systemd[1]: Failed to mount SAN Storage.
> systemd[1]: Dependency failed for Remote File Systems.
> systemd[1]: remote-fs.target: Job remote-fs.target/start failed with result 'dependency'.
>
> kernel: sd 4:0:0:6: [sdc] Attached SCSI disk
> kernel: BTRFS: device fsid c6f83d24-1ac3-4417-bdd9-6249c899604d devid 12 transid 309446 /dev/sdc scanned by systemd-udevd (372)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: race condition mounting multi-device iscsi volume, not resolved in newer kernels
2023-07-22 11:50 ` Anand Jain
@ 2023-07-22 19:07 ` Eric Levy
2023-07-22 19:11 ` Andrei Borzenkov
0 siblings, 1 reply; 12+ messages in thread
From: Eric Levy @ 2023-07-22 19:07 UTC (permalink / raw)
To: Anand Jain; +Cc: linux-btrfs
On Sat, Jul 22 2023 at 07:50:38 PM +0800, Anand Jain
<anand.jain@oracle.com> wrote
>> Hm. Do you mean the filesystem is in an inconsistent state after
> the manual successful manual mount? Do you have any error/warn logs?
> What tells you that the filesysem is inconsistent?
The state inconsistency refers to the mount session not to the file
system. The file system is completely healthy, having been cleanly
unmounted during the previous shutdown sequence, and no mount operation
having been completed during the current boot session.
However, within the running session, the system reports the file system
being already mounted, when mounting is attempted, even though the
mount point is empty and the mount operation had produced failure
messages.
> Devid 8! How many devices do we have to this fsid?
> As before do you have the dump-super?
>
> Do you have -o degraded in the mount option?
>
> Before (what ever is) calling the mount has it run
> btrfs device scan ?
The system has eight devices. The first two are system devices, the
remaining ones comprise the spanned volume.
I have not attempted mounting in degraded mode, nor would I do so,
since the file system itself is healthy, and I wish to keep it as such.
All of the failures have been failures to mount due to some devices
being unattached. As long as the system is not mounting cleanly, then
not mounting remains preferred, to maintain its health.
> So what is the time diff between above two messages?
> Already mounted => might be due to a race condition?
I could check the timestamps, but the differences I am certain are less
than one second. The messages are printed sequentially, and operations
are clearly not in the preferred order. Mounting should not be
attempted until all devices are attached.
> Scanned messages appearing after the an attempt to mount so for sure
> mount shall fail.
Right.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: race condition mounting multi-device iscsi volume, not resolved in newer kernels
2023-07-22 19:07 ` Eric Levy
@ 2023-07-22 19:11 ` Andrei Borzenkov
2023-07-24 3:27 ` Anand Jain
0 siblings, 1 reply; 12+ messages in thread
From: Andrei Borzenkov @ 2023-07-22 19:11 UTC (permalink / raw)
To: Eric Levy, Anand Jain; +Cc: linux-btrfs
On 22.07.2023 22:07, Eric Levy wrote:
> Mounting should not be
> attempted until all devices are attached.
>
btrfs does not attempt any mounting on its own. You need to raise this
up with whatever software component does it.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: race condition mounting multi-device iscsi volume, not resolved in newer kernels
2023-07-22 19:11 ` Andrei Borzenkov
@ 2023-07-24 3:27 ` Anand Jain
2023-07-24 4:08 ` Eric Levy
0 siblings, 1 reply; 12+ messages in thread
From: Anand Jain @ 2023-07-24 3:27 UTC (permalink / raw)
To: Andrei Borzenkov, Eric Levy; +Cc: linux-btrfs
On 23/7/23 03:11, Andrei Borzenkov wrote:
> On 22.07.2023 22:07, Eric Levy wrote:
>> Mounting should not be
>> attempted until all devices are attached.
>>
>
> btrfs does not attempt any mounting on its own. You need to raise this
> up with whatever software component does it.
>
Yep, systemd or whatever should call 'btrfs device scan' before
mounting. According to the logs, it seems that this didn't happen. As
the device scan occurs later, after the failed mount attempt, manually
mounting from the terminal is successful.
Furthermore, to understand how it worked before the upgrade, you can
share with us the boot logs from the older kernel.
Thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: race condition mounting multi-device iscsi volume, not resolved in newer kernels
2023-07-24 3:27 ` Anand Jain
@ 2023-07-24 4:08 ` Eric Levy
2023-07-24 4:42 ` Anand Jain
2023-07-24 6:36 ` Andrei Borzenkov
0 siblings, 2 replies; 12+ messages in thread
From: Eric Levy @ 2023-07-24 4:08 UTC (permalink / raw)
To: Anand Jain; +Cc: Andrei Borzenkov, linux-btrfs
On Mon, Jul 24 2023 at 11:27:25 AM +0800, Anand Jain
<anand.jain@oracle.com> wrote:
>
> Yep, systemd or whatever should call 'btrfs device scan' before
> mounting. According to the logs, it seems that this didn't happen. As
> the device scan occurs later, after the failed mount attempt,
> manually mounting from the terminal is successful.
>
> Furthermore, to understand how it worked before the upgrade, you can
> share with us the boot logs from the older kernel.
>
> Thanks.
I no longer believe that the upgrade introduced any new problem.
Broader testing with various kernel versions between 5.19.0 and
6.5.0-rc2 reveals no general differences among them.
My earlier complaint about the upgrade leading to degraded behavior, I
believe was premature, based simply on the bad luck of encountering the
race condition on particular instances immediately after the upgrade,
despite its having always been a problem.
The race condition itself remains an issue.
I have found a workaround, of moving the mount from a systemd unit to a
fstab line, and adding the _netdev fake mount option.
However, I think systemd should complete the device scan before
clearing the units remote-fs-pre.target and iscsi.service.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: race condition mounting multi-device iscsi volume, not resolved in newer kernels
2023-07-24 4:08 ` Eric Levy
@ 2023-07-24 4:42 ` Anand Jain
2023-07-24 6:36 ` Andrei Borzenkov
1 sibling, 0 replies; 12+ messages in thread
From: Anand Jain @ 2023-07-24 4:42 UTC (permalink / raw)
To: Eric Levy; +Cc: Andrei Borzenkov, linux-btrfs
On 24/7/23 12:08, Eric Levy wrote:
>
>
> On Mon, Jul 24 2023 at 11:27:25 AM +0800, Anand Jain
> <anand.jain@oracle.com> wrote:
>>
>> Yep, systemd or whatever should call 'btrfs device scan' before
>> mounting. According to the logs, it seems that this didn't happen. As
>> the device scan occurs later, after the failed mount attempt, manually
>> mounting from the terminal is successful.
>>
>> Furthermore, to understand how it worked before the upgrade, you can
>> share with us the boot logs from the older kernel.
>>
>> Thanks.
>
> I no longer believe that the upgrade introduced any new problem. Broader
> testing with various kernel versions between 5.19.0 and 6.5.0-rc2
> reveals no general differences among them.
>
> My earlier complaint about the upgrade leading to degraded behavior, I
> believe was premature, based simply on the bad luck of encountering the
> race condition on particular instances immediately after the upgrade,
> despite its having always been a problem.
>
> The race condition itself remains an issue.
I don't believe you will see error the "already mounted" if there is a
successful 'btrfs device scan'. So it shouldn't be an issue.
>
> I have found a workaround, of moving the mount from a systemd unit to a
> fstab line, and adding the _netdev fake mount option.
>
> However, I think systemd should complete the device scan before clearing
> the units remote-fs-pre.target and iscsi.service.
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: race condition mounting multi-device iscsi volume, not resolved in newer kernels
2023-07-24 4:08 ` Eric Levy
2023-07-24 4:42 ` Anand Jain
@ 2023-07-24 6:36 ` Andrei Borzenkov
1 sibling, 0 replies; 12+ messages in thread
From: Andrei Borzenkov @ 2023-07-24 6:36 UTC (permalink / raw)
To: Eric Levy; +Cc: Anand Jain, linux-btrfs
On Mon, Jul 24, 2023 at 7:38 AM Eric Levy <contact@ericlevy.name> wrote:
>
> However, I think systemd should complete the device scan before
> clearing the units remote-fs-pre.target and iscsi.service.
>
As I already told you, that is what systemd should be doing by default
for mount units generated from /etc/fstab. Or more precisely, the
default dependencies used by systemd should make sure btrfs mount is
not attempted until all devices that are part of this filesystem are
present. If you have evidence that it does not work you should file an
issue on systemd github.
If you are using your own mount unit it is entirely up to you to make
sure dependencies are correct. Do not blame the messenger.
^ permalink raw reply [flat|nested] 12+ messages in thread