* [PATCH] btrfs: prevent device path updating during mount
@ 2025-08-11 7:33 Qu Wenruo
2025-08-15 20:45 ` Boris Burkov
0 siblings, 1 reply; 3+ messages in thread
From: Qu Wenruo @ 2025-08-11 7:33 UTC (permalink / raw)
To: linux-btrfs
[REGRESSION]
After commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until
super block is created"), test case btrfs/315 can fail like this
randomly:
btrfs/315 1s ... - output mismatch (see /home/adam/xfstests/results//btrfs/315.out.bad)
--- tests/btrfs/315.out 2025-08-11 16:40:36.496000000 +0930
+++ /home/adam/xfstests/results//btrfs/315.out.bad 2025-08-11 16:41:04.304000000 +0930
@@ -1,7 +1,7 @@
QA output created by 315
---- seed_device_must_fail ----
mount: SCRATCH_MNT: WARNING: source write-protected, mounted read-only.
-mount: File exists
+mount: /mnt/test/315/tempfsid_mnt: WARNING: source write-protected, mounted read-only
---- device_add_must_fail ----
wrote 9000/9000 bytes at offset 0
...
(Run 'diff -u /home/adam/xfstests/tests/btrfs/315.out /home/adam/xfstests/results//btrfs/315.out.bad' to see the entire diff)
Ran: btrfs/315
Failures: btrfs/315
Failed 1 of 1 tests
[CAUSE]
The failure is that the second seed device (with a duplicated fsid and
dev uuid) is mounted successfully, which is unexpected.
In my environment, the following 2 devices involved in the
"seed_device_must_fail" run are:
lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch1 -> ../dm-2
lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch2 -> ../dm-4
Note the kernel dmesg of that run, when mounting the first seed device
(scratch1), the real device got mounted is the second seed device
(scratch2):
BTRFS: device fsid 7c8a5017-0c44-456f-8acf-57663f954e53 devid 1 transid 9 /dev/mapper/test-scratch1 (253:2) scanned by mount (3343974)
BTRFS info (device dm-4): first mount of filesystem 7c8a5017-0c44-456f-8acf-57663f954e53
BTRFS info (device dm-4): using crc32c (crc32c-generic) checksum algorithm
BTRFS info (device dm-4): using free-space-tre
Note that "(device dm-4)" line, this means /dev/test/scratch2 is
mounted when running "mount /dev/test/scratch1 /mnt/scratch".
Then when trying to mount /dev/test/scratch2, since the same device is
already mounted, it returns the same super block and do not fail.
The root cause is, when setting seed device flags for both devices,
a btrfs device scan is triggered, that scan is delayed and can happen at
any time.
So there is a race window between scanning the second device and
mounting the first device, where the device path can be replaced
halfway:
Mount scratch1 | Scanning scratch2
-----------------------------------+-------------------------------------
btrfs_get_tree_super() |
|- btrfs_scan_one_device() |
| We're mounting "scratch1" |
| |
|- btrfs_fs_devices_inc_holding() |
|- mutex_unlock(&uuid_mutex); |
| | btrfs_scan_one_device()
| | |- device_list_add()
| | |- rcu_assign_pointer()
| | This changes the device->name
| | to "scratch2"
|- sget_fc() |
| This creates a new super block |
|- mutex_lock(&uuid_mutex) |
|- btrfs_fs_devices_dec_holding() |
|- btrfs_open_devices() |
|- btrfs_open_one_device() |
"scratch2" is opened as that |
is recorded in device->name |
Commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until super
block is created") introduced fs_devices holding mechanism to allow
devices to be opened after super block is created.
But that holding period doesn't keep device->name untouched, allowing
a duplicated device to replace the path, thus mounting the incorrect
device.
[FIX]
Also check fs_devices->holding value before replacing the device->name.
If fs_devices->holding is not zero, meaning someone is trying to mount
the fs, then do not allow device->name to be updated.
Although this situation is rare, require certain race window and
two devices with duplicated fsid/dev uuid (which is already very cursed),
still output a warning message so that end users can be informed about
such cursed situation.
Fixes: bddf57a70781 ("btrfs: delay btrfs_open_devices() until super block is created")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/volumes.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index fa7a929a0461..4fdd84e0bff9 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -934,6 +934,20 @@ static noinline struct btrfs_device *device_list_add(const char *path,
path, found_transid, device->generation);
return ERR_PTR(-EEXIST);
}
+ if (fs_devices->holding) {
+ /*
+ * The fs_devices are already hold by an ongoing mount.
+ *
+ * We can not update the device path, or a duplicated
+ * fsid/dev-uuid can replace the original path, causing
+ * another device to be mounted.
+ */
+ btrfs_warn(NULL,
+ "device %s is trying to update path for a device being mounted",
+ path);
+ mutex_unlock(&fs_devices->device_list_mutex);
+ return ERR_PTR(-EEXIST);
+ }
/*
* We are going to replace the device path for a given devid,
--
2.50.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] btrfs: prevent device path updating during mount
2025-08-11 7:33 [PATCH] btrfs: prevent device path updating during mount Qu Wenruo
@ 2025-08-15 20:45 ` Boris Burkov
2025-08-15 23:06 ` Qu Wenruo
0 siblings, 1 reply; 3+ messages in thread
From: Boris Burkov @ 2025-08-15 20:45 UTC (permalink / raw)
To: Qu Wenruo; +Cc: linux-btrfs
On Mon, Aug 11, 2025 at 05:03:15PM +0930, Qu Wenruo wrote:
> [REGRESSION]
> After commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until
> super block is created"), test case btrfs/315 can fail like this
> randomly:
>
> btrfs/315 1s ... - output mismatch (see /home/adam/xfstests/results//btrfs/315.out.bad)
> --- tests/btrfs/315.out 2025-08-11 16:40:36.496000000 +0930
> +++ /home/adam/xfstests/results//btrfs/315.out.bad 2025-08-11 16:41:04.304000000 +0930
> @@ -1,7 +1,7 @@
> QA output created by 315
> ---- seed_device_must_fail ----
> mount: SCRATCH_MNT: WARNING: source write-protected, mounted read-only.
> -mount: File exists
> +mount: /mnt/test/315/tempfsid_mnt: WARNING: source write-protected, mounted read-only
> ---- device_add_must_fail ----
> wrote 9000/9000 bytes at offset 0
> ...
> (Run 'diff -u /home/adam/xfstests/tests/btrfs/315.out /home/adam/xfstests/results//btrfs/315.out.bad' to see the entire diff)
> Ran: btrfs/315
> Failures: btrfs/315
> Failed 1 of 1 tests
>
> [CAUSE]
> The failure is that the second seed device (with a duplicated fsid and
> dev uuid) is mounted successfully, which is unexpected.
>
> In my environment, the following 2 devices involved in the
> "seed_device_must_fail" run are:
>
> lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch1 -> ../dm-2
> lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch2 -> ../dm-4
>
> Note the kernel dmesg of that run, when mounting the first seed device
> (scratch1), the real device got mounted is the second seed device
> (scratch2):
>
> BTRFS: device fsid 7c8a5017-0c44-456f-8acf-57663f954e53 devid 1 transid 9 /dev/mapper/test-scratch1 (253:2) scanned by mount (3343974)
> BTRFS info (device dm-4): first mount of filesystem 7c8a5017-0c44-456f-8acf-57663f954e53
> BTRFS info (device dm-4): using crc32c (crc32c-generic) checksum algorithm
> BTRFS info (device dm-4): using free-space-tre
>
> Note that "(device dm-4)" line, this means /dev/test/scratch2 is
> mounted when running "mount /dev/test/scratch1 /mnt/scratch".
>
> Then when trying to mount /dev/test/scratch2, since the same device is
> already mounted, it returns the same super block and do not fail.
>
> The root cause is, when setting seed device flags for both devices,
> a btrfs device scan is triggered, that scan is delayed and can happen at
> any time.
>
> So there is a race window between scanning the second device and
> mounting the first device, where the device path can be replaced
> halfway:
>
> Mount scratch1 | Scanning scratch2
> -----------------------------------+-------------------------------------
> btrfs_get_tree_super() |
> |- btrfs_scan_one_device() |
> | We're mounting "scratch1" |
> | |
> |- btrfs_fs_devices_inc_holding() |
> |- mutex_unlock(&uuid_mutex); |
> | | btrfs_scan_one_device()
> | | |- device_list_add()
> | | |- rcu_assign_pointer()
> | | This changes the device->name
> | | to "scratch2"
> |- sget_fc() |
> | This creates a new super block |
> |- mutex_lock(&uuid_mutex) |
> |- btrfs_fs_devices_dec_holding() |
> |- btrfs_open_devices() |
> |- btrfs_open_one_device() |
> "scratch2" is opened as that |
> is recorded in device->name |
>
> Commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until super
> block is created") introduced fs_devices holding mechanism to allow
> devices to be opened after super block is created.
>
> But that holding period doesn't keep device->name untouched, allowing
> a duplicated device to replace the path, thus mounting the incorrect
> device.
>
> [FIX]
> Also check fs_devices->holding value before replacing the device->name.
> If fs_devices->holding is not zero, meaning someone is trying to mount
> the fs, then do not allow device->name to be updated.
>
> Although this situation is rare, require certain race window and
> two devices with duplicated fsid/dev uuid (which is already very cursed),
> still output a warning message so that end users can be informed about
> such cursed situation.
Thanks for working on this, the last time I worked in this code it made
my head hurt, so I appreciate this is annoying to fix.
With that said, I have one concern: do you know if there are existing
tests exercising the legit name change behaviors outlined in the big
comment starting with "When FS is already mounted"? I have a feeling
that this check will prevent those as well, but I am not sufficiently
familiar with the new holding delayed open logic to be sure.
Thanks,
Boris
>
> Fixes: bddf57a70781 ("btrfs: delay btrfs_open_devices() until super block is created")
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
> fs/btrfs/volumes.c | 14 ++++++++++++++
> 1 file changed, 14 insertions(+)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index fa7a929a0461..4fdd84e0bff9 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -934,6 +934,20 @@ static noinline struct btrfs_device *device_list_add(const char *path,
> path, found_transid, device->generation);
> return ERR_PTR(-EEXIST);
> }
> + if (fs_devices->holding) {
> + /*
> + * The fs_devices are already hold by an ongoing mount.
> + *
> + * We can not update the device path, or a duplicated
> + * fsid/dev-uuid can replace the original path, causing
> + * another device to be mounted.
> + */
> + btrfs_warn(NULL,
> + "device %s is trying to update path for a device being mounted",
> + path);
> + mutex_unlock(&fs_devices->device_list_mutex);
> + return ERR_PTR(-EEXIST);
> + }
>
> /*
> * We are going to replace the device path for a given devid,
> --
> 2.50.1
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] btrfs: prevent device path updating during mount
2025-08-15 20:45 ` Boris Burkov
@ 2025-08-15 23:06 ` Qu Wenruo
0 siblings, 0 replies; 3+ messages in thread
From: Qu Wenruo @ 2025-08-15 23:06 UTC (permalink / raw)
To: Boris Burkov; +Cc: linux-btrfs
在 2025/8/16 06:15, Boris Burkov 写道:
> On Mon, Aug 11, 2025 at 05:03:15PM +0930, Qu Wenruo wrote:
>> [REGRESSION]
>> After commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until
>> super block is created"), test case btrfs/315 can fail like this
>> randomly:
>>
>> btrfs/315 1s ... - output mismatch (see /home/adam/xfstests/results//btrfs/315.out.bad)
>> --- tests/btrfs/315.out 2025-08-11 16:40:36.496000000 +0930
>> +++ /home/adam/xfstests/results//btrfs/315.out.bad 2025-08-11 16:41:04.304000000 +0930
>> @@ -1,7 +1,7 @@
>> QA output created by 315
>> ---- seed_device_must_fail ----
>> mount: SCRATCH_MNT: WARNING: source write-protected, mounted read-only.
>> -mount: File exists
>> +mount: /mnt/test/315/tempfsid_mnt: WARNING: source write-protected, mounted read-only
>> ---- device_add_must_fail ----
>> wrote 9000/9000 bytes at offset 0
>> ...
>> (Run 'diff -u /home/adam/xfstests/tests/btrfs/315.out /home/adam/xfstests/results//btrfs/315.out.bad' to see the entire diff)
>> Ran: btrfs/315
>> Failures: btrfs/315
>> Failed 1 of 1 tests
>>
>> [CAUSE]
>> The failure is that the second seed device (with a duplicated fsid and
>> dev uuid) is mounted successfully, which is unexpected.
>>
>> In my environment, the following 2 devices involved in the
>> "seed_device_must_fail" run are:
>>
>> lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch1 -> ../dm-2
>> lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch2 -> ../dm-4
>>
>> Note the kernel dmesg of that run, when mounting the first seed device
>> (scratch1), the real device got mounted is the second seed device
>> (scratch2):
>>
>> BTRFS: device fsid 7c8a5017-0c44-456f-8acf-57663f954e53 devid 1 transid 9 /dev/mapper/test-scratch1 (253:2) scanned by mount (3343974)
>> BTRFS info (device dm-4): first mount of filesystem 7c8a5017-0c44-456f-8acf-57663f954e53
>> BTRFS info (device dm-4): using crc32c (crc32c-generic) checksum algorithm
>> BTRFS info (device dm-4): using free-space-tre
>>
>> Note that "(device dm-4)" line, this means /dev/test/scratch2 is
>> mounted when running "mount /dev/test/scratch1 /mnt/scratch".
>>
>> Then when trying to mount /dev/test/scratch2, since the same device is
>> already mounted, it returns the same super block and do not fail.
>>
>> The root cause is, when setting seed device flags for both devices,
>> a btrfs device scan is triggered, that scan is delayed and can happen at
>> any time.
>>
>> So there is a race window between scanning the second device and
>> mounting the first device, where the device path can be replaced
>> halfway:
>>
>> Mount scratch1 | Scanning scratch2
>> -----------------------------------+-------------------------------------
>> btrfs_get_tree_super() |
>> |- btrfs_scan_one_device() |
>> | We're mounting "scratch1" |
>> | |
>> |- btrfs_fs_devices_inc_holding() |
>> |- mutex_unlock(&uuid_mutex); |
>> | | btrfs_scan_one_device()
>> | | |- device_list_add()
>> | | |- rcu_assign_pointer()
>> | | This changes the device->name
>> | | to "scratch2"
>> |- sget_fc() |
>> | This creates a new super block |
>> |- mutex_lock(&uuid_mutex) |
>> |- btrfs_fs_devices_dec_holding() |
>> |- btrfs_open_devices() |
>> |- btrfs_open_one_device() |
>> "scratch2" is opened as that |
>> is recorded in device->name |
>>
>> Commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until super
>> block is created") introduced fs_devices holding mechanism to allow
>> devices to be opened after super block is created.
>>
>> But that holding period doesn't keep device->name untouched, allowing
>> a duplicated device to replace the path, thus mounting the incorrect
>> device.
>>
>> [FIX]
>> Also check fs_devices->holding value before replacing the device->name.
>> If fs_devices->holding is not zero, meaning someone is trying to mount
>> the fs, then do not allow device->name to be updated.
>>
>> Although this situation is rare, require certain race window and
>> two devices with duplicated fsid/dev uuid (which is already very cursed),
>> still output a warning message so that end users can be informed about
>> such cursed situation.
>
> Thanks for working on this, the last time I worked in this code it made
> my head hurt, so I appreciate this is annoying to fix.
>
> With that said, I have one concern: do you know if there are existing
> tests exercising the legit name change behaviors outlined in the big
> comment starting with "When FS is already mounted"?
No, I can not find such tests nor I experienced any failure with this
patch applied.
Furthermore the cases mentioned in the original comment only applies for
fs already mounted, for fses during mount I think those cases should
directly cause failure.
Although I understand there are cases like devices disappear and
re-appear, personally I doubt if we should even just re-add the device
automatically.
There are more problems, especially related to the recently introduced
block device holder.
We relies on the block device holder to reach a super block, but during
scan we don't provide any device holder.
Thus if we really go into the branch that calls "device->devt =
path_devt;", it will screw up future bdev freeze/thaw/sync/remove_bdev
callbacks.
Thanks,
Qu
> I have a feeling
> that this check will prevent those as well, but I am not sufficiently
> familiar with the new holding delayed open logic to be sure.
>
> Thanks,
> Boris
>
>>
>> Fixes: bddf57a70781 ("btrfs: delay btrfs_open_devices() until super block is created")
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>> fs/btrfs/volumes.c | 14 ++++++++++++++
>> 1 file changed, 14 insertions(+)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index fa7a929a0461..4fdd84e0bff9 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -934,6 +934,20 @@ static noinline struct btrfs_device *device_list_add(const char *path,
>> path, found_transid, device->generation);
>> return ERR_PTR(-EEXIST);
>> }
>> + if (fs_devices->holding) {
>> + /*
>> + * The fs_devices are already hold by an ongoing mount.
>> + *
>> + * We can not update the device path, or a duplicated
>> + * fsid/dev-uuid can replace the original path, causing
>> + * another device to be mounted.
>> + */
>> + btrfs_warn(NULL,
>> + "device %s is trying to update path for a device being mounted",
>> + path);
>> + mutex_unlock(&fs_devices->device_list_mutex);
>> + return ERR_PTR(-EEXIST);
>> + }
>>
>> /*
>> * We are going to replace the device path for a given devid,
>> --
>> 2.50.1
>>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-08-15 23:07 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-11 7:33 [PATCH] btrfs: prevent device path updating during mount Qu Wenruo
2025-08-15 20:45 ` Boris Burkov
2025-08-15 23:06 ` Qu Wenruo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox