* [PATCH] btrfs: prevent device path updating during mount
@ 2025-08-11 7:33 Qu Wenruo
2025-08-15 20:45 ` Boris Burkov
0 siblings, 1 reply; 3+ messages in thread
From: Qu Wenruo @ 2025-08-11 7:33 UTC (permalink / raw)
To: linux-btrfs
[REGRESSION]
After commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until
super block is created"), test case btrfs/315 can fail like this
randomly:
btrfs/315 1s ... - output mismatch (see /home/adam/xfstests/results//btrfs/315.out.bad)
--- tests/btrfs/315.out 2025-08-11 16:40:36.496000000 +0930
+++ /home/adam/xfstests/results//btrfs/315.out.bad 2025-08-11 16:41:04.304000000 +0930
@@ -1,7 +1,7 @@
QA output created by 315
---- seed_device_must_fail ----
mount: SCRATCH_MNT: WARNING: source write-protected, mounted read-only.
-mount: File exists
+mount: /mnt/test/315/tempfsid_mnt: WARNING: source write-protected, mounted read-only
---- device_add_must_fail ----
wrote 9000/9000 bytes at offset 0
...
(Run 'diff -u /home/adam/xfstests/tests/btrfs/315.out /home/adam/xfstests/results//btrfs/315.out.bad' to see the entire diff)
Ran: btrfs/315
Failures: btrfs/315
Failed 1 of 1 tests
[CAUSE]
The failure is that the second seed device (with a duplicated fsid and
dev uuid) is mounted successfully, which is unexpected.
In my environment, the following 2 devices involved in the
"seed_device_must_fail" run are:
lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch1 -> ../dm-2
lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch2 -> ../dm-4
Note the kernel dmesg of that run, when mounting the first seed device
(scratch1), the real device got mounted is the second seed device
(scratch2):
BTRFS: device fsid 7c8a5017-0c44-456f-8acf-57663f954e53 devid 1 transid 9 /dev/mapper/test-scratch1 (253:2) scanned by mount (3343974)
BTRFS info (device dm-4): first mount of filesystem 7c8a5017-0c44-456f-8acf-57663f954e53
BTRFS info (device dm-4): using crc32c (crc32c-generic) checksum algorithm
BTRFS info (device dm-4): using free-space-tre
Note that "(device dm-4)" line, this means /dev/test/scratch2 is
mounted when running "mount /dev/test/scratch1 /mnt/scratch".
Then when trying to mount /dev/test/scratch2, since the same device is
already mounted, it returns the same super block and do not fail.
The root cause is, when setting seed device flags for both devices,
a btrfs device scan is triggered, that scan is delayed and can happen at
any time.
So there is a race window between scanning the second device and
mounting the first device, where the device path can be replaced
halfway:
Mount scratch1 | Scanning scratch2
-----------------------------------+-------------------------------------
btrfs_get_tree_super() |
|- btrfs_scan_one_device() |
| We're mounting "scratch1" |
| |
|- btrfs_fs_devices_inc_holding() |
|- mutex_unlock(&uuid_mutex); |
| | btrfs_scan_one_device()
| | |- device_list_add()
| | |- rcu_assign_pointer()
| | This changes the device->name
| | to "scratch2"
|- sget_fc() |
| This creates a new super block |
|- mutex_lock(&uuid_mutex) |
|- btrfs_fs_devices_dec_holding() |
|- btrfs_open_devices() |
|- btrfs_open_one_device() |
"scratch2" is opened as that |
is recorded in device->name |
Commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until super
block is created") introduced fs_devices holding mechanism to allow
devices to be opened after super block is created.
But that holding period doesn't keep device->name untouched, allowing
a duplicated device to replace the path, thus mounting the incorrect
device.
[FIX]
Also check fs_devices->holding value before replacing the device->name.
If fs_devices->holding is not zero, meaning someone is trying to mount
the fs, then do not allow device->name to be updated.
Although this situation is rare, require certain race window and
two devices with duplicated fsid/dev uuid (which is already very cursed),
still output a warning message so that end users can be informed about
such cursed situation.
Fixes: bddf57a70781 ("btrfs: delay btrfs_open_devices() until super block is created")
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
fs/btrfs/volumes.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index fa7a929a0461..4fdd84e0bff9 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -934,6 +934,20 @@ static noinline struct btrfs_device *device_list_add(const char *path,
path, found_transid, device->generation);
return ERR_PTR(-EEXIST);
}
+ if (fs_devices->holding) {
+ /*
+ * The fs_devices are already hold by an ongoing mount.
+ *
+ * We can not update the device path, or a duplicated
+ * fsid/dev-uuid can replace the original path, causing
+ * another device to be mounted.
+ */
+ btrfs_warn(NULL,
+ "device %s is trying to update path for a device being mounted",
+ path);
+ mutex_unlock(&fs_devices->device_list_mutex);
+ return ERR_PTR(-EEXIST);
+ }
/*
* We are going to replace the device path for a given devid,
--
2.50.1
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] btrfs: prevent device path updating during mount 2025-08-11 7:33 [PATCH] btrfs: prevent device path updating during mount Qu Wenruo @ 2025-08-15 20:45 ` Boris Burkov 2025-08-15 23:06 ` Qu Wenruo 0 siblings, 1 reply; 3+ messages in thread From: Boris Burkov @ 2025-08-15 20:45 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs On Mon, Aug 11, 2025 at 05:03:15PM +0930, Qu Wenruo wrote: > [REGRESSION] > After commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until > super block is created"), test case btrfs/315 can fail like this > randomly: > > btrfs/315 1s ... - output mismatch (see /home/adam/xfstests/results//btrfs/315.out.bad) > --- tests/btrfs/315.out 2025-08-11 16:40:36.496000000 +0930 > +++ /home/adam/xfstests/results//btrfs/315.out.bad 2025-08-11 16:41:04.304000000 +0930 > @@ -1,7 +1,7 @@ > QA output created by 315 > ---- seed_device_must_fail ---- > mount: SCRATCH_MNT: WARNING: source write-protected, mounted read-only. > -mount: File exists > +mount: /mnt/test/315/tempfsid_mnt: WARNING: source write-protected, mounted read-only > ---- device_add_must_fail ---- > wrote 9000/9000 bytes at offset 0 > ... > (Run 'diff -u /home/adam/xfstests/tests/btrfs/315.out /home/adam/xfstests/results//btrfs/315.out.bad' to see the entire diff) > Ran: btrfs/315 > Failures: btrfs/315 > Failed 1 of 1 tests > > [CAUSE] > The failure is that the second seed device (with a duplicated fsid and > dev uuid) is mounted successfully, which is unexpected. > > In my environment, the following 2 devices involved in the > "seed_device_must_fail" run are: > > lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch1 -> ../dm-2 > lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch2 -> ../dm-4 > > Note the kernel dmesg of that run, when mounting the first seed device > (scratch1), the real device got mounted is the second seed device > (scratch2): > > BTRFS: device fsid 7c8a5017-0c44-456f-8acf-57663f954e53 devid 1 transid 9 /dev/mapper/test-scratch1 (253:2) scanned by mount (3343974) > BTRFS info (device dm-4): first mount of filesystem 7c8a5017-0c44-456f-8acf-57663f954e53 > BTRFS info (device dm-4): using crc32c (crc32c-generic) checksum algorithm > BTRFS info (device dm-4): using free-space-tre > > Note that "(device dm-4)" line, this means /dev/test/scratch2 is > mounted when running "mount /dev/test/scratch1 /mnt/scratch". > > Then when trying to mount /dev/test/scratch2, since the same device is > already mounted, it returns the same super block and do not fail. > > The root cause is, when setting seed device flags for both devices, > a btrfs device scan is triggered, that scan is delayed and can happen at > any time. > > So there is a race window between scanning the second device and > mounting the first device, where the device path can be replaced > halfway: > > Mount scratch1 | Scanning scratch2 > -----------------------------------+------------------------------------- > btrfs_get_tree_super() | > |- btrfs_scan_one_device() | > | We're mounting "scratch1" | > | | > |- btrfs_fs_devices_inc_holding() | > |- mutex_unlock(&uuid_mutex); | > | | btrfs_scan_one_device() > | | |- device_list_add() > | | |- rcu_assign_pointer() > | | This changes the device->name > | | to "scratch2" > |- sget_fc() | > | This creates a new super block | > |- mutex_lock(&uuid_mutex) | > |- btrfs_fs_devices_dec_holding() | > |- btrfs_open_devices() | > |- btrfs_open_one_device() | > "scratch2" is opened as that | > is recorded in device->name | > > Commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until super > block is created") introduced fs_devices holding mechanism to allow > devices to be opened after super block is created. > > But that holding period doesn't keep device->name untouched, allowing > a duplicated device to replace the path, thus mounting the incorrect > device. > > [FIX] > Also check fs_devices->holding value before replacing the device->name. > If fs_devices->holding is not zero, meaning someone is trying to mount > the fs, then do not allow device->name to be updated. > > Although this situation is rare, require certain race window and > two devices with duplicated fsid/dev uuid (which is already very cursed), > still output a warning message so that end users can be informed about > such cursed situation. Thanks for working on this, the last time I worked in this code it made my head hurt, so I appreciate this is annoying to fix. With that said, I have one concern: do you know if there are existing tests exercising the legit name change behaviors outlined in the big comment starting with "When FS is already mounted"? I have a feeling that this check will prevent those as well, but I am not sufficiently familiar with the new holding delayed open logic to be sure. Thanks, Boris > > Fixes: bddf57a70781 ("btrfs: delay btrfs_open_devices() until super block is created") > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > fs/btrfs/volumes.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index fa7a929a0461..4fdd84e0bff9 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -934,6 +934,20 @@ static noinline struct btrfs_device *device_list_add(const char *path, > path, found_transid, device->generation); > return ERR_PTR(-EEXIST); > } > + if (fs_devices->holding) { > + /* > + * The fs_devices are already hold by an ongoing mount. > + * > + * We can not update the device path, or a duplicated > + * fsid/dev-uuid can replace the original path, causing > + * another device to be mounted. > + */ > + btrfs_warn(NULL, > + "device %s is trying to update path for a device being mounted", > + path); > + mutex_unlock(&fs_devices->device_list_mutex); > + return ERR_PTR(-EEXIST); > + } > > /* > * We are going to replace the device path for a given devid, > -- > 2.50.1 > ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] btrfs: prevent device path updating during mount 2025-08-15 20:45 ` Boris Burkov @ 2025-08-15 23:06 ` Qu Wenruo 0 siblings, 0 replies; 3+ messages in thread From: Qu Wenruo @ 2025-08-15 23:06 UTC (permalink / raw) To: Boris Burkov; +Cc: linux-btrfs 在 2025/8/16 06:15, Boris Burkov 写道: > On Mon, Aug 11, 2025 at 05:03:15PM +0930, Qu Wenruo wrote: >> [REGRESSION] >> After commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until >> super block is created"), test case btrfs/315 can fail like this >> randomly: >> >> btrfs/315 1s ... - output mismatch (see /home/adam/xfstests/results//btrfs/315.out.bad) >> --- tests/btrfs/315.out 2025-08-11 16:40:36.496000000 +0930 >> +++ /home/adam/xfstests/results//btrfs/315.out.bad 2025-08-11 16:41:04.304000000 +0930 >> @@ -1,7 +1,7 @@ >> QA output created by 315 >> ---- seed_device_must_fail ---- >> mount: SCRATCH_MNT: WARNING: source write-protected, mounted read-only. >> -mount: File exists >> +mount: /mnt/test/315/tempfsid_mnt: WARNING: source write-protected, mounted read-only >> ---- device_add_must_fail ---- >> wrote 9000/9000 bytes at offset 0 >> ... >> (Run 'diff -u /home/adam/xfstests/tests/btrfs/315.out /home/adam/xfstests/results//btrfs/315.out.bad' to see the entire diff) >> Ran: btrfs/315 >> Failures: btrfs/315 >> Failed 1 of 1 tests >> >> [CAUSE] >> The failure is that the second seed device (with a duplicated fsid and >> dev uuid) is mounted successfully, which is unexpected. >> >> In my environment, the following 2 devices involved in the >> "seed_device_must_fail" run are: >> >> lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch1 -> ../dm-2 >> lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch2 -> ../dm-4 >> >> Note the kernel dmesg of that run, when mounting the first seed device >> (scratch1), the real device got mounted is the second seed device >> (scratch2): >> >> BTRFS: device fsid 7c8a5017-0c44-456f-8acf-57663f954e53 devid 1 transid 9 /dev/mapper/test-scratch1 (253:2) scanned by mount (3343974) >> BTRFS info (device dm-4): first mount of filesystem 7c8a5017-0c44-456f-8acf-57663f954e53 >> BTRFS info (device dm-4): using crc32c (crc32c-generic) checksum algorithm >> BTRFS info (device dm-4): using free-space-tre >> >> Note that "(device dm-4)" line, this means /dev/test/scratch2 is >> mounted when running "mount /dev/test/scratch1 /mnt/scratch". >> >> Then when trying to mount /dev/test/scratch2, since the same device is >> already mounted, it returns the same super block and do not fail. >> >> The root cause is, when setting seed device flags for both devices, >> a btrfs device scan is triggered, that scan is delayed and can happen at >> any time. >> >> So there is a race window between scanning the second device and >> mounting the first device, where the device path can be replaced >> halfway: >> >> Mount scratch1 | Scanning scratch2 >> -----------------------------------+------------------------------------- >> btrfs_get_tree_super() | >> |- btrfs_scan_one_device() | >> | We're mounting "scratch1" | >> | | >> |- btrfs_fs_devices_inc_holding() | >> |- mutex_unlock(&uuid_mutex); | >> | | btrfs_scan_one_device() >> | | |- device_list_add() >> | | |- rcu_assign_pointer() >> | | This changes the device->name >> | | to "scratch2" >> |- sget_fc() | >> | This creates a new super block | >> |- mutex_lock(&uuid_mutex) | >> |- btrfs_fs_devices_dec_holding() | >> |- btrfs_open_devices() | >> |- btrfs_open_one_device() | >> "scratch2" is opened as that | >> is recorded in device->name | >> >> Commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until super >> block is created") introduced fs_devices holding mechanism to allow >> devices to be opened after super block is created. >> >> But that holding period doesn't keep device->name untouched, allowing >> a duplicated device to replace the path, thus mounting the incorrect >> device. >> >> [FIX] >> Also check fs_devices->holding value before replacing the device->name. >> If fs_devices->holding is not zero, meaning someone is trying to mount >> the fs, then do not allow device->name to be updated. >> >> Although this situation is rare, require certain race window and >> two devices with duplicated fsid/dev uuid (which is already very cursed), >> still output a warning message so that end users can be informed about >> such cursed situation. > > Thanks for working on this, the last time I worked in this code it made > my head hurt, so I appreciate this is annoying to fix. > > With that said, I have one concern: do you know if there are existing > tests exercising the legit name change behaviors outlined in the big > comment starting with "When FS is already mounted"? No, I can not find such tests nor I experienced any failure with this patch applied. Furthermore the cases mentioned in the original comment only applies for fs already mounted, for fses during mount I think those cases should directly cause failure. Although I understand there are cases like devices disappear and re-appear, personally I doubt if we should even just re-add the device automatically. There are more problems, especially related to the recently introduced block device holder. We relies on the block device holder to reach a super block, but during scan we don't provide any device holder. Thus if we really go into the branch that calls "device->devt = path_devt;", it will screw up future bdev freeze/thaw/sync/remove_bdev callbacks. Thanks, Qu > I have a feeling > that this check will prevent those as well, but I am not sufficiently > familiar with the new holding delayed open logic to be sure. > > Thanks, > Boris > >> >> Fixes: bddf57a70781 ("btrfs: delay btrfs_open_devices() until super block is created") >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> fs/btrfs/volumes.c | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >> index fa7a929a0461..4fdd84e0bff9 100644 >> --- a/fs/btrfs/volumes.c >> +++ b/fs/btrfs/volumes.c >> @@ -934,6 +934,20 @@ static noinline struct btrfs_device *device_list_add(const char *path, >> path, found_transid, device->generation); >> return ERR_PTR(-EEXIST); >> } >> + if (fs_devices->holding) { >> + /* >> + * The fs_devices are already hold by an ongoing mount. >> + * >> + * We can not update the device path, or a duplicated >> + * fsid/dev-uuid can replace the original path, causing >> + * another device to be mounted. >> + */ >> + btrfs_warn(NULL, >> + "device %s is trying to update path for a device being mounted", >> + path); >> + mutex_unlock(&fs_devices->device_list_mutex); >> + return ERR_PTR(-EEXIST); >> + } >> >> /* >> * We are going to replace the device path for a given devid, >> -- >> 2.50.1 >> ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-08-15 23:07 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-08-11 7:33 [PATCH] btrfs: prevent device path updating during mount Qu Wenruo 2025-08-15 20:45 ` Boris Burkov 2025-08-15 23:06 ` Qu Wenruo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox