public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Boris Burkov <boris@bur.io>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: prevent device path updating during mount
Date: Sat, 16 Aug 2025 08:36:59 +0930	[thread overview]
Message-ID: <f05dc011-5fcd-4d84-a85b-2eebdbbf741a@suse.com> (raw)
In-Reply-To: <20250815204441.GA2973697@zen.localdomain>



在 2025/8/16 06:15, Boris Burkov 写道:
> On Mon, Aug 11, 2025 at 05:03:15PM +0930, Qu Wenruo wrote:
>> [REGRESSION]
>> After commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until
>> super block is created"), test case btrfs/315 can fail like this
>> randomly:
>>
>> btrfs/315 1s ... - output mismatch (see /home/adam/xfstests/results//btrfs/315.out.bad)
>>      --- tests/btrfs/315.out	2025-08-11 16:40:36.496000000 +0930
>>      +++ /home/adam/xfstests/results//btrfs/315.out.bad	2025-08-11 16:41:04.304000000 +0930
>>      @@ -1,7 +1,7 @@
>>       QA output created by 315
>>       ---- seed_device_must_fail ----
>>       mount: SCRATCH_MNT: WARNING: source write-protected, mounted read-only.
>>      -mount: File exists
>>      +mount: /mnt/test/315/tempfsid_mnt: WARNING: source write-protected, mounted read-only
>>       ---- device_add_must_fail ----
>>       wrote 9000/9000 bytes at offset 0
>>      ...
>>      (Run 'diff -u /home/adam/xfstests/tests/btrfs/315.out /home/adam/xfstests/results//btrfs/315.out.bad'  to see the entire diff)
>> Ran: btrfs/315
>> Failures: btrfs/315
>> Failed 1 of 1 tests
>>
>> [CAUSE]
>> The failure is that the second seed device (with a duplicated fsid and
>> dev uuid) is mounted successfully, which is unexpected.
>>
>> In my environment, the following 2 devices involved in the
>> "seed_device_must_fail" run are:
>>
>>   lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch1 -> ../dm-2
>>   lrwxrwxrwx 1 root root 7 Aug 11 09:03 /dev/test/scratch2 -> ../dm-4
>>
>> Note the kernel dmesg of that run, when mounting the first seed device
>> (scratch1), the real device got mounted is the second seed device
>> (scratch2):
>>
>>   BTRFS: device fsid 7c8a5017-0c44-456f-8acf-57663f954e53 devid 1 transid 9 /dev/mapper/test-scratch1 (253:2) scanned by mount (3343974)
>>   BTRFS info (device dm-4): first mount of filesystem 7c8a5017-0c44-456f-8acf-57663f954e53
>>   BTRFS info (device dm-4): using crc32c (crc32c-generic) checksum algorithm
>>   BTRFS info (device dm-4): using free-space-tre
>>
>> Note that "(device dm-4)" line, this means /dev/test/scratch2 is
>> mounted when running "mount /dev/test/scratch1 /mnt/scratch".
>>
>> Then when trying to mount /dev/test/scratch2, since the same device is
>> already mounted, it returns the same super block and do not fail.
>>
>> The root cause is, when setting seed device flags for both devices,
>> a btrfs device scan is triggered, that scan is delayed and can happen at
>> any time.
>>
>> So there is a race window between scanning the second device and
>> mounting the first device, where the device path can be replaced
>> halfway:
>>
>>       Mount scratch1                |          Scanning scratch2
>> -----------------------------------+-------------------------------------
>> btrfs_get_tree_super()             |
>> |- btrfs_scan_one_device()         |
>> |  We're mounting "scratch1"       |
>> |                                  |
>> |- btrfs_fs_devices_inc_holding()  |
>> |- mutex_unlock(&uuid_mutex);      |
>> |                                  | btrfs_scan_one_device()
>> |                                  | |- device_list_add()
>> |                                  |    |- rcu_assign_pointer()
>> |                                  |       This changes the device->name
>> |                                  |       to "scratch2"
>> |- sget_fc()                       |
>> |  This creates a new super block  |
>> |- mutex_lock(&uuid_mutex)         |
>> |- btrfs_fs_devices_dec_holding()  |
>> |- btrfs_open_devices()            |
>>     |- btrfs_open_one_device()      |
>>        "scratch2" is opened as that |
>>        is recorded in device->name   |
>>
>> Commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until super
>> block is created") introduced fs_devices holding mechanism to allow
>> devices to be opened after super block is created.
>>
>> But that holding period doesn't keep device->name untouched, allowing
>> a duplicated device to replace the path, thus mounting the incorrect
>> device.
>>
>> [FIX]
>> Also check fs_devices->holding value before replacing the device->name.
>> If fs_devices->holding is not zero, meaning someone is trying to mount
>> the fs, then do not allow device->name to be updated.
>>
>> Although this situation is rare, require certain race window and
>> two devices with duplicated fsid/dev uuid (which is already very cursed),
>> still output a warning message so that end users can be informed about
>> such cursed situation.
> 
> Thanks for working on this, the last time I worked in this code it made
> my head hurt, so I appreciate this is annoying to fix.
> 
> With that said, I have one concern: do you know if there are existing
> tests exercising the legit name change behaviors outlined in the big
> comment starting with "When FS is already mounted"?

No, I can not find such tests nor I experienced any failure with this 
patch applied.

Furthermore the cases mentioned in the original comment only applies for 
fs already mounted, for fses during mount I think those cases should 
directly cause failure.


Although I understand there are cases like devices disappear and 
re-appear, personally I doubt if we should even just re-add the device 
automatically.

There are more problems, especially related to the recently introduced 
block device holder.
We relies on the block device holder to reach a super block, but during 
scan we don't provide any device holder.

Thus if we really go into the branch that calls "device->devt = 
path_devt;", it will screw up future bdev freeze/thaw/sync/remove_bdev 
callbacks.

Thanks,
Qu

> I have a feeling
> that this check will prevent those as well, but I am not sufficiently
> familiar with the new holding delayed open logic to be sure.
> 
> Thanks,
> Boris
> 
>>
>> Fixes: bddf57a70781 ("btrfs: delay btrfs_open_devices() until super block is created")
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>> ---
>>   fs/btrfs/volumes.c | 14 ++++++++++++++
>>   1 file changed, 14 insertions(+)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index fa7a929a0461..4fdd84e0bff9 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -934,6 +934,20 @@ static noinline struct btrfs_device *device_list_add(const char *path,
>>   				  path, found_transid, device->generation);
>>   			return ERR_PTR(-EEXIST);
>>   		}
>> +		if (fs_devices->holding) {
>> +			/*
>> +			 * The fs_devices are already hold by an ongoing mount.
>> +			 *
>> +			 * We can not update the device path, or a duplicated
>> +			 * fsid/dev-uuid can replace the original path, causing
>> +			 * another device to be mounted.
>> +			 */
>> +			btrfs_warn(NULL,
>> +				   "device %s is trying to update path for a device being mounted",
>> +				   path);
>> +			mutex_unlock(&fs_devices->device_list_mutex);
>> +			return ERR_PTR(-EEXIST);
>> +		}
>>   
>>   		/*
>>   		 * We are going to replace the device path for a given devid,
>> -- 
>> 2.50.1
>>


      reply	other threads:[~2025-08-15 23:07 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-11  7:33 [PATCH] btrfs: prevent device path updating during mount Qu Wenruo
2025-08-15 20:45 ` Boris Burkov
2025-08-15 23:06   ` Qu Wenruo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f05dc011-5fcd-4d84-a85b-2eebdbbf741a@suse.com \
    --to=wqu@suse.com \
    --cc=boris@bur.io \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox