From: Anand Jain <anand.jain@oracle.com>
To: Alex Romosan <aromosan@gmail.com>
Cc: Filipe Manana <fdmanana@kernel.org>,
David Sterba <dsterba@suse.com>,
CHECK_1234543212345@protonmail.com, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v4 RFC] btrfs: do not skip re-registration for the mounted device
Date: Fri, 8 Mar 2024 21:18:38 +0530 [thread overview]
Message-ID: <9a8f7496-b5ad-45f5-ba0b-5690a8a39fa6@oracle.com> (raw)
In-Reply-To: <CAKLYgeLgp4=QxmSEZS1+eUhdfjh-S-hu+HgtFeq51-jgj2EGTQ@mail.gmail.com>
On 3/8/24 21:07, Alex Romosan wrote:
> confirming that update-grub works with v4 of the patch. these are the
> relevant entries in the log:
>
> Btrfs loaded, debug=on, zoned=no, fsverity=no
> BTRFS: device fsid 695aa7ac-862a-4de3-ae59-c96f784600a0 devid 1
> transid 2026388 /dev/root scanned by swapper/0 (1)
> BTRFS info (device nvme0n1p3): first mount of filesystem
> 695aa7ac-862a-4de3-ae59-c96f784600a0
> BTRFS info (device nvme0n1p3): using crc32c (crc32c-generic) checksum algorithm
> BTRFS info (device nvme0n1p3): using free-space-tree
> VFS: Mounted root (btrfs filesystem) readonly on device 0:19.
> BTRFS info: devid 1 device path /dev/root changed to /dev/nvme0n1p3
> scanned by (udev-worker) (279)
Thank you for reconfirming and for sharing the logs, which clearly
depict the device path updates. Additionally, there is a separate
patch to include the major and minor numbers in these messages,
enhancing clarity further.
-Anand
> On Fri, Mar 8, 2024 at 3:37 PM Anand Jain <anand.jain@oracle.com> wrote:
>>
>>
>> Sure.
>> Here is the link to the latest version of the patch, which is v4.
>> It is based on the mainline master.
>>
>> https://patchwork.kernel.org/project/linux-btrfs/patch/65a11e853a31b18b620f31cbbddf03e277fe3edf.1709809171.git.anand.jain@oracle.com/
>>
>> Thanks, Anand
>>
>> On 3/8/24 20:02, Alex Romosan wrote:
>>> sorry about the previous html mail.
>>>
>>> Just to eliminate any confusion, can you please provide either a link
>>> to v4 of the patch or include it in the reply to this and explicitly
>>> labeled as such? I am beginning to have doubts as to the version I was
>>> testing. Thanks
>>>
>>> On Fri, Mar 8, 2024 at 2:52 PM Anand Jain <anand.jain@oracle.com> wrote:
>>>>
>>>>
>>>>
>>>> On 3/8/24 17:45, Filipe Manana wrote:
>>>>> On Fri, Mar 8, 2024 at 2:28 AM Anand Jain <anand.jain@oracle.com> wrote:
>>>>>>
>>>>>> Filipe,
>>>>>>
>>>>>> We've received confirmation from the user that the original update-grub
>>>>>> issue has been fixed. Additionally, your reported issue using
>>>>>> './check btrfs/14[6-9] btrfs/15[8-9]' has been resolved.
>>>>>>
>>>>>> However, reproducing the bug has been inconsistent on my systems.
>>>>>> If you could try checking that as well, it would be appreciated.
>>>>>
>>>>> Sure, but I'm lost as to what I should test.
>>>>> There are several patches, and multiple versions, in the mailing list.
>>>>>
>>>>> What should be tested on top of the for-next branch?
>>>>
>>>> v4 is the latest version of this patch, which is based on the mainline
>>>> master. As you reported that you were able to make btrfs/159 fail with
>>>> this patch at v2, v4 of this patch theoretically fixes the bug you
>>>> reported. So, I wanted to know if you are still able to reproduce
>>>> the bug with v4?
>>>>
>>>> Test case:
>>>> ./check btrfs/14[6-9] btrfs/15[8-9]
>>>>
>>>> Thanks.
>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>>>
>>>>>> David,
>>>>>>
>>>>>> If everything is good with v4, would you like v5 with the RFC
>>>>>> removed and "CC: stable@vger.kernel.org # 6.7+" added? Or if
>>>>>> it could be done during integration? I'm fine either way.
>>>>>>
>>>>>> Thanks,
>>>>>> Anand
>>>>>>
>>>>>> On 3/7/24 16:38, Anand Jain wrote:
>>>>>>> There are reports that since version 6.7 update-grub fails to find the
>>>>>>> device of the root on systems without initrd and on a single device.
>>>>>>>
>>>>>>> This looks like the device name changed in the output of
>>>>>>> /proc/self/mountinfo:
>>>>>>>
>>>>>>> 6.5-rc5 working
>>>>>>>
>>>>>>> 18 1 0:16 / / rw,noatime - btrfs /dev/sda8 ...
>>>>>>>
>>>>>>> 6.7 not working:
>>>>>>>
>>>>>>> 17 1 0:15 / / rw,noatime - btrfs /dev/root ...
>>>>>>>
>>>>>>> and "update-grub" shows this error:
>>>>>>>
>>>>>>> /usr/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?)
>>>>>>>
>>>>>>> This looks like it's related to the device name, but grub-probe
>>>>>>> recognizes the "/dev/root" path and tries to find the underlying device.
>>>>>>> However there's a special case for some filesystems, for btrfs in
>>>>>>> particular.
>>>>>>>
>>>>>>> The generic root device detection heuristic is not done and it all
>>>>>>> relies on reading the device infos by a btrfs specific ioctl. This ioctl
>>>>>>> returns the device name as it was saved at the time of device scan (in
>>>>>>> this case it's /dev/root).
>>>>>>>
>>>>>>> The change in 6.7 for temp_fsid to allow several single device
>>>>>>> filesystem to exist with the same fsid (and transparently generate a new
>>>>>>> UUID at mount time) was to skip caching/registering such devices.
>>>>>>>
>>>>>>> This also skipped mounted device. One step of scanning is to check if
>>>>>>> the device name hasn't changed, and if yes then update the cached value.
>>>>>>>
>>>>>>> This broke the grub-probe as it always read the device /dev/root and
>>>>>>> couldn't find it in the system. A temporary workaround is to create a
>>>>>>> symlink but this does not survive reboot.
>>>>>>>
>>>>>>> The right fix is to allow updating the device path of a mounted
>>>>>>> filesystem even if this is a single device one.
>>>>>>>
>>>>>>> In the fix, check if the device's major:minor number matches with the
>>>>>>> cached device. If they do, then we can allow the scan to happen so that
>>>>>>> device_list_add() can take care of updating the device path. The file
>>>>>>> descriptor remains unchanged.
>>>>>>>
>>>>>>> This does not affect the temp_fsid feature, the UUID of the mounted
>>>>>>> filesystem remains the same and the matching is based on device major:minor
>>>>>>> which is unique per mounted filesystem.
>>>>>>>
>>>>>>> This covers the path when the device (that exists for all mounted
>>>>>>> devices) name changes, updating /dev/root to /dev/sdx. Any other single
>>>>>>> device with filesystem and is not mounted is still skipped.
>>>>>>>
>>>>>>> Note that if a system is booted and initial mount is done on the
>>>>>>> /dev/root device, this will be the cached name of the device. Only after
>>>>>>> the command "btrfs device scan" it will change as it triggers the
>>>>>>> rename.
>>>>>>>
>>>>>>> The fix was verified by users whose systems were affected.
>>>>>>>
>>>>>>> Fixes: bc27d6f0aa0e ("btrfs: scan but don't register device on single device filesystem")
>>>>>>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=218353
>>>>>>> Link: https://lore.kernel.org/lkml/CAKLYgeJ1tUuqLcsquwuFqjDXPSJpEiokrWK2gisPKDZLs8Y2TQ@mail.gmail.com/
>>>>>>> Tested-by: Alex Romosan <aromosan@gmail.com>
>>>>>>> Tested-by: CHECK_1234543212345@protonmail.com
>>>>>>> Reviewed-by: David Sterba <dsterba@suse.com>
>>>>>>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>>>>>>> ---
>>>>>>> v4:
>>>>>>> I removed CC: stable@vger.kernel.org # 6.7+ as this is still in the RFC stage.
>>>>>>> I need this patch verified by the bug filer.
>>>>>>> Use devt from bdev->bd_dev
>>>>>>> Rebased on mainline kernel.org master branch
>>>>>>>
>>>>>>> v3:
>>>>>>> https://lore.kernel.org/linux-btrfs/e2add8d54fbbd813305ba014c11d21d297ad87d0.1709782041.git.anand.jain@oracle.com/T/#u
>>>>>>>
>>>>>>> fs/btrfs/volumes.c | 57 ++++++++++++++++++++++++++++++++++++++--------
>>>>>>> 1 file changed, 47 insertions(+), 10 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>>>>>> index d67785be2c77..75bfef1b973b 100644
>>>>>>> --- a/fs/btrfs/volumes.c
>>>>>>> +++ b/fs/btrfs/volumes.c
>>>>>>> @@ -1301,6 +1301,47 @@ int btrfs_forget_devices(dev_t devt)
>>>>>>> return ret;
>>>>>>> }
>>>>>>>
>>>>>>> +static bool btrfs_skip_registration(struct btrfs_super_block *disk_super,
>>>>>>> + const char *path, dev_t devt,
>>>>>>> + bool mount_arg_dev)
>>>>>>> +{
>>>>>>> + struct btrfs_fs_devices *fs_devices;
>>>>>>> +
>>>>>>> + /*
>>>>>>> + * Do not skip device registration for mounted devices with matching
>>>>>>> + * maj:min but different paths. Booting without initrd relies on
>>>>>>> + * /dev/root initially, later replaced with the actual root device.
>>>>>>> + * A successful scan ensures update-grub selects the correct device.
>>>>>>> + */
>>>>>>> + list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
>>>>>>> + struct btrfs_device *device;
>>>>>>> +
>>>>>>> + mutex_lock(&fs_devices->device_list_mutex);
>>>>>>> +
>>>>>>> + if (!fs_devices->opened) {
>>>>>>> + mutex_unlock(&fs_devices->device_list_mutex);
>>>>>>> + continue;
>>>>>>> + }
>>>>>>> +
>>>>>>> + list_for_each_entry(device, &fs_devices->devices, dev_list) {
>>>>>>> + if ((device->devt == devt) &&
>>>>>>> + strcmp(device->name->str, path)) {
>>>>>>> + mutex_unlock(&fs_devices->device_list_mutex);
>>>>>>> +
>>>>>>> + /* Do not skip registration */
>>>>>>> + return false;
>>>>>>> + }
>>>>>>> + }
>>>>>>> + mutex_unlock(&fs_devices->device_list_mutex);
>>>>>>> + }
>>>>>>> +
>>>>>>> + if (!mount_arg_dev && btrfs_super_num_devices(disk_super) == 1 &&
>>>>>>> + !(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING))
>>>>>>> + return true;
>>>>>>> +
>>>>>>> + return false;
>>>>>>> +}
>>>>>>> +
>>>>>>> /*
>>>>>>> * Look for a btrfs signature on a device. This may be called out of the mount path
>>>>>>> * and we are not allowed to call set_blocksize during the scan. The superblock
>>>>>>> @@ -1357,18 +1398,14 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
>>>>>>> goto error_bdev_put;
>>>>>>> }
>>>>>>>
>>>>>>> - if (!mount_arg_dev && btrfs_super_num_devices(disk_super) == 1 &&
>>>>>>> - !(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)) {
>>>>>>> - dev_t devt;
>>>>>>> + if (btrfs_skip_registration(disk_super, path, bdev_handle->bdev->bd_dev,
>>>>>>> + mount_arg_dev)) {
>>>>>>> + pr_debug("BTRFS: skip registering single non-seed device %s (%d:%d)\n",
>>>>>>> + path, MAJOR(bdev_handle->bdev->bd_dev),
>>>>>>> + MINOR(bdev_handle->bdev->bd_dev));
>>>>>>>
>>>>>>> - ret = lookup_bdev(path, &devt);
>>>>>>> - if (ret)
>>>>>>> - btrfs_warn(NULL, "lookup bdev failed for path %s: %d",
>>>>>>> - path, ret);
>>>>>>> - else
>>>>>>> - btrfs_free_stale_devices(devt, NULL);
>>>>>>> + btrfs_free_stale_devices(bdev_handle->bdev->bd_dev, NULL);
>>>>>>>
>>>>>>> - pr_debug("BTRFS: skip registering single non-seed device %s\n", path);
>>>>>>> device = NULL;
>>>>>>> goto free_disk_super;
>>>>>>> }
>>>>>>
next prev parent reply other threads:[~2024-03-08 15:49 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-07 11:08 [PATCH v4 RFC] btrfs: do not skip re-registration for the mounted device Anand Jain
2024-03-07 12:12 ` Alex Romosan
2024-03-08 2:28 ` Anand Jain
2024-03-08 12:15 ` Filipe Manana
2024-03-08 13:52 ` Anand Jain
2024-03-08 14:32 ` Alex Romosan
2024-03-08 14:37 ` Anand Jain
2024-03-08 15:37 ` Alex Romosan
2024-03-08 15:48 ` Anand Jain [this message]
2024-03-08 15:54 ` Alex Romosan
2024-03-10 17:11 ` Filipe Manana
2024-03-11 2:00 ` Anand Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9a8f7496-b5ad-45f5-ba0b-5690a8a39fa6@oracle.com \
--to=anand.jain@oracle.com \
--cc=CHECK_1234543212345@protonmail.com \
--cc=aromosan@gmail.com \
--cc=dsterba@suse.com \
--cc=fdmanana@kernel.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox