From: Anand Jain <anand.jain@oracle.com>
To: Alex Romosan <aromosan@gmail.com>
Cc: bernd.feige@gmx.net, dsterba@suse.cz, dsterba@suse.com,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: do not skip re-registration for the mounted device
Date: Thu, 8 Feb 2024 22:52:00 +0530 [thread overview]
Message-ID: <bb7f33ba-5c8f-4b07-8d79-d0d191ce1fcf@oracle.com> (raw)
In-Reply-To: <CAKLYgeJCqu_9aCO+s74rcFh5R6EdLeNwe43MhRmjQ=soFX-rcQ@mail.gmail.com>
Thanks for the kernel messages with debug enabled.
I don't see the message to skip scannaing for
the mounted device. So it's not what we thought
was the issue.
pr_debug("BTRFS: skip registering single non-seed device %s\n", path);
Based on the assumption above, I have a fix below,
but I doubt its effectiveness.
https://patchwork.kernel.org/project/linux-btrfs/patch/8dd1990114aabb775d4631969f1beabeadaac5b7.1707132247.git.anand.jain@oracle.com/
-Anand
On 2/8/24 18:05, Alex Romosan wrote:
> i'm attaching my boot log with 6.8.0-rc3 no fixes and btrfs debug
> enabled (i assume this is what you wanted). update-grub doesn't work.
> there was no patch in your last message. do you want me to try the
> patch you sent on monday? confused
>
> On Thu, Feb 8, 2024 at 3:23 AM Anand Jain <anand.jain@oracle.com> wrote:
>>
>>
>> Alex,
>>
>> Please provide the kernel boot messages with debugging enabled but
>> no fixes applied. Kindly collect these messages after reproducing
>> the problem.
>>
>> We've found issues with the previous fix. Please test this
>> new patch, as it may address the bug. Keep debugging enabled
>> during testing.
>>
>>
>> Thanks ,Anand
>>
>>
>> On 2/7/24 23:48, Alex Romosan wrote:
>>> Which version of the patch are we talking about? Let me know and I’ll
>>> try it with debugging on. I tried David’s patch and it seemed to work (I
>>> just booted into that kernel and ran update-grub) but I’ll try something
>>> else…
>>>
>>> On Wed, Feb 7, 2024 at 19:08 Anand Jain <anand.jain@oracle.com
>>> <mailto:anand.jain@oracle.com>> wrote:
>>>
>>>
>>>
>>> On 2/7/24 08:08, Anand Jain wrote:
>>> >
>>> >
>>> >
>>> > On 2/5/24 18:27, David Sterba wrote:
>>> >> On Mon, Feb 05, 2024 at 07:45:05PM +0800, Anand Jain wrote:
>>> >>> We skip device registration for a single device. However, we do
>>> not do
>>> >>> that if the device is already mounted, as it might be coming in
>>> again
>>> >>> for scanning a different path.
>>> >>>
>>> >>> This patch is lightly tested; for verification if it fixes.
>>> >>>
>>> >>> Signed-off-by: Anand Jain <anand.jain@oracle.com
>>> <mailto:anand.jain@oracle.com>>
>>> >>> ---
>>> >>> I still have some unknowns about the problem. Pls test if this
>>> fixes
>>> >>> the problem.
>>>
>>> Successfully tested with fstests (-g volume) and temp-fsid test cases.
>>>
>>> Can someone verify if this patch fixes the problem? Also, when problem
>>> occurs please provide kernel messages with Btrfs debugging support
>>> option compiled in.
>>>
>>> Thanks, Anand
>>>
>>>
>>> >>>
>>> >>> fs/btrfs/volumes.c | 44
>>> ++++++++++++++++++++++++++++++++++----------
>>> >>> fs/btrfs/volumes.h | 1 -
>>> >>> 2 files changed, 34 insertions(+), 11 deletions(-)
>>> >>>
>>> >>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> >>> index 474ab7ed65ea..192c540a650c 100644
>>> >>> --- a/fs/btrfs/volumes.c
>>> >>> +++ b/fs/btrfs/volumes.c
>>> >>> @@ -1299,6 +1299,31 @@ int btrfs_forget_devices(dev_t devt)
>>> >>> return ret;
>>> >>> }
>>> >>> +static bool btrfs_skip_registration(struct btrfs_super_block
>>> >>> *disk_super,
>>> >>> + dev_t devt, bool mount_arg_dev)
>>> >>> +{
>>> >>> + struct btrfs_fs_devices *fs_devices;
>>> >>> +
>>> >>> + list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
>>> >>> + struct btrfs_device *device;
>>> >>> +
>>> >>> + mutex_lock(&fs_devices->device_list_mutex);
>>> >>> + list_for_each_entry(device, &fs_devices->devices,
>>> dev_list) {
>>> >>> + if (device->devt == devt) {
>>> >>> + mutex_unlock(&fs_devices->device_list_mutex);
>>> >>> + return false;
>>> >>> + }
>>> >>> + }
>>> >>> + mutex_unlock(&fs_devices->device_list_mutex);
>>> >>
>>> >> This is locking and unlocking again before going to
>>> device_list_add, so
>>> >> if something changes regarding the registered device then it's
>>> not up to
>>> >> date.
>>> >>
>>> >
>>>
>>> We are in the uuid_mutex, a potentially racing thread will have to
>>> acquire this mutex to delete from the list. So there can't a race.
>>>
>>>
>>>
>>> > Right. A race might happen, but it is not an issue. At worst, there
>>> > will be a stale device in the cache, which gets removed or re-used
>>> > in the next mkfs or mount of the same device.
>>> >
>>> > However, this is a rough cut that we need to fix. I am reviewing
>>> > your approach as well. I'm fine with any fix.
>>> >
>>> >
>>> >>
>>> >>> + }
>>> >>> +
>>> >>> + if (!mount_arg_dev && btrfs_super_num_devices(disk_super)
>>> == 1 &&
>>> >>> + !(btrfs_super_flags(disk_super) &
>>> BTRFS_SUPER_FLAG_SEEDING))
>>> >>> + return true;
>>> >>
>>> >> The way I implemented it is to check the above conditions as a
>>> >> prerequisite but leave the heavy work for device_list_add that
>>> does all
>>> >> the uuid and device list locking and we are quite sure it
>>> survives all
>>> >> the races between scanning and mounts.
>>> >>
>>> >
>>> > Hm. But isn't that the bug we need to fix? That we skipped the device
>>> > scan thread that wanted to replace the device path from /dev/root to
>>> > /dev/sdx?
>>> >
>>> > And we skipped, because it was not a mount thread
>>> > (%mount_arg_dev=false), and the device is already mounted and the
>>> > devt will match?
>>> >
>>> > So my fix also checked if devt is a match, then allow it to scan
>>> > (so that the device path can be updated, such as /dev/root to
>>> /dev/sdx).
>>> >
>>> > To confirm the bug, I asked for the debug kernel messages, I don't
>>> > this we got it. Also, the existing kernel log shows no such issue.
>>> >
>>> >
>>> >>> +
>>> >>> + return false;
>>> >>> +}
>>> >>> +
>>> >>> /*
>>> >>> * Look for a btrfs signature on a device. This may be called
>>> out
>>> >>> of the mount path
>>> >>> * and we are not allowed to call set_blocksize during the scan.
>>> >>> The superblock
>>> >>> @@ -1316,6 +1341,7 @@ struct btrfs_device
>>> >>> *btrfs_scan_one_device(const char *path, blk_mode_t flags,
>>> >>> struct btrfs_device *device = NULL;
>>> >>> struct bdev_handle *bdev_handle;
>>> >>> u64 bytenr, bytenr_orig;
>>> >>> + dev_t devt = 0;
>>> >>> int ret;
>>> >>> lockdep_assert_held(&uuid_mutex);
>>> >>> @@ -1355,18 +1381,16 @@ struct btrfs_device
>>> >>> *btrfs_scan_one_device(const char *path, blk_mode_t flags,
>>> >>> goto error_bdev_put;
>>> >>> }
>>> >>> - if (!mount_arg_dev && btrfs_super_num_devices(disk_super)
>>> == 1 &&
>>> >>> - !(btrfs_super_flags(disk_super) &
>>> BTRFS_SUPER_FLAG_SEEDING)) {
>>> >>> - dev_t devt;
>>> >>> + ret = lookup_bdev(path, &devt);
>>> >>> + if (ret)
>>> >>> + btrfs_warn(NULL, "lookup bdev failed for path %s: %d",
>>> >>> + path, ret);
>>> >>> - ret = lookup_bdev(path, &devt);
>>> >>
>>> >> Do we actually need this check? It was added with the patch
>>> skipping the
>>> >> registration, so it's validating the block device but how can we
>>> pass
>>> >> something that is not a valid block device?
>>> >>
>>> >
>>> > Do you mean to check if the lookup_bdev() is successful? Hm. It
>>> should
>>> > be okay not to check, but we do that consistently in other places.
>>> >
>>> >> Besides there's a call to bdev_open_by_path() that in turn does the
>>> >> lookup_bdev so checking it here is redundant. It's not related
>>> to the
>>> >> fix itself but I deleted it in my fix.
>>> >>
>>> >
>>> > Oh no. We need %devt to be set because:
>>> >
>>> > It will match if that device is already mounted/scanned.
>>> > It will also free stale entries.
>>> >
>>> > Thx, Anand
>>> >
>>> >>> - if (ret)
>>> >>> - btrfs_warn(NULL, "lookup bdev failed for path %s: %d",
>>> >>> - path, ret);
>>> >>> - else
>>> >>> + if (btrfs_skip_registration(disk_super, devt,
>>> mount_arg_dev)) {
>>> >>> + pr_debug("BTRFS: skip registering single non-seed
>>> device %s\n",
>>> >>> + path);
>>> >>> + if (devt)
>>> >>> btrfs_free_stale_devices(devt, NULL);
>>> >>> -
>>> >>> - pr_debug("BTRFS: skip registering single non-seed device
>>> >>> %s\n", path);
>>> >>> device = NULL;
>>> >>> goto free_disk_super;
>>> >>> }
>>>
next prev parent reply other threads:[~2024-02-08 17:22 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-05 11:45 [PATCH] btrfs: do not skip re-registration for the mounted device Anand Jain
2024-02-05 12:57 ` David Sterba
2024-02-07 2:38 ` Anand Jain
2024-02-07 18:08 ` Anand Jain
[not found] ` <CAKLYge+9ngrW-1EffUhyU1y13MzgP33osNDi3D6y6UVW6poJZA@mail.gmail.com>
2024-02-08 2:23 ` Anand Jain
2024-02-08 12:35 ` Alex Romosan
2024-02-08 17:22 ` Anand Jain [this message]
2024-02-08 19:45 ` Alex Romosan
2024-02-08 20:04 ` Alex Romosan
2024-02-09 8:11 ` Anand Jain
2024-02-09 15:27 ` Dr. Bernd Feige
2024-02-12 14:59 ` David Sterba
2024-02-13 0:35 ` Anand Jain
2024-02-13 19:38 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bb7f33ba-5c8f-4b07-8d79-d0d191ce1fcf@oracle.com \
--to=anand.jain@oracle.com \
--cc=aromosan@gmail.com \
--cc=bernd.feige@gmx.net \
--cc=dsterba@suse.com \
--cc=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox