From: Anand Jain <anand.jain@oracle.com>
To: Alex Romosan <aromosan@gmail.com>
Cc: bernd.feige@gmx.net, dsterba@suse.cz, dsterba@suse.com,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: do not skip re-registration for the mounted device
Date: Thu, 8 Feb 2024 07:53:22 +0530 [thread overview]
Message-ID: <55c879b6-5e6b-4602-b558-e52540b67bda@oracle.com> (raw)
In-Reply-To: <CAKLYge+9ngrW-1EffUhyU1y13MzgP33osNDi3D6y6UVW6poJZA@mail.gmail.com>
Alex,
Please provide the kernel boot messages with debugging enabled but
no fixes applied. Kindly collect these messages after reproducing
the problem.
We've found issues with the previous fix. Please test this
new patch, as it may address the bug. Keep debugging enabled
during testing.
Thanks ,Anand
On 2/7/24 23:48, Alex Romosan wrote:
> Which version of the patch are we talking about? Let me know and I’ll
> try it with debugging on. I tried David’s patch and it seemed to work (I
> just booted into that kernel and ran update-grub) but I’ll try something
> else…
>
> On Wed, Feb 7, 2024 at 19:08 Anand Jain <anand.jain@oracle.com
> <mailto:anand.jain@oracle.com>> wrote:
>
>
>
> On 2/7/24 08:08, Anand Jain wrote:
> >
> >
> >
> > On 2/5/24 18:27, David Sterba wrote:
> >> On Mon, Feb 05, 2024 at 07:45:05PM +0800, Anand Jain wrote:
> >>> We skip device registration for a single device. However, we do
> not do
> >>> that if the device is already mounted, as it might be coming in
> again
> >>> for scanning a different path.
> >>>
> >>> This patch is lightly tested; for verification if it fixes.
> >>>
> >>> Signed-off-by: Anand Jain <anand.jain@oracle.com
> <mailto:anand.jain@oracle.com>>
> >>> ---
> >>> I still have some unknowns about the problem. Pls test if this
> fixes
> >>> the problem.
>
> Successfully tested with fstests (-g volume) and temp-fsid test cases.
>
> Can someone verify if this patch fixes the problem? Also, when problem
> occurs please provide kernel messages with Btrfs debugging support
> option compiled in.
>
> Thanks, Anand
>
>
> >>>
> >>> fs/btrfs/volumes.c | 44
> ++++++++++++++++++++++++++++++++++----------
> >>> fs/btrfs/volumes.h | 1 -
> >>> 2 files changed, 34 insertions(+), 11 deletions(-)
> >>>
> >>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> >>> index 474ab7ed65ea..192c540a650c 100644
> >>> --- a/fs/btrfs/volumes.c
> >>> +++ b/fs/btrfs/volumes.c
> >>> @@ -1299,6 +1299,31 @@ int btrfs_forget_devices(dev_t devt)
> >>> return ret;
> >>> }
> >>> +static bool btrfs_skip_registration(struct btrfs_super_block
> >>> *disk_super,
> >>> + dev_t devt, bool mount_arg_dev)
> >>> +{
> >>> + struct btrfs_fs_devices *fs_devices;
> >>> +
> >>> + list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
> >>> + struct btrfs_device *device;
> >>> +
> >>> + mutex_lock(&fs_devices->device_list_mutex);
> >>> + list_for_each_entry(device, &fs_devices->devices,
> dev_list) {
> >>> + if (device->devt == devt) {
> >>> + mutex_unlock(&fs_devices->device_list_mutex);
> >>> + return false;
> >>> + }
> >>> + }
> >>> + mutex_unlock(&fs_devices->device_list_mutex);
> >>
> >> This is locking and unlocking again before going to
> device_list_add, so
> >> if something changes regarding the registered device then it's
> not up to
> >> date.
> >>
> >
>
> We are in the uuid_mutex, a potentially racing thread will have to
> acquire this mutex to delete from the list. So there can't a race.
>
>
>
> > Right. A race might happen, but it is not an issue. At worst, there
> > will be a stale device in the cache, which gets removed or re-used
> > in the next mkfs or mount of the same device.
> >
> > However, this is a rough cut that we need to fix. I am reviewing
> > your approach as well. I'm fine with any fix.
> >
> >
> >>
> >>> + }
> >>> +
> >>> + if (!mount_arg_dev && btrfs_super_num_devices(disk_super)
> == 1 &&
> >>> + !(btrfs_super_flags(disk_super) &
> BTRFS_SUPER_FLAG_SEEDING))
> >>> + return true;
> >>
> >> The way I implemented it is to check the above conditions as a
> >> prerequisite but leave the heavy work for device_list_add that
> does all
> >> the uuid and device list locking and we are quite sure it
> survives all
> >> the races between scanning and mounts.
> >>
> >
> > Hm. But isn't that the bug we need to fix? That we skipped the device
> > scan thread that wanted to replace the device path from /dev/root to
> > /dev/sdx?
> >
> > And we skipped, because it was not a mount thread
> > (%mount_arg_dev=false), and the device is already mounted and the
> > devt will match?
> >
> > So my fix also checked if devt is a match, then allow it to scan
> > (so that the device path can be updated, such as /dev/root to
> /dev/sdx).
> >
> > To confirm the bug, I asked for the debug kernel messages, I don't
> > this we got it. Also, the existing kernel log shows no such issue.
> >
> >
> >>> +
> >>> + return false;
> >>> +}
> >>> +
> >>> /*
> >>> * Look for a btrfs signature on a device. This may be called
> out
> >>> of the mount path
> >>> * and we are not allowed to call set_blocksize during the scan.
> >>> The superblock
> >>> @@ -1316,6 +1341,7 @@ struct btrfs_device
> >>> *btrfs_scan_one_device(const char *path, blk_mode_t flags,
> >>> struct btrfs_device *device = NULL;
> >>> struct bdev_handle *bdev_handle;
> >>> u64 bytenr, bytenr_orig;
> >>> + dev_t devt = 0;
> >>> int ret;
> >>> lockdep_assert_held(&uuid_mutex);
> >>> @@ -1355,18 +1381,16 @@ struct btrfs_device
> >>> *btrfs_scan_one_device(const char *path, blk_mode_t flags,
> >>> goto error_bdev_put;
> >>> }
> >>> - if (!mount_arg_dev && btrfs_super_num_devices(disk_super)
> == 1 &&
> >>> - !(btrfs_super_flags(disk_super) &
> BTRFS_SUPER_FLAG_SEEDING)) {
> >>> - dev_t devt;
> >>> + ret = lookup_bdev(path, &devt);
> >>> + if (ret)
> >>> + btrfs_warn(NULL, "lookup bdev failed for path %s: %d",
> >>> + path, ret);
> >>> - ret = lookup_bdev(path, &devt);
> >>
> >> Do we actually need this check? It was added with the patch
> skipping the
> >> registration, so it's validating the block device but how can we
> pass
> >> something that is not a valid block device?
> >>
> >
> > Do you mean to check if the lookup_bdev() is successful? Hm. It
> should
> > be okay not to check, but we do that consistently in other places.
> >
> >> Besides there's a call to bdev_open_by_path() that in turn does the
> >> lookup_bdev so checking it here is redundant. It's not related
> to the
> >> fix itself but I deleted it in my fix.
> >>
> >
> > Oh no. We need %devt to be set because:
> >
> > It will match if that device is already mounted/scanned.
> > It will also free stale entries.
> >
> > Thx, Anand
> >
> >>> - if (ret)
> >>> - btrfs_warn(NULL, "lookup bdev failed for path %s: %d",
> >>> - path, ret);
> >>> - else
> >>> + if (btrfs_skip_registration(disk_super, devt,
> mount_arg_dev)) {
> >>> + pr_debug("BTRFS: skip registering single non-seed
> device %s\n",
> >>> + path);
> >>> + if (devt)
> >>> btrfs_free_stale_devices(devt, NULL);
> >>> -
> >>> - pr_debug("BTRFS: skip registering single non-seed device
> >>> %s\n", path);
> >>> device = NULL;
> >>> goto free_disk_super;
> >>> }
>
next prev parent reply other threads:[~2024-02-08 2:23 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-05 11:45 [PATCH] btrfs: do not skip re-registration for the mounted device Anand Jain
2024-02-05 12:57 ` David Sterba
2024-02-07 2:38 ` Anand Jain
2024-02-07 18:08 ` Anand Jain
[not found] ` <CAKLYge+9ngrW-1EffUhyU1y13MzgP33osNDi3D6y6UVW6poJZA@mail.gmail.com>
2024-02-08 2:23 ` Anand Jain [this message]
2024-02-08 12:35 ` Alex Romosan
2024-02-08 17:22 ` Anand Jain
2024-02-08 19:45 ` Alex Romosan
2024-02-08 20:04 ` Alex Romosan
2024-02-09 8:11 ` Anand Jain
2024-02-09 15:27 ` Dr. Bernd Feige
2024-02-12 14:59 ` David Sterba
2024-02-13 0:35 ` Anand Jain
2024-02-13 19:38 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55c879b6-5e6b-4602-b558-e52540b67bda@oracle.com \
--to=anand.jain@oracle.com \
--cc=aromosan@gmail.com \
--cc=bernd.feige@gmx.net \
--cc=dsterba@suse.com \
--cc=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox