public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: dsterba@suse.cz
Cc: linux-btrfs@vger.kernel.org, dsterba@suse.com,
	aromosan@gmail.com, bernd.feige@gmx.net
Subject: Re: [PATCH] btrfs: do not skip re-registration for the mounted device
Date: Wed, 7 Feb 2024 08:08:06 +0530	[thread overview]
Message-ID: <e718b759-e597-440f-9fd0-351686bd6b5e@oracle.com> (raw)
In-Reply-To: <20240205125704.GD355@twin.jikos.cz>




On 2/5/24 18:27, David Sterba wrote:
> On Mon, Feb 05, 2024 at 07:45:05PM +0800, Anand Jain wrote:
>> We skip device registration for a single device. However, we do not do
>> that if the device is already mounted, as it might be coming in again
>> for scanning a different path.
>>
>> This patch is lightly tested; for verification if it fixes.
>>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>> I still have some unknowns about the problem. Pls test if this fixes
>> the problem.
>>
>>   fs/btrfs/volumes.c | 44 ++++++++++++++++++++++++++++++++++----------
>>   fs/btrfs/volumes.h |  1 -
>>   2 files changed, 34 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 474ab7ed65ea..192c540a650c 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -1299,6 +1299,31 @@ int btrfs_forget_devices(dev_t devt)
>>   	return ret;
>>   }
>>   
>> +static bool btrfs_skip_registration(struct btrfs_super_block *disk_super,
>> +				    dev_t devt, bool mount_arg_dev)
>> +{
>> +	struct btrfs_fs_devices *fs_devices;
>> +
>> +	list_for_each_entry(fs_devices, &fs_uuids, fs_list) {
>> +		struct btrfs_device *device;
>> +
>> +		mutex_lock(&fs_devices->device_list_mutex);
>> +		list_for_each_entry(device, &fs_devices->devices, dev_list) {
>> +			if (device->devt == devt) {
>> +				mutex_unlock(&fs_devices->device_list_mutex);
>> +				return false;
>> +			}
>> +		}
>> +		mutex_unlock(&fs_devices->device_list_mutex);
> 
> This is locking and unlocking again before going to device_list_add, so
> if something changes regarding the registered device then it's not up to
> date.
> 

Right. A race might happen, but it is not an issue. At worst, there
will be a stale device in the cache, which gets removed or re-used
in the next mkfs or mount of the same device.

However, this is a rough cut that we need to fix. I am reviewing
your approach as well. I'm fine with any fix.


> 
>> +	}
>> +
>> +	if (!mount_arg_dev && btrfs_super_num_devices(disk_super) == 1 &&
>> +	    !(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING))
>> +		return true;
> 
> The way I implemented it is to check the above conditions as a
> prerequisite but leave the heavy work for device_list_add that does all
> the uuid and device list locking and we are quite sure it survives all
> the races between scanning and mounts.
> 

Hm. But isn't that the bug we need to fix? That we skipped the device
scan thread that wanted to replace the device path from /dev/root to
/dev/sdx?

And we skipped, because it was not a mount thread
(%mount_arg_dev=false), and the device is already mounted and the
devt will match?

So my fix also checked if devt is a match, then allow it to scan
(so that the device path can be updated, such as /dev/root to /dev/sdx).

To confirm the bug, I asked for the debug kernel messages, I don't
this we got it. Also, the existing kernel log shows no such issue.


>> +
>> +	return false;
>> +}
>> +
>>   /*
>>    * Look for a btrfs signature on a device. This may be called out of the mount path
>>    * and we are not allowed to call set_blocksize during the scan. The superblock
>> @@ -1316,6 +1341,7 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
>>   	struct btrfs_device *device = NULL;
>>   	struct bdev_handle *bdev_handle;
>>   	u64 bytenr, bytenr_orig;
>> +	dev_t devt = 0;
>>   	int ret;
>>   
>>   	lockdep_assert_held(&uuid_mutex);
>> @@ -1355,18 +1381,16 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, blk_mode_t flags,
>>   		goto error_bdev_put;
>>   	}
>>   
>> -	if (!mount_arg_dev && btrfs_super_num_devices(disk_super) == 1 &&
>> -	    !(btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING)) {
>> -		dev_t devt;
>> +	ret = lookup_bdev(path, &devt);
>> +	if (ret)
>> +		btrfs_warn(NULL, "lookup bdev failed for path %s: %d",
>> +			   path, ret);
>>   
>> -		ret = lookup_bdev(path, &devt);
> 
> Do we actually need this check? It was added with the patch skipping the
> registration, so it's validating the block device but how can we pass
> something that is not a valid block device?
> 

Do you mean to check if the lookup_bdev() is successful? Hm. It should
be okay not to check, but we do that consistently in other places.

> Besides there's a call to bdev_open_by_path() that in turn does the
> lookup_bdev so checking it here is redundant. It's not related to the
> fix itself but I deleted it in my fix.
> 

Oh no. We need %devt to be set because:

It will match if that device is already mounted/scanned.
It will also free stale entries.

Thx, Anand

>> -		if (ret)
>> -			btrfs_warn(NULL, "lookup bdev failed for path %s: %d",
>> -				   path, ret);
>> -		else
>> +	if (btrfs_skip_registration(disk_super, devt, mount_arg_dev)) {
>> +		pr_debug("BTRFS: skip registering single non-seed device %s\n",
>> +			  path);
>> +		if (devt)
>>   			btrfs_free_stale_devices(devt, NULL);
>> -
>> -		pr_debug("BTRFS: skip registering single non-seed device %s\n", path);
>>   		device = NULL;
>>   		goto free_disk_super;
>>   	}

  reply	other threads:[~2024-02-07  2:38 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-05 11:45 [PATCH] btrfs: do not skip re-registration for the mounted device Anand Jain
2024-02-05 12:57 ` David Sterba
2024-02-07  2:38   ` Anand Jain [this message]
2024-02-07 18:08     ` Anand Jain
     [not found]       ` <CAKLYge+9ngrW-1EffUhyU1y13MzgP33osNDi3D6y6UVW6poJZA@mail.gmail.com>
2024-02-08  2:23         ` Anand Jain
2024-02-08 12:35           ` Alex Romosan
2024-02-08 17:22             ` Anand Jain
2024-02-08 19:45               ` Alex Romosan
2024-02-08 20:04               ` Alex Romosan
2024-02-09  8:11                 ` Anand Jain
2024-02-09 15:27                   ` Dr. Bernd Feige
2024-02-12 14:59 ` David Sterba
2024-02-13  0:35   ` Anand Jain
2024-02-13 19:38     ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e718b759-e597-440f-9fd0-351686bd6b5e@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=aromosan@gmail.com \
    --cc=bernd.feige@gmx.net \
    --cc=dsterba@suse.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox