public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Johannes Thumshirn <jth@kernel.org>, linux-btrfs@vger.kernel.org
Cc: Boris Burkov <boris@bur.io>, Christoph Hellwig <hch@lst.de>,
	David Sterba <dsterba@suse.com>,
	Josef Bacik <josef@toxicpanda.com>,
	Johannes Thumshirn <johannes.thumshirn@wdc.com>
Subject: Re: [PATCH v2 3/5] btrfs: split btrfs_fs_devices.opened
Date: Tue, 17 Jun 2025 08:07:02 +0930	[thread overview]
Message-ID: <c34b2c63-66f9-422f-bba4-997bc530c673@suse.com> (raw)
In-Reply-To: <f1088e76-896c-4b83-8477-b7bdad7e8efd@suse.com>



在 2025/6/16 19:56, Qu Wenruo 写道:
> 
> 
> 在 2025/6/11 19:33, Johannes Thumshirn 写道:
>> From: Christoph Hellwig <hch@lst.de>
>>
>> The btrfs_fs_devices.opened member mixes an in use counter for the
>> fs_devices structure that prevents it from being garbage collected with
>> a flag if the underlying devices were actually opened.  This not only
>> makes the code hard to follow, but also prevents btrfs from switching
>> to opening the block device only after super block creation.  Split it
>> into an in_use counter and an is_open boolean flag instead.
> 
> I'm hitting problems with this patch applied (previous patch 1 and 2 
> also applied and tested), that it causes problem setting up other dm 
> devices.
> Thus failing future tests that requires SCRATCH_DEV for dm setup.
> 
> Furthermore at btrfs unload time, the warnings at free_fs_devices() got 
> triggered.
> 
> But without this one (only patch 1 & 2), I have not hit anything like 
> that at all.
> 
> So this means, even without extra in_use without opening cases, this is 
> already causing problems.
> 
> 
> Furthermore the in-use-but-not-opened state for the next patch (delayed 
> devices opening), may not be that important anymore.
> 
> With extra comments added into btrfs_get_tree_super(), we can explain 
> the cleanup properly.
> And even if there are races between devices open (protected by 
> uuid_mutex) and sget_fc(), and different threads win the two races, it 
> doesn't really matter.
> 
> As long as the final btrfs_fill_super() get all its devices opened, and 
> the opened ref count is working, there is nothing to worry.
> 
> So I'm afraid both patch 3 and 4 may be dropped.

But we still need the delay of btrfs_open_devices() after sget_fc() to 
use the super block as blk holder.

So my current version is to expand the critical section of uuid_mutex to 
cover sget_fc(), then call btrfs_open_devices(), and finally unlock 
uuid_mutex.

Still running the tests but so far so good.

Thanks,
Qu

> 
> Thanks,
> Qu
> 
>>
>> Signed-off-by: Christoph Hellwig <hch@lst.de>
>> Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
>> ---
>>   fs/btrfs/volumes.c | 53 ++++++++++++++++++++++++++--------------------
>>   fs/btrfs/volumes.h |  6 ++++--
>>   2 files changed, 34 insertions(+), 25 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 1ebfc69012a2..00b64f98e3bd 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -413,7 +413,8 @@ static void free_fs_devices(struct 
>> btrfs_fs_devices *fs_devices)
>>   {
>>       struct btrfs_device *device;
>> -    WARN_ON(fs_devices->opened);
>> +    WARN_ON_ONCE(fs_devices->in_use);
>> +    WARN_ON_ONCE(fs_devices->is_open);
>>       while (!list_empty(&fs_devices->devices)) {
>>           device = list_first_entry(&fs_devices->devices,
>>                         struct btrfs_device, dev_list);
>> @@ -541,7 +542,7 @@ static int btrfs_free_stale_devices(dev_t devt, 
>> struct btrfs_device *skip_device
>>                   continue;
>>               if (devt && devt != device->devt)
>>                   continue;
>> -            if (fs_devices->opened) {
>> +            if (fs_devices->in_use) {
>>                   if (devt)
>>                       ret = -EBUSY;
>>                   break;
>> @@ -613,7 +614,7 @@ static struct btrfs_fs_devices *find_fsid_by_device(
>>       if (found_by_devt) {
>>           /* Existing device. */
>>           if (fsid_fs_devices == NULL) {
>> -            if (devt_fs_devices->opened == 0) {
>> +            if (devt_fs_devices->in_use == 0) {
>>                   /* Stale device. */
>>                   return NULL;
>>               } else {
>> @@ -848,7 +849,7 @@ static noinline struct btrfs_device 
>> *device_list_add(const char *path,
>>       if (!device) {
>>           unsigned int nofs_flag;
>> -        if (fs_devices->opened) {
>> +        if (fs_devices->in_use) {
>>               btrfs_err(NULL,
>>   "device %s (%d:%d) belongs to fsid %pU, and the fs is already 
>> mounted, scanned by %s (%d)",
>>                     path, MAJOR(path_devt), MINOR(path_devt),
>> @@ -916,7 +917,7 @@ static noinline struct btrfs_device 
>> *device_list_add(const char *path,
>>            * tracking a problem where systems fail mount by subvolume id
>>            * when we reject replacement on a mounted FS.
>>            */
>> -        if (!fs_devices->opened && found_transid < device->generation) {
>> +        if (!fs_devices->in_use && found_transid < device->generation) {
>>               /*
>>                * That is if the FS is _not_ mounted and if you
>>                * are here, that means there is more than one
>> @@ -977,7 +978,7 @@ static noinline struct btrfs_device 
>> *device_list_add(const char *path,
>>        * it back. We need it to pick the disk with largest generation
>>        * (as above).
>>        */
>> -    if (!fs_devices->opened) {
>> +    if (!fs_devices->in_use) {
>>           device->generation = found_transid;
>>           fs_devices->latest_generation = max_t(u64, found_transid,
>>                           fs_devices->latest_generation);
>> @@ -1177,15 +1178,19 @@ static void close_fs_devices(struct 
>> btrfs_fs_devices *fs_devices)
>>       lockdep_assert_held(&uuid_mutex);
>> -    if (--fs_devices->opened > 0)
>> +    if (--fs_devices->in_use > 0)
>>           return;
>> +    if (!fs_devices->is_open)
>> +        goto done;
>> +
>>       list_for_each_entry_safe(device, tmp, &fs_devices->devices, 
>> dev_list)
>>           btrfs_close_one_device(device);
>>       WARN_ON(fs_devices->open_devices);
>>       WARN_ON(fs_devices->rw_devices);
>> -    fs_devices->opened = 0;
>> +    fs_devices->is_open = false;
>> +done:
>>       fs_devices->seeding = false;
>>       fs_devices->fs_info = NULL;
>>   }
>> @@ -1197,7 +1202,7 @@ void btrfs_close_devices(struct btrfs_fs_devices 
>> *fs_devices)
>>       mutex_lock(&uuid_mutex);
>>       close_fs_devices(fs_devices);
>> -    if (!fs_devices->opened) {
>> +    if (!fs_devices->in_use) {
>>           list_splice_init(&fs_devices->seed_list, &list);
>>           /*
>> @@ -1253,7 +1258,7 @@ static int open_fs_devices(struct 
>> btrfs_fs_devices *fs_devices,
>>           return -EINVAL;
>>       }
>> -    fs_devices->opened = 1;
>> +    fs_devices->is_open = true;
>>       fs_devices->latest_dev = latest_dev;
>>       fs_devices->total_rw_bytes = 0;
>>       fs_devices->chunk_alloc_policy = BTRFS_CHUNK_ALLOC_REGULAR;
>> @@ -1306,16 +1311,14 @@ int btrfs_open_devices(struct btrfs_fs_devices 
>> *fs_devices,
>>        * We also don't need the lock here as this is called during 
>> mount and
>>        * exclusion is provided by uuid_mutex
>>        */
>> -
>> -    if (fs_devices->opened) {
>> -        fs_devices->opened++;
>> -        ret = 0;
>> -    } else {
>> +    if (!fs_devices->is_open) {
>>           list_sort(NULL, &fs_devices->devices, devid_cmp);
>>           ret = open_fs_devices(fs_devices, flags, holder);
>> +        if (ret)
>> +            return ret;
>>       }
>> -
>> -    return ret;
>> +    fs_devices->in_use++;
>> +    return 0;
>>   }
>>   void btrfs_release_disk_super(struct btrfs_super_block *super)
>> @@ -1407,7 +1410,7 @@ static bool btrfs_skip_registration(struct 
>> btrfs_super_block *disk_super,
>>           mutex_lock(&fs_devices->device_list_mutex);
>> -        if (!fs_devices->opened) {
>> +        if (!fs_devices->is_open) {
>>               mutex_unlock(&fs_devices->device_list_mutex);
>>               continue;
>>           }
>> @@ -2314,13 +2317,14 @@ int btrfs_rm_device(struct btrfs_fs_info 
>> *fs_info,
>>        * This can happen if cur_devices is the private seed devices 
>> list.  We
>>        * cannot call close_fs_devices() here because it expects the 
>> uuid_mutex
>>        * to be held, but in fact we don't need that for the private
>> -     * seed_devices, we can simply decrement cur_devices->opened and 
>> then
>> +     * seed_devices, we can simply decrement cur_devices->in_use and 
>> then
>>        * remove it from our list and free the fs_devices.
>>        */
>>       if (cur_devices->num_devices == 0) {
>>           list_del_init(&cur_devices->seed_list);
>> -        ASSERT(cur_devices->opened == 1, "opened=%d", cur_devices- 
>> >opened);
>> -        cur_devices->opened--;
>> +        ASSERT(cur_devices->in_use == 1, "opened=%d", cur_devices- 
>> >in_use);
>> +        cur_devices->in_use--;
>> +        cur_devices->is_open = false;
>>           free_fs_devices(cur_devices);
>>       }
>> @@ -2549,7 +2553,8 @@ static struct btrfs_fs_devices 
>> *btrfs_init_sprout(struct btrfs_fs_info *fs_info)
>>       list_add(&old_devices->fs_list, &fs_uuids);
>>       memcpy(seed_devices, fs_devices, sizeof(*seed_devices));
>> -    seed_devices->opened = 1;
>> +    seed_devices->in_use = 1;
>> +    seed_devices->is_open = true;
>>       INIT_LIST_HEAD(&seed_devices->devices);
>>       INIT_LIST_HEAD(&seed_devices->alloc_list);
>>       mutex_init(&seed_devices->device_list_mutex);
>> @@ -7162,7 +7167,8 @@ static struct btrfs_fs_devices 
>> *open_seed_devices(struct btrfs_fs_info *fs_info,
>>               return fs_devices;
>>           fs_devices->seeding = true;
>> -        fs_devices->opened = 1;
>> +        fs_devices->in_use = 1;
>> +        fs_devices->is_open = true;
>>           return fs_devices;
>>       }
>> @@ -7179,6 +7185,7 @@ static struct btrfs_fs_devices 
>> *open_seed_devices(struct btrfs_fs_info *fs_info,
>>           free_fs_devices(fs_devices);
>>           return ERR_PTR(ret);
>>       }
>> +    fs_devices->in_use = 1;
>>       if (!fs_devices->seeding) {
>>           close_fs_devices(fs_devices);
>> diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
>> index afa71d315c46..06cf8c99befe 100644
>> --- a/fs/btrfs/volumes.h
>> +++ b/fs/btrfs/volumes.h
>> @@ -419,8 +419,10 @@ struct btrfs_fs_devices {
>>       struct list_head seed_list;
>> -    /* Count fs-devices opened. */
>> -    int opened;
>> +    /* Count if fs_device is in used. */
>> +    unsigned int in_use;
>> +    /* True if the devices were opened. */
>> +    bool is_open;
>>       /* Set when we find or add a device that doesn't have the nonrot 
>> flag set. */
>>       bool rotating;
> 
> 



  reply	other threads:[~2025-06-16 22:37 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-11 10:02 [PATCH v2 0/5] btrfs: use the super_block as bdev holder Johannes Thumshirn
2025-06-11 10:02 ` [PATCH v2 1/5] btrfs: always open the device read-only in btrfs_scan_one_device Johannes Thumshirn
2025-06-11 21:28   ` Qu Wenruo
2025-06-11 10:03 ` [PATCH v2 2/5] btrfs: call btrfs_close_devices from ->kill_sb Johannes Thumshirn
2025-06-11 21:37   ` Qu Wenruo
2025-06-11 10:03 ` [PATCH v2 3/5] btrfs: split btrfs_fs_devices.opened Johannes Thumshirn
2025-06-11 21:44   ` Qu Wenruo
2025-06-16 10:26   ` Qu Wenruo
2025-06-16 22:37     ` Qu Wenruo [this message]
2025-06-11 10:03 ` [PATCH v2 4/5] btrfs: open block devices after superblock creation Johannes Thumshirn
2025-06-11 10:03 ` [PATCH v2 5/5] btrfs: use the super_block as holder when mounting file systems Johannes Thumshirn
2025-06-11 21:49 ` [PATCH v2 0/5] btrfs: use the super_block as bdev holder Qu Wenruo
2025-06-11 22:06   ` Qu Wenruo
2025-06-12 12:15     ` Johannes Thumshirn
2025-06-12 22:21       ` Qu Wenruo
2025-06-13  5:59         ` hch
2025-06-13  8:01           ` Qu Wenruo
2025-06-13  8:14             ` Johannes Thumshirn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c34b2c63-66f9-422f-bba4-997bc530c673@suse.com \
    --to=wqu@suse.com \
    --cc=boris@bur.io \
    --cc=dsterba@suse.com \
    --cc=hch@lst.de \
    --cc=johannes.thumshirn@wdc.com \
    --cc=josef@toxicpanda.com \
    --cc=jth@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox