public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: Boris Burkov <boris@bur.io>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] btrfs: do not clear read-only when adding sprout device
Date: Thu, 24 Mar 2022 19:16:13 +0800	[thread overview]
Message-ID: <795bf289-6520-cb90-51c5-7809d0683fa4@oracle.com> (raw)
In-Reply-To: <YjtmLqBGlqaQXf2u@zen>



On 24/03/2022 02:25, Boris Burkov wrote:
> On Wed, Mar 23, 2022 at 06:44:42PM +0800, Anand Jain wrote:
>> On 22/03/2022 07:56, Boris Burkov wrote:
>>> If you follow the seed/sprout wiki, it suggests the following workflow:
>>>
>>> btrfstune -S 1 seed_dev
>>> mount seed_dev mnt
>>> btrfs device add sprout_dev
>>
>>> mount -o remount,rw mnt
>> or
>>   umount mnt
>>   mount sprout mnt
> 
> Agreed. FWIW, I tested that umount/mount is not vulnerable to the bug.
> However, a user might be using one of the documented workflows anyway,
> and would need it for an fs they add a sprout to while it is in use.

Yep.

>>
>>> The first mount mounts the FS readonly, which results in not setting
>>> BTRFS_FS_OPEN, and setting the readonly bit on the sb.
>>
>>   Why not set the BTRFS_FS_OPEN?
>>
> 
> One reason is that there is other logic that runs when transitioning
> from ro->rw in remount besides just setting BTRFS_FS_OPEN. By not
> improperly clearing ro, we let that logic do its thing naturally.
> 

>> @@ -3904,8 +3904,11 @@ int __cold open_ctree(struct super_block *sb, struct
>> btrfs_fs_devices *fs_device
>>                  goto fail_qgroup;
>>          }
>>
>> -       if (sb_rdonly(sb))
>> +       if (sb_rdonly(sb)) {
>> +               btrfs_set_sb_rdonly(sb);
>> +               set_bit(BTRFS_FS_OPEN, &fs_info->flags);

  Looked more deep. No. I misunderstood BTRFS_FS_OPEN, it implies
  open_ctree() is complete and read-writeable. So this won't work.

>>                  goto clear_oneshot;
>> +       }
>>
>>          ret = btrfs_start_pre_rw_mount(fs_info);
>>          if (ret) {
>>
>>> The device add
>>> somewhat surprisingly clears the readonly bit on the sb (though the
>>> mount is still practically readonly, from the users perspective...).
>>> Finally, the remount checks the readonly bit on the sb against the flag
>>> and sees no change, so it does not run the code intended to run on
>>> ro->rw transitions, leaving BTRFS_FS_OPEN unset.
>>
>>   Originally, the step 'btrfs device add sprout_dev' provided seed
>>   fs writeable without a remount.
> 
> That is very interesting. Do you remember why we changed this behavior
> to the current behavior of leaving the seed readonly until the
> remount/mount cycle?

  So far, I couldn't find the commit that changed it.
  Perhaps Josef may have an idea?

  More below.

>>   I think the btrfs_clear_sb_rdonly(sb) in btrfs_init_new_device()
>>   was part of it.
>>
>>   Removing it doesn't seem to affect the seed-sprout functionality
>>   (did I miss anything?) either the -o remount,rw
>>   or mount recycle will get it writeable.
> 
> My current understanding (probably flawed..):
> before this patch: we clear the rdonly bit, but the fs is still readonly
> until remount (should figure out exactly how)
> after this patch: behavior is the same, except the rdonly bit gets
> cleared along with the mounting, not the device add.
> 
>>
>>> As a result, when the cleaner_kthread runs, it sees no BTRFS_FS_OPEN and
>>> does no work. This results in leaking deleted snapshots until we run out
>>> of space.
>>
>>
>>> I propose fixing it at the first departure from what feels reasonable:
>>> when we clear the readonly bit on the sb during device add. I have a
>>> reproducer of the issue here:
>>> https://raw.githubusercontent.com/boryas/scripts/main/sh/seed/mkseed.sh
>>> and confirm that this patch fixes it, and seems to work OK, otherwise. I
>>> will admit that I couldn't dig up the original rationale for clearing
>>> the bit here (it dates back to the original seed/sprout commit without
>>> explicit explanation) so it's hard to imagine all the ramifications of
>>> the change.
>>
>>   We got fstests -g seed to test the seed-sprout stuff. Your test case
>>   here fits in it. IMO.
> 
> Thanks for the tip, I'll add it there, regardless of how we fix it.
> 
>>
>> Thanks, Anand
>>
>>


>>> Signed-off-by: Boris Burkov <boris@bur.io>
>>> ---
>>>    fs/btrfs/volumes.c | 4 ----
>>>    1 file changed, 4 deletions(-)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index 3fd17e87815a..75d7eeb26fe6 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -2675,8 +2675,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
>>>    	set_blocksize(device->bdev, BTRFS_BDEV_BLOCKSIZE);
>>>    	if (seeding_dev) {
>>> -		btrfs_clear_sb_rdonly(sb);
>>> -
>>>    		/* GFP_KERNEL allocation must not be under device_list_mutex */
>>>    		seed_devices = btrfs_init_sprout(fs_info);
>>>    		if (IS_ERR(seed_devices)) {
>>> @@ -2819,8 +2817,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
>>>    	mutex_unlock(&fs_info->chunk_mutex);
>>>    	mutex_unlock(&fs_info->fs_devices->device_list_mutex);
>>>    error_trans:
>>> -	if (seeding_dev)
>>> -		btrfs_set_sb_rdonly(sb);
>>>    	if (trans)
>>>    		btrfs_end_transaction(trans);
>>>    error_free_zone:
>>

This looks good to me. You may add.

Reviewed-by: Anand Jain <anand.jain@oracle.com>

Thanks, Anand

  reply	other threads:[~2022-03-24 11:16 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-21 23:56 [PATCH] btrfs: do not clear read-only when adding sprout device Boris Burkov
2022-03-22 21:46 ` Josef Bacik
2022-03-23  0:52 ` Naohiro Aota
2022-03-23 18:16   ` Boris Burkov
2022-03-28 11:11     ` Anand Jain
2022-03-29  4:33       ` Naohiro Aota
2022-03-29 19:45         ` Boris Burkov
2022-03-23 10:44 ` Anand Jain
2022-03-23 18:25   ` Boris Burkov
2022-03-24 11:16     ` Anand Jain [this message]
2022-03-23 20:17   ` Josef Bacik
  -- strict thread matches above, loose matches on Subject: below --
2024-10-15 21:38 Boris Burkov
2024-10-15 22:00 ` Qu Wenruo
2024-10-15 22:12   ` Qu Wenruo
2024-10-15 23:23     ` Boris Burkov
2024-10-16 17:14 ` Anand Jain
2024-10-16 17:24   ` Boris Burkov
2024-10-17 20:47   ` Qu Wenruo
2024-10-18 11:54     ` Anand Jain
2024-10-17 14:01 ` David Sterba
2024-10-17 16:41   ` Boris Burkov
2024-10-21 18:56     ` David Sterba
2024-10-21 19:29       ` Boris Burkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=795bf289-6520-cb90-51c5-7809d0683fa4@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=boris@bur.io \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox