From: Boris Burkov <boris@bur.io>
To: Anand Jain <anand.jain@oracle.com>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] btrfs: do not clear read-only when adding sprout device
Date: Wed, 23 Mar 2022 11:25:50 -0700 [thread overview]
Message-ID: <YjtmLqBGlqaQXf2u@zen> (raw)
In-Reply-To: <b4ff2316-fca8-2f04-bf0a-d7747118b768@oracle.com>
On Wed, Mar 23, 2022 at 06:44:42PM +0800, Anand Jain wrote:
> On 22/03/2022 07:56, Boris Burkov wrote:
> > If you follow the seed/sprout wiki, it suggests the following workflow:
> >
> > btrfstune -S 1 seed_dev
> > mount seed_dev mnt
> > btrfs device add sprout_dev
>
> > mount -o remount,rw mnt
> or
> umount mnt
> mount sprout mnt
Agreed. FWIW, I tested that umount/mount is not vulnerable to the bug.
However, a user might be using one of the documented workflows anyway,
and would need it for an fs they add a sprout to while it is in use.
>
> > The first mount mounts the FS readonly, which results in not setting
> > BTRFS_FS_OPEN, and setting the readonly bit on the sb.
>
> Why not set the BTRFS_FS_OPEN?
>
One reason is that there is other logic that runs when transitioning
from ro->rw in remount besides just setting BTRFS_FS_OPEN. By not
improperly clearing ro, we let that logic do its thing naturally.
> @@ -3904,8 +3904,11 @@ int __cold open_ctree(struct super_block *sb, struct
> btrfs_fs_devices *fs_device
> goto fail_qgroup;
> }
>
> - if (sb_rdonly(sb))
> + if (sb_rdonly(sb)) {
> + btrfs_set_sb_rdonly(sb);
> + set_bit(BTRFS_FS_OPEN, &fs_info->flags);
> goto clear_oneshot;
> + }
>
> ret = btrfs_start_pre_rw_mount(fs_info);
> if (ret) {
>
> > The device add
> > somewhat surprisingly clears the readonly bit on the sb (though the
> > mount is still practically readonly, from the users perspective...).
> > Finally, the remount checks the readonly bit on the sb against the flag
> > and sees no change, so it does not run the code intended to run on
> > ro->rw transitions, leaving BTRFS_FS_OPEN unset.
>
> Originally, the step 'btrfs device add sprout_dev' provided seed
> fs writeable without a remount.
That is very interesting. Do you remember why we changed this behavior
to the current behavior of leaving the seed readonly until the
remount/mount cycle?
>
> I think the btrfs_clear_sb_rdonly(sb) in btrfs_init_new_device()
> was part of it.
>
> Removing it doesn't seem to affect the seed-sprout functionality
> (did I miss anything?) either the -o remount,rw
> or mount recycle will get it writeable.
My current understanding (probably flawed..):
before this patch: we clear the rdonly bit, but the fs is still readonly
until remount (should figure out exactly how)
after this patch: behavior is the same, except the rdonly bit gets
cleared along with the mounting, not the device add.
>
> > As a result, when the cleaner_kthread runs, it sees no BTRFS_FS_OPEN and
> > does no work. This results in leaking deleted snapshots until we run out
> > of space.
>
>
> > I propose fixing it at the first departure from what feels reasonable:
> > when we clear the readonly bit on the sb during device add. I have a
> > reproducer of the issue here:
> > https://raw.githubusercontent.com/boryas/scripts/main/sh/seed/mkseed.sh
> > and confirm that this patch fixes it, and seems to work OK, otherwise. I
> > will admit that I couldn't dig up the original rationale for clearing
> > the bit here (it dates back to the original seed/sprout commit without
> > explicit explanation) so it's hard to imagine all the ramifications of
> > the change.
>
> We got fstests -g seed to test the seed-sprout stuff. Your test case
> here fits in it. IMO.
Thanks for the tip, I'll add it there, regardless of how we fix it.
>
> Thanks, Anand
>
>
> > Signed-off-by: Boris Burkov <boris@bur.io>
> > ---
> > fs/btrfs/volumes.c | 4 ----
> > 1 file changed, 4 deletions(-)
> >
> > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > index 3fd17e87815a..75d7eeb26fe6 100644
> > --- a/fs/btrfs/volumes.c
> > +++ b/fs/btrfs/volumes.c
> > @@ -2675,8 +2675,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
> > set_blocksize(device->bdev, BTRFS_BDEV_BLOCKSIZE);
> > if (seeding_dev) {
> > - btrfs_clear_sb_rdonly(sb);
> > -
> > /* GFP_KERNEL allocation must not be under device_list_mutex */
> > seed_devices = btrfs_init_sprout(fs_info);
> > if (IS_ERR(seed_devices)) {
> > @@ -2819,8 +2817,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
> > mutex_unlock(&fs_info->chunk_mutex);
> > mutex_unlock(&fs_info->fs_devices->device_list_mutex);
> > error_trans:
> > - if (seeding_dev)
> > - btrfs_set_sb_rdonly(sb);
> > if (trans)
> > btrfs_end_transaction(trans);
> > error_free_zone:
>
next prev parent reply other threads:[~2022-03-23 18:25 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-21 23:56 [PATCH] btrfs: do not clear read-only when adding sprout device Boris Burkov
2022-03-22 21:46 ` Josef Bacik
2022-03-23 0:52 ` Naohiro Aota
2022-03-23 18:16 ` Boris Burkov
2022-03-28 11:11 ` Anand Jain
2022-03-29 4:33 ` Naohiro Aota
2022-03-29 19:45 ` Boris Burkov
2022-03-23 10:44 ` Anand Jain
2022-03-23 18:25 ` Boris Burkov [this message]
2022-03-24 11:16 ` Anand Jain
2022-03-23 20:17 ` Josef Bacik
-- strict thread matches above, loose matches on Subject: below --
2024-10-15 21:38 Boris Burkov
2024-10-15 22:00 ` Qu Wenruo
2024-10-15 22:12 ` Qu Wenruo
2024-10-15 23:23 ` Boris Burkov
2024-10-16 17:14 ` Anand Jain
2024-10-16 17:24 ` Boris Burkov
2024-10-17 20:47 ` Qu Wenruo
2024-10-18 11:54 ` Anand Jain
2024-10-17 14:01 ` David Sterba
2024-10-17 16:41 ` Boris Burkov
2024-10-21 18:56 ` David Sterba
2024-10-21 19:29 ` Boris Burkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YjtmLqBGlqaQXf2u@zen \
--to=boris@bur.io \
--cc=anand.jain@oracle.com \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.