From: Boris Burkov <boris@bur.io>
To: Naohiro Aota <Naohiro.Aota@wdc.com>
Cc: Anand Jain <anand.jain@oracle.com>,
"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
"kernel-team@fb.com" <kernel-team@fb.com>
Subject: Re: [PATCH] btrfs: do not clear read-only when adding sprout device
Date: Tue, 29 Mar 2022 12:45:21 -0700 [thread overview]
Message-ID: <YkNh0X7wx8uk5aat@zen> (raw)
In-Reply-To: <20220329043320.bak6zyigz2g5facj@naota-xeon>
On Tue, Mar 29, 2022 at 04:33:21AM +0000, Naohiro Aota wrote:
> On Mon, Mar 28, 2022 at 07:11:39PM +0800, Anand Jain wrote:
> > On 24/03/2022 02:16, Boris Burkov wrote:
> > > On Wed, Mar 23, 2022 at 12:52:15AM +0000, Naohiro Aota wrote:
> > > > On Mon, Mar 21, 2022 at 04:56:17PM -0700, Boris Burkov wrote:
> > > > > If you follow the seed/sprout wiki, it suggests the following workflow:
> > > > >
> > > > > btrfstune -S 1 seed_dev > > mount seed_dev mnt
> > > > > btrfs device add sprout_dev
> > > > > mount -o remount,rw mnt
> > > > >
> > > > > The first mount mounts the FS readonly, which results in not setting
> > > > > BTRFS_FS_OPEN, and setting the readonly bit on the sb. The device add
> > > > > somewhat surprisingly clears the readonly bit on the sb (though the
> > > > > mount is still practically readonly, from the users perspective...).
> > > > > Finally, the remount checks the readonly bit on the sb against the flag
> > > > > and sees no change, so it does not run the code intended to run on
> > > > > ro->rw transitions, leaving BTRFS_FS_OPEN unset.
> > > > >
> > > > > As a result, when the cleaner_kthread runs, it sees no BTRFS_FS_OPEN and
> > > > > does no work. This results in leaking deleted snapshots until we run out
> > > > > of space.
> > > > >
> > > > > I propose fixing it at the first departure from what feels reasonable:
> > > > > when we clear the readonly bit on the sb during device add. I have a
> > > > > reproducer of the issue here:
> > > > > https://raw.githubusercontent.com/boryas/scripts/main/sh/seed/mkseed.sh
> > > > > and confirm that this patch fixes it, and seems to work OK, otherwise. I
> > > > > will admit that I couldn't dig up the original rationale for clearing
> > > > > the bit here (it dates back to the original seed/sprout commit without
> > > > > explicit explanation) so it's hard to imagine all the ramifications of
> > > > > the change.
> > > > >
> > > > > Signed-off-by: Boris Burkov <boris@bur.io>
> > > > > ---
> > > > > fs/btrfs/volumes.c | 4 ----
> > > > > 1 file changed, 4 deletions(-)
> > > > >
> > > > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > > > > index 3fd17e87815a..75d7eeb26fe6 100644
> > > > > --- a/fs/btrfs/volumes.c
> > > > > +++ b/fs/btrfs/volumes.c
> > > > > @@ -2675,8 +2675,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
> > > > > set_blocksize(device->bdev, BTRFS_BDEV_BLOCKSIZE);
> > > > > if (seeding_dev) {
> > > > > - btrfs_clear_sb_rdonly(sb);
> > > > > -
> > > >
> > > > After this line, it updates the metadata e.g, with
> > > > init_first_rw_device() and writes them out with
> > > > btrfs_commit_transaction(). Is that OK to do so with the SB_RDONLY
> > > > flag set?
> > >
> >
> > It is ok as the device-add step creates a _new_ sprout filesystem which is
> > RW-able. btrfs_setup_sprout() resets the seeding flag.
> >
> > super_flags = btrfs_super_flags(disk_super) &
> > ~BTRFS_SUPER_FLAG_SEEDING;
> > btrfs_set_super_flags(disk_super, super_flags);
> >
> > Thanks, Anand
>
> Yeah, I see that point. I'm concerned about an interaction with the
> VFS code. With this patch applied, it is going to do file-system
> internal writes (updating the trees and transaction commit) with the
> SB_RDONLY flag set. Doesn't it break the current and future assumptions
> of the VFS side?
>
> But, it's just a concern. It might not be a bid deal.
I discussed this with Chris and he pointed out an existing case:
log-replay. Log replay runs in mount before the sb_rdonly check and
checks rw_devices, but not the fs readonly bit. It uses transactions and
such to replay the log. This seems similar enough here: the device is rw,
but the fs is ro.
>
> > > Good question. As far as I can tell, the functions don't explicitly
> > > check sb_rdonly, though that could be because they expect that to be
> > > checked before you ever try to commit a transaction, for example..
> > >
> > > If there is an issue, it's probably somewhat subtle, because the basic
> > > behavior does work.
> > >
> > > >
> > > > > /* GFP_KERNEL allocation must not be under device_list_mutex */
> > > > > seed_devices = btrfs_init_sprout(fs_info);
> > > > > if (IS_ERR(seed_devices)) {
> > > > > @@ -2819,8 +2817,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
> > > > > mutex_unlock(&fs_info->chunk_mutex);
> > > > > mutex_unlock(&fs_info->fs_devices->device_list_mutex);
> > > > > error_trans:
> > > > > - if (seeding_dev)
> > > > > - btrfs_set_sb_rdonly(sb);
> > > > > if (trans)
> > > > > btrfs_end_transaction(trans);
> > > > > error_free_zone:
> > > > > --
> > > > > 2.30.2
> > > > >
> >
next prev parent reply other threads:[~2022-03-29 19:45 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-21 23:56 [PATCH] btrfs: do not clear read-only when adding sprout device Boris Burkov
2022-03-22 21:46 ` Josef Bacik
2022-03-23 0:52 ` Naohiro Aota
2022-03-23 18:16 ` Boris Burkov
2022-03-28 11:11 ` Anand Jain
2022-03-29 4:33 ` Naohiro Aota
2022-03-29 19:45 ` Boris Burkov [this message]
2022-03-23 10:44 ` Anand Jain
2022-03-23 18:25 ` Boris Burkov
2022-03-24 11:16 ` Anand Jain
2022-03-23 20:17 ` Josef Bacik
-- strict thread matches above, loose matches on Subject: below --
2024-10-15 21:38 Boris Burkov
2024-10-15 22:00 ` Qu Wenruo
2024-10-15 22:12 ` Qu Wenruo
2024-10-15 23:23 ` Boris Burkov
2024-10-16 17:14 ` Anand Jain
2024-10-16 17:24 ` Boris Burkov
2024-10-17 20:47 ` Qu Wenruo
2024-10-18 11:54 ` Anand Jain
2024-10-17 14:01 ` David Sterba
2024-10-17 16:41 ` Boris Burkov
2024-10-21 18:56 ` David Sterba
2024-10-21 19:29 ` Boris Burkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YkNh0X7wx8uk5aat@zen \
--to=boris@bur.io \
--cc=Naohiro.Aota@wdc.com \
--cc=anand.jain@oracle.com \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox