All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boris Burkov <boris@bur.io>
To: Naohiro Aota <Naohiro.Aota@wdc.com>
Cc: Anand Jain <anand.jain@oracle.com>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
	"kernel-team@fb.com" <kernel-team@fb.com>
Subject: Re: [PATCH] btrfs: do not clear read-only when adding sprout device
Date: Tue, 29 Mar 2022 12:45:21 -0700	[thread overview]
Message-ID: <YkNh0X7wx8uk5aat@zen> (raw)
In-Reply-To: <20220329043320.bak6zyigz2g5facj@naota-xeon>

On Tue, Mar 29, 2022 at 04:33:21AM +0000, Naohiro Aota wrote:
> On Mon, Mar 28, 2022 at 07:11:39PM +0800, Anand Jain wrote:
> > On 24/03/2022 02:16, Boris Burkov wrote:
> > > On Wed, Mar 23, 2022 at 12:52:15AM +0000, Naohiro Aota wrote:
> > > > On Mon, Mar 21, 2022 at 04:56:17PM -0700, Boris Burkov wrote:
> > > > > If you follow the seed/sprout wiki, it suggests the following workflow:
> > > > > 
> > > > > btrfstune -S 1 seed_dev > > mount seed_dev mnt
> > > > > btrfs device add sprout_dev
> > > > > mount -o remount,rw mnt
> > > > > 
> > > > > The first mount mounts the FS readonly, which results in not setting
> > > > > BTRFS_FS_OPEN, and setting the readonly bit on the sb. The device add
> > > > > somewhat surprisingly clears the readonly bit on the sb (though the
> > > > > mount is still practically readonly, from the users perspective...).
> > > > > Finally, the remount checks the readonly bit on the sb against the flag
> > > > > and sees no change, so it does not run the code intended to run on
> > > > > ro->rw transitions, leaving BTRFS_FS_OPEN unset.
> > > > > 
> > > > > As a result, when the cleaner_kthread runs, it sees no BTRFS_FS_OPEN and
> > > > > does no work. This results in leaking deleted snapshots until we run out
> > > > > of space.
> > > > > 
> > > > > I propose fixing it at the first departure from what feels reasonable:
> > > > > when we clear the readonly bit on the sb during device add. I have a
> > > > > reproducer of the issue here:
> > > > > https://raw.githubusercontent.com/boryas/scripts/main/sh/seed/mkseed.sh
> > > > > and confirm that this patch fixes it, and seems to work OK, otherwise. I
> > > > > will admit that I couldn't dig up the original rationale for clearing
> > > > > the bit here (it dates back to the original seed/sprout commit without
> > > > > explicit explanation) so it's hard to imagine all the ramifications of
> > > > > the change.
> > > > > 
> > > > > Signed-off-by: Boris Burkov <boris@bur.io>
> > > > > ---
> > > > >   fs/btrfs/volumes.c | 4 ----
> > > > >   1 file changed, 4 deletions(-)
> > > > > 
> > > > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > > > > index 3fd17e87815a..75d7eeb26fe6 100644
> > > > > --- a/fs/btrfs/volumes.c
> > > > > +++ b/fs/btrfs/volumes.c
> > > > > @@ -2675,8 +2675,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
> > > > >   	set_blocksize(device->bdev, BTRFS_BDEV_BLOCKSIZE);
> > > > >   	if (seeding_dev) {
> > > > > -		btrfs_clear_sb_rdonly(sb);
> > > > > -
> > > > 
> > > > After this line, it updates the metadata e.g, with
> > > > init_first_rw_device() and writes them out with
> > > > btrfs_commit_transaction(). Is that OK to do so with the SB_RDONLY
> > > > flag set?
> > > 
> > 
> > It is ok as the device-add step creates a _new_ sprout filesystem which is
> > RW-able. btrfs_setup_sprout() resets the seeding flag.
> > 
> >  super_flags = btrfs_super_flags(disk_super) &
> >  ~BTRFS_SUPER_FLAG_SEEDING;
> >  btrfs_set_super_flags(disk_super, super_flags);
> > 
> > Thanks, Anand
> 
> Yeah, I see that point. I'm concerned about an interaction with the
> VFS code. With this patch applied, it is going to do file-system
> internal writes (updating the trees and transaction commit) with the
> SB_RDONLY flag set. Doesn't it break the current and future assumptions
> of the VFS side?
> 
> But, it's just a concern. It might not be a bid deal.

I discussed this with Chris and he pointed out an existing case:
log-replay. Log replay runs in mount before the sb_rdonly check and
checks rw_devices, but not the fs readonly bit. It uses transactions and
such to replay the log. This seems similar enough here: the device is rw,
but the fs is ro.

> 
> > > Good question. As far as I can tell, the functions don't explicitly
> > > check sb_rdonly, though that could be because they expect that to be
> > > checked before you ever try to commit a transaction, for example..
> > > 
> > > If there is an issue, it's probably somewhat subtle, because the basic
> > > behavior does work.
> > > 
> > > > 
> > > > >   		/* GFP_KERNEL allocation must not be under device_list_mutex */
> > > > >   		seed_devices = btrfs_init_sprout(fs_info);
> > > > >   		if (IS_ERR(seed_devices)) {
> > > > > @@ -2819,8 +2817,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
> > > > >   	mutex_unlock(&fs_info->chunk_mutex);
> > > > >   	mutex_unlock(&fs_info->fs_devices->device_list_mutex);
> > > > >   error_trans:
> > > > > -	if (seeding_dev)
> > > > > -		btrfs_set_sb_rdonly(sb);
> > > > >   	if (trans)
> > > > >   		btrfs_end_transaction(trans);
> > > > >   error_free_zone:
> > > > > -- 
> > > > > 2.30.2
> > > > > 
> > 

  reply	other threads:[~2022-03-29 19:45 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-21 23:56 [PATCH] btrfs: do not clear read-only when adding sprout device Boris Burkov
2022-03-22 21:46 ` Josef Bacik
2022-03-23  0:52 ` Naohiro Aota
2022-03-23 18:16   ` Boris Burkov
2022-03-28 11:11     ` Anand Jain
2022-03-29  4:33       ` Naohiro Aota
2022-03-29 19:45         ` Boris Burkov [this message]
2022-03-23 10:44 ` Anand Jain
2022-03-23 18:25   ` Boris Burkov
2022-03-24 11:16     ` Anand Jain
2022-03-23 20:17   ` Josef Bacik
  -- strict thread matches above, loose matches on Subject: below --
2024-10-15 21:38 Boris Burkov
2024-10-15 22:00 ` Qu Wenruo
2024-10-15 22:12   ` Qu Wenruo
2024-10-15 23:23     ` Boris Burkov
2024-10-16 17:14 ` Anand Jain
2024-10-16 17:24   ` Boris Burkov
2024-10-17 20:47   ` Qu Wenruo
2024-10-18 11:54     ` Anand Jain
2024-10-17 14:01 ` David Sterba
2024-10-17 16:41   ` Boris Burkov
2024-10-21 18:56     ` David Sterba
2024-10-21 19:29       ` Boris Burkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YkNh0X7wx8uk5aat@zen \
    --to=boris@bur.io \
    --cc=Naohiro.Aota@wdc.com \
    --cc=anand.jain@oracle.com \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.