public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: Boris Burkov <boris@bur.io>,
	linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] btrfs: do not clear read-only when adding sprout device
Date: Thu, 17 Oct 2024 01:14:16 +0800	[thread overview]
Message-ID: <694552b3-5f70-48d2-a62f-4c2b8caf10fd@oracle.com> (raw)
In-Reply-To: <a9aa42f6bc2739ab46ce015f749e15177f8730d6.1729028033.git.boris@bur.io>

On 16/10/24 05:38, Boris Burkov wrote:
> If you follow the seed/sprout wiki, it suggests the following workflow:
> 
> btrfstune -S 1 seed_dev
> mount seed_dev mnt
> btrfs device add sprout_dev
> mount -o remount,rw mnt
> 



> The first mount mounts the FS readonly, which results in not setting
> BTRFS_FS_OPEN, and setting the readonly bit on the sb. The device add
> somewhat surprisingly clears the readonly bit on the sb (though the
> mount is still practically readonly, from the users perspective...).
> Finally, the remount checks the readonly bit on the sb against the flag
> and sees no change, so it does not run the code intended to run on
> ro->rw transitions, leaving BTRFS_FS_OPEN unset.
> 
> As a result, when the cleaner_kthread runs, it sees no BTRFS_FS_OPEN and
> does no work. This results in leaking deleted snapshots until we run out
> of space.
> 
> I propose fixing it at the first departure from what feels reasonable:
> when we clear the readonly bit on the sb during device add.
> 
> A new fstest I have written reproduces the bug and confirms the fix.
> 
> Signed-off-by: Boris Burkov <boris@bur.io>
> ---
> Note that this is a resend of an old unmerged fix:
> https://lore.kernel.org/linux-btrfs/16c05d39566858bb8bc1e03bd19947cf2b601b98.1647906815.git.boris@bur.io/T/#u
> Some other ideas for fixing it by modifying how we set BTRFS_FS_OPEN
> were also explored but not merged around that time:
> https://lore.kernel.org/linux-btrfs/cover.1654216941.git.anand.jain@oracle.com/
> 
> I don't have a strong preference, but I would really like to see this
> trivial bug fixed. For what it is worth, we have been carrying this
> patch internally at Meta since I first sent it with no incident.
> ---


I remember fixing this before. I tested on 5.15, and the bug isn't
there, but it’s back in 6.10, so something broke in between.
We need to track it down.

The original design (kernel 4.x and below) makes the filesystem switch
to read-write mode after adding a sprout because:

    You can’t add a device to a normal read-only filesystem
  so with seed read-only mount is different.
    With a seed device, adding a writable device transforms
  it into a new read-write filesystem with a _new_ FSID and
  fs_devices. Logically, read-write at this stage makes sense,
  but I’m okay without it and in fact we had fixed this before,
  but a patch somewhere seems to have broken it again.


(Demo below. :<x> is the return code from the 'run' command at
  https://github.com/asj/run.git)


----- 5.15.0-208.159.3.2.el9uek.x86_64 ----
$ mkfs.btrfs -fq /dev/loop0 :0
$ btrfstune -S1 /dev/loop0 :0
$ mount /dev/loop0 /btrfs :0
mount: /btrfs: WARNING: source write-protected, mounted read-only.

$ cat /proc/self/mounts | grep btrfs :0
/dev/loop0 /btrfs btrfs ro,relatime,space_cache=v2,subvolid=5,subvol=/ 0 0

$ findmnt -o SOURCE,UUID /btrfs :0
SOURCE     UUID
/dev/loop0 64f21b87-4e4c-4786-b2cd-c09a5ccd2afa

$ btrfs fi show -m :0
Label: none  uuid: 64f21b87-4e4c-4786-b2cd-c09a5ccd2afa
	Total devices 1 FS bytes used 144.00KiB
	devid    1 size 3.00GiB used 536.00MiB path /dev/loop0

$ ls /sys/fs/btrfs :0
64f21b87-4e4c-4786-b2cd-c09a5ccd2afa
features

$ btrfs dev add -f /dev/loop1 /btrfs :0

# After adding the device, the path and UUID are different,
# so it’s a new filesystem. (But, as I said, I’m fine with
# keeping it read-only and needing remount,rw.

$ cat /proc/self/mounts | grep btrfs :0
/dev/loop1 /btrfs btrfs ro,relatime,space_cache=v2,subvolid=5,subvol=/ 0 0

$ findmnt -o SOURCE,UUID /btrfs :0
SOURCE     UUID
/dev/loop1 948cea35-18db-45da-9ec8-3d46cb5f0413

$ btrfs fi show -m :0
Label: none  uuid: 948cea35-18db-45da-9ec8-3d46cb5f0413
	Total devices 2 FS bytes used 144.00KiB
	devid    1 size 3.00GiB used 520.00MiB path /dev/loop0
	devid    2 size 3.00GiB used 576.00MiB path /dev/loop1


$ ls /sys/fs/btrfs :0
948cea35-18db-45da-9ec8-3d46cb5f0413
features
---------


Thanks, Anand

>   fs/btrfs/volumes.c | 4 ----
>   1 file changed, 4 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index dc9f54849f39..84e861dcb350 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -2841,8 +2841,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
>   	set_blocksize(device->bdev_file, BTRFS_BDEV_BLOCKSIZE);
>   
>   	if (seeding_dev) {
> -		btrfs_clear_sb_rdonly(sb);
> -
>   		/* GFP_KERNEL allocation must not be under device_list_mutex */
>   		seed_devices = btrfs_init_sprout(fs_info);
>   		if (IS_ERR(seed_devices)) {
> @@ -2985,8 +2983,6 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
>   	mutex_unlock(&fs_info->chunk_mutex);
>   	mutex_unlock(&fs_info->fs_devices->device_list_mutex);
>   error_trans:
> -	if (seeding_dev)
> -		btrfs_set_sb_rdonly(sb);
>   	if (trans)
>   		btrfs_end_transaction(trans);
>   error_free_zone:


  parent reply	other threads:[~2024-10-16 17:14 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-15 21:38 [PATCH] btrfs: do not clear read-only when adding sprout device Boris Burkov
2024-10-15 22:00 ` Qu Wenruo
2024-10-15 22:12   ` Qu Wenruo
2024-10-15 23:23     ` Boris Burkov
2024-10-16 17:14 ` Anand Jain [this message]
2024-10-16 17:24   ` Boris Burkov
2024-10-17 20:47   ` Qu Wenruo
2024-10-18 11:54     ` Anand Jain
2024-10-17 14:01 ` David Sterba
2024-10-17 16:41   ` Boris Burkov
2024-10-21 18:56     ` David Sterba
2024-10-21 19:29       ` Boris Burkov
  -- strict thread matches above, loose matches on Subject: below --
2022-03-21 23:56 Boris Burkov
2022-03-22 21:46 ` Josef Bacik
2022-03-23  0:52 ` Naohiro Aota
2022-03-23 18:16   ` Boris Burkov
2022-03-28 11:11     ` Anand Jain
2022-03-29  4:33       ` Naohiro Aota
2022-03-29 19:45         ` Boris Burkov
2022-03-23 10:44 ` Anand Jain
2022-03-23 18:25   ` Boris Burkov
2022-03-24 11:16     ` Anand Jain
2022-03-23 20:17   ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=694552b3-5f70-48d2-a62f-4c2b8caf10fd@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=boris@bur.io \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox