From: Anand Jain <anand.jain@oracle.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>, Boris Burkov <boris@bur.io>,
linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] btrfs: do not clear read-only when adding sprout device
Date: Fri, 18 Oct 2024 19:54:33 +0800 [thread overview]
Message-ID: <56258d9b-484d-4340-b6ad-82ac8648a03a@oracle.com> (raw)
In-Reply-To: <12c0bb30-7ee5-4aec-9fe8-f40ee01ec9a7@gmx.com>
On 18/10/24 04:47, Qu Wenruo wrote:
>
>
> 在 2024/10/17 03:44, Anand Jain 写道:
>> On 16/10/24 05:38, Boris Burkov wrote:
>>> If you follow the seed/sprout wiki, it suggests the following workflow:
>>>
>>> btrfstune -S 1 seed_dev
>>> mount seed_dev mnt
>>> btrfs device add sprout_dev
>>> mount -o remount,rw mnt
>>>
>>
>>
>>
>>> The first mount mounts the FS readonly, which results in not setting
>>> BTRFS_FS_OPEN, and setting the readonly bit on the sb. The device add
>>> somewhat surprisingly clears the readonly bit on the sb (though the
>>> mount is still practically readonly, from the users perspective...).
>>> Finally, the remount checks the readonly bit on the sb against the flag
>>> and sees no change, so it does not run the code intended to run on
>>> ro->rw transitions, leaving BTRFS_FS_OPEN unset.
>>>
>>> As a result, when the cleaner_kthread runs, it sees no BTRFS_FS_OPEN and
>>> does no work. This results in leaking deleted snapshots until we run out
>>> of space.
>>>
>>> I propose fixing it at the first departure from what feels reasonable:
>>> when we clear the readonly bit on the sb during device add.
>>>
>>> A new fstest I have written reproduces the bug and confirms the fix.
>>>
>>> Signed-off-by: Boris Burkov <boris@bur.io>
>>> ---
>>> Note that this is a resend of an old unmerged fix:
>>> https://lore.kernel.org/linux-
>>> btrfs/16c05d39566858bb8bc1e03bd19947cf2b601b98.1647906815.git.boris@bur.io/T/#u
>>> Some other ideas for fixing it by modifying how we set BTRFS_FS_OPEN
>>> were also explored but not merged around that time:
>>> https://lore.kernel.org/linux-btrfs/
>>> cover.1654216941.git.anand.jain@oracle.com/
>>>
>>> I don't have a strong preference, but I would really like to see this
>>> trivial bug fixed. For what it is worth, we have been carrying this
>>> patch internally at Meta since I first sent it with no incident.
>>> ---
>>
>>
>> I remember fixing this before. I tested on 5.15, and the bug isn't
>> there, but it’s back in 6.10, so something broke in between.
>> We need to track it down.
>>
>> The original design (kernel 4.x and below) makes the filesystem switch
>> to read-write mode after adding a sprout because:
>>
>> You can’t add a device to a normal read-only filesystem
>> so with seed read-only mount is different.
>> With a seed device, adding a writable device transforms
>> it into a new read-write filesystem with a _new_ FSID and
>> fs_devices. Logically, read-write at this stage makes sense,
>> but I’m okay without it and in fact we had fixed this before,
>> but a patch somewhere seems to have broken it again.
>>
>>
>> (Demo below. :<x> is the return code from the 'run' command at
>> https://github.com/asj/run.git)
>>
>>
>> ----- 5.15.0-208.159.3.2.el9uek.x86_64 ----
>
> I also tried it on upstream kernel v5.15.94, the behavior is still the
> old changed to RW immediately after device add:
>
> [adam@btrfs-vm ~]$ uname -a
> Linux btrfs-vm 5.15.94-1-lts #1 SMP Wed, 15 Feb 2023 07:09:02 +0000
> x86_64 GNU/Linux
> [adam@btrfs-vm ~]$ sudo mkfs.btrfs -f /dev/test/scratch1 > /dev/null
> [adam@btrfs-vm ~]$ sudo btrfstune -S 1 /dev/test/scratch1
> [adam@btrfs-vm ~]$ sudo mount /dev/test/scratch1 /mnt/btrfs/
> mount: /mnt/btrfs: WARNING: source write-protected, mounted read-only.
> [adam@btrfs-vm ~]$ sudo btrfs device add -f /dev/test/scratch2 /mnt/btrfs/
> Performing full device TRIM /dev/test/scratch2 (10.00GiB) ...
> [adam@btrfs-vm ~]$ sudo touch /mnt/btrfs/file
> [adam@btrfs-vm ~]$ mount | grep mnt/btrfs
> /dev/mapper/test-scratch2 on /mnt/btrfs type btrfs
> (rw,relatime,space_cache=v2,subvolid=5,subvol=/)
>
> So it looks like it's some extra backports causing the behavior change.
>
Actually, it is caused by util-linux, specifically the libmount.
v2.38 is good, but v2.39-rc1 is bad with the same kernel without
the fix.
Unfortunately, a bunch of libmount commits between these
versions are not bisectable.
So I have no specific commit but commits from 2b1db0951b9d
to f07412a04ca8.
> But I still strongly prefer to keep it RO.
> Even if it's a different fs under the hood, it still suddenly changes
> the RO/RW status of a mount point without letting the user to know.
Just my perspective—typically, we add an RW device to make a seed
filesystem writable. If one command can do it, that's great from
ease of use pov. But I’m fine with RO; it’s cleaner.
Thanks, Anand
>
> Thanks,
> Qu
>
>> $ mkfs.btrfs -fq /dev/loop0 :0
>> $ btrfstune -S1 /dev/loop0 :0
>> $ mount /dev/loop0 /btrfs :0
>> mount: /btrfs: WARNING: source write-protected, mounted read-only.
>>
>> $ cat /proc/self/mounts | grep btrfs :0
>> /dev/loop0 /btrfs btrfs ro,relatime,space_cache=v2,subvolid=5,subvol=/
>> 0 0
>>
>> $ findmnt -o SOURCE,UUID /btrfs :0
>> SOURCE UUID
>> /dev/loop0 64f21b87-4e4c-4786-b2cd-c09a5ccd2afa
>>
>> $ btrfs fi show -m :0
>> Label: none uuid: 64f21b87-4e4c-4786-b2cd-c09a5ccd2afa
>> Total devices 1 FS bytes used 144.00KiB
>> devid 1 size 3.00GiB used 536.00MiB path /dev/loop0
>>
>> $ ls /sys/fs/btrfs :0
>> 64f21b87-4e4c-4786-b2cd-c09a5ccd2afa
>> features
>>
>> $ btrfs dev add -f /dev/loop1 /btrfs :0
>>
>> # After adding the device, the path and UUID are different,
>> # so it’s a new filesystem. (But, as I said, I’m fine with
>> # keeping it read-only and needing remount,rw.
>>
>> $ cat /proc/self/mounts | grep btrfs :0
>> /dev/loop1 /btrfs btrfs ro,relatime,space_cache=v2,subvolid=5,subvol=/
>> 0 0
>>
>> $ findmnt -o SOURCE,UUID /btrfs :0
>> SOURCE UUID
>> /dev/loop1 948cea35-18db-45da-9ec8-3d46cb5f0413
>>
>> $ btrfs fi show -m :0
>> Label: none uuid: 948cea35-18db-45da-9ec8-3d46cb5f0413
>> Total devices 2 FS bytes used 144.00KiB
>> devid 1 size 3.00GiB used 520.00MiB path /dev/loop0
>> devid 2 size 3.00GiB used 576.00MiB path /dev/loop1
>>
>>
>> $ ls /sys/fs/btrfs :0
>> 948cea35-18db-45da-9ec8-3d46cb5f0413
>> features
>> ---------
>>
>>
>> Thanks, Anand
>>
>>> fs/btrfs/volumes.c | 4 ----
>>> 1 file changed, 4 deletions(-)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index dc9f54849f39..84e861dcb350 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -2841,8 +2841,6 @@ int btrfs_init_new_device(struct btrfs_fs_info
>>> *fs_info, const char *device_path
>>> set_blocksize(device->bdev_file, BTRFS_BDEV_BLOCKSIZE);
>>> if (seeding_dev) {
>>> - btrfs_clear_sb_rdonly(sb);
>>> -
>>> /* GFP_KERNEL allocation must not be under
>>> device_list_mutex */
>>> seed_devices = btrfs_init_sprout(fs_info);
>>> if (IS_ERR(seed_devices)) {
>>> @@ -2985,8 +2983,6 @@ int btrfs_init_new_device(struct btrfs_fs_info
>>> *fs_info, const char *device_path
>>> mutex_unlock(&fs_info->chunk_mutex);
>>> mutex_unlock(&fs_info->fs_devices->device_list_mutex);
>>> error_trans:
>>> - if (seeding_dev)
>>> - btrfs_set_sb_rdonly(sb);
>>> if (trans)
>>> btrfs_end_transaction(trans);
>>> error_free_zone:
>>
>>
>
next prev parent reply other threads:[~2024-10-18 11:54 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-15 21:38 [PATCH] btrfs: do not clear read-only when adding sprout device Boris Burkov
2024-10-15 22:00 ` Qu Wenruo
2024-10-15 22:12 ` Qu Wenruo
2024-10-15 23:23 ` Boris Burkov
2024-10-16 17:14 ` Anand Jain
2024-10-16 17:24 ` Boris Burkov
2024-10-17 20:47 ` Qu Wenruo
2024-10-18 11:54 ` Anand Jain [this message]
2024-10-17 14:01 ` David Sterba
2024-10-17 16:41 ` Boris Burkov
2024-10-21 18:56 ` David Sterba
2024-10-21 19:29 ` Boris Burkov
-- strict thread matches above, loose matches on Subject: below --
2022-03-21 23:56 Boris Burkov
2022-03-22 21:46 ` Josef Bacik
2022-03-23 0:52 ` Naohiro Aota
2022-03-23 18:16 ` Boris Burkov
2022-03-28 11:11 ` Anand Jain
2022-03-29 4:33 ` Naohiro Aota
2022-03-29 19:45 ` Boris Burkov
2022-03-23 10:44 ` Anand Jain
2022-03-23 18:25 ` Boris Burkov
2022-03-24 11:16 ` Anand Jain
2022-03-23 20:17 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56258d9b-484d-4340-b6ad-82ac8648a03a@oracle.com \
--to=anand.jain@oracle.com \
--cc=boris@bur.io \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=quwenruo.btrfs@gmx.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox