public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>, Boris Burkov <boris@bur.io>,
	linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] btrfs: do not clear read-only when adding sprout device
Date: Fri, 18 Oct 2024 19:54:33 +0800	[thread overview]
Message-ID: <56258d9b-484d-4340-b6ad-82ac8648a03a@oracle.com> (raw)
In-Reply-To: <12c0bb30-7ee5-4aec-9fe8-f40ee01ec9a7@gmx.com>



On 18/10/24 04:47, Qu Wenruo wrote:
> 
> 
> 在 2024/10/17 03:44, Anand Jain 写道:
>> On 16/10/24 05:38, Boris Burkov wrote:
>>> If you follow the seed/sprout wiki, it suggests the following workflow:
>>>
>>> btrfstune -S 1 seed_dev
>>> mount seed_dev mnt
>>> btrfs device add sprout_dev
>>> mount -o remount,rw mnt
>>>
>>
>>
>>
>>> The first mount mounts the FS readonly, which results in not setting
>>> BTRFS_FS_OPEN, and setting the readonly bit on the sb. The device add
>>> somewhat surprisingly clears the readonly bit on the sb (though the
>>> mount is still practically readonly, from the users perspective...).
>>> Finally, the remount checks the readonly bit on the sb against the flag
>>> and sees no change, so it does not run the code intended to run on
>>> ro->rw transitions, leaving BTRFS_FS_OPEN unset.
>>>
>>> As a result, when the cleaner_kthread runs, it sees no BTRFS_FS_OPEN and
>>> does no work. This results in leaking deleted snapshots until we run out
>>> of space.
>>>
>>> I propose fixing it at the first departure from what feels reasonable:
>>> when we clear the readonly bit on the sb during device add.
>>>
>>> A new fstest I have written reproduces the bug and confirms the fix.
>>>
>>> Signed-off-by: Boris Burkov <boris@bur.io>
>>> ---
>>> Note that this is a resend of an old unmerged fix:
>>> https://lore.kernel.org/linux-
>>> btrfs/16c05d39566858bb8bc1e03bd19947cf2b601b98.1647906815.git.boris@bur.io/T/#u
>>> Some other ideas for fixing it by modifying how we set BTRFS_FS_OPEN
>>> were also explored but not merged around that time:
>>> https://lore.kernel.org/linux-btrfs/
>>> cover.1654216941.git.anand.jain@oracle.com/
>>>
>>> I don't have a strong preference, but I would really like to see this
>>> trivial bug fixed. For what it is worth, we have been carrying this
>>> patch internally at Meta since I first sent it with no incident.
>>> ---
>>
>>
>> I remember fixing this before. I tested on 5.15, and the bug isn't
>> there, but it’s back in 6.10, so something broke in between.
>> We need to track it down.
>>
>> The original design (kernel 4.x and below) makes the filesystem switch
>> to read-write mode after adding a sprout because:
>>
>>     You can’t add a device to a normal read-only filesystem
>>   so with seed read-only mount is different.
>>     With a seed device, adding a writable device transforms
>>   it into a new read-write filesystem with a _new_ FSID and
>>   fs_devices. Logically, read-write at this stage makes sense,
>>   but I’m okay without it and in fact we had fixed this before,
>>   but a patch somewhere seems to have broken it again.
>>
>>
>> (Demo below. :<x> is the return code from the 'run' command at
>>   https://github.com/asj/run.git)
>>
>>
>> ----- 5.15.0-208.159.3.2.el9uek.x86_64 ----
> 
> I also tried it on upstream kernel v5.15.94, the behavior is still the
> old changed to RW immediately after device add:
> 
> [adam@btrfs-vm ~]$ uname -a
> Linux btrfs-vm 5.15.94-1-lts #1 SMP Wed, 15 Feb 2023 07:09:02 +0000
> x86_64 GNU/Linux
> [adam@btrfs-vm ~]$ sudo mkfs.btrfs  -f /dev/test/scratch1 > /dev/null
> [adam@btrfs-vm ~]$ sudo btrfstune -S 1 /dev/test/scratch1
> [adam@btrfs-vm ~]$ sudo mount /dev/test/scratch1  /mnt/btrfs/
> mount: /mnt/btrfs: WARNING: source write-protected, mounted read-only.
> [adam@btrfs-vm ~]$ sudo btrfs device add -f /dev/test/scratch2  /mnt/btrfs/
> Performing full device TRIM /dev/test/scratch2 (10.00GiB) ...
> [adam@btrfs-vm ~]$ sudo touch /mnt/btrfs/file
> [adam@btrfs-vm ~]$ mount | grep mnt/btrfs
> /dev/mapper/test-scratch2 on /mnt/btrfs type btrfs
> (rw,relatime,space_cache=v2,subvolid=5,subvol=/)
> 
> So it looks like it's some extra backports causing the behavior change.
> 

Actually, it is caused by util-linux, specifically the libmount.
v2.38 is good, but v2.39-rc1 is bad with the same kernel without
the fix.
Unfortunately, a bunch of libmount commits between these
versions are not bisectable.
So I have no specific commit but commits from 2b1db0951b9d
to f07412a04ca8.


> But I still strongly prefer to keep it RO.
> Even if it's a different fs under the hood, it still suddenly changes
> the RO/RW status of a mount point without letting the user to know.

Just my perspective—typically, we add an RW device to make a seed
filesystem writable. If one command can do it, that's great from
ease of use pov. But I’m fine with RO; it’s cleaner.


Thanks, Anand

> 
> Thanks,
> Qu
> 
>> $ mkfs.btrfs -fq /dev/loop0 :0
>> $ btrfstune -S1 /dev/loop0 :0
>> $ mount /dev/loop0 /btrfs :0
>> mount: /btrfs: WARNING: source write-protected, mounted read-only.
>>
>> $ cat /proc/self/mounts | grep btrfs :0
>> /dev/loop0 /btrfs btrfs ro,relatime,space_cache=v2,subvolid=5,subvol=/ 
>> 0 0
>>
>> $ findmnt -o SOURCE,UUID /btrfs :0
>> SOURCE     UUID
>> /dev/loop0 64f21b87-4e4c-4786-b2cd-c09a5ccd2afa
>>
>> $ btrfs fi show -m :0
>> Label: none  uuid: 64f21b87-4e4c-4786-b2cd-c09a5ccd2afa
>>      Total devices 1 FS bytes used 144.00KiB
>>      devid    1 size 3.00GiB used 536.00MiB path /dev/loop0
>>
>> $ ls /sys/fs/btrfs :0
>> 64f21b87-4e4c-4786-b2cd-c09a5ccd2afa
>> features
>>
>> $ btrfs dev add -f /dev/loop1 /btrfs :0
>>
>> # After adding the device, the path and UUID are different,
>> # so it’s a new filesystem. (But, as I said, I’m fine with
>> # keeping it read-only and needing remount,rw.
>>
>> $ cat /proc/self/mounts | grep btrfs :0
>> /dev/loop1 /btrfs btrfs ro,relatime,space_cache=v2,subvolid=5,subvol=/ 
>> 0 0
>>
>> $ findmnt -o SOURCE,UUID /btrfs :0
>> SOURCE     UUID
>> /dev/loop1 948cea35-18db-45da-9ec8-3d46cb5f0413
>>
>> $ btrfs fi show -m :0
>> Label: none  uuid: 948cea35-18db-45da-9ec8-3d46cb5f0413
>>      Total devices 2 FS bytes used 144.00KiB
>>      devid    1 size 3.00GiB used 520.00MiB path /dev/loop0
>>      devid    2 size 3.00GiB used 576.00MiB path /dev/loop1
>>
>>
>> $ ls /sys/fs/btrfs :0
>> 948cea35-18db-45da-9ec8-3d46cb5f0413
>> features
>> ---------
>>
>>
>> Thanks, Anand
>>
>>>   fs/btrfs/volumes.c | 4 ----
>>>   1 file changed, 4 deletions(-)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index dc9f54849f39..84e861dcb350 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -2841,8 +2841,6 @@ int btrfs_init_new_device(struct btrfs_fs_info
>>> *fs_info, const char *device_path
>>>       set_blocksize(device->bdev_file, BTRFS_BDEV_BLOCKSIZE);
>>>       if (seeding_dev) {
>>> -        btrfs_clear_sb_rdonly(sb);
>>> -
>>>           /* GFP_KERNEL allocation must not be under 
>>> device_list_mutex */
>>>           seed_devices = btrfs_init_sprout(fs_info);
>>>           if (IS_ERR(seed_devices)) {
>>> @@ -2985,8 +2983,6 @@ int btrfs_init_new_device(struct btrfs_fs_info
>>> *fs_info, const char *device_path
>>>       mutex_unlock(&fs_info->chunk_mutex);
>>>       mutex_unlock(&fs_info->fs_devices->device_list_mutex);
>>>   error_trans:
>>> -    if (seeding_dev)
>>> -        btrfs_set_sb_rdonly(sb);
>>>       if (trans)
>>>           btrfs_end_transaction(trans);
>>>   error_free_zone:
>>
>>
> 


  reply	other threads:[~2024-10-18 11:54 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-15 21:38 [PATCH] btrfs: do not clear read-only when adding sprout device Boris Burkov
2024-10-15 22:00 ` Qu Wenruo
2024-10-15 22:12   ` Qu Wenruo
2024-10-15 23:23     ` Boris Burkov
2024-10-16 17:14 ` Anand Jain
2024-10-16 17:24   ` Boris Burkov
2024-10-17 20:47   ` Qu Wenruo
2024-10-18 11:54     ` Anand Jain [this message]
2024-10-17 14:01 ` David Sterba
2024-10-17 16:41   ` Boris Burkov
2024-10-21 18:56     ` David Sterba
2024-10-21 19:29       ` Boris Burkov
  -- strict thread matches above, loose matches on Subject: below --
2022-03-21 23:56 Boris Burkov
2022-03-22 21:46 ` Josef Bacik
2022-03-23  0:52 ` Naohiro Aota
2022-03-23 18:16   ` Boris Burkov
2022-03-28 11:11     ` Anand Jain
2022-03-29  4:33       ` Naohiro Aota
2022-03-29 19:45         ` Boris Burkov
2022-03-23 10:44 ` Anand Jain
2022-03-23 18:25   ` Boris Burkov
2022-03-24 11:16     ` Anand Jain
2022-03-23 20:17   ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56258d9b-484d-4340-b6ad-82ac8648a03a@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=boris@bur.io \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox