From: Anand Jain <anand.jain@oracle.com>
To: Qu Wenruo <wqu@suse.com>, dsterba@suse.cz
Cc: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] btrfs: reject device with CHANGING_FSID_V2
Date: Fri, 18 Aug 2023 16:27:58 +0800 [thread overview]
Message-ID: <f45f036b-72df-1ec6-ac32-4a2b98be5b3a@oracle.com> (raw)
In-Reply-To: <5fafa142-7995-4603-a6b2-8360dda8e7fb@suse.com>
On 18/08/2023 08:21, Qu Wenruo wrote:
>
>
> On 2023/8/18 01:19, Anand Jain wrote:
>>
>>
>> On 17/8/23 20:04, David Sterba wrote:
>>> On Wed, Aug 16, 2023 at 08:30:40PM +0800, Anand Jain wrote:
>>>> The BTRFS_SUPER_FLAG_CHANGING_FSID_V2 flag indicates a transient state
>>>> where the device in the userspace btrfstune -m|-M operation failed to
>>>> complete changing the fsid.
>>>>
>>>> This flag makes the kernel to automatically determine the other
>>>> partner devices to which a given device can be associated, based on the
>>>> fsid, metadata_uuid and generation values.
>>>>
>>>> btrfstune -m|M feature is especially useful in virtual cloud setups,
>>>> where
>>>> compute instances (disk images) are quickly copied, fsid changed, and
>>>> launched. Given numerous disk images with the same metadata_uuid but
>>>> different fsid, there's no clear way a device can be correctly
>>>> assembled
>>>> with the proper partners when the CHANGING_FSID_V2 flag is set. So, the
>>>> disk could be assembled incorrectly, as in the example below:
>>>>
>>>> Before this patch:
>>>>
>>>> Consider the following two filesystems:
>>>> /dev/loop[2-3] are raw copies of /dev/loop[0-1] and the
>>>> btrsftune -m
>>>> operation fails.
>>>>
>>>> In this scenario, as the /dev/loop0's fsid change is interrupted,
>>>> and the
>>>> CHANGING_FSID_V2 flag is set as shown below.
>>>>
>>>> $ p="device|devid|^metadata_uuid|^fsid|^incom|^generation|^flags"
>>>>
>>>> $ btrfs inspect dump-super /dev/loop0 | egrep '$p'
>>>> superblock: bytenr=65536, device=/dev/loop0
>>>> flags 0x1000000001
>>>> fsid 7d4b4b93-2b27-4432-b4e4-4be1fbccbd45
>>>> metadata_uuid bb040a9f-233a-4de2-ad84-49aa5a28059b
>>>> generation 9
>>>> num_devices 2
>>>> incompat_flags 0x741
>>>> dev_item.devid 1
>>>>
>>>> $ btrfs inspect dump-super /dev/loop1 | egrep '$p'
>>>> superblock: bytenr=65536, device=/dev/loop1
>>>> flags 0x1
>>>> fsid 11d2af4d-1b71-45a9-83f6-f2100766939d
>>>> metadata_uuid bb040a9f-233a-4de2-ad84-49aa5a28059b
>>>> generation 10
>>>> num_devices 2
>>>> incompat_flags 0x741
>>>> dev_item.devid 2
>>>>
>>>> $ btrfs inspect dump-super /dev/loop2 | egrep '$p'
>>>> superblock: bytenr=65536, device=/dev/loop2
>>>> flags 0x1
>>>> fsid 7d4b4b93-2b27-4432-b4e4-4be1fbccbd45
>>>> metadata_uuid bb040a9f-233a-4de2-ad84-49aa5a28059b
>>>> generation 8
>>>> num_devices 2
>>>> incompat_flags 0x741
>>>> dev_item.devid 1
>>>>
>>>> $ btrfs inspect dump-super /dev/loop3 | egrep '$p'
>>>> superblock: bytenr=65536, device=/dev/loop3
>>>> flags 0x1
>>>> fsid 7d4b4b93-2b27-4432-b4e4-4be1fbccbd45
>>>> metadata_uuid bb040a9f-233a-4de2-ad84-49aa5a28059b
>>>> generation 8
>>>> num_devices 2
>>>> incompat_flags 0x741
>>>> dev_item.devid 2
>>>>
>>>>
>>>> It is normal that some devices aren't instantly discovered during
>>>> system boot or iSCSI discovery. The controlled scan below demonstrates
>>>> this.
>>>>
>>>> $ btrfs device scan --forget
>>>> $ btrfs device scan /dev/loop0
>>>> Scanning for btrfs filesystems on '/dev/loop0'
>>>> $ mount /dev/loop3 /btrfs
>>>> $ btrfs filesystem show -m
>>>> Label: none uuid: 7d4b4b93-2b27-4432-b4e4-4be1fbccbd45
>>>> Total devices 2 FS bytes used 144.00KiB
>>>> devid 1 size 300.00MiB used 48.00MiB path /dev/loop0
>>>> devid 2 size 300.00MiB used 40.00MiB path /dev/loop3
>>>>
>>>> /dev/loop0 and /dev/loop3 are incorrectly partnered.
>>>>
>>>> This kernel patch removes functions and code connected to the
>>>> CHANGING_FSID_V2 flag.
>>>
>>> I didn't have a closer look but it seems you're removing all the logic
>>> to make the metadata uuid robust and usable in case of interrupted
>>> conversion, while finding another case where it does not work as you
>>> expect. With this it would be change in behaviour, we need to check
>>> the original use case. IIRC as the metadata uuid change is lightweight
>>> we want to try harder to deal with the easy errors instead of rejecting
>>> the filesystem mount.
>>
>> Robust indeed. Silently assembling wrong devices-data loss risk?
>> Failing to assemble is still safe.
>>
>> I think it is better to introduce a sub-command to clone btrfs
>> filesystem with a new device-uuid and same fsid (as it looks like
>> same fsid has some use case).
>>
>> Thanks, Anand
>
> Oh, my memory comes back, the original design for the two stage
> commitment is to avoid split brain cases when one device is committed
> with the new flag, while the remaining one doesn't.
>
> With the extra stage, even if at stage 1 or 2 the transaction is
> interrupted and only one device got the new flag, it can help us to
> locate the stage and recover.
As this comment is about the btrfstune patch
[PATCH RFC] btrfs-progs: btrfstune -m|M remove 2-stage commit
Let's discuss it there.
Thanks.
next prev parent reply other threads:[~2023-08-18 8:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-16 12:30 [PATCH] btrfs: reject device with CHANGING_FSID_V2 Anand Jain
2023-08-17 12:04 ` David Sterba
2023-08-17 17:19 ` Anand Jain
2023-08-18 0:21 ` Qu Wenruo
2023-08-18 8:27 ` Anand Jain [this message]
2023-09-18 22:44 ` David Sterba
2023-09-19 11:40 ` Anand Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f45f036b-72df-1ec6-ac32-4a2b98be5b3a@oracle.com \
--to=anand.jain@oracle.com \
--cc=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox