From: Anand Jain <anand.jain@oracle.com>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org, dsterba@suse.cz
Subject: Re: [PATCH RFC] btrfs-progs: btrfstune -m|M remove 2-stage commit
Date: Sat, 19 Aug 2023 19:14:40 +0800 [thread overview]
Message-ID: <df569834-1949-5c1e-dd99-8da105a10b38@oracle.com> (raw)
In-Reply-To: <ef88fc2a-b845-4637-b006-43ecc511d9fd@suse.com>
On 18/08/2023 17:27, Qu Wenruo wrote:
>
>
> On 2023/8/18 17:21, Anand Jain wrote:
>> On 17/08/2023 19:52, David Sterba wrote:
>>> On Tue, Aug 15, 2023 at 08:53:24PM +0800, Anand Jain wrote:
>>>> The btrfstune -m|M operation changes the fsid and preserves the
>>>> original
>>>> fsid in the metadata_uuid.
>>>>
>>>> Changing the fsid happens in two commits in the function
>>>> set_metadata_uuid()
>>>> - Stage 1 commits the superblock with the CHANGING_FSID_V2 flag.
>>>> - Stage 2 updates the fsid in fs_info->super_copy (local memory),
>>>> resets the CHANGING_FSID_V2 flag (local memory),
>>>> and then calls commit.
>>>>
>>>> The two-stage operation with the CHANGING_FSID flag makes sense for
>>>> btrfstune -u|U, where the fsid is changed on each and every tree block,
>>>> involving many commits. The CHANGING_FSID flag helps to identify the
>>>> transient incomplete operation.
>>>>
>>>> However, for btrfstune -m|M, we don't need the CHANGING_FSID_V2
>>>> flag, and
>>>> a single commit would have been sufficient. The original git commit
>>>> that
>>>> added it, 493dfc9d1fc4 ("btrfs-progs: btrfstune: Add support for
>>>> changing
>>>> the metadata uuid"), provides no reasoning or did I miss something?
>>>> (So marking this patch as RFC).
>>>
>>> I remember discussions with Nikolay about the corner cases that could
>>> happen due to interrupted conversion and there was a document explaining
>>> that. Part of that ended up in kernel in the logic to detect partially
>>> enabled metadata_uuid. So there was reason to do it in two phases but
>>> I'd have to look it up in mails or if I still have a link or copy of the
>>> document.
>>
>>
>> On 18/08/2023 08:21, Qu Wenruo wrote:
>>
>> > Oh, my memory comes back, the original design for the two stage
>> > commitment is to avoid split brain cases when one device is committed
>> > with the new flag, while the remaining one doesn't.
>> >
>> > With the extra stage, even if at stage 1 or 2 the transaction is
>> > interrupted and only one device got the new flag, it can help us to
>> > locate the stage and recover.
>>
>> It is useful for `btrfstune -u`
>> when there are many transaction commits to write. It uses the
>> `CHANGING_FSID` flag for this purpose. Any device with the
>> `CHANGING_FSID` flag fails to mount, and `btrfstune` should be called
>> again to continue rewrite the new FSID. This is a fair process.
>>
>>
>> However, in the case of `CHANGING_FSID_V2`, which the `btrfstune -m`
>> command uses, only one transaction is required. How does this help?
>>
>> Disk1 Disk2
>>
>> Commit1 CHANGING_FSID_V2 CHANGING_FSID_V2
>> Commit2 new-fsid new-fsid
>>
>>
>>
>
> Instead if there is only one transaction to enable metadata_uuid
> feature, we can have the following situation where for the single
> transaction we only committed on disk 1:
>
> Disk 1 Disk 2
> Meta uuid Regular UUID
>
> Then how do kernel/progs proper recover from this situation?
>
> Although I'd say, it's still possible to recover, but significantly more
> difficult, as we need to properly handle different generations of super
> blocks.
>
> For the two stage one, it's a little simpler but I'm not sure if we have
> the extra handling. E.g. if commit 1 failed partially:
>
> Disk 1 Disk 2
> Changing_fsid_v2 no changing_fsid_v2
>
> In this case, we can detects we're trying to start fsid change using
> metadata uuid.
>
> The same thing for the 2nd commit.
The changing_fsid_v2 flag an unnecessary overhead, right? As it could be
something like:
. Write new-fsid and Verify. Revert if needed and verify.
------
Summarizing the overall patches:
Patchset [1] is a port of kernel changing_fsid_v2 recovery automation
functions to the progs. So, hosts with older progs and can't upgrade to
find a tmp host with the newer progs to fix the disks?
[1] [PATCH 00/16] btrfs-progs: recover from failed metadata_uuid
Automation in progs/kernel cannot fix all possible scenarios; we still
need user intervention to reunite, as in patchset [2]. It adds --device
and --noscan options to reunite. (Needs comments, so that I can rebase).
[2] [PATCH 00/10] btrfs-progs: check and tune: add device and noscan
options
Patchset [3] removes the changing_fsid_v2 recovery code from the kernel,
as pointed out- recovery code is robust in general, except in corner
cases. So, after this, similar to changing_fsid, disks with
changing_fsid_v2 are rejected.
But when progs can recover the failed situation and could remove the use
of the changing_fsid_v2 flag (patch [4]), we don't actually need the
recovery in the kernel.
[3] [PATCH] btrfs: reject device with CHANGING_FSID_V2
[4] [PATCH RFC] btrfs-progs: btrfstune -m|M remove 2-stage commit
-------
Thanks,
Anand
next prev parent reply other threads:[~2023-08-19 11:17 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-15 12:53 [PATCH RFC] btrfs-progs: btrfstune -m|M remove 2-stage commit Anand Jain
2023-08-16 2:17 ` Qu Wenruo
2023-08-17 11:52 ` David Sterba
2023-08-18 9:21 ` Anand Jain
2023-08-18 9:27 ` Qu Wenruo
2023-08-19 11:14 ` Anand Jain [this message]
2023-09-18 22:38 ` David Sterba
2023-09-20 23:01 ` Anand Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=df569834-1949-5c1e-dd99-8da105a10b38@oracle.com \
--to=anand.jain@oracle.com \
--cc=dsterba@suse.cz \
--cc=linux-btrfs@vger.kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).