public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Kyle Smith <mr.kyle.smith@gmail.com>, fdmanana@kernel.org
Cc: linux-btrfs@vger.kernel.org
Subject: Re: mounting causes errors after power loss
Date: Fri, 16 Feb 2024 11:19:19 +1030	[thread overview]
Message-ID: <ae41aadd-d4bf-4cb7-9b53-2e44ceeff6b2@gmx.com> (raw)
In-Reply-To: <CAKb79g1KrW2KVdZkThu6X26wMKTyErq-eT+r555H4kXCTGDa1w@mail.gmail.com>



在 2024/2/16 10:51, Kyle Smith 写道:
> On Thu, Feb 15, 2024 at 3:23 PM Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
[...]
>>
>> Those are all fixable by the latest btrfs-progs, so no big deal.
>>
>> Furthermore, this is not caused by some powerloss, but more like some
>> older btrfs bugs.
>> Or sometimes even memory bitflips (this need extra debugging to confirm).
>>
>> By all means, it's recommended to use kernel newer than v5.11 at least
>> (thus recommended to go at least 5.15).
>
> I'm currently using OpenWrt 22.03.5 which uses the 5.10 kernel, and I
> am eventually going to move to OpenWrt 23.05 with the 5.15 kernel. In
> the meantime, are there any btrfs patches that I should backport to
> the 5.10 kernel?

I don't think so, unless you want to backport all the tree-checker code
to the 5.10 kernel.

> Is there any problem upgrading the kernel from 5.10
> to 5.15 while btrfs has these errors?

Still hard to say. See my reply about the dump below.

> Would upgrading alone be enough
> to fix these errors or is a "btrfs check --repair" required?

--repair is required. It's already a corruption on disk.

>
> OpenWrt also provides btrfs-progs 6.0.1. Is this version new enough to
> safely and reliably fix these errors? "btrfs check --repair" has been
> successful everytime.

I believe it's should be fine.

>
> Please provide the debug steps to check for memory bitflips. The
> system has been very stable so while I don't think this is a memory
> issue it would be good to rule it out.

If it's x86_64 based, you can try some UEFI payload like memtest86+.

If not, you can go memtester program.

>
[...]
>>
>> Unfortunately the dump is not enough to confirm anything.
>>
>> Please try the following ones:
>>
>> # btrfs ins dump-tree -t /dev/mapper/luks-part | grep -A5 "(27535265
>> DIR_INDEX 55000957)"
>>
>> # btrfs ins dump-tree -t /dev/mapper/luks-part | grep -A5 "(27535266
>> DIR_INDEX 55000959)"
>>
>> After the direct match, there would be a line like:
>>
>>          location key (XXXX INODE_ITEM 0) type XXX
>>
>> Use that key to do such search again.
>
> I wasn't able to find the  "(27535265 DIR_INDEX 55000957)" or
> "(27535266 DIR_INDEX 55000959)"  strings in the dump. Here are the
> lines matching any of those values. I get the same output with "-t 5"
> or just removing the option. "-t" alone was throwing an error.
>
> # btrfs ins dump-tree /dev/mapper/luks-part | grep -A3 -E
> "27535265|55000957|27535266|55000959"
>      item 61 key (256 DIR_INDEX 55000957) itemoff 13638 itemsize 45
>          location key (27535265 INODE_ITEM 0) type FILE
>          transid 17119099 data_len 0 name_len 15
>          name: .sharedContents
>      item 62 key (256 DIR_INDEX 55000959) itemoff 13593 itemsize 45
>          location key (27535266 INODE_ITEM 0) type FILE
>          transid 17119099 data_len 0 name_len 15
>          name: .sharedContents
>      item 63 key (256 DIR_INDEX 55415388) itemoff 13545 itemsize 48

So it really means the inode 27535265 and 27535266 are gone.

It may be something related to the transaction split in older kernels,
as the deletion of the inode item and those dir items should be in the
same transaction.

But it's pretty old kernel, thus I'm not sure if it's possible to pin
down the fix/offending commit.

In that case, no obvious memory biflip.
But since the damage is already done, a --repair is required.
>
> I see these two "location key" lines but no new key values to search
> for. Should I be looking for something else?
>
>          location key (27535265 INODE_ITEM 0) type FILE
>          location key (27535266 INODE_ITEM 0) type FILE

Considering that's the only error, it should really be those two inode
items missing.
Or it means the dir index are not properly deleted.

[...]
>>
>> For your case, it's completely unrelated, but I'd like more dump to make
>> sure it's not some weird memory bitflip.
>
> This is good to know. Can I rule out the lower LUKS layer and the disk
> firmware since I'm not seeing a transid mismatch? These btrfs errors
> are the only problems I've had with LUKS2 on eMMC.

The problem is, you can only find out if it's something wrong with flush
when you already hit a transid error.

So forget flush related problem for now.

>
> Please let me know about backporting any relevant btrfs patches or
> debugging a possible memory bitflip.

I don't have any good idea on how this happened.

Adding Filipe and he may be aware of which commit is the cause/fix.

Thanks,
Qu

>
> Thank you for your quick help.
>
>
> Kyle
>
>> Thanks,
>> Qu
>>
>>>
>>>
>>> Thank you,
>>> Kyle
>>>
>>> [0]: https://lore.kernel.org/linux-btrfs/CA+XNQ=ixcfB1_CXHf5azsB4gX87vvdmei+fxv5dj4K_4=H1=ag@mail.gmail.com/
>>>

      reply	other threads:[~2024-02-16  0:49 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-15 20:04 mounting causes errors after power loss Kyle Smith
2024-02-15 23:23 ` Qu Wenruo
2024-02-16  0:21   ` Kyle Smith
2024-02-16  0:49     ` Qu Wenruo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ae41aadd-d4bf-4cb7-9b53-2e44ceeff6b2@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=fdmanana@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=mr.kyle.smith@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox