public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: kreijack@inwind.it, Qu Wenruo <wqu@suse.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH v2 0/6] btrfs-progs: introduce "btrfs rescue fix-data-checksum"
Date: Sun, 1 Jun 2025 07:37:57 +0930	[thread overview]
Message-ID: <e156fb78-928b-4fea-b29d-c06f70744fdf@gmx.com> (raw)
In-Reply-To: <828702dd-33e8-48c0-85f8-455763e34ed2@libero.it>



在 2025/6/1 02:28, Goffredo Baroncelli 写道:
> On 15/05/2025 10.00, Qu Wenruo wrote:
>> [CHANGELOG]
>> v2:
>> - Rename the subcommand to "fix-data-checksum"
>>    It's better to use full name in the command name
>>
>> - Remove unused members inside corrupted_block
>>    The old @extent_bytenr and @extent_len is no longer needed, even for
>>    the future file-deletion action.
>>
>> - Fix the bitmap size off-by-1 bug
>>    We must use the bit 0 to represent mirror 1, or the bitmap size will
>>    exceed num_mirrors.
>>
>> - Introduce -i|--interactive mode
>>    Will ask the user for the action on the corrupted block, including:
>>
>>    * Ignore
>>      The default behavior if no command is provided
>>
>>    * Use specified mirror to update the data checksum item
>>      The user must input a number inside range [1, num_mirrors].
>>
>> - Introduce -m|--mirror <num> mode
>>    Use specified mirror for all corrupted blocks.
>>    The value <num> must be >= 1. And if the value is larger than the
>>    actual max mirror number, the real mirror number will be
>>    `num % (num_mirror + 1)`.
>>
>> We have a long history of data csum mismatch, caused by direct IO and
>> buffered being modified during writeback.
>>
> What about having an ioctl (on a mounted fs) which allow us to read data 
> from a file even
> if the csum doesn't match ?

The problem is again with mirrors.

Unless you ask the end user to provide a mirror number, for a with 
multiple mirrors, and the mirrors doesn't match each other, the behavior 
will be a mess.

That's why I'm planning to add something like a mirror priority, with a 
new mirror "best" to try use the mirror with the most matches.


Despite the mirror number problem (we need to ask the end user for 
mirror number), I think it's possible to implement the feature (mostly) 
in user space.

E.g. combine fiemap result with btrfs-map-logical, then read from the 
disk directly.

> I asked that because the problem usually 
> happen on a specific file
> and not to an entire filesystem. In this case I think that it would be 
> more practical to read the block
> using the IOCTL, and then rewrite the block, at the specific offset (to 
> update the checksum).

I'm fine with the idea of reading from raw data idea (although prefer to 
implement it in user space), but not a huge fan to simply re-write with COW.

E.g. the bad csum is still there, can still be exposed by scrub, even 
it's not referred by any file anymore.

> 
> Of course there are several tradeoff: the "unmounted" version, doesn't 
> duplicate a shared block,
> where my approach (read the data using an ioctl to avoid the CSUM 
> mismatch and rewrite it) make
> a fork if the block is shared. However, as told before, the problem is 
> related to specifics file and it seems
> a waste of resource reading an entire filesystem to correct few files. 
> No to mention that the IOCTL
> can be done on a "live" filesystem.

I do not think we should treat the csum error so easily, it's still a 
big problem.

Even it's known that direct IO with buffer modified halfway can lead to 
corruption, the csum mismatch is still a huge problem.
The end user should check if their problem is properly written, or use 
the latest kernel with proper backport.

If it's really caused by hardware (memory or disk or whatever), it's a 
huge deal and definitely needs proper inspection and verification.

So overall, if one hits a csum mismatch, especially after the direct IO 
fix, then it should not be treated as "something that can be easily 
fixed online", it should be something huge, at least needs some tool to 
handle it offline.

Thanks,
Qu

> 
> Said that, of course "btrfs rescue fix-data-checksum" is better than 
> nothing.
> 
> BR
> G.Baroncelli
> 
>> Although the problem is worked around in v6.15 (and being backported),
>> for the affected fs there is no good way to fix them, other than complex
>> manually find out which files are affected and delete them.
>>
>> This series introduce the initial implementation of "btrfs rescue
>> fix-data-checksum", which is designed to fix such problem by either:
>>
>> - Update the csum items using the data from specified copy
>>
>> The subcommand has 3 modes so far:
>>
>> - Readonly mode
>>    Only report all corrupted blocks and affected files, no repair is
>>    done.
>>
>> - Interactive mode
>>    Ask for what to do, including
>>
>>    * Ignore (the default)
>>    * Use certain mirror to update the checksum item
>>
>> - Mirror mode
>>    Use specified mirror to update the checksum item, the batch mode of
>>    the interactive one.
>>
>> In the future, there will be one more mode:
>>
>> - Delete mode
>>    Delete all involved files.
>>
>>    There are still some points to address before implementing this mode.
>>
>> Qu Wenruo (6):
>>    btrfs-progs: introduce "btrfs rescue fix-data-checksum"
>>    btrfs-progs: fix a bug in btrfs_find_item()
>>    btrfs-progs: fix-data-checksum: show affected files
>>    btrfs-progs: fix-data-checksum: introduce interactive mode
>>    btrfs-progs: fix-data-checksum: update csum items to fix csum mismatch
>>    btrfs-progs: fix-data-checksum: introduce -m|--mirror option
>>
>>   Documentation/btrfs-rescue.rst  |  28 ++
>>   Makefile                        |   2 +-
>>   cmds/rescue-fix-data-checksum.c | 511 ++++++++++++++++++++++++++++++++
>>   cmds/rescue.c                   |  65 ++++
>>   cmds/rescue.h                   |  10 +
>>   kernel-shared/ctree.c           |  17 +-
>>   kernel-shared/file-item.c       |   2 +-
>>   kernel-shared/file-item.h       |   5 +
>>   8 files changed, 635 insertions(+), 5 deletions(-)
>>   create mode 100644 cmds/rescue-fix-data-checksum.c
>>
>> -- 
>> 2.49.0
>>
>>
> 
> 


  reply	other threads:[~2025-05-31 22:08 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-15  8:00 [PATCH v2 0/6] btrfs-progs: introduce "btrfs rescue fix-data-checksum" Qu Wenruo
2025-05-15  8:00 ` [PATCH v2 1/6] " Qu Wenruo
2025-05-30 11:10   ` David Sterba
2025-05-15  8:00 ` [PATCH v2 2/6] btrfs-progs: fix a bug in btrfs_find_item() Qu Wenruo
2025-05-30 11:11   ` David Sterba
2025-05-15  8:00 ` [PATCH v2 3/6] btrfs-progs: fix-data-checksum: show affected files Qu Wenruo
2025-05-30 11:14   ` David Sterba
2025-05-15  8:00 ` [PATCH v2 4/6] btrfs-progs: fix-data-checksum: introduce interactive mode Qu Wenruo
2025-05-30 11:18   ` David Sterba
2025-05-15  8:00 ` [PATCH v2 5/6] btrfs-progs: fix-data-checksum: update csum items to fix csum mismatch Qu Wenruo
2025-05-30 11:19   ` David Sterba
2025-05-30 22:23     ` Qu Wenruo
2025-05-15  8:00 ` [PATCH v2 6/6] btrfs-progs: fix-data-checksum: introduce -m|--mirror option Qu Wenruo
2025-05-30 11:07 ` [PATCH v2 0/6] btrfs-progs: introduce "btrfs rescue fix-data-checksum" David Sterba
2025-05-31 16:58 ` Goffredo Baroncelli
2025-05-31 22:07   ` Qu Wenruo [this message]
2025-06-02 17:28     ` Goffredo Baroncelli
2025-06-02 22:55       ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e156fb78-928b-4fea-b29d-c06f70744fdf@gmx.com \
    --to=quwenruo.btrfs@gmx.com \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox