All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] btrfs: raid56: make raid56 to use more accurate error bitmap for error detection
@ 2022-11-07  7:32 Qu Wenruo
  2022-11-07  7:32 ` [PATCH 1/3] btrfs: raid56: introduce btrfs_raid_bio::error_bitmap Qu Wenruo
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Qu Wenruo @ 2022-11-07  7:32 UTC (permalink / raw)
  To: linux-btrfs

Currently btrfs raid56 uses stripe based error detection.

This means, any error (which vary from single sector csum mismatch, to a
missing device) will mark the whole horizontal stripe as error.

This can lead to some unexpected behavior, for example:


             0        32K       64K
     Data 1  |XXXXXXXX|         |
     Data 2  |        |XXXXXXXXX|
     Parity  |        |         |

When reading data 1 [0, 32K), we got csum mismatch and go RAID56
recovery path.

If going the old path, we will mark the whole data 1 [0, 64K) all as
error, and recover using data 2 and parity.

But since data 2 [32K, 64K) is also corrupted, the recovered data will
also be corrupted.

Thankfully such problem will be mostly avoided after commit f6065f8edeb2
("btrfs: raid56: don't trust any cached sector in
__raid56_parity_recover()"), as when we read the sectors in data 2 [32K,
64K), we will recover discarding all the cached result.


This patchset will change the behavior by introducing an error bitmap,
recording corrupted sector one by one, so for above case, at least we
won't try to recover data 1 [32K, 64K) using incorrect data.

The true solution to this destructive RMW problem will be read time csum
verification, but this patchset introduces the basis to handle extra
csum mismatch error better (csum mismatch will also be marked as error,
but only for the offending sectors).

This patchset itself doesn't improve the raid56 destructive RMW
situation by itself, but would make later destructive RMW fix much
easier to implement.

Qu Wenruo (3):
  btrfs: raid56: introduce btrfs_raid_bio::error_bitmap
  btrfs: raid56: migrate recovery and scrub recovery path to use
    error_bitmap
  btrfs: raid56: remove the old error tracing system

 fs/btrfs/raid56.c | 572 ++++++++++++++++++++++++++--------------------
 fs/btrfs/raid56.h |  19 +-
 2 files changed, 334 insertions(+), 257 deletions(-)

-- 
2.38.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-11-07 17:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-07  7:32 [PATCH 0/3] btrfs: raid56: make raid56 to use more accurate error bitmap for error detection Qu Wenruo
2022-11-07  7:32 ` [PATCH 1/3] btrfs: raid56: introduce btrfs_raid_bio::error_bitmap Qu Wenruo
2022-11-07 17:03   ` David Sterba
2022-11-07  7:32 ` [PATCH 2/3] btrfs: raid56: migrate recovery and scrub recovery path to use error_bitmap Qu Wenruo
2022-11-07  7:32 ` [PATCH 3/3] btrfs: raid56: remove the old error tracing system Qu Wenruo
2022-11-07 17:04 ` [PATCH 0/3] btrfs: raid56: make raid56 to use more accurate error bitmap for error detection David Sterba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.