linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* What would happen if the block device driver/firmware found some block of a bio is corrupted?
@ 2023-01-24  2:31 Qu Wenruo
  2023-01-24  4:44 ` Keith Busch
  0 siblings, 1 reply; 7+ messages in thread
From: Qu Wenruo @ 2023-01-24  2:31 UTC (permalink / raw)
  To: linux-block@vger.kernel.org, Linux FS Devel,
	linux-block@vger.kernel.org

Hi,

I'm wondering what would happen if we submit a read bio containing 
multiple sectors, while the block disk driver/firmware has internal 
checksum and found just one sector is corrupted (mismatch with its 
internal csum)?

For example, we submit a read bio sized 16KiB, and the device is in 4K 
sector size (like most modern HDD/SSD).
The corruption happens at the 2nd sector of the 16KiB.

My instinct points to either of them:

A) Mark the whole 16KiB bio as BLK_STS_IOERR
    This means even we have 3 good sectors, we have to treat them all as
    errors.

B) Ignore the error mark the bio as BLK_STS_OK
    This means higher layer must have extra ways to verify the contents.

But my concern is, if we go path A), it means after a read bio failure, 
we should try read again with much smaller block size, until we hit a 
failure with one sector.

IIRC VFS would do some retry, but otherwise the FS/driver layer needs to 
do some internal work and hit an error, then they need to do the 
split-and-retry manually.

On the other hand path B) seems more straightforward, but the problem is 
also obvious. Thankfully most fses are already doing checksum for their 
metadata at least.


So what's the common solution in real world for device drivers/firmware?
Path A/B or some other solution?

And should the upper layer do extra split-and-retry by themselves?
I know btrfs scrub code and repair is doing such split-and-retry, but 
not 100% sure if this is really needed or helpful in real world.

Thanks,
Qu

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-01-24 16:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-24  2:31 What would happen if the block device driver/firmware found some block of a bio is corrupted? Qu Wenruo
2023-01-24  4:44 ` Keith Busch
2023-01-24  5:38   ` Qu Wenruo
2023-01-24  6:32     ` Christoph Hellwig
2023-01-24  7:52       ` Qu Wenruo
2023-01-24  7:54         ` Christoph Hellwig
2023-01-24 16:08       ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).