public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Qu Wenruo <wqu@suse.com>
To: Przemek Klosowski <przemek.klosowski@gmail.com>,
	linux-btrfs@vger.kernel.org
Subject: Re: Fwd: HTML message rejected: btrfs checksum error
Date: Tue, 29 Apr 2025 15:02:30 +0930	[thread overview]
Message-ID: <1be5f421-a36e-4a27-8c4b-73140f94a217@suse.com> (raw)
In-Reply-To: <CAC=1GgGRobZ7sMN6iBExMuYCRzNyei_mngkyRd=kvOX9rj90Lg@mail.gmail.com>



在 2025/4/29 11:55, Przemek Klosowski 写道:
> I have a RAID1 btrfs root/home on Fedora 42 that developed what
> appears to be a single data checksum error. RAM tests fine, but it's a
> DELL system that had memory problems early on (years ago), that were
> fixed by Dell BIOS memory tests  (which changed the mem controller
> settings).
> 
> The errors seem to have started right after a scrub (see btrfs
> messages from journal below)
> 
> btrfs check --readonly --force --check-data-csum -p /dev/nvme0n1p2
> 
> shows a cascade of errors (which seem to be increasing in number)
> ..
> [4/7] checking fs roots                        (0:00:04 elapsed, 60923
> items checked)
> mirror 1 bytenr 299511672832 csum 0x125beb3c expected csum
> 0xc8374bb569 items checked)
> mirror 1 bytenr 299511676928 csum 0x4c6adf72 expected csum 0xd82f54b8
> mirror 2 bytenr 299511672832 csum 0x125beb3c expected csum 0xc8374bb5
> mirror 2 bytenr 299511676928 csum 0x4c6adf72 expected csum 0xd82f54b8
> mirror 1 bytenr 306513821696 csum 0x8941f998 expected csum
> 0xa5fe1bfd94 items checked)
> mirror 1 bytenr 306513825792 csum 0x8941f998 expected csum 0x77c755d4
> .. and many more
> 
> I can recover the file with only 1 4kB block zeroed out.
> 
> Is there a way to read the bad sector? I thought that
> mount -o ro,degraded,rescue=ignoredatacsums/dev/sda5 /mnt
> would read data ignoring the bad checksum? as it is, it replicates the
> I/O error that is raised when reading the original file.

It turns out to be a bug in the implementation, we expect to ignore bad 
data csum error and return the data directly, but it's not implemented 
if the csum tree is still valid...

I'll send out a patch for that, but that will also mean with 
rescue=idatacsums mount option, the data will only be the first one 
btrfs read out.

It'll be fine for your case, as both mirrors have the same csum.

> 
> Do you think that deleting the file with the bad checksum will solve
> this?

Yes.

> or should I move to rebuilding and restoring from backups?

No need, "btrfs check --check-data-csum" is the most comprehensive check 
we have and it only reports error of data checksum so far (better than 
scrub because of the comprehensive metadata checks).

Although you will need to find out all involved files, scrub is doing a 
good job resolving the path, but the output may be ratelimited.

I'd recommend to craft a small script, parsing all involved unique 
bytenr into `btrfs ins logical` to get a full path to the affected files.

Thanks,
Qu

> 
> 
> Apr 26 22:41:04 fedora kernel: BTRFS info (device nvme0n1p2): scrub:
> started on devid 1
> Apr 26 22:41:04 fedora kernel: BTRFS info (device nvme0n1p2): scrub:
> started on devid 2
> Apr 26 22:41:36 fedora kernel: BTRFS error (device nvme0n1p2): unable
> to fixup (regular) error at logical 452965761024 on dev /dev/nvme0n1p2
> physical 74303995904
> Apr 26 22:41:36 fedora kernel: BTRFS warning (device nvme0n1p2):
> checksum error at logical 452965761024 on dev /dev/nvme0n1p2, physical
> 74303995904, root 257, inode 35328, offset 26034176, length 4096,
> links 1 (path: usr/lib/sysimage/rpm/rpmdb.sqlite-wal)
> Apr 26 22:42:52 fedora kernel: BTRFS info (device nvme0n1p2): scrub:
> finished on devid 1 with status: 0
> Apr 26 22:45:46 fedora kernel: BTRFS error (device nvme0n1p2): unable
> to fixup (regular) error at logical 452965761024 on dev /dev/sda5
> physical 147297468416
> Apr 26 22:45:46 fedora kernel: BTRFS warning (device nvme0n1p2):
> checksum error at logical 452965761024 on dev /dev/sda5, physical
> 147297468416, root 257, inode 35328, offset 26034176, length 4096,
> links 1 (path: usr/lib/sysimage/rpm/rpmdb.sqlite-wal)
> Apr 26 22:48:45 fedora kernel: BTRFS info (device nvme0n1p2): scrub:
> finished on devid 2 with status: 0
> Apr 26 22:53:23 fedora kernel: BTRFS info (device nvme0n1p2): scrub:
> started on devid 2
> Apr 26 22:53:23 fedora kernel: BTRFS info (device nvme0n1p2): scrub:
> started on devid 1
> Apr 26 22:53:52 fedora kernel: BTRFS error (device nvme0n1p2): unable
> to fixup (regular) error at logical 452965761024 on dev /dev/nvme0n1p2
> physical 74303995904
> Apr 26 22:53:52 fedora kernel: BTRFS warning (device nvme0n1p2):
> checksum error at logical 452965761024 on dev /dev/nvme0n1p2, physical
> 74303995904, root 257, inode 35328, offset 26034176, length 4096,
> links 1 (path: usr/lib/sysimage/rpm/rpmdb.sqlite-wal)
> Apr 26 22:55:07 fedora kernel: BTRFS info (device nvme0n1p2): scrub:
> finished on devid 1 with status: 0
> Apr 26 22:58:04 fedora kernel: BTRFS error (device nvme0n1p2): unable
> to fixup (regular) error at logical 452965761024 on dev /dev/sda5
> physical 147297468416
> Apr 26 22:58:04 fedora kernel: BTRFS warning (device nvme0n1p2):
> checksum error at logical 452965761024 on dev /dev/sda5, physical
> 147297468416, root 257, inode 35328, offset 26034176, length 4096,
> links 1 (path: usr/lib/sysimage/rpm/rpmdb.sqlite-wal)
> Apr 26 23:01:01 fedora kernel: BTRFS info (device nvme0n1p2): scrub:
> finished on devid 2 with status: 0
> Apr 27 07:35:32 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x862b6025 expected csum
> 0xcf4a5572 mirror 1
> Apr 27 07:35:32 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
> Apr 27 07:35:32 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x127b77ee expected csum
> 0xcf4a5572 mirror 2
> Apr 27 07:35:32 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/sda5 errs: wr 0, rd 0, flush 0, corrupt 1, gen 0
> Apr 27 07:35:32 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x862b6025 expected csum
> 0xcf4a5572 mirror 1
> Apr 27 07:35:32 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
> Apr 27 07:35:32 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x127b77ee expected csum
> 0xcf4a5572 mirror 2
> Apr 27 07:35:32 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/sda5 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
> Apr 27 07:35:33 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x862b6025 expected csum
> 0xcf4a5572 mirror 1
> Apr 27 07:35:33 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
> Apr 27 07:35:33 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x127b77ee expected csum
> 0xcf4a5572 mirror 2
> Apr 27 07:35:33 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/sda5 errs: wr 0, rd 0, flush 0, corrupt 3, gen 0
> Apr 27 07:35:33 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x862b6025 expected csum
> 0xcf4a5572 mirror 1
> Apr 27 07:35:33 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
> Apr 27 07:35:33 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x127b77ee expected csum
> 0xcf4a5572 mirror 2
> Apr 27 07:35:33 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/sda5 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
> Apr 27 07:35:33 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x862b6025 expected csum
> 0xcf4a5572 mirror 1
> Apr 27 07:35:33 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0
> Apr 27 07:35:33 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x127b77ee expected csum
> 0xcf4a5572 mirror 2
> Apr 27 07:35:33 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/sda5 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0
> Apr 27 07:35:55 fedora kernel: btrfs_print_data_csum_error: 2
> callbacks suppressed
> Apr 27 07:35:55 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x127b77ee expected csum
> 0xcf4a5572 mirror 2
> Apr 27 07:35:55 fedora kernel: btrfs_dev_stat_inc_and_print: 2
> callbacks suppressed
> Apr 27 07:35:55 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/sda5 errs: wr 0, rd 0, flush 0, corrupt 7, gen 0
> Apr 27 07:35:55 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x862b6025 expected csum
> 0xcf4a5572 mirror 1
> Apr 27 07:35:55 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 7, gen 0
> Apr 27 07:35:55 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x127b77ee expected csum
> 0xcf4a5572 mirror 2
> Apr 27 07:35:55 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/sda5 errs: wr 0, rd 0, flush 0, corrupt 8, gen 0
> Apr 27 07:35:55 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x862b6025 expected csum
> 0xcf4a5572 mirror 1
> Apr 27 07:35:55 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 8, gen 0
> Apr 27 07:35:55 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x127b77ee expected csum
> 0xcf4a5572 mirror 2
> Apr 27 07:35:55 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/sda5 errs: wr 0, rd 0, flush 0, corrupt 9, gen 0
> Apr 27 07:35:55 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x862b6025 expected csum
> 0xcf4a5572 mirror 1
> Apr 27 07:35:55 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 9, gen 0
> Apr 27 07:35:55 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x127b77ee expected csum
> 0xcf4a5572 mirror 2
> Apr 27 07:35:55 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/sda5 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0
> Apr 27 07:35:55 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x862b6025 expected csum
> 0xcf4a5572 mirror 1
> Apr 27 07:35:55 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0
> Apr 27 07:35:55 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x127b77ee expected csum
> 0xcf4a5572 mirror 2
> Apr 27 07:35:55 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/sda5 errs: wr 0, rd 0, flush 0, corrupt 11, gen 0
> Apr 27 07:35:55 fedora kernel: BTRFS warning (device nvme0n1p2): csum
> failed root 257 ino 35328 off 26079232 csum 0x862b6025 expected csum
> 0xcf4a5572 mirror 1
> Apr 27 07:35:55 fedora kernel: BTRFS error (device nvme0n1p2): bdev
> /dev/nvme0n1p2 errs: wr 0, rd 0, flush 0, corrupt 11, gen 0
> 


  reply	other threads:[~2025-04-29  5:32 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1745893230-14268-mlmmj-729fb3af@vger.kernel.org>
2025-04-29  2:25 ` Fwd: HTML message rejected: btrfs checksum error Przemek Klosowski
2025-04-29  5:32   ` Qu Wenruo [this message]
2025-04-29  5:44     ` Qu Wenruo
     [not found]       ` <CAC=1GgEaY80tHuA1av-u8y43o_U-yF6-7b8kaDNLi=i5X-fGqw@mail.gmail.com>
2025-05-03  4:34         ` Fwd: " Przemek Klosowski
2025-05-03  8:40           ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1be5f421-a36e-4a27-8c4b-73140f94a217@suse.com \
    --to=wqu@suse.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=przemek.klosowski@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox