public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* Scrub problem with Debian kernel 6.12.33+deb13-amd64
@ 2025-07-07 10:55 Russell Coker
  2025-07-07 22:30 ` Qu Wenruo
  0 siblings, 1 reply; 3+ messages in thread
From: Russell Coker @ 2025-07-07 10:55 UTC (permalink / raw)
  To: Btrfs BTRFS

I ran a scrub on my laptop running the latest Debian/Testing setup.  It's a 
Thinkpad X1 Carbon Gen6 that has just been updated to the latest firmware 
(Thinkpad BIOS, management engine, and some 3rd thing on the motherboard).  It 
had crashed a few times before which I think has been fixed by the firmware 
update, it is plausible that the crashes caused some corruption.

The system is running LUKS encryption.  After the monthly btrfs scrub I got 
the following in the cron output:

ERROR: there are 1 uncorrectable errors
Starting scrub on devid 1
scrub done for d90583c8-9284-48b4-9444-abd00924002a
Scrub started:    Mon Jul  7 02:30:01 2025
Status:           finished
Duration:         0:02:46
Total to scrub:   226.35GiB
Rate:             1.36GiB/s
Error summary:    csum=110693
  Corrected:      0
  Uncorrectable:  110693
  Unverified:     0

I ran the following commands to get more data and got the below output.  It 
seems that we have a clear problem of btrfs dev sta reporting 0 errors when 
there were apparently many errors!

root@dojacat:/var/log# btrfs dev sta /
[/dev/mapper/root].write_io_errs    0
[/dev/mapper/root].read_io_errs     0
[/dev/mapper/root].flush_io_errs    0
[/dev/mapper/root].corruption_errs  0
[/dev/mapper/root].generation_errs  0
root@dojacat:/var/log# btrfs scrub status /
UUID:             d90583c8-9284-48b4-9444-abd00924002a
Scrub started:    Mon Jul  7 02:30:01 2025
Status:           finished
Duration:         0:02:46
Total to scrub:   226.34GiB
Rate:             1.36GiB/s
Error summary:    csum=110693
  Corrected:      0
  Uncorrectable:  110693
  Unverified:     0


[190966.907320] BTRFS info (device dm-0): scrub: started on devid 1
[191057.409078] scrub_stripe_report_errors: 110553 callbacks suppressed
[191057.409081] scrub_stripe_report_errors: 110576 callbacks suppressed
[191057.409084] BTRFS error (device dm-0): unable to fixup (regular) error at 
logical 327469629440 on dev /dev/mapper/root physical 147760480256
[191057.409138] BTRFS error (device dm-0): unable to fixup (regular) error at 
logical 327469563904 on dev /dev/mapper/root physical 147760414720
[191057.409300] _btrfs_printk: 290 callbacks suppressed
[191057.409303] BTRFS warning (device dm-0): checksum error at logical 
327469629440 on dev /dev/mapper/root, physical 147760480256, root 540, inode 
1826602, offset 2087845888, length 4096, links 1 (path: home.old/tv/Foo.
2024.S01E08.1080p.WEB.H264-SuccessfulCrab.mkv)

[many more about similar files]

[191057.410987] BTRFS warning (device dm-0): checksum error at logical 
327469629440 on dev /dev/mapper/root, physical 147760480256, root 522, inode 
174508, offset 2087845888, length 4096, links 1 (path: tv/Foo.
2024.S01E08.1080p.WEB.H264-SuccessfulCrab.mkv)
[191057.411281] BTRFS error (device dm-0): unable to fixup (regular) error at 
logical 327469629440 on dev /dev/mapper/root physical 147760480256
[191057.411285] BTRFS error (device dm-0): unable to fixup (regular) error at 
logical 327469563904 on dev /dev/mapper/root physical 147760414720
[191057.411458] BTRFS error (device dm-0): unable to fixup (regular) error at 
logical 327469432832 on dev /dev/mapper/root physical 147760283648
[191057.411461] BTRFS error (device dm-0): unable to fixup (regular) error at 
logical 327469367296 on dev /dev/mapper/root physical 147760218112
[191057.411907] BTRFS error (device dm-0): unable to fixup (regular) error at 
logical 327469498368 on dev /dev/mapper/root physical 147760349184
[191057.413012] BTRFS error (device dm-0): unable to fixup (regular) error at 
logical 327469629440 on dev /dev/mapper/root physical 147760480256
[191131.353819] BTRFS info (device dm-0): scrub: finished on devid 1 with 
status: 0

# md5sum Foo.2024.S01E08.1080p.WEB.H264-SuccessfulCrab.mkv
md5sum: Foo.2024.S01E08.1080p.WEB.H264-SuccessfulCrab.mkv: Input/output error

The files in question had been subject to "cp -a --reflink=auto", across 
subvols.  When I deleted them from one subvol and deleted the snapshots of 
that subvol I ran another scrub and now I see the following:

# /bin/btrfs scrub start -B /
Starting scrub on devid 1
scrub done for d90583c8-9284-48b4-9444-abd00924002a
Scrub started:    Mon Jul  7 20:33:18 2025
Status:           finished
Duration:         0:03:01
Total to scrub:   220.04GiB
Rate:             1.21GiB/s
Error summary:    csum=110693
  Corrected:      0
  Uncorrectable:  110693
  Unverified:     0
ERROR: there are 1 uncorrectable errors
# btrfs dev sta /
[/dev/mapper/root].write_io_errs    0
[/dev/mapper/root].read_io_errs     0
[/dev/mapper/root].flush_io_errs    0
[/dev/mapper/root].corruption_errs  689
[/dev/mapper/root].generation_errs  0

So it looks like the failure to report error counts in btrfs dev sta may be 
related to cp --reflink=auto across subvols.  The csum=110693 doesn't match to 
the "corruption_errs  689" but at least it's not 0.

I removed another file that was listed as having uncorrectable errors and now 
I get the following:

# /bin/btrfs scrub start -B /
Starting scrub on devid 1
scrub done for d90583c8-9284-48b4-9444-abd00924002a
Scrub started:    Mon Jul  7 20:46:05 2025
Status:           finished
Duration:         0:02:17
Total to scrub:   173.88GiB
Rate:             1.27GiB/s
Error summary:    csum=7137
  Corrected:      0
  Uncorrectable:  7137
  Unverified:     0
ERROR: there are 1 uncorrectable errors

Below are the kernel messages.  No mentions of files or directories so the 
scrub doesn't seem to be doing it's job well here.  It should either fix 
things or tell me what rm command I can use to replace things that can't be 
fixed!

Jul 07 20:47:20 dojacat kernel: scrub_stripe_report_errors: 7116 callbacks 
suppressed
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup 
(regular) error at logical 327893450752 on dev /dev/mapper/root physical 
148184301568
Jul 07 20:47:20 dojacat kernel: scrub_stripe_report_errors: 7117 callbacks 
suppressed
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup 
(regular) error at logical 327893450752 on dev /dev/mapper/root physical 
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup 
(regular) error at logical 327893450752 on dev /dev/mapper/root physical 
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup 
(regular) error at logical 327893450752 on dev /dev/mapper/root physical 
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup 
(regular) error at logical 327893450752 on dev /dev/mapper/root physical 
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup 
(regular) error at logical 327893450752 on dev /dev/mapper/root physical 
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup 
(regular) error at logical 327893450752 on dev /dev/mapper/root physical 
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup 
(regular) error at logical 327893450752 on dev /dev/mapper/root physical 
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup 
(regular) error at logical 327893450752 on dev /dev/mapper/root physical 
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup 
(regular) error at logical 327893450752 on dev /dev/mapper/root physical 
148184301568

I don't think that BTRFS is responsible for the data loss here, I think that 
is entirely due to the system crashing.  But BTRFS really isn't handling the 
recovery as well as I think it should and could.

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-07-11 21:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-07 10:55 Scrub problem with Debian kernel 6.12.33+deb13-amd64 Russell Coker
2025-07-07 22:30 ` Qu Wenruo
2025-07-11 21:03   ` Nicholas D Steeves

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox