From: Russell Coker <russell@coker.com.au>
To: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Scrub problem with Debian kernel 6.12.33+deb13-amd64
Date: Mon, 07 Jul 2025 20:55:07 +1000 [thread overview]
Message-ID: <3036994.e9J7NaK4W3@dojacat> (raw)
I ran a scrub on my laptop running the latest Debian/Testing setup. It's a
Thinkpad X1 Carbon Gen6 that has just been updated to the latest firmware
(Thinkpad BIOS, management engine, and some 3rd thing on the motherboard). It
had crashed a few times before which I think has been fixed by the firmware
update, it is plausible that the crashes caused some corruption.
The system is running LUKS encryption. After the monthly btrfs scrub I got
the following in the cron output:
ERROR: there are 1 uncorrectable errors
Starting scrub on devid 1
scrub done for d90583c8-9284-48b4-9444-abd00924002a
Scrub started: Mon Jul 7 02:30:01 2025
Status: finished
Duration: 0:02:46
Total to scrub: 226.35GiB
Rate: 1.36GiB/s
Error summary: csum=110693
Corrected: 0
Uncorrectable: 110693
Unverified: 0
I ran the following commands to get more data and got the below output. It
seems that we have a clear problem of btrfs dev sta reporting 0 errors when
there were apparently many errors!
root@dojacat:/var/log# btrfs dev sta /
[/dev/mapper/root].write_io_errs 0
[/dev/mapper/root].read_io_errs 0
[/dev/mapper/root].flush_io_errs 0
[/dev/mapper/root].corruption_errs 0
[/dev/mapper/root].generation_errs 0
root@dojacat:/var/log# btrfs scrub status /
UUID: d90583c8-9284-48b4-9444-abd00924002a
Scrub started: Mon Jul 7 02:30:01 2025
Status: finished
Duration: 0:02:46
Total to scrub: 226.34GiB
Rate: 1.36GiB/s
Error summary: csum=110693
Corrected: 0
Uncorrectable: 110693
Unverified: 0
[190966.907320] BTRFS info (device dm-0): scrub: started on devid 1
[191057.409078] scrub_stripe_report_errors: 110553 callbacks suppressed
[191057.409081] scrub_stripe_report_errors: 110576 callbacks suppressed
[191057.409084] BTRFS error (device dm-0): unable to fixup (regular) error at
logical 327469629440 on dev /dev/mapper/root physical 147760480256
[191057.409138] BTRFS error (device dm-0): unable to fixup (regular) error at
logical 327469563904 on dev /dev/mapper/root physical 147760414720
[191057.409300] _btrfs_printk: 290 callbacks suppressed
[191057.409303] BTRFS warning (device dm-0): checksum error at logical
327469629440 on dev /dev/mapper/root, physical 147760480256, root 540, inode
1826602, offset 2087845888, length 4096, links 1 (path: home.old/tv/Foo.
2024.S01E08.1080p.WEB.H264-SuccessfulCrab.mkv)
[many more about similar files]
[191057.410987] BTRFS warning (device dm-0): checksum error at logical
327469629440 on dev /dev/mapper/root, physical 147760480256, root 522, inode
174508, offset 2087845888, length 4096, links 1 (path: tv/Foo.
2024.S01E08.1080p.WEB.H264-SuccessfulCrab.mkv)
[191057.411281] BTRFS error (device dm-0): unable to fixup (regular) error at
logical 327469629440 on dev /dev/mapper/root physical 147760480256
[191057.411285] BTRFS error (device dm-0): unable to fixup (regular) error at
logical 327469563904 on dev /dev/mapper/root physical 147760414720
[191057.411458] BTRFS error (device dm-0): unable to fixup (regular) error at
logical 327469432832 on dev /dev/mapper/root physical 147760283648
[191057.411461] BTRFS error (device dm-0): unable to fixup (regular) error at
logical 327469367296 on dev /dev/mapper/root physical 147760218112
[191057.411907] BTRFS error (device dm-0): unable to fixup (regular) error at
logical 327469498368 on dev /dev/mapper/root physical 147760349184
[191057.413012] BTRFS error (device dm-0): unable to fixup (regular) error at
logical 327469629440 on dev /dev/mapper/root physical 147760480256
[191131.353819] BTRFS info (device dm-0): scrub: finished on devid 1 with
status: 0
# md5sum Foo.2024.S01E08.1080p.WEB.H264-SuccessfulCrab.mkv
md5sum: Foo.2024.S01E08.1080p.WEB.H264-SuccessfulCrab.mkv: Input/output error
The files in question had been subject to "cp -a --reflink=auto", across
subvols. When I deleted them from one subvol and deleted the snapshots of
that subvol I ran another scrub and now I see the following:
# /bin/btrfs scrub start -B /
Starting scrub on devid 1
scrub done for d90583c8-9284-48b4-9444-abd00924002a
Scrub started: Mon Jul 7 20:33:18 2025
Status: finished
Duration: 0:03:01
Total to scrub: 220.04GiB
Rate: 1.21GiB/s
Error summary: csum=110693
Corrected: 0
Uncorrectable: 110693
Unverified: 0
ERROR: there are 1 uncorrectable errors
# btrfs dev sta /
[/dev/mapper/root].write_io_errs 0
[/dev/mapper/root].read_io_errs 0
[/dev/mapper/root].flush_io_errs 0
[/dev/mapper/root].corruption_errs 689
[/dev/mapper/root].generation_errs 0
So it looks like the failure to report error counts in btrfs dev sta may be
related to cp --reflink=auto across subvols. The csum=110693 doesn't match to
the "corruption_errs 689" but at least it's not 0.
I removed another file that was listed as having uncorrectable errors and now
I get the following:
# /bin/btrfs scrub start -B /
Starting scrub on devid 1
scrub done for d90583c8-9284-48b4-9444-abd00924002a
Scrub started: Mon Jul 7 20:46:05 2025
Status: finished
Duration: 0:02:17
Total to scrub: 173.88GiB
Rate: 1.27GiB/s
Error summary: csum=7137
Corrected: 0
Uncorrectable: 7137
Unverified: 0
ERROR: there are 1 uncorrectable errors
Below are the kernel messages. No mentions of files or directories so the
scrub doesn't seem to be doing it's job well here. It should either fix
things or tell me what rm command I can use to replace things that can't be
fixed!
Jul 07 20:47:20 dojacat kernel: scrub_stripe_report_errors: 7116 callbacks
suppressed
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup
(regular) error at logical 327893450752 on dev /dev/mapper/root physical
148184301568
Jul 07 20:47:20 dojacat kernel: scrub_stripe_report_errors: 7117 callbacks
suppressed
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup
(regular) error at logical 327893450752 on dev /dev/mapper/root physical
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup
(regular) error at logical 327893450752 on dev /dev/mapper/root physical
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup
(regular) error at logical 327893450752 on dev /dev/mapper/root physical
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup
(regular) error at logical 327893450752 on dev /dev/mapper/root physical
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup
(regular) error at logical 327893450752 on dev /dev/mapper/root physical
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup
(regular) error at logical 327893450752 on dev /dev/mapper/root physical
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup
(regular) error at logical 327893450752 on dev /dev/mapper/root physical
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup
(regular) error at logical 327893450752 on dev /dev/mapper/root physical
148184301568
Jul 07 20:47:20 dojacat kernel: BTRFS error (device dm-0): unable to fixup
(regular) error at logical 327893450752 on dev /dev/mapper/root physical
148184301568
I don't think that BTRFS is responsible for the data loss here, I think that
is entirely due to the system crashing. But BTRFS really isn't handling the
recovery as well as I think it should and could.
--
My Main Blog http://etbe.coker.com.au/
My Documents Blog http://doc.coker.com.au/
next reply other threads:[~2025-07-07 11:04 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-07 10:55 Russell Coker [this message]
2025-07-07 22:30 ` Scrub problem with Debian kernel 6.12.33+deb13-amd64 Qu Wenruo
2025-07-11 21:03 ` Nicholas D Steeves
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3036994.e9J7NaK4W3@dojacat \
--to=russell@coker.com.au \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox