From: Richard Weinberger <richard@nod.at>
To: linux-btrfs@vger.kernel.org
Subject: Re: Decoding "unable to fixup (regular)" errors
Date: Fri, 08 Nov 2019 23:06:50 +0100 [thread overview]
Message-ID: <2590197.gOHgNE8CYM@blindfold> (raw)
In-Reply-To: <1591390.YpsIS3gr9g@blindfold>
Am Dienstag, 5. November 2019, 23:03:01 CET schrieb Richard Weinberger:
> [10860370.764595] BTRFS error (device md1): unable to fixup (regular) error at logical 593483341824 on dev /dev/md1
> [10860395.236787] BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 2292, gen 0
> [10860395.237267] BTRFS error (device md1): unable to fixup (regular) error at logical 595304841216 on dev /dev/md1
> [10860395.506085] BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 2293, gen 0
> [10860395.506560] BTRFS error (device md1): unable to fixup (regular) error at logical 595326820352 on dev /dev/md1
> [10860395.511546] BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 2294, gen 0
> [10860395.512061] BTRFS error (device md1): unable to fixup (regular) error at logical 595327647744 on dev /dev/md1
> [10860395.664956] BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 2295, gen 0
> [10860395.664959] BTRFS error (device md1): unable to fixup (regular) error at logical 595344850944 on dev /dev/md1
> [10860395.677733] BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 2296, gen 0
> [10860395.677736] BTRFS error (device md1): unable to fixup (regular) error at logical 595346452480 on dev /dev/md1
> [10860395.770918] BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 2297, gen 0
> [10860395.771523] BTRFS error (device md1): unable to fixup (regular) error at logical 595357601792 on dev /dev/md1
> [10860395.789808] BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 2298, gen 0
> [10860395.790455] BTRFS error (device md1): unable to fixup (regular) error at logical 595359870976 on dev /dev/md1
> [10860395.806699] BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 2299, gen 0
> [10860395.807381] BTRFS error (device md1): unable to fixup (regular) error at logical 595361865728 on dev /dev/md1
> [10860395.918793] BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 2300, gen 0
> [10860395.919513] BTRFS error (device md1): unable to fixup (regular) error at logical 595372343296 on dev /dev/md1
> [10860395.993817] BTRFS error (device md1): bdev /dev/md1 errs: wr 0, rd 0, flush 0, corrupt 2301, gen 0
> [10860395.994574] BTRFS error (device md1): unable to fixup (regular) error at logical 595384438784 on dev /dev/md1
> For obvious reasons the "BTRFS error (device md1): unable to fixup (regular) error" lines made me nervous
> and I would like to understand better what is going on.
> The system has ECC memory with md1 being a RAID1 which passes all health checks.
>
> I tried to find the inodes behind the erroneous addresses without success.
> e.g.
> $ btrfs inspect-internal logical-resolve -v -P 593483341824 /
> ioctl ret=0, total_size=4096, bytes_left=4080, bytes_missing=0, cnt=0, missed=0
> $ echo $?
> 1
>
> My kernel is 4.12.14-lp150.12.64-default (OpenSUSE 15.0), so not super recent but AFAICT btrfs should be sane
> there. :-)
>
> What could cause the errors and how to dig further?
I was able to reproduce this on vanilla v5.4-rc6.
Instrumenting btrfs revealed that all erroneous blocks are data blocks (BTRFS_EXTENT_FLAG_DATA)
and only have ->checksum_error set.
Both expected and computed checksums are non-zero.
To me it seems like all these blocks are orphaned data, while extent_from_logical() finds and extent
for the affected logical addresses, none of the extents belong to an inode.
This explains also why "btrfs inspect-internal logical-resolve" is unable to point me to an
inode. And why scrub_print_warning("checksum error", sblock_to_check) does not log anything.
The function returns early if no inode can be found for a data block...
This is something to worry about?
Why does the scrubbing mechanism check orphaned blocks?
Thanks,
//richard
next prev parent reply other threads:[~2019-11-08 22:06 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-05 22:03 Decoding "unable to fixup (regular)" errors Richard Weinberger
2019-11-08 22:06 ` Richard Weinberger [this message]
2019-11-08 22:16 ` Zygo Blaxell
2019-11-08 22:09 ` Zygo Blaxell
2019-11-08 22:21 ` Richard Weinberger
2019-11-08 22:25 ` Zygo Blaxell
2019-11-08 22:31 ` Richard Weinberger
2019-11-08 23:39 ` Zygo Blaxell
2019-11-09 9:58 ` checksum errors in orphaned blocks on multiple systems (Was: Re: Decoding "unable to fixup (regular)" errors) Richard Weinberger
2019-11-13 3:34 ` Zygo Blaxell
2019-11-09 10:00 ` Decoding "unable to fixup (regular)" errors Richard Weinberger
2019-11-13 3:31 ` Zygo Blaxell
2019-11-13 18:17 ` Chris Murphy
2019-11-13 18:24 ` Chris Murphy
2019-11-16 6:16 ` Zygo Blaxell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2590197.gOHgNE8CYM@blindfold \
--to=richard@nod.at \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox