* kernel 6.8.5, bad tree block start, couldn't read tree root, including any of the backup roots
@ 2024-10-15 14:43 Chris Murphy
2024-10-15 22:59 ` Chris Murphy
2024-10-16 10:41 ` Mark Harmstone
0 siblings, 2 replies; 4+ messages in thread
From: Chris Murphy @ 2024-10-15 14:43 UTC (permalink / raw)
To: Btrfs BTRFS
Fedora user report:
kernel 6.8.5-301.fc40.x86_64
btrfs-progs 6.8
Kioxia NVMe KXG80ZNV1T02, firmware 11304102 (used by Dell, seems firmware is current)
https://discussion.fedoraproject.org/t/stuck-in-emergency-mode-after-force-shutdown/133615
Description:
User reports system was suspended, and wouldn't wake from suspend. And power was forced off to recover. The problem appears on the subsequent boot. This post contains the most relevant photo of kernel messages showing the mount errors, including both copies (DUP metadata) of all the backup tree roots.
https://discussion.fedoraproject.org/t/133615/15
Supers all look good:
superblock: bytenr=65536, device=/dev/nvme0n1p3
---------------------------------------------------------
csum_type 0 (crc32c)
csum_size 4
csum 0x79e5611c [match]
bytenr 65536
flags 0x1
( WRITTEN )
magic _BHRfS_M [match]
fsid 2c03e734-1b38-49e9-991a-1f85b9cc97f7
metadata_uuid 00000000-0000-0000-0000-000000000000
label fedora
generation 38863
root 808009728
sys_array_size 129
chunk_root_generation 32970
root_level 0
chunk_root 24854528
chunk_root_level 0
log_root 0
log_root_transid (deprecated) 0
log_root_level 0
total_bytes 1022505254912
bytes_used 81702764544
sectorsize 4096
nodesize 16384
leafsize (deprecated) 16384
stripesize 4096
root_dir 6
num_devices 1
compat_flags 0x0
compat_ro_flags 0x3
( FREE_SPACE_TREE |
FREE_SPACE_TREE_VALID )
incompat_flags 0x371
( MIXED_BACKREF |
COMPRESS_ZSTD |
BIG_METADATA |
EXTENDED_IREF |
SKINNY_METADATA |
NO_HOLES )
cache_generation 0
uuid_tree_generation 38863
dev_item.uuid 0aab31a2-3c63-4c6f-bc4c-8f4b4bad66ac
dev_item.fsid 2c03e734-1b38-49e9-991a-1f85b9cc97f7 [match]
dev_item.type 0
dev_item.total_bytes 1022505254912
dev_item.bytes_used 89145737216
dev_item.io_align 4096
dev_item.io_width 4096
dev_item.sector_size 4096
dev_item.devid 1
dev_item.dev_group 0
dev_item.seek_speed 0
dev_item.bandwidth 0
dev_item.generation 0
sys_chunk_array[2048]:
item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096)
length 8388608 owner 2 stripe_len 65536 type SYSTEM|DUP
io_align 65536 io_width 65536 sector_size 4096
num_stripes 2 sub_stripes 1
stripe 0 devid 1 offset 22020096
dev_uuid 0aab31a2-3c63-4c6f-bc4c-8f4b4bad66ac
stripe 1 devid 1 offset 30408704
dev_uuid 0aab31a2-3c63-4c6f-bc4c-8f4b4bad66ac
backup_roots[4]:
backup 0:
backup_tree_root: 803454976 gen: 38861 level: 0
backup_chunk_root: 24854528 gen: 32970 level: 0
backup_extent_root: 802881536 gen: 38861 level: 2
backup_fs_root: 30572544 gen: 9 level: 0
backup_dev_root: 73252864 gen: 36528 level: 0
csum_root: 801767424 gen: 38861 level: 2
backup_total_bytes: 1022505254912
backup_bytes_used: 81702764544
backup_num_devices: 1
backup 1:
backup_tree_root: 804864000 gen: 38862 level: 0
backup_chunk_root: 24854528 gen: 32970 level: 0
backup_extent_root: 804093952 gen: 38862 level: 2
backup_fs_root: 30572544 gen: 9 level: 0
backup_dev_root: 73252864 gen: 36528 level: 0
csum_root: 801931264 gen: 38862 level: 2
backup_total_bytes: 1022505254912
backup_bytes_used: 81702899712
backup_num_devices: 1
backup 2:
backup_tree_root: 808009728 gen: 38863 level: 0
backup_chunk_root: 24854528 gen: 32970 level: 0
backup_extent_root: 806535168 gen: 38863 level: 2
backup_fs_root: 30572544 gen: 9 level: 0
backup_dev_root: 73252864 gen: 36528 level: 0
csum_root: 803799040 gen: 38863 level: 2
backup_total_bytes: 1022505254912
backup_bytes_used: 81702764544
backup_num_devices: 1
backup 3:
backup_tree_root: 801734656 gen: 38860 level: 0
backup_chunk_root: 24854528 gen: 32970 level: 0
backup_extent_root: 800374784 gen: 38860 level: 2
backup_fs_root: 30572544 gen: 9 level: 0
backup_dev_root: 73252864 gen: 36528 level: 0
csum_root: 797999104 gen: 38860 level: 2
backup_total_bytes: 1022505254912
backup_bytes_used: 81702764544
backup_num_devices: 1
btrfs insp dump-t -b 808009728 $DEV
btrfs-progs v6.8
checksum verify failed on 808009728 wanted 0x00000000 found 0xca72d647
checksum verify failed on 808009728 wanted 0x780fa0fd found 0x2ce4b8af
Couldn't read tree root
checksum verify failed on 808009728 wanted 0x00000000 found 0xca72d647
checksum verify failed on 808009728 wanted 0x780fa0fd found 0x2ce4b8af
ERROR: failed to read tree block 808009728
btrfs insp dump-t -b 80486400 $DEV
btrfs-progs v6.8
checksum verify failed on 808009728 wanted 0x00000000 found 0xca72d647
checksum verify failed on 808009728 wanted 0x780fa0fd found 0x2ce4b8af
Couldn't read tree root
checksum verify failed on 80486400 wanted 0x00000000 found 0x973bc4ee
checksum verify failed on 80486400 wanted 0x00000000 found 0x14f430d1
ERROR: failed to read tree block 80486400
btrfs insp dump-t -b 803454976 $DEV
btrfs-progs v6.8
checksum verify failed on 808009728 wanted 0x00000000 found 0xca72d647
checksum verify failed on 808009728 wanted 0x780fa0fd found 0x2ce4b8af
Couldn't read tree root
checksum verify failed on 803454976 wanted 0x080fd04c found 0x4431a259
checksum verify failed on 803454976 wanted 0x00000000 found 0x3be418cb
ERROR: failed to read tree block 803454976
btrfs insp dump-t -b 801734656 $DEV
btrfs-progs v6.8
checksum verify failed on 808009728 wanted 0x00000000 found 0xca72d647
checksum verify failed on 808009728 wanted 0x780fa0fd found 0x2ce4b8af
Couldn't read tree root
checksum verify failed on 801734656 wanted 0x080fd04c found 0x69344e69
checksum verify failed on 801734656 wanted 0x00000000 found 0x818d4e83
ERROR: failed to read tree block 801734656
But I'm not sure where to suggest the user go from here since none of the tree roots can be read. That seems like quite a lot of problems all at once to different parts of the media.
thanks,
--
Chris Murphy
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: kernel 6.8.5, bad tree block start, couldn't read tree root, including any of the backup roots
2024-10-15 14:43 kernel 6.8.5, bad tree block start, couldn't read tree root, including any of the backup roots Chris Murphy
@ 2024-10-15 22:59 ` Chris Murphy
2024-10-16 10:41 ` Mark Harmstone
1 sibling, 0 replies; 4+ messages in thread
From: Chris Murphy @ 2024-10-15 22:59 UTC (permalink / raw)
To: Btrfs BTRFS
On Tue, Oct 15, 2024, at 10:43 AM, Chris Murphy wrote:
> Fedora user report:
> kernel 6.8.5-301.fc40.x86_64
> btrfs-progs 6.8
Should be kernel 6.10 series
I reported the live USB booted versions. The installed version under which the problem occurred is probably 6.10.12.
--
Chris Murphy
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: kernel 6.8.5, bad tree block start, couldn't read tree root, including any of the backup roots
2024-10-15 14:43 kernel 6.8.5, bad tree block start, couldn't read tree root, including any of the backup roots Chris Murphy
2024-10-15 22:59 ` Chris Murphy
@ 2024-10-16 10:41 ` Mark Harmstone
2024-10-16 18:36 ` Chris Murphy
1 sibling, 1 reply; 4+ messages in thread
From: Mark Harmstone @ 2024-10-16 10:41 UTC (permalink / raw)
To: Chris Murphy, Btrfs BTRFS
This looks like a disk error to me.
This was interesting though:
> checksum verify failed on 808009728 wanted 0x00000000 found 0xca72d647
> checksum verify failed on 808009728 wanted 0x780fa0fd found 0x2ce4b8af
Why would btrfs check be reporting two different "found" values for the
same block?
Mark
On 15/10/24 15:43, Chris Murphy wrote:
> >
> Fedora user report:
> kernel 6.8.5-301.fc40.x86_64
> btrfs-progs 6.8
> Kioxia NVMe KXG80ZNV1T02, firmware 11304102 (used by Dell, seems firmware is current)
> https://urldefense.com/v3/__https://discussion.fedoraproject.org/t/stuck-in-emergency-mode-after-force-shutdown/133615__;!!Bt8RZUm9aw!6m27MG8w3IYorE7KW3kmXdlLaSPpbskKASUnTigu4mMYKmsfvKrUgGv1nB13_24XicYXvDPOt1cF_G_j0w$
>
> Description:
> User reports system was suspended, and wouldn't wake from suspend. And power was forced off to recover. The problem appears on the subsequent boot. This post contains the most relevant photo of kernel messages showing the mount errors, including both copies (DUP metadata) of all the backup tree roots.
>
> https://urldefense.com/v3/__https://discussion.fedoraproject.org/t/133615/15__;!!Bt8RZUm9aw!6m27MG8w3IYorE7KW3kmXdlLaSPpbskKASUnTigu4mMYKmsfvKrUgGv1nB13_24XicYXvDPOt1dplAlvvg$
>
> Supers all look good:
>
> superblock: bytenr=65536, device=/dev/nvme0n1p3
> ---------------------------------------------------------
> csum_type 0 (crc32c)
> csum_size 4
> csum 0x79e5611c [match]
> bytenr 65536
> flags 0x1
> ( WRITTEN )
> magic _BHRfS_M [match]
> fsid 2c03e734-1b38-49e9-991a-1f85b9cc97f7
> metadata_uuid 00000000-0000-0000-0000-000000000000
> label fedora
> generation 38863
> root 808009728
> sys_array_size 129
> chunk_root_generation 32970
> root_level 0
> chunk_root 24854528
> chunk_root_level 0
> log_root 0
> log_root_transid (deprecated) 0
> log_root_level 0
> total_bytes 1022505254912
> bytes_used 81702764544
> sectorsize 4096
> nodesize 16384
> leafsize (deprecated) 16384
> stripesize 4096
> root_dir 6
> num_devices 1
> compat_flags 0x0
> compat_ro_flags 0x3
> ( FREE_SPACE_TREE |
> FREE_SPACE_TREE_VALID )
> incompat_flags 0x371
> ( MIXED_BACKREF |
> COMPRESS_ZSTD |
> BIG_METADATA |
> EXTENDED_IREF |
> SKINNY_METADATA |
> NO_HOLES )
> cache_generation 0
> uuid_tree_generation 38863
> dev_item.uuid 0aab31a2-3c63-4c6f-bc4c-8f4b4bad66ac
> dev_item.fsid 2c03e734-1b38-49e9-991a-1f85b9cc97f7 [match]
> dev_item.type 0
> dev_item.total_bytes 1022505254912
> dev_item.bytes_used 89145737216
> dev_item.io_align 4096
> dev_item.io_width 4096
> dev_item.sector_size 4096
> dev_item.devid 1
> dev_item.dev_group 0
> dev_item.seek_speed 0
> dev_item.bandwidth 0
> dev_item.generation 0
> sys_chunk_array[2048]:
> item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 22020096)
> length 8388608 owner 2 stripe_len 65536 type SYSTEM|DUP
> io_align 65536 io_width 65536 sector_size 4096
> num_stripes 2 sub_stripes 1
> stripe 0 devid 1 offset 22020096
> dev_uuid 0aab31a2-3c63-4c6f-bc4c-8f4b4bad66ac
> stripe 1 devid 1 offset 30408704
> dev_uuid 0aab31a2-3c63-4c6f-bc4c-8f4b4bad66ac
> backup_roots[4]:
> backup 0:
> backup_tree_root: 803454976 gen: 38861 level: 0
> backup_chunk_root: 24854528 gen: 32970 level: 0
> backup_extent_root: 802881536 gen: 38861 level: 2
> backup_fs_root: 30572544 gen: 9 level: 0
> backup_dev_root: 73252864 gen: 36528 level: 0
> csum_root: 801767424 gen: 38861 level: 2
> backup_total_bytes: 1022505254912
> backup_bytes_used: 81702764544
> backup_num_devices: 1
>
> backup 1:
> backup_tree_root: 804864000 gen: 38862 level: 0
> backup_chunk_root: 24854528 gen: 32970 level: 0
> backup_extent_root: 804093952 gen: 38862 level: 2
> backup_fs_root: 30572544 gen: 9 level: 0
> backup_dev_root: 73252864 gen: 36528 level: 0
> csum_root: 801931264 gen: 38862 level: 2
> backup_total_bytes: 1022505254912
> backup_bytes_used: 81702899712
> backup_num_devices: 1
>
> backup 2:
> backup_tree_root: 808009728 gen: 38863 level: 0
> backup_chunk_root: 24854528 gen: 32970 level: 0
> backup_extent_root: 806535168 gen: 38863 level: 2
> backup_fs_root: 30572544 gen: 9 level: 0
> backup_dev_root: 73252864 gen: 36528 level: 0
> csum_root: 803799040 gen: 38863 level: 2
> backup_total_bytes: 1022505254912
> backup_bytes_used: 81702764544
> backup_num_devices: 1
>
> backup 3:
> backup_tree_root: 801734656 gen: 38860 level: 0
> backup_chunk_root: 24854528 gen: 32970 level: 0
> backup_extent_root: 800374784 gen: 38860 level: 2
> backup_fs_root: 30572544 gen: 9 level: 0
> backup_dev_root: 73252864 gen: 36528 level: 0
> csum_root: 797999104 gen: 38860 level: 2
> backup_total_bytes: 1022505254912
> backup_bytes_used: 81702764544
> backup_num_devices: 1
>
>
> btrfs insp dump-t -b 808009728 $DEV
> btrfs-progs v6.8
> checksum verify failed on 808009728 wanted 0x00000000 found 0xca72d647
> checksum verify failed on 808009728 wanted 0x780fa0fd found 0x2ce4b8af
> Couldn't read tree root
> checksum verify failed on 808009728 wanted 0x00000000 found 0xca72d647
> checksum verify failed on 808009728 wanted 0x780fa0fd found 0x2ce4b8af
> ERROR: failed to read tree block 808009728
>
> btrfs insp dump-t -b 80486400 $DEV
> btrfs-progs v6.8
> checksum verify failed on 808009728 wanted 0x00000000 found 0xca72d647
> checksum verify failed on 808009728 wanted 0x780fa0fd found 0x2ce4b8af
> Couldn't read tree root
> checksum verify failed on 80486400 wanted 0x00000000 found 0x973bc4ee
> checksum verify failed on 80486400 wanted 0x00000000 found 0x14f430d1
> ERROR: failed to read tree block 80486400
>
>
> btrfs insp dump-t -b 803454976 $DEV
> btrfs-progs v6.8
> checksum verify failed on 808009728 wanted 0x00000000 found 0xca72d647
> checksum verify failed on 808009728 wanted 0x780fa0fd found 0x2ce4b8af
> Couldn't read tree root
> checksum verify failed on 803454976 wanted 0x080fd04c found 0x4431a259
> checksum verify failed on 803454976 wanted 0x00000000 found 0x3be418cb
> ERROR: failed to read tree block 803454976
>
>
> btrfs insp dump-t -b 801734656 $DEV
> btrfs-progs v6.8
> checksum verify failed on 808009728 wanted 0x00000000 found 0xca72d647
> checksum verify failed on 808009728 wanted 0x780fa0fd found 0x2ce4b8af
> Couldn't read tree root
> checksum verify failed on 801734656 wanted 0x080fd04c found 0x69344e69
> checksum verify failed on 801734656 wanted 0x00000000 found 0x818d4e83
> ERROR: failed to read tree block 801734656
>
>
> But I'm not sure where to suggest the user go from here since none of the tree roots can be read. That seems like quite a lot of problems all at once to different parts of the media.
>
> thanks,
>
>
> --
> Chris Murphy
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: kernel 6.8.5, bad tree block start, couldn't read tree root, including any of the backup roots
2024-10-16 10:41 ` Mark Harmstone
@ 2024-10-16 18:36 ` Chris Murphy
0 siblings, 0 replies; 4+ messages in thread
From: Chris Murphy @ 2024-10-16 18:36 UTC (permalink / raw)
To: Mark Harmstone, Btrfs BTRFS
On Wed, Oct 16, 2024, at 6:41 AM, Mark Harmstone wrote:
> This looks like a disk error to me.
>
> This was interesting though:
> > checksum verify failed on 808009728 wanted 0x00000000 found 0xca72d647
> > checksum verify failed on 808009728 wanted 0x780fa0fd found 0x2ce4b8af
>
> Why would btrfs check be reporting two different "found" values for the
> same block?
I'm assuming DUP metadata, hence mirror 1 and mirror 2. Both copies are bad, but they are different badness.
User used ddrescue on the Btrfs partition so we still have a copy of it for now.
The user then used blkdiscard to wipe the drive, reformatted it Btrfs, and tested with f3 [1] - there were no errors. And then clean installed Fedora to get back to work again - so I guess we'll see if the issue happens again or not. But it doesn't appear that the drive has failed.
I'm suspicious of a suspend related bug, but this is a lot more blocks being corrupted than what we usually see with drives that don't honor flush/FUA followed by a crash or power failure.
[1] https://github.com/AltraMayor/f3
--
Chris Murphy
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-10-16 18:37 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-15 14:43 kernel 6.8.5, bad tree block start, couldn't read tree root, including any of the backup roots Chris Murphy
2024-10-15 22:59 ` Chris Murphy
2024-10-16 10:41 ` Mark Harmstone
2024-10-16 18:36 ` Chris Murphy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox