public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* tree first key mismatch detected (reproducible error)
@ 2020-01-25 11:37 Thorsten Hirsch
  2020-01-25 11:46 ` Andrei Borzenkov
  2020-01-25 12:15 ` Qu Wenruo
  0 siblings, 2 replies; 8+ messages in thread
From: Thorsten Hirsch @ 2020-01-25 11:37 UTC (permalink / raw)
  To: linux-btrfs

Hi, here's a btrfs problem that started happening today on my main computer:

BTRFS error (device nvme0n1p3): tree first key mismatch detected,
bytenr=109690880 parent_transid=1329869 key
expected=(48044838912,168,12288) has=(48045363200,168,12288)

It always occurs some minutes after booting, sometimes even seconds
after booting. The partition is then remounted read-only. I already
tried scrubbing the partition (aborts itself after some seconds) and
balancing (seems to trigger the error immediately and doesn't even
start).

I attached some more output of dmesg. The distribution is Arch Linux
and the kernel is the most recent one in Arch's default kernel
package: 5.4.14-arch1-1 (I upgraded from 5.4.13 to 5.4.14 just
yesterday).

Best regards,
Thorsten

[Jan25 12:00] BTRFS error (device nvme0n1p3): tree first key mismatch
detected, bytenr=109690880 parent_transid=1329869 key
expected=(48044838912,168,12288) has=(48045363200,168,12288)
[  +0,000003] ------------[ cut here ]------------
[  +0,000001] BTRFS: Transaction aborted (error -117)
[  +0,000041] WARNING: CPU: 7 PID: 382 at fs/btrfs/extent-tree.c:3080
__btrfs_free_extent.isra.0+0x694/0x9e0 [btrfs]
[  +0,000000] Modules linked in: xt_nat xt_tcpudp veth xt_conntrack
xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo
xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc edac_mce_amd
kvm_amd snd_hda_codec_ca0110 snd_hda_codec_generic wmi_bmof kvm
ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_nhlt pktcdvd
irqbypass snd_hda_codec uvcvideo snd_hda_core snd_hwdep
videobuf2_vmalloc snd_pcm videobuf2_memops nls_iso8859_1
videobuf2_v4l2 nls_cp437 videobuf2_common snd_timer crct10dif_pclmul
vfat crc32_pclmul videodev fat snd joydev ghash_clmulni_intel
input_leds mousedev mc psmouse aesni_intel r8169 crypto_simd realtek
cryptd ccp glue_helper k10temp i2c_piix4 soundcore libphy rng_core wmi
gpio_amdpt evdev mac_hid pinctrl_amd acpi_cpufreq fuse vboxnetflt(OE)
vboxnetadp(OE) vboxdrv(OE) sg crypto_user ip_tables x_tables sr_mod
cdrom sd_mod hid_generic usbhid hid serio_raw atkbd libps2 ahci
libahci libata xhci_pci
[  +0,000018]  xhci_hcd scsi_mod i8042 serio amdgpu gpu_sched
i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops drm agpgart btrfs libcrc32c crc32c_generic crc32c_intel
xor raid6_pq
[  +0,000005] CPU: 7 PID: 382 Comm: btrfs-transacti Tainted: G
  OE     5.4.14-arch1-1 #1
[  +0,000001] Hardware name: Gigabyte Technology Co., Ltd.
AB350M-DS3H/AB350M-DS3H-CF, BIOS F50a 11/27/2019
[  +0,000010] RIP: 0010:__btrfs_free_extent.isra.0+0x694/0x9e0 [btrfs]
[  +0,000001] Code: e8 c1 ee 00 00 8b 4c 24 38 85 c9 0f 84 39 fe ff ff
48 8b 54 24 48 e9 04 fe ff ff 44 89 fe 48 c7 c7 a0 ce 30 c0 e8 ba 48
c4 d1 <0f> 0b 48 8b 3c 24 44 89 f9 ba 08 0c 00 00 48 c7 c6 a0 20 30 c0
e8
[  +0,000001] RSP: 0018:ffff8fc081363ba0 EFLAGS: 00010286
[  +0,000001] RAX: 0000000000000000 RBX: 0000000000000192 RCX: 0000000000000000
[  +0,000000] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 00000000ffffffff
[  +0,000001] RBP: 0000000b3090a000 R08: 000000000000049b R09: 0000000000000004
[  +0,000000] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8b958a1c9c40
[  +0,000001] R13: 0000000000000000 R14: 0000000000000001 R15: 00000000ffffff8b
[  +0,000001] FS:  0000000000000000(0000) GS:ffff8b958e9c0000(0000)
knlGS:0000000000000000
[  +0,000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0,000001] CR2: 00007fdcf263d000 CR3: 000000032f11a000 CR4: 00000000003406e0
[  +0,000001] Call Trace:
[  +0,000012]  ? __btrfs_run_delayed_refs+0xc9f/0xff0 [btrfs]
[  +0,000009]  __btrfs_run_delayed_refs+0x25e/0xff0 [btrfs]
[  +0,000011]  btrfs_run_delayed_refs+0x6a/0x180 [btrfs]
[  +0,000013]  btrfs_start_dirty_block_groups+0x28e/0x470 [btrfs]
[  +0,000011]  btrfs_commit_transaction+0x116/0x9b0 [btrfs]
[  +0,000003]  ? _raw_spin_unlock+0x16/0x30
[  +0,000010]  ? join_transaction+0x108/0x3a0 [btrfs]
[  +0,000010]  transaction_kthread+0x13a/0x180 [btrfs]
[  +0,000002]  kthread+0xfb/0x130
[  +0,000010]  ? btrfs_cleanup_transaction+0x560/0x560 [btrfs]
[  +0,000001]  ? kthread_park+0x90/0x90
[  +0,000001]  ret_from_fork+0x1f/0x40
[  +0,000002] ---[ end trace 51366456523028bd ]---
[  +0,000001] BTRFS: error (device nvme0n1p3) in
__btrfs_free_extent:3080: errno=-117 unknown
[  +0,000001] BTRFS info (device nvme0n1p3): forced readonly
[  +0,000002] BTRFS: error (device nvme0n1p3) in
btrfs_run_delayed_refs:2188: errno=-117 unknown

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: tree first key mismatch detected (reproducible error)
  2020-01-25 11:37 tree first key mismatch detected (reproducible error) Thorsten Hirsch
@ 2020-01-25 11:46 ` Andrei Borzenkov
  2020-01-25 12:23   ` Qu Wenruo
  2020-01-25 12:15 ` Qu Wenruo
  1 sibling, 1 reply; 8+ messages in thread
From: Andrei Borzenkov @ 2020-01-25 11:46 UTC (permalink / raw)
  To: Thorsten Hirsch, linux-btrfs

25.01.2020 14:37, Thorsten Hirsch пишет:
> Hi, here's a btrfs problem that started happening today on my main computer:
> 
> BTRFS error (device nvme0n1p3): tree first key mismatch detected,
> bytenr=109690880 parent_transid=1329869 key
> expected=(48044838912,168,12288) has=(48045363200,168,12288)
> 

This looks like bit flip

48044838912 == B2FB21000
48045363200 == B2FBA1000

with usual recommendation to check your RAM.

> It always occurs some minutes after booting, sometimes even seconds
> after booting. The partition is then remounted read-only. I already
> tried scrubbing the partition (aborts itself after some seconds) and
> balancing (seems to trigger the error immediately and doesn't even
> start).
> 
> I attached some more output of dmesg. The distribution is Arch Linux
> and the kernel is the most recent one in Arch's default kernel
> package: 5.4.14-arch1-1 (I upgraded from 5.4.13 to 5.4.14 just
> yesterday).
> 
> Best regards,
> Thorsten
> 
> [Jan25 12:00] BTRFS error (device nvme0n1p3): tree first key mismatch
> detected, bytenr=109690880 parent_transid=1329869 key
> expected=(48044838912,168,12288) has=(48045363200,168,12288)
> [  +0,000003] ------------[ cut here ]------------
> [  +0,000001] BTRFS: Transaction aborted (error -117)
> [  +0,000041] WARNING: CPU: 7 PID: 382 at fs/btrfs/extent-tree.c:3080
> __btrfs_free_extent.isra.0+0x694/0x9e0 [btrfs]
> [  +0,000000] Modules linked in: xt_nat xt_tcpudp veth xt_conntrack
> xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo
> xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
> nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc edac_mce_amd
> kvm_amd snd_hda_codec_ca0110 snd_hda_codec_generic wmi_bmof kvm
> ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_nhlt pktcdvd
> irqbypass snd_hda_codec uvcvideo snd_hda_core snd_hwdep
> videobuf2_vmalloc snd_pcm videobuf2_memops nls_iso8859_1
> videobuf2_v4l2 nls_cp437 videobuf2_common snd_timer crct10dif_pclmul
> vfat crc32_pclmul videodev fat snd joydev ghash_clmulni_intel
> input_leds mousedev mc psmouse aesni_intel r8169 crypto_simd realtek
> cryptd ccp glue_helper k10temp i2c_piix4 soundcore libphy rng_core wmi
> gpio_amdpt evdev mac_hid pinctrl_amd acpi_cpufreq fuse vboxnetflt(OE)
> vboxnetadp(OE) vboxdrv(OE) sg crypto_user ip_tables x_tables sr_mod
> cdrom sd_mod hid_generic usbhid hid serio_raw atkbd libps2 ahci
> libahci libata xhci_pci
> [  +0,000018]  xhci_hcd scsi_mod i8042 serio amdgpu gpu_sched
> i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt
> fb_sys_fops drm agpgart btrfs libcrc32c crc32c_generic crc32c_intel
> xor raid6_pq
> [  +0,000005] CPU: 7 PID: 382 Comm: btrfs-transacti Tainted: G
>   OE     5.4.14-arch1-1 #1
> [  +0,000001] Hardware name: Gigabyte Technology Co., Ltd.
> AB350M-DS3H/AB350M-DS3H-CF, BIOS F50a 11/27/2019
> [  +0,000010] RIP: 0010:__btrfs_free_extent.isra.0+0x694/0x9e0 [btrfs]
> [  +0,000001] Code: e8 c1 ee 00 00 8b 4c 24 38 85 c9 0f 84 39 fe ff ff
> 48 8b 54 24 48 e9 04 fe ff ff 44 89 fe 48 c7 c7 a0 ce 30 c0 e8 ba 48
> c4 d1 <0f> 0b 48 8b 3c 24 44 89 f9 ba 08 0c 00 00 48 c7 c6 a0 20 30 c0
> e8
> [  +0,000001] RSP: 0018:ffff8fc081363ba0 EFLAGS: 00010286
> [  +0,000001] RAX: 0000000000000000 RBX: 0000000000000192 RCX: 0000000000000000
> [  +0,000000] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 00000000ffffffff
> [  +0,000001] RBP: 0000000b3090a000 R08: 000000000000049b R09: 0000000000000004
> [  +0,000000] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8b958a1c9c40
> [  +0,000001] R13: 0000000000000000 R14: 0000000000000001 R15: 00000000ffffff8b
> [  +0,000001] FS:  0000000000000000(0000) GS:ffff8b958e9c0000(0000)
> knlGS:0000000000000000
> [  +0,000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  +0,000001] CR2: 00007fdcf263d000 CR3: 000000032f11a000 CR4: 00000000003406e0
> [  +0,000001] Call Trace:
> [  +0,000012]  ? __btrfs_run_delayed_refs+0xc9f/0xff0 [btrfs]
> [  +0,000009]  __btrfs_run_delayed_refs+0x25e/0xff0 [btrfs]
> [  +0,000011]  btrfs_run_delayed_refs+0x6a/0x180 [btrfs]
> [  +0,000013]  btrfs_start_dirty_block_groups+0x28e/0x470 [btrfs]
> [  +0,000011]  btrfs_commit_transaction+0x116/0x9b0 [btrfs]
> [  +0,000003]  ? _raw_spin_unlock+0x16/0x30
> [  +0,000010]  ? join_transaction+0x108/0x3a0 [btrfs]
> [  +0,000010]  transaction_kthread+0x13a/0x180 [btrfs]
> [  +0,000002]  kthread+0xfb/0x130
> [  +0,000010]  ? btrfs_cleanup_transaction+0x560/0x560 [btrfs]
> [  +0,000001]  ? kthread_park+0x90/0x90
> [  +0,000001]  ret_from_fork+0x1f/0x40
> [  +0,000002] ---[ end trace 51366456523028bd ]---
> [  +0,000001] BTRFS: error (device nvme0n1p3) in
> __btrfs_free_extent:3080: errno=-117 unknown
> [  +0,000001] BTRFS info (device nvme0n1p3): forced readonly
> [  +0,000002] BTRFS: error (device nvme0n1p3) in
> btrfs_run_delayed_refs:2188: errno=-117 unknown
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: tree first key mismatch detected (reproducible error)
  2020-01-25 11:37 tree first key mismatch detected (reproducible error) Thorsten Hirsch
  2020-01-25 11:46 ` Andrei Borzenkov
@ 2020-01-25 12:15 ` Qu Wenruo
  1 sibling, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2020-01-25 12:15 UTC (permalink / raw)
  To: Thorsten Hirsch, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 5467 bytes --]



On 2020/1/25 下午7:37, Thorsten Hirsch wrote:
> Hi, here's a btrfs problem that started happening today on my main computer:
> 
> BTRFS error (device nvme0n1p3): tree first key mismatch detected,
> bytenr=109690880 parent_transid=1329869 key
> expected=(48044838912,168,12288) has=(48045363200,168,12288)

This means your fs is already corrupted.

The only good news is, that corruption is in extent tree.

Thus you can still salvage your data in RO mode.
> 
> It always occurs some minutes after booting, sometimes even seconds
> after booting. The partition is then remounted read-only. I already
> tried scrubbing the partition (aborts itself after some seconds) and
> balancing (seems to trigger the error immediately and doesn't even
> start).

Please run `btrfs check` on the unmounted fs. (Since you're already
using Arch, using latest arch iso looks like the best solution if it's
your root fs).

If feel like to have a adventure, you could try `btrfs check
--init-extent-tree` after posting the `btrfs check` result.

It can be very slow, and may not always fix your problem.

> 
> I attached some more output of dmesg. The distribution is Arch Linux
> and the kernel is the most recent one in Arch's default kernel
> package: 5.4.14-arch1-1 (I upgraded from 5.4.13 to 5.4.14 just
> yesterday).

Arch's kernel is mostly upstream, which is mostly good for btrfs usage,
so is its btrfs-progs version.

Thanks,
Qu

> 
> Best regards,
> Thorsten
> 
> [Jan25 12:00] BTRFS error (device nvme0n1p3): tree first key mismatch
> detected, bytenr=109690880 parent_transid=1329869 key
> expected=(48044838912,168,12288) has=(48045363200,168,12288)
> [  +0,000003] ------------[ cut here ]------------
> [  +0,000001] BTRFS: Transaction aborted (error -117)
> [  +0,000041] WARNING: CPU: 7 PID: 382 at fs/btrfs/extent-tree.c:3080
> __btrfs_free_extent.isra.0+0x694/0x9e0 [btrfs]
> [  +0,000000] Modules linked in: xt_nat xt_tcpudp veth xt_conntrack
> xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo
> xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
> nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc edac_mce_amd
> kvm_amd snd_hda_codec_ca0110 snd_hda_codec_generic wmi_bmof kvm
> ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_nhlt pktcdvd
> irqbypass snd_hda_codec uvcvideo snd_hda_core snd_hwdep
> videobuf2_vmalloc snd_pcm videobuf2_memops nls_iso8859_1
> videobuf2_v4l2 nls_cp437 videobuf2_common snd_timer crct10dif_pclmul
> vfat crc32_pclmul videodev fat snd joydev ghash_clmulni_intel
> input_leds mousedev mc psmouse aesni_intel r8169 crypto_simd realtek
> cryptd ccp glue_helper k10temp i2c_piix4 soundcore libphy rng_core wmi
> gpio_amdpt evdev mac_hid pinctrl_amd acpi_cpufreq fuse vboxnetflt(OE)
> vboxnetadp(OE) vboxdrv(OE) sg crypto_user ip_tables x_tables sr_mod
> cdrom sd_mod hid_generic usbhid hid serio_raw atkbd libps2 ahci
> libahci libata xhci_pci
> [  +0,000018]  xhci_hcd scsi_mod i8042 serio amdgpu gpu_sched
> i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt
> fb_sys_fops drm agpgart btrfs libcrc32c crc32c_generic crc32c_intel
> xor raid6_pq
> [  +0,000005] CPU: 7 PID: 382 Comm: btrfs-transacti Tainted: G
>   OE     5.4.14-arch1-1 #1
> [  +0,000001] Hardware name: Gigabyte Technology Co., Ltd.
> AB350M-DS3H/AB350M-DS3H-CF, BIOS F50a 11/27/2019
> [  +0,000010] RIP: 0010:__btrfs_free_extent.isra.0+0x694/0x9e0 [btrfs]
> [  +0,000001] Code: e8 c1 ee 00 00 8b 4c 24 38 85 c9 0f 84 39 fe ff ff
> 48 8b 54 24 48 e9 04 fe ff ff 44 89 fe 48 c7 c7 a0 ce 30 c0 e8 ba 48
> c4 d1 <0f> 0b 48 8b 3c 24 44 89 f9 ba 08 0c 00 00 48 c7 c6 a0 20 30 c0
> e8
> [  +0,000001] RSP: 0018:ffff8fc081363ba0 EFLAGS: 00010286
> [  +0,000001] RAX: 0000000000000000 RBX: 0000000000000192 RCX: 0000000000000000
> [  +0,000000] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 00000000ffffffff
> [  +0,000001] RBP: 0000000b3090a000 R08: 000000000000049b R09: 0000000000000004
> [  +0,000000] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8b958a1c9c40
> [  +0,000001] R13: 0000000000000000 R14: 0000000000000001 R15: 00000000ffffff8b
> [  +0,000001] FS:  0000000000000000(0000) GS:ffff8b958e9c0000(0000)
> knlGS:0000000000000000
> [  +0,000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  +0,000001] CR2: 00007fdcf263d000 CR3: 000000032f11a000 CR4: 00000000003406e0
> [  +0,000001] Call Trace:
> [  +0,000012]  ? __btrfs_run_delayed_refs+0xc9f/0xff0 [btrfs]
> [  +0,000009]  __btrfs_run_delayed_refs+0x25e/0xff0 [btrfs]
> [  +0,000011]  btrfs_run_delayed_refs+0x6a/0x180 [btrfs]
> [  +0,000013]  btrfs_start_dirty_block_groups+0x28e/0x470 [btrfs]
> [  +0,000011]  btrfs_commit_transaction+0x116/0x9b0 [btrfs]
> [  +0,000003]  ? _raw_spin_unlock+0x16/0x30
> [  +0,000010]  ? join_transaction+0x108/0x3a0 [btrfs]
> [  +0,000010]  transaction_kthread+0x13a/0x180 [btrfs]
> [  +0,000002]  kthread+0xfb/0x130
> [  +0,000010]  ? btrfs_cleanup_transaction+0x560/0x560 [btrfs]
> [  +0,000001]  ? kthread_park+0x90/0x90
> [  +0,000001]  ret_from_fork+0x1f/0x40
> [  +0,000002] ---[ end trace 51366456523028bd ]---
> [  +0,000001] BTRFS: error (device nvme0n1p3) in
> __btrfs_free_extent:3080: errno=-117 unknown
> [  +0,000001] BTRFS info (device nvme0n1p3): forced readonly
> [  +0,000002] BTRFS: error (device nvme0n1p3) in
> btrfs_run_delayed_refs:2188: errno=-117 unknown
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: tree first key mismatch detected (reproducible error)
  2020-01-25 11:46 ` Andrei Borzenkov
@ 2020-01-25 12:23   ` Qu Wenruo
  2020-01-25 15:44     ` Thorsten Hirsch
  0 siblings, 1 reply; 8+ messages in thread
From: Qu Wenruo @ 2020-01-25 12:23 UTC (permalink / raw)
  To: Andrei Borzenkov, Thorsten Hirsch, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 5266 bytes --]



On 2020/1/25 下午7:46, Andrei Borzenkov wrote:
> 25.01.2020 14:37, Thorsten Hirsch пишет:
>> Hi, here's a btrfs problem that started happening today on my main computer:
>>
>> BTRFS error (device nvme0n1p3): tree first key mismatch detected,
>> bytenr=109690880 parent_transid=1329869 key
>> expected=(48044838912,168,12288) has=(48045363200,168,12288)
>>
> 
> This looks like bit flip
> 
> 48044838912 == B2FB21000
> 48045363200 == B2FBA1000
> 
> with usual recommendation to check your RAM.
>

Ops, forgot the case of bitflip.

Just as mentioned by Andrei, make sure the memory problem is solved,
then `btrfs check`.

Thanks,
Qu

>> It always occurs some minutes after booting, sometimes even seconds
>> after booting. The partition is then remounted read-only. I already
>> tried scrubbing the partition (aborts itself after some seconds) and
>> balancing (seems to trigger the error immediately and doesn't even
>> start).
>>
>> I attached some more output of dmesg. The distribution is Arch Linux
>> and the kernel is the most recent one in Arch's default kernel
>> package: 5.4.14-arch1-1 (I upgraded from 5.4.13 to 5.4.14 just
>> yesterday).
>>
>> Best regards,
>> Thorsten
>>
>> [Jan25 12:00] BTRFS error (device nvme0n1p3): tree first key mismatch
>> detected, bytenr=109690880 parent_transid=1329869 key
>> expected=(48044838912,168,12288) has=(48045363200,168,12288)
>> [  +0,000003] ------------[ cut here ]------------
>> [  +0,000001] BTRFS: Transaction aborted (error -117)
>> [  +0,000041] WARNING: CPU: 7 PID: 382 at fs/btrfs/extent-tree.c:3080
>> __btrfs_free_extent.isra.0+0x694/0x9e0 [btrfs]
>> [  +0,000000] Modules linked in: xt_nat xt_tcpudp veth xt_conntrack
>> xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo
>> xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
>> nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc edac_mce_amd
>> kvm_amd snd_hda_codec_ca0110 snd_hda_codec_generic wmi_bmof kvm
>> ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_nhlt pktcdvd
>> irqbypass snd_hda_codec uvcvideo snd_hda_core snd_hwdep
>> videobuf2_vmalloc snd_pcm videobuf2_memops nls_iso8859_1
>> videobuf2_v4l2 nls_cp437 videobuf2_common snd_timer crct10dif_pclmul
>> vfat crc32_pclmul videodev fat snd joydev ghash_clmulni_intel
>> input_leds mousedev mc psmouse aesni_intel r8169 crypto_simd realtek
>> cryptd ccp glue_helper k10temp i2c_piix4 soundcore libphy rng_core wmi
>> gpio_amdpt evdev mac_hid pinctrl_amd acpi_cpufreq fuse vboxnetflt(OE)
>> vboxnetadp(OE) vboxdrv(OE) sg crypto_user ip_tables x_tables sr_mod
>> cdrom sd_mod hid_generic usbhid hid serio_raw atkbd libps2 ahci
>> libahci libata xhci_pci
>> [  +0,000018]  xhci_hcd scsi_mod i8042 serio amdgpu gpu_sched
>> i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt
>> fb_sys_fops drm agpgart btrfs libcrc32c crc32c_generic crc32c_intel
>> xor raid6_pq
>> [  +0,000005] CPU: 7 PID: 382 Comm: btrfs-transacti Tainted: G
>>   OE     5.4.14-arch1-1 #1
>> [  +0,000001] Hardware name: Gigabyte Technology Co., Ltd.
>> AB350M-DS3H/AB350M-DS3H-CF, BIOS F50a 11/27/2019
>> [  +0,000010] RIP: 0010:__btrfs_free_extent.isra.0+0x694/0x9e0 [btrfs]
>> [  +0,000001] Code: e8 c1 ee 00 00 8b 4c 24 38 85 c9 0f 84 39 fe ff ff
>> 48 8b 54 24 48 e9 04 fe ff ff 44 89 fe 48 c7 c7 a0 ce 30 c0 e8 ba 48
>> c4 d1 <0f> 0b 48 8b 3c 24 44 89 f9 ba 08 0c 00 00 48 c7 c6 a0 20 30 c0
>> e8
>> [  +0,000001] RSP: 0018:ffff8fc081363ba0 EFLAGS: 00010286
>> [  +0,000001] RAX: 0000000000000000 RBX: 0000000000000192 RCX: 0000000000000000
>> [  +0,000000] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 00000000ffffffff
>> [  +0,000001] RBP: 0000000b3090a000 R08: 000000000000049b R09: 0000000000000004
>> [  +0,000000] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8b958a1c9c40
>> [  +0,000001] R13: 0000000000000000 R14: 0000000000000001 R15: 00000000ffffff8b
>> [  +0,000001] FS:  0000000000000000(0000) GS:ffff8b958e9c0000(0000)
>> knlGS:0000000000000000
>> [  +0,000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  +0,000001] CR2: 00007fdcf263d000 CR3: 000000032f11a000 CR4: 00000000003406e0
>> [  +0,000001] Call Trace:
>> [  +0,000012]  ? __btrfs_run_delayed_refs+0xc9f/0xff0 [btrfs]
>> [  +0,000009]  __btrfs_run_delayed_refs+0x25e/0xff0 [btrfs]
>> [  +0,000011]  btrfs_run_delayed_refs+0x6a/0x180 [btrfs]
>> [  +0,000013]  btrfs_start_dirty_block_groups+0x28e/0x470 [btrfs]
>> [  +0,000011]  btrfs_commit_transaction+0x116/0x9b0 [btrfs]
>> [  +0,000003]  ? _raw_spin_unlock+0x16/0x30
>> [  +0,000010]  ? join_transaction+0x108/0x3a0 [btrfs]
>> [  +0,000010]  transaction_kthread+0x13a/0x180 [btrfs]
>> [  +0,000002]  kthread+0xfb/0x130
>> [  +0,000010]  ? btrfs_cleanup_transaction+0x560/0x560 [btrfs]
>> [  +0,000001]  ? kthread_park+0x90/0x90
>> [  +0,000001]  ret_from_fork+0x1f/0x40
>> [  +0,000002] ---[ end trace 51366456523028bd ]---
>> [  +0,000001] BTRFS: error (device nvme0n1p3) in
>> __btrfs_free_extent:3080: errno=-117 unknown
>> [  +0,000001] BTRFS info (device nvme0n1p3): forced readonly
>> [  +0,000002] BTRFS: error (device nvme0n1p3) in
>> btrfs_run_delayed_refs:2188: errno=-117 unknown
>>
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: tree first key mismatch detected (reproducible error)
  2020-01-25 12:23   ` Qu Wenruo
@ 2020-01-25 15:44     ` Thorsten Hirsch
  2020-01-25 16:01       ` Martin Raiber
  0 siblings, 1 reply; 8+ messages in thread
From: Thorsten Hirsch @ 2020-01-25 15:44 UTC (permalink / raw)
  To: linux-btrfs

Thanks, guys.

However, checking the RAM with memtest86 hasn't revealed any errors.
Currently I let it run another pass, but so far everything's good.
Here's the output of btrfs check...

[1/7] checking root items
[2/7] checking extents
leaf parent key incorrect 109690880
bad block 109690880
ERROR: errors found in extent allocation tree or chunk allocation
[3/7] checking free space cache
[4/7] checking fs roots
root 5 inode 3583162 errors 1040, bad file extent, some csum missing
root 5 inode 3767022 errors 1040, bad file extent, some csum missing
root 5 inode 3819591 errors 1040, bad file extent, some csum missing
root 5 inode 4108194 errors 1040, bad file extent, some csum missing
ERROR: errors found in fs roots
Opening filesystem to check...
Checking filesystem on /dev/nvme0n1p3
UUID: 26717c9f-df62-4c57-a482-b9e4880b31e6
found 6132469760 bytes used, error(s) found
total csum bytes: 0
total tree bytes: 4161536
total fs tree bytes: 0
total extent tree bytes: 3850240
btree space waste bytes: 1115823
file data blocks allocated: 108003328
 referenced 108003328

-- 
Thorsten


Am Sa., 25. Jan. 2020 um 13:23 Uhr schrieb Qu Wenruo <quwenruo.btrfs@gmx.com>:
>
>
>
> On 2020/1/25 下午7:46, Andrei Borzenkov wrote:
> > 25.01.2020 14:37, Thorsten Hirsch пишет:
> >> Hi, here's a btrfs problem that started happening today on my main computer:
> >>
> >> BTRFS error (device nvme0n1p3): tree first key mismatch detected,
> >> bytenr=109690880 parent_transid=1329869 key
> >> expected=(48044838912,168,12288) has=(48045363200,168,12288)
> >>
> >
> > This looks like bit flip
> >
> > 48044838912 == B2FB21000
> > 48045363200 == B2FBA1000
> >
> > with usual recommendation to check your RAM.
> >
>
> Ops, forgot the case of bitflip.
>
> Just as mentioned by Andrei, make sure the memory problem is solved,
> then `btrfs check`.
>
> Thanks,
> Qu
>
> >> It always occurs some minutes after booting, sometimes even seconds
> >> after booting. The partition is then remounted read-only. I already
> >> tried scrubbing the partition (aborts itself after some seconds) and
> >> balancing (seems to trigger the error immediately and doesn't even
> >> start).
> >>
> >> I attached some more output of dmesg. The distribution is Arch Linux
> >> and the kernel is the most recent one in Arch's default kernel
> >> package: 5.4.14-arch1-1 (I upgraded from 5.4.13 to 5.4.14 just
> >> yesterday).
> >>
> >> Best regards,
> >> Thorsten
> >>
> >> [Jan25 12:00] BTRFS error (device nvme0n1p3): tree first key mismatch
> >> detected, bytenr=109690880 parent_transid=1329869 key
> >> expected=(48044838912,168,12288) has=(48045363200,168,12288)
> >> [  +0,000003] ------------[ cut here ]------------
> >> [  +0,000001] BTRFS: Transaction aborted (error -117)
> >> [  +0,000041] WARNING: CPU: 7 PID: 382 at fs/btrfs/extent-tree.c:3080
> >> __btrfs_free_extent.isra.0+0x694/0x9e0 [btrfs]
> >> [  +0,000000] Modules linked in: xt_nat xt_tcpudp veth xt_conntrack
> >> xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo
> >> xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack
> >> nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc edac_mce_amd
> >> kvm_amd snd_hda_codec_ca0110 snd_hda_codec_generic wmi_bmof kvm
> >> ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_nhlt pktcdvd
> >> irqbypass snd_hda_codec uvcvideo snd_hda_core snd_hwdep
> >> videobuf2_vmalloc snd_pcm videobuf2_memops nls_iso8859_1
> >> videobuf2_v4l2 nls_cp437 videobuf2_common snd_timer crct10dif_pclmul
> >> vfat crc32_pclmul videodev fat snd joydev ghash_clmulni_intel
> >> input_leds mousedev mc psmouse aesni_intel r8169 crypto_simd realtek
> >> cryptd ccp glue_helper k10temp i2c_piix4 soundcore libphy rng_core wmi
> >> gpio_amdpt evdev mac_hid pinctrl_amd acpi_cpufreq fuse vboxnetflt(OE)
> >> vboxnetadp(OE) vboxdrv(OE) sg crypto_user ip_tables x_tables sr_mod
> >> cdrom sd_mod hid_generic usbhid hid serio_raw atkbd libps2 ahci
> >> libahci libata xhci_pci
> >> [  +0,000018]  xhci_hcd scsi_mod i8042 serio amdgpu gpu_sched
> >> i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt
> >> fb_sys_fops drm agpgart btrfs libcrc32c crc32c_generic crc32c_intel
> >> xor raid6_pq
> >> [  +0,000005] CPU: 7 PID: 382 Comm: btrfs-transacti Tainted: G
> >>   OE     5.4.14-arch1-1 #1
> >> [  +0,000001] Hardware name: Gigabyte Technology Co., Ltd.
> >> AB350M-DS3H/AB350M-DS3H-CF, BIOS F50a 11/27/2019
> >> [  +0,000010] RIP: 0010:__btrfs_free_extent.isra.0+0x694/0x9e0 [btrfs]
> >> [  +0,000001] Code: e8 c1 ee 00 00 8b 4c 24 38 85 c9 0f 84 39 fe ff ff
> >> 48 8b 54 24 48 e9 04 fe ff ff 44 89 fe 48 c7 c7 a0 ce 30 c0 e8 ba 48
> >> c4 d1 <0f> 0b 48 8b 3c 24 44 89 f9 ba 08 0c 00 00 48 c7 c6 a0 20 30 c0
> >> e8
> >> [  +0,000001] RSP: 0018:ffff8fc081363ba0 EFLAGS: 00010286
> >> [  +0,000001] RAX: 0000000000000000 RBX: 0000000000000192 RCX: 0000000000000000
> >> [  +0,000000] RDX: 0000000000000001 RSI: 0000000000000096 RDI: 00000000ffffffff
> >> [  +0,000001] RBP: 0000000b3090a000 R08: 000000000000049b R09: 0000000000000004
> >> [  +0,000000] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8b958a1c9c40
> >> [  +0,000001] R13: 0000000000000000 R14: 0000000000000001 R15: 00000000ffffff8b
> >> [  +0,000001] FS:  0000000000000000(0000) GS:ffff8b958e9c0000(0000)
> >> knlGS:0000000000000000
> >> [  +0,000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [  +0,000001] CR2: 00007fdcf263d000 CR3: 000000032f11a000 CR4: 00000000003406e0
> >> [  +0,000001] Call Trace:
> >> [  +0,000012]  ? __btrfs_run_delayed_refs+0xc9f/0xff0 [btrfs]
> >> [  +0,000009]  __btrfs_run_delayed_refs+0x25e/0xff0 [btrfs]
> >> [  +0,000011]  btrfs_run_delayed_refs+0x6a/0x180 [btrfs]
> >> [  +0,000013]  btrfs_start_dirty_block_groups+0x28e/0x470 [btrfs]
> >> [  +0,000011]  btrfs_commit_transaction+0x116/0x9b0 [btrfs]
> >> [  +0,000003]  ? _raw_spin_unlock+0x16/0x30
> >> [  +0,000010]  ? join_transaction+0x108/0x3a0 [btrfs]
> >> [  +0,000010]  transaction_kthread+0x13a/0x180 [btrfs]
> >> [  +0,000002]  kthread+0xfb/0x130
> >> [  +0,000010]  ? btrfs_cleanup_transaction+0x560/0x560 [btrfs]
> >> [  +0,000001]  ? kthread_park+0x90/0x90
> >> [  +0,000001]  ret_from_fork+0x1f/0x40
> >> [  +0,000002] ---[ end trace 51366456523028bd ]---
> >> [  +0,000001] BTRFS: error (device nvme0n1p3) in
> >> __btrfs_free_extent:3080: errno=-117 unknown
> >> [  +0,000001] BTRFS info (device nvme0n1p3): forced readonly
> >> [  +0,000002] BTRFS: error (device nvme0n1p3) in
> >> btrfs_run_delayed_refs:2188: errno=-117 unknown
> >>
> >
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: tree first key mismatch detected (reproducible error)
  2020-01-25 15:44     ` Thorsten Hirsch
@ 2020-01-25 16:01       ` Martin Raiber
  2020-01-26 11:17         ` Thorsten Hirsch
  0 siblings, 1 reply; 8+ messages in thread
From: Martin Raiber @ 2020-01-25 16:01 UTC (permalink / raw)
  To: Thorsten Hirsch, linux-btrfs

On 25.01.2020 16:44 Thorsten Hirsch wrote:
> Thanks, guys.
>
> However, checking the RAM with memtest86 hasn't revealed any errors.
> Currently I let it run another pass, but so far everything's good.
> Here's the output of btrfs check...

just from my experience with non-ECC RAM:
When I had RAM corruption it only occurred after a few days of uptime
and only when I ran memtester on Linux. memtest86/memtest86+ didn't show
any problems even when running for a week (and in multi cpu mode).

> [1/7] checking root items
> [2/7] checking extents
> leaf parent key incorrect 109690880
> bad block 109690880
> ERROR: errors found in extent allocation tree or chunk allocation
> [3/7] checking free space cache
> [4/7] checking fs roots
> root 5 inode 3583162 errors 1040, bad file extent, some csum missing
> root 5 inode 3767022 errors 1040, bad file extent, some csum missing
> root 5 inode 3819591 errors 1040, bad file extent, some csum missing
> root 5 inode 4108194 errors 1040, bad file extent, some csum missing
> ERROR: errors found in fs roots
> Opening filesystem to check...
> Checking filesystem on /dev/nvme0n1p3
> UUID: 26717c9f-df62-4c57-a482-b9e4880b31e6
> found 6132469760 bytes used, error(s) found
> total csum bytes: 0
> total tree bytes: 4161536
> total fs tree bytes: 0
> total extent tree bytes: 3850240
> btree space waste bytes: 1115823
> file data blocks allocated: 108003328
>  referenced 108003328
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: tree first key mismatch detected (reproducible error)
  2020-01-25 16:01       ` Martin Raiber
@ 2020-01-26 11:17         ` Thorsten Hirsch
  2020-01-26 13:03           ` Qu Wenruo
  0 siblings, 1 reply; 8+ messages in thread
From: Thorsten Hirsch @ 2020-01-26 11:17 UTC (permalink / raw)
  To: linux-btrfs

Thank you, Martin. So I started memtester yesterday and meanwhile it
has run 90 loops w/o any errors.
Back to btrfs:

- I could restore pretty much all data with "btrfs restore", except
for some virtualbox disk images
- "btrfs check --init-extent-tree" took some hours to finish, but I
still couldn't mount the partition due to multiple "corrupt leaf"
errors
- mounting with "-o backuproot" resulted in the same error
- "btrfs rescue super-recover" said everything was fine
- after "btrfs rescue chunk-recover" or "btrfs check --repair" there
was only 1 "corrupt leaf" error left, but mounting was still not
possible

So basically the mount errors after "btrfs check --init-extent-tree"
and all later commands looked like this:

[64385.439530] BTRFS critical (device nvme0n1p3): corrupt leaf:
block=156450816 slot=30 extent bytenr=51548897280 len=262144 invalid
generation, have 315981823 expect (0, 2265510]
[64385.440779] BTRFS error (device nvme0n1p3): block=156450816 read
time tree block corruption detected
[64385.440785] BTRFS error (device nvme0n1p3): failed to read block groups: -5
[64385.493696] BTRFS error (device nvme0n1p3): open_ctree failed
mount: /mnt/nvme: wrong fs type, bad option, bad superblock on
/dev/nvme0n1p3, missing codepage or helper program, or other error.

Then I gave up and called mkfs.btrfs. Currently the restored data is
on its way back to the device.

-- 
Thorsten

Am Sa., 25. Jan. 2020 um 17:01 Uhr schrieb Martin Raiber <martin@urbackup.org>:
>
> On 25.01.2020 16:44 Thorsten Hirsch wrote:
> > Thanks, guys.
> >
> > However, checking the RAM with memtest86 hasn't revealed any errors.
> > Currently I let it run another pass, but so far everything's good.
> > Here's the output of btrfs check...
>
> just from my experience with non-ECC RAM:
> When I had RAM corruption it only occurred after a few days of uptime
> and only when I ran memtester on Linux. memtest86/memtest86+ didn't show
> any problems even when running for a week (and in multi cpu mode).
>
> > [1/7] checking root items
> > [2/7] checking extents
> > leaf parent key incorrect 109690880
> > bad block 109690880
> > ERROR: errors found in extent allocation tree or chunk allocation
> > [3/7] checking free space cache
> > [4/7] checking fs roots
> > root 5 inode 3583162 errors 1040, bad file extent, some csum missing
> > root 5 inode 3767022 errors 1040, bad file extent, some csum missing
> > root 5 inode 3819591 errors 1040, bad file extent, some csum missing
> > root 5 inode 4108194 errors 1040, bad file extent, some csum missing
> > ERROR: errors found in fs roots
> > Opening filesystem to check...
> > Checking filesystem on /dev/nvme0n1p3
> > UUID: 26717c9f-df62-4c57-a482-b9e4880b31e6
> > found 6132469760 bytes used, error(s) found
> > total csum bytes: 0
> > total tree bytes: 4161536
> > total fs tree bytes: 0
> > total extent tree bytes: 3850240
> > btree space waste bytes: 1115823
> > file data blocks allocated: 108003328
> >  referenced 108003328
> >
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: tree first key mismatch detected (reproducible error)
  2020-01-26 11:17         ` Thorsten Hirsch
@ 2020-01-26 13:03           ` Qu Wenruo
  0 siblings, 0 replies; 8+ messages in thread
From: Qu Wenruo @ 2020-01-26 13:03 UTC (permalink / raw)
  To: Thorsten Hirsch, linux-btrfs


[-- Attachment #1.1: Type: text/plain, Size: 2062 bytes --]



On 2020/1/26 下午7:17, Thorsten Hirsch wrote:
> Thank you, Martin. So I started memtester yesterday and meanwhile it
> has run 90 loops w/o any errors.
> Back to btrfs:
> 
> - I could restore pretty much all data with "btrfs restore", except
> for some virtualbox disk images
> - "btrfs check --init-extent-tree" took some hours to finish, but I
> still couldn't mount the partition due to multiple "corrupt leaf"
> errors

That's due to a bug in btrfs-progs where extent item generation is not
reset properly.

You can either use the devel branch
https://github.com/kdave/btrfs-progs/tree/devel

Or at least apply this commit to fix it, while without using all other
patches which may break --init-extent-tree.
https://github.com/kdave/btrfs-progs/commit/8d45dc270a3791d7217625190c9fc8f7cc129285

Or, you can just use v5.3 to skip such warning and do a full balance to
reset the whole extent tree and call it a day.

Thanks,
Qu

> - mounting with "-o backuproot" resulted in the same error
> - "btrfs rescue super-recover" said everything was fine
> - after "btrfs rescue chunk-recover" or "btrfs check --repair" there
> was only 1 "corrupt leaf" error left, but mounting was still not
> possible
> 
> So basically the mount errors after "btrfs check --init-extent-tree"
> and all later commands looked like this:
> 
> [64385.439530] BTRFS critical (device nvme0n1p3): corrupt leaf:
> block=156450816 slot=30 extent bytenr=51548897280 len=262144 invalid
> generation, have 315981823 expect (0, 2265510]
> [64385.440779] BTRFS error (device nvme0n1p3): block=156450816 read
> time tree block corruption detected
> [64385.440785] BTRFS error (device nvme0n1p3): failed to read block groups: -5
> [64385.493696] BTRFS error (device nvme0n1p3): open_ctree failed
> mount: /mnt/nvme: wrong fs type, bad option, bad superblock on
> /dev/nvme0n1p3, missing codepage or helper program, or other error.
> 
> Then I gave up and called mkfs.btrfs. Currently the restored data is
> on its way back to the device.
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-01-26 13:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-01-25 11:37 tree first key mismatch detected (reproducible error) Thorsten Hirsch
2020-01-25 11:46 ` Andrei Borzenkov
2020-01-25 12:23   ` Qu Wenruo
2020-01-25 15:44     ` Thorsten Hirsch
2020-01-25 16:01       ` Martin Raiber
2020-01-26 11:17         ` Thorsten Hirsch
2020-01-26 13:03           ` Qu Wenruo
2020-01-25 12:15 ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox