* Kernel lockup, might be helpful log.
@ 2015-12-13 22:55 Birdsarenice
2015-12-14 6:51 ` Duncan
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Birdsarenice @ 2015-12-13 22:55 UTC (permalink / raw)
To: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 1423 bytes --]
I've finally finished deleting all those nasty unreliable Seagate drives
from my array. During the process I crashed my server - over, and over,
and over. Completely gone - screen blank, controls unresponsive, no
network activity (no, I don't have root on btrfs - data only). Most
annoying, but I think btrfs survived it all somehow - it's scrubbing now.
Meanwhile, I did get lucky: At one crash I happened to be logged in and
was able to hit dmesg seconds before it went completely. So what I have
here is information that looks like it'll help you track down a
rarely-encountered and hard-to-reproduce bug which can cause the system
to lock up completely in event of certain types of hard drive failure.
It might be nothing, but perhaps someone will find it of use - because
it'd be a tricky one to both reproduce and get a good error report if it
did occur.
I see an 'invalid opcode' error in here, that's pretty unusual - and
again it even gives a file name and line number to look at. The root
cause of all my issues is the NCQ issue with Seagate 8TB archive drives,
which is Someone Else's Problem - but I think some good can come of
this, as these exotic forms of corruption and weird drive semi-failures
have revealed ways in which btrfs's error handling could be made more
graceful.
Meanwhile I remain impressed that btrfs appears to have kept all my data
intact even though all these issues.
[-- Attachment #2: btrfslog.txt --]
[-- Type: text/plain, Size: 36293 bytes --]
[11668.697976] BTRFS info (device sde1): relocating block group 5932520046592 flags 17
[11676.977183] BTRFS info (device sde1): found 20 extents
[11686.138376] BTRFS info (device sde1): found 20 extents
[11686.567242] BTRFS info (device sde1): relocating block group 5935741272064 flags 17
[11695.452025] BTRFS info (device sde1): found 17 extents
[11704.627191] BTRFS info (device sde1): found 17 extents
[11705.966792] BTRFS info (device sde1): relocating block group 5938962497536 flags 17
[11715.343790] BTRFS info (device sde1): found 15 extents
[11724.219660] BTRFS info (device sde1): found 15 extents
[11724.910970] BTRFS info (device sde1): relocating block group 5940036239360 flags 17
[11733.289804] BTRFS info (device sde1): found 22 extents
[11741.538676] BTRFS info (device sde1): found 22 extents
[11742.019752] BTRFS info (device sde1): relocating block group 5941109981184 flags 17
[11751.676514] BTRFS info (device sde1): found 14 extents
[11759.404371] ------------[ cut here ]------------
[11759.404439] kernel BUG at ../fs/btrfs/extent-tree.c:1832!
[11759.404514] invalid opcode: 0000 [#1] PREEMPT SMP
[11759.404600] Modules linked in: xt_nat nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_conntrack xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables af_packet bridge stp llc iscsi_ibft iscsi_boot_sysfs btrfs xor x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel raid6_pq aesni_intel aes_x86_64 lrw gf128mul iTCO_wdt glue_helper ablk_helper iTCO_vendor_support cryptd pcspkr i2c_i801 ib_mthca lpc_ich tpm_tis 8250_fintek ie31200_edac mfd_core shpchp battery edac_core thermal tpm video fan button processor hid_generic usbhid uas usb_storage amdkfd amd_iommu_v2 radeon igb dca i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
[11759.405914] fb_sys_fops ttm drm xhci_pci xhci_hcd ehci_pci ehci_hcd usbcore usb_common e1000e ptp pps_core fjes vhost_net tun vhost macvtap macvlan sg rpcrdma sunrpc rdma_cm iw_cm ib_ipoib ib_cm ib_sa ib_umad ib_mad ib_core ib_addr
[11759.406328] CPU: 2 PID: 2060 Comm: btrfs Not tainted 4.3.0-2-default #1
[11759.406414] Hardware name: FUJITSU PRIMERGY TX100 S3P/D3009-B1, BIOS V4.6.5.3 R1.10.0 for D3009-B1x 12/18/2012
[11759.406555] task: ffff88042f832040 ti: ffff88041cae4000 task.ti: ffff88041cae4000
[11759.406659] RIP: 0010:[<ffffffffa0b53cf6>] [<ffffffffa0b53cf6>] insert_inline_extent_backref+0xc6/0xd0 [btrfs]
[11759.406815] RSP: 0018:ffff88041cae7830 EFLAGS: 00010293
[11759.406889] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
[11759.406986] RDX: ffff880000000000 RSI: 0000000000000001 RDI: 0000000000000000
[11759.407085] RBP: ffff88041cae7890 R08: 0000000000004000 R09: ffff88041cae7748
[11759.407184] R10: 0000000000000000 R11: 0000000000000003 R12: ffff880412615800
[11759.407283] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8800c92aef50
[11759.407383] FS: 00007f2e3b1678c0(0000) GS:ffff88042fd00000(0000) knlGS:0000000000000000
[11759.407497] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11759.407576] CR2: 000055f473f59f28 CR3: 00000004180be000 CR4: 00000000001406e0
[11759.407675] Stack:
[11759.407706] 0000000000000000 0000000000000102 0000000000000000 0000000000000000
[11759.407831] ffffffff00000001 ffff88041170d800 00000000000032b6 ffff88041170d800
[11759.407949] ffff88030f0203b0 ffff8800c92aef50 0000000000000102 ffff88040b22e000
[11759.408069] Call Trace:
[11759.408127] [<ffffffffa0b54188>] __btrfs_inc_extent_ref.isra.52+0x98/0x250 [btrfs]
[11759.408239] [<ffffffffa0b59709>] __btrfs_run_delayed_refs+0xc79/0x10a0 [btrfs]
[11759.408349] [<ffffffffa0b5c7d2>] btrfs_run_delayed_refs+0x82/0x290 [btrfs]
[11759.408453] [<ffffffffa0b70cc3>] btrfs_commit_transaction+0x43/0xb60 [btrfs]
[11759.408562] [<ffffffffa0bc0e90>] prepare_to_merge+0x200/0x210 [btrfs]
[11759.408663] [<ffffffffa0bc189f>] relocate_block_group+0x24f/0x6a0 [btrfs]
[11759.408766] [<ffffffffa0bc1ea3>] btrfs_relocate_block_group+0x1b3/0x290 [btrfs]
[11759.408878] [<ffffffffa0b98c02>] btrfs_relocate_chunk.isra.36+0x52/0xe0 [btrfs]
[11759.408990] [<ffffffffa0b996c3>] btrfs_shrink_device+0x1a3/0x530 [btrfs]
[11759.409092] [<ffffffffa0b9d83c>] btrfs_rm_device+0x31c/0x800 [btrfs]
[11759.409187] [<ffffffffa0ba9118>] btrfs_ioctl+0x16f8/0x2870 [btrfs]
[11759.409273] [<ffffffff811f5645>] do_vfs_ioctl+0x285/0x460
[11759.409351] [<ffffffff811f5899>] SyS_ioctl+0x79/0x90
[11759.409419] [<ffffffff8166e4b6>] entry_SYSCALL_64_fastpath+0x16/0x75
[11759.411644] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x16/0x75
[11759.411751] Leftover inexact backtrace:
[11759.411825] Code: 45 10 49 89 d9 48 8b 55 d0 4c 89 34 24 4c 89 e9 4c 89 fe 4c 89 e7 48 89 44 24 10 8b 45 28 89 44 24 08 e8 ae d7 ff ff 31 c0 eb bb <0f> 0b 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 be 01 00 00 00
[11759.412375] RIP [<ffffffffa0b53cf6>] insert_inline_extent_backref+0xc6/0xd0 [btrfs]
[11759.412483] RSP <ffff88041cae7830>
[11759.437278] ---[ end trace 8d5c08952a1ee527 ]---
[11759.513591] note: btrfs[2060] exited with preempt_count 1
[11786.883269] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kworker/u8:5:3901]
[11786.889457] Modules linked in: xt_nat nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_conntrack xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables af_packet bridge stp llc iscsi_ibft iscsi_boot_sysfs btrfs xor x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel raid6_pq aesni_intel aes_x86_64 lrw gf128mul iTCO_wdt glue_helper ablk_helper iTCO_vendor_support cryptd pcspkr i2c_i801 ib_mthca lpc_ich tpm_tis 8250_fintek ie31200_edac mfd_core shpchp battery edac_core thermal tpm video fan button processor hid_generic usbhid uas usb_storage amdkfd amd_iommu_v2 radeon igb dca i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
[11786.922302] fb_sys_fops ttm drm xhci_pci xhci_hcd ehci_pci ehci_hcd usbcore usb_common e1000e ptp pps_core fjes vhost_net tun vhost macvtap macvlan sg rpcrdma sunrpc rdma_cm iw_cm ib_ipoib ib_cm ib_sa ib_umad ib_mad ib_core ib_addr
[11786.935936] CPU: 3 PID: 3901 Comm: kworker/u8:5 Tainted: G D 4.3.0-2-default #1
[11786.942792] Hardware name: FUJITSU PRIMERGY TX100 S3P/D3009-B1, BIOS V4.6.5.3 R1.10.0 for D3009-B1x 12/18/2012
[11786.949788] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
[11786.956829] task: ffff88019ffec0c0 ti: ffff880253e94000 task.ti: ffff880253e94000
[11786.963886] RIP: 0010:[<ffffffff810b11ec>] [<ffffffff810b11ec>] queued_write_lock_slowpath+0x3c/0x80
[11786.971005] RSP: 0018:ffff880253e979b0 EFLAGS: 00000286
[11786.978112] RAX: 00000000000000ff RBX: ffff88030f0a3700 RCX: 0000000000000001
[11786.985135] RDX: 0000000000000001 RSI: ffff88030f0a36a0 RDI: ffff88030f0a3700
[11786.992032] RBP: ffff880253e979b8 R08: 00000000b5b2d000 R09: ffff88019ffec0c0
[11786.998790] R10: ffff880000000000 R11: 0000000000000001 R12: ffff88030f0a3700
[11787.005402] R13: ffff880403c40d14 R14: ffff880253e97aee R15: ffff880403c40d18
[11787.011885] FS: 0000000000000000(0000) GS:ffff88042fd80000(0000) knlGS:0000000000000000
[11787.018288] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11787.024558] CR2: 000055f473f6d2ef CR3: 0000000001e0f000 CR4: 00000000001406e0
[11787.030830] Stack:
[11787.037003] ffff88030f0a36a0 ffff880253e979c8 ffffffff8166e1a5 ffff880253e979e8
[11787.043239] ffffffffa0baa62a ffff880403c40d10 0000000000000000 ffff880253e97a98
[11787.049404] ffffffffa0b4c95e ffff880253e97a20 ffffffff811d26f6 ffffea000291ed80
[11787.055507] Call Trace:
[11787.061537] [<ffffffff8166e1a5>] _raw_write_lock+0x25/0x30
[11787.067647] [<ffffffffa0baa62a>] btrfs_try_tree_write_lock+0x2a/0x70 [btrfs]
[11787.073796] [<ffffffffa0b4c95e>] btrfs_search_slot+0x93e/0xa70 [btrfs]
[11787.079941] [<ffffffffa0b5378e>] lookup_inline_extent_backref+0xde/0x580 [btrfs]
[11787.086134] [<ffffffffa0b53c85>] insert_inline_extent_backref+0x55/0xd0 [btrfs]
[11787.092274] [<ffffffffa0b54188>] __btrfs_inc_extent_ref.isra.52+0x98/0x250 [btrfs]
[11787.098399] [<ffffffffa0b59709>] __btrfs_run_delayed_refs+0xc79/0x10a0 [btrfs]
[11787.104511] [<ffffffffa0b5c7d2>] btrfs_run_delayed_refs+0x82/0x290 [btrfs]
[11787.110595] [<ffffffffa0b5ca17>] delayed_ref_async_start+0x37/0x90 [btrfs]
[11787.116653] [<ffffffffa0ba104a>] btrfs_scrubparity_helper+0xca/0x300 [btrfs]
[11787.122710] [<ffffffffa0ba12be>] btrfs_extent_refs_helper+0xe/0x10 [btrfs]
[11787.128768] [<ffffffff81080c89>] process_one_work+0x159/0x470
[11787.134785] [<ffffffff81080fe8>] worker_thread+0x48/0x4a0
[11787.140757] [<ffffffff81086c79>] kthread+0xc9/0xe0
[11787.146702] [<ffffffff8166e84f>] ret_from_fork+0x3f/0x70
[11787.153690] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[11787.165525] Leftover inexact backtrace:
[11787.177257] [<ffffffff81086bb0>] ? kthread_worker_fn+0x170/0x170
[11787.183186] Code: 48 89 fb f0 0f b1 57 04 85 c0 75 4b 8b 03 85 c0 75 0d ba ff 00 00 00 f0 0f b1 13 85 c0 74 31 ba 01 00 00 00 eb 02 f3 90 0f b6 03 <84> c0 75 f7 f0 0f b0 13 84 c0 75 ef ba ff 00 00 00 eb 02 f3 90
[11814.877090] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kworker/u8:5:3901]
[11814.883399] Modules linked in: xt_nat nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_conntrack xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables af_packet bridge stp llc iscsi_ibft iscsi_boot_sysfs btrfs xor x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel raid6_pq aesni_intel aes_x86_64 lrw gf128mul iTCO_wdt glue_helper ablk_helper iTCO_vendor_support cryptd pcspkr i2c_i801 ib_mthca lpc_ich tpm_tis 8250_fintek ie31200_edac mfd_core shpchp battery edac_core thermal tpm video fan button processor hid_generic usbhid uas usb_storage amdkfd amd_iommu_v2 radeon igb dca i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
[11814.917910] fb_sys_fops ttm drm xhci_pci xhci_hcd ehci_pci ehci_hcd usbcore usb_common e1000e ptp pps_core fjes vhost_net tun vhost macvtap macvlan sg rpcrdma sunrpc rdma_cm iw_cm ib_ipoib ib_cm ib_sa ib_umad ib_mad ib_core ib_addr
[11814.932558] CPU: 3 PID: 3901 Comm: kworker/u8:5 Tainted: G D L 4.3.0-2-default #1
[11814.939752] Hardware name: FUJITSU PRIMERGY TX100 S3P/D3009-B1, BIOS V4.6.5.3 R1.10.0 for D3009-B1x 12/18/2012
[11814.946900] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
[11814.954020] task: ffff88019ffec0c0 ti: ffff880253e94000 task.ti: ffff880253e94000
[11814.961141] RIP: 0010:[<ffffffff810b11ec>] [<ffffffff810b11ec>] queued_write_lock_slowpath+0x3c/0x80
[11814.968357] RSP: 0018:ffff880253e979b0 EFLAGS: 00000286
[11814.975549] RAX: 00000000000000ff RBX: ffff88030f0a3700 RCX: 0000000000000001
[11814.982749] RDX: 0000000000000001 RSI: ffff88030f0a36a0 RDI: ffff88030f0a3700
[11814.989814] RBP: ffff880253e979b8 R08: 00000000b5b2d000 R09: ffff88019ffec0c0
[11814.996750] R10: ffff880000000000 R11: 0000000000000001 R12: ffff88030f0a3700
[11815.003546] R13: ffff880403c40d14 R14: ffff880253e97aee R15: ffff880403c40d18
[11815.010206] FS: 0000000000000000(0000) GS:ffff88042fd80000(0000) knlGS:0000000000000000
[11815.016783] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11815.023223] CR2: 000055f473f6d2ef CR3: 0000000001e0f000 CR4: 00000000001406e0
[11815.029569] Stack:
[11815.035820] ffff88030f0a36a0 ffff880253e979c8 ffffffff8166e1a5 ffff880253e979e8
[11815.042141] ffffffffa0baa62a ffff880403c40d10 0000000000000000 ffff880253e97a98
[11815.048382] ffffffffa0b4c95e ffff880253e97a20 ffffffff811d26f6 ffffea000291ed80
[11815.054609] Call Trace:
[11815.060712] [<ffffffff8166e1a5>] _raw_write_lock+0x25/0x30
[11815.066819] [<ffffffffa0baa62a>] btrfs_try_tree_write_lock+0x2a/0x70 [btrfs]
[11815.072929] [<ffffffffa0b4c95e>] btrfs_search_slot+0x93e/0xa70 [btrfs]
[11815.079041] [<ffffffffa0b5378e>] lookup_inline_extent_backref+0xde/0x580 [btrfs]
[11815.085165] [<ffffffffa0b53c85>] insert_inline_extent_backref+0x55/0xd0 [btrfs]
[11815.091275] [<ffffffffa0b54188>] __btrfs_inc_extent_ref.isra.52+0x98/0x250 [btrfs]
[11815.097405] [<ffffffffa0b59709>] __btrfs_run_delayed_refs+0xc79/0x10a0 [btrfs]
[11815.103484] [<ffffffffa0b5c7d2>] btrfs_run_delayed_refs+0x82/0x290 [btrfs]
[11815.109546] [<ffffffffa0b5ca17>] delayed_ref_async_start+0x37/0x90 [btrfs]
[11815.115578] [<ffffffffa0ba104a>] btrfs_scrubparity_helper+0xca/0x300 [btrfs]
[11815.121606] [<ffffffffa0ba12be>] btrfs_extent_refs_helper+0xe/0x10 [btrfs]
[11815.127574] [<ffffffff81080c89>] process_one_work+0x159/0x470
[11815.133535] [<ffffffff81080fe8>] worker_thread+0x48/0x4a0
[11815.139490] [<ffffffff81086c79>] kthread+0xc9/0xe0
[11815.145377] [<ffffffff8166e84f>] ret_from_fork+0x3f/0x70
[11815.152308] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[11815.164034] Leftover inexact backtrace:
[11815.175660] [<ffffffff81086bb0>] ? kthread_worker_fn+0x170/0x170
[11815.181508] Code: 48 89 fb f0 0f b1 57 04 85 c0 75 4b 8b 03 85 c0 75 0d ba ff 00 00 00 f0 0f b1 13 85 c0 74 31 ba 01 00 00 00 eb 02 f3 90 0f b6 03 <84> c0 75 f7 f0 0f b0 13 84 c0 75 ef ba ff 00 00 00 eb 02 f3 90
[11818.875865] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [btrfs-transacti:2055]
[11818.882095] Modules linked in: xt_nat nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_conntrack xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables af_packet bridge stp llc iscsi_ibft iscsi_boot_sysfs btrfs xor x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel raid6_pq aesni_intel aes_x86_64 lrw gf128mul iTCO_wdt glue_helper ablk_helper iTCO_vendor_support cryptd pcspkr i2c_i801 ib_mthca lpc_ich tpm_tis 8250_fintek ie31200_edac mfd_core shpchp battery edac_core thermal tpm video fan button processor hid_generic usbhid uas usb_storage amdkfd amd_iommu_v2 radeon igb dca i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
[11818.916096] fb_sys_fops ttm drm xhci_pci xhci_hcd ehci_pci ehci_hcd usbcore usb_common e1000e ptp pps_core fjes vhost_net tun vhost macvtap macvlan sg rpcrdma sunrpc rdma_cm iw_cm ib_ipoib ib_cm ib_sa ib_umad ib_mad ib_core ib_addr
[11818.930570] CPU: 2 PID: 2055 Comm: btrfs-transacti Tainted: G D L 4.3.0-2-default #1
[11818.937796] Hardware name: FUJITSU PRIMERGY TX100 S3P/D3009-B1, BIOS V4.6.5.3 R1.10.0 for D3009-B1x 12/18/2012
[11818.945137] task: ffff880414eaa180 ti: ffff880414eb0000 task.ti: ffff880414eb0000
[11818.952537] RIP: 0010:[<ffffffff810b1169>] [<ffffffff810b1169>] queued_read_lock_slowpath+0x49/0x90
[11818.960011] RSP: 0018:ffff880414eb39c0 EFLAGS: 00000246
[11818.967332] RAX: 00000000000000ff RBX: ffff8801906fd188 RCX: 0000000000000004
[11818.974542] RDX: 0000000000000001 RSI: 00000000000001ff RDI: ffff8801906fd18c
[11818.981705] RBP: ffff880414eb39c8 R08: 0000000000000009 R09: 0000000000000001
[11818.988755] R10: 00000000000058ec R11: 0000000000000000 R12: ffff8801906fd188
[11818.995661] R13: ffff880414eaa180 R14: ffff880414eaa180 R15: ffff880414eb3a00
[11819.002452] FS: 0000000000000000(0000) GS:ffff88042fd00000(0000) knlGS:0000000000000000
[11819.009170] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11819.015743] CR2: 0000560a819e7001 CR3: 0000000001e0f000 CR4: 00000000001406e0
[11819.022222] Stack:
[11819.028520] ffff8801906fd128 ffff880414eb39d8 ffffffff8166e24a ffff880414eb3a38
[11819.034896] ffffffffa0baa47b ffff8803cb3bd2a0 0000000000000000 ffffffff8137342a
[11819.041222] 0000000000000000 ffff880414eb3a30 ffff8801906fd128 ffff880412615800
[11819.047487] Call Trace:
[11819.053666] [<ffffffff8166e24a>] _raw_read_lock+0x2a/0x30
[11819.059839] [<ffffffffa0baa47b>] btrfs_tree_read_lock+0x3b/0x100 [btrfs]
[11819.065989] [<ffffffffa0b47854>] btrfs_read_lock_root_node+0x34/0x50 [btrfs]
[11819.072114] [<ffffffffa0b4c6f1>] btrfs_search_slot+0x6d1/0xa70 [btrfs]
[11819.078245] [<ffffffffa0b5378e>] lookup_inline_extent_backref+0xde/0x580 [btrfs]
[11819.084399] [<ffffffffa0b53c85>] insert_inline_extent_backref+0x55/0xd0 [btrfs]
[11819.090540] [<ffffffffa0b54188>] __btrfs_inc_extent_ref.isra.52+0x98/0x250 [btrfs]
[11819.096697] [<ffffffffa0b59709>] __btrfs_run_delayed_refs+0xc79/0x10a0 [btrfs]
[11819.102802] [<ffffffffa0b5c7d2>] btrfs_run_delayed_refs+0x82/0x290 [btrfs]
[11819.108892] [<ffffffffa0b70cc3>] btrfs_commit_transaction+0x43/0xb60 [btrfs]
[11819.114959] [<ffffffffa0b6c373>] transaction_kthread+0x1c3/0x230 [btrfs]
[11819.121005] [<ffffffff81086c79>] kthread+0xc9/0xe0
[11819.127000] [<ffffffff8166e84f>] ret_from_fork+0x3f/0x70
[11819.134035] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[11819.145940] Leftover inexact backtrace:
[11819.157701] [<ffffffff81086bb0>] ? kthread_worker_fn+0x170/0x170
[11819.163623] Code: 00 01 00 00 ba 01 00 00 00 48 8d 7f 04 f0 0f b1 53 04 85 c0 75 45 b8 00 01 00 00 f0 0f c1 03 0f b6 c0 3d ff 00 00 00 75 0e f3 90 <8b> 03 0f b6 c0 3d ff 00 00 00 74 f2 c6 43 04 00 5b 5d f3 c3 40
[11832.221823] ata7.00: exception Emask 0x0 SAct 0x3f00 SErr 0x0 action 0x6 frozen
[11832.228088] ata7.00: failed command: READ FPDMA QUEUED
[11832.234346] ata7.00: cmd 60/08:40:58:df:81/00:00:31:00:00/40 tag 8 ncq 4096 in
res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[11832.246923] ata7.00: status: { DRDY }
[11832.253190] ata7.00: failed command: READ FPDMA QUEUED
[11832.259498] ata7.00: cmd 60/08:48:98:7d:87/00:00:27:00:00/40 tag 9 ncq 4096 in
res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[11832.272160] ata7.00: status: { DRDY }
[11832.278507] ata7.00: failed command: READ FPDMA QUEUED
[11832.284882] ata7.00: cmd 60/08:50:00:59:80/00:00:38:00:00/40 tag 10 ncq 4096 in
res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[11832.297686] ata7.00: status: { DRDY }
[11832.304090] ata7.00: failed command: WRITE FPDMA QUEUED
[11832.310543] ata7.00: cmd 61/08:58:e8:08:00/00:00:00:00:00/40 tag 11 ncq 4096 out
res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[11832.323365] ata7.00: status: { DRDY }
[11832.329544] ata7.00: failed command: WRITE FPDMA QUEUED
[11832.335700] ata7.00: cmd 61/10:60:b8:0c:c0/00:00:2c:00:00/40 tag 12 ncq 8192 out
res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[11832.347678] ata7.00: status: { DRDY }
[11832.353457] ata7.00: failed command: WRITE FPDMA QUEUED
[11832.359118] ata7.00: cmd 61/08:68:00:08:40/00:00:39:00:00/40 tag 13 ncq 4096 out
res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[11832.370128] ata7.00: status: { DRDY }
[11832.375410] ata7: hard resetting link
[11832.705753] ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[11837.708611] ata7.00: qc timeout (cmd 0xec)
[11837.720643] ata7.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[11837.725692] ata7.00: revalidation failed (errno=-5)
[11837.730681] ata7: hard resetting link
[11838.060538] ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[11842.870910] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kworker/u8:5:3901]
[11842.875955] Modules linked in: xt_nat nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_conntrack xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables af_packet bridge stp llc iscsi_ibft iscsi_boot_sysfs btrfs xor x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel raid6_pq aesni_intel aes_x86_64 lrw gf128mul iTCO_wdt glue_helper ablk_helper iTCO_vendor_support cryptd pcspkr i2c_i801 ib_mthca lpc_ich tpm_tis 8250_fintek ie31200_edac mfd_core shpchp battery edac_core thermal tpm video fan button processor hid_generic usbhid uas usb_storage amdkfd amd_iommu_v2 radeon igb dca i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
[11842.903884] fb_sys_fops ttm drm xhci_pci xhci_hcd ehci_pci ehci_hcd usbcore usb_common e1000e ptp pps_core fjes vhost_net tun vhost macvtap macvlan sg rpcrdma sunrpc rdma_cm iw_cm ib_ipoib ib_cm ib_sa ib_umad ib_mad ib_core ib_addr
[11842.915689] CPU: 3 PID: 3901 Comm: kworker/u8:5 Tainted: G D L 4.3.0-2-default #1
[11842.921553] Hardware name: FUJITSU PRIMERGY TX100 S3P/D3009-B1, BIOS V4.6.5.3 R1.10.0 for D3009-B1x 12/18/2012
[11842.927510] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
[11842.933445] task: ffff88019ffec0c0 ti: ffff880253e94000 task.ti: ffff880253e94000
[11842.939419] RIP: 0010:[<ffffffff810b11e7>] [<ffffffff810b11e7>] queued_write_lock_slowpath+0x37/0x80
[11842.945473] RSP: 0018:ffff880253e979b0 EFLAGS: 00000286
[11842.951519] RAX: 00000000000000ff RBX: ffff88030f0a3700 RCX: 0000000000000001
[11842.957604] RDX: 0000000000000001 RSI: ffff88030f0a36a0 RDI: ffff88030f0a3700
[11842.963678] RBP: ffff880253e979b8 R08: 00000000b5b2d000 R09: ffff88019ffec0c0
[11842.969770] R10: ffff880000000000 R11: 0000000000000001 R12: ffff88030f0a3700
[11842.975840] R13: ffff880403c40d14 R14: ffff880253e97aee R15: ffff880403c40d18
[11842.981902] FS: 0000000000000000(0000) GS:ffff88042fd80000(0000) knlGS:0000000000000000
[11842.988014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11842.994116] CR2: 000055f473f6d2ef CR3: 0000000001e0f000 CR4: 00000000001406e0
[11843.000254] Stack:
[11843.006342] ffff88030f0a36a0 ffff880253e979c8 ffffffff8166e1a5 ffff880253e979e8
[11843.012587] ffffffffa0baa62a ffff880403c40d10 0000000000000000 ffff880253e97a98
[11843.018817] ffffffffa0b4c95e ffff880253e97a20 ffffffff811d26f6 ffffea000291ed80
[11843.025046] Call Trace:
[11843.031231] [<ffffffff8166e1a5>] _raw_write_lock+0x25/0x30
[11843.037495] [<ffffffffa0baa62a>] btrfs_try_tree_write_lock+0x2a/0x70 [btrfs]
[11843.043777] [<ffffffffa0b4c95e>] btrfs_search_slot+0x93e/0xa70 [btrfs]
[11843.049906] [<ffffffffa0b5378e>] lookup_inline_extent_backref+0xde/0x580 [btrfs]
[11843.055911] [<ffffffffa0b53c85>] insert_inline_extent_backref+0x55/0xd0 [btrfs]
[11843.061854] [<ffffffffa0b54188>] __btrfs_inc_extent_ref.isra.52+0x98/0x250 [btrfs]
[11843.067808] [<ffffffffa0b59709>] __btrfs_run_delayed_refs+0xc79/0x10a0 [btrfs]
[11843.073739] [<ffffffffa0b5c7d2>] btrfs_run_delayed_refs+0x82/0x290 [btrfs]
[11843.079671] [<ffffffffa0b5ca17>] delayed_ref_async_start+0x37/0x90 [btrfs]
[11843.085561] [<ffffffffa0ba104a>] btrfs_scrubparity_helper+0xca/0x300 [btrfs]
[11843.091477] [<ffffffffa0ba12be>] btrfs_extent_refs_helper+0xe/0x10 [btrfs]
[11843.097350] [<ffffffff81080c89>] process_one_work+0x159/0x470
[11843.103210] [<ffffffff81080fe8>] worker_thread+0x48/0x4a0
[11843.109028] [<ffffffff81086c79>] kthread+0xc9/0xe0
[11843.114832] [<ffffffff8166e84f>] ret_from_fork+0x3f/0x70
[11843.121682] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[11843.133173] Leftover inexact backtrace:
[11843.144541] [<ffffffff81086bb0>] ? kthread_worker_fn+0x170/0x170
[11843.150252] Code: 00 48 89 e5 53 48 89 fb f0 0f b1 57 04 85 c0 75 4b 8b 03 85 c0 75 0d ba ff 00 00 00 f0 0f b1 13 85 c0 74 31 ba 01 00 00 00 eb 02 <f3> 90 0f b6 03 84 c0 75 f7 f0 0f b0 13 84 c0 75 ef ba ff 00 00
[11846.869687] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [btrfs-transacti:2055]
[11846.875750] Modules linked in: xt_nat nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_conntrack xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables af_packet bridge stp llc iscsi_ibft iscsi_boot_sysfs btrfs xor x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel raid6_pq aesni_intel aes_x86_64 lrw gf128mul iTCO_wdt glue_helper ablk_helper iTCO_vendor_support cryptd pcspkr i2c_i801 ib_mthca lpc_ich tpm_tis 8250_fintek ie31200_edac mfd_core shpchp battery edac_core thermal tpm video fan button processor hid_generic usbhid uas usb_storage amdkfd amd_iommu_v2 radeon igb dca i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
[11846.908771] fb_sys_fops ttm drm xhci_pci xhci_hcd ehci_pci ehci_hcd usbcore usb_common e1000e ptp pps_core fjes vhost_net tun vhost macvtap macvlan sg rpcrdma sunrpc rdma_cm iw_cm ib_ipoib ib_cm ib_sa ib_umad ib_mad ib_core ib_addr
[11846.922743] CPU: 2 PID: 2055 Comm: btrfs-transacti Tainted: G D L 4.3.0-2-default #1
[11846.929734] Hardware name: FUJITSU PRIMERGY TX100 S3P/D3009-B1, BIOS V4.6.5.3 R1.10.0 for D3009-B1x 12/18/2012
[11846.936810] task: ffff880414eaa180 ti: ffff880414eb0000 task.ti: ffff880414eb0000
[11846.943912] RIP: 0010:[<ffffffff810b116b>] [<ffffffff810b116b>] queued_read_lock_slowpath+0x4b/0x90
[11846.951098] RSP: 0018:ffff880414eb39c0 EFLAGS: 00000246
[11846.958280] RAX: 00000000000001ff RBX: ffff8801906fd188 RCX: 0000000000000004
[11846.965494] RDX: 0000000000000001 RSI: 00000000000001ff RDI: ffff8801906fd18c
[11846.972667] RBP: ffff880414eb39c8 R08: 0000000000000009 R09: 0000000000000001
[11846.979722] R10: 00000000000058ec R11: 0000000000000000 R12: ffff8801906fd188
[11846.986646] R13: ffff880414eaa180 R14: ffff880414eaa180 R15: ffff880414eb3a00
[11846.993452] FS: 0000000000000000(0000) GS:ffff88042fd00000(0000) knlGS:0000000000000000
[11847.000174] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11847.006759] CR2: 0000560a819e7001 CR3: 0000000001e0f000 CR4: 00000000001406e0
[11847.013249] Stack:
[11847.019566] ffff8801906fd128 ffff880414eb39d8 ffffffff8166e24a ffff880414eb3a38
[11847.025958] ffffffffa0baa47b ffff8803cb3bd2a0 0000000000000000 ffffffff8137342a
[11847.032293] 0000000000000000 ffff880414eb3a30 ffff8801906fd128 ffff880412615800
[11847.038581] Call Trace:
[11847.044760] [<ffffffff8166e24a>] _raw_read_lock+0x2a/0x30
[11847.050947] [<ffffffffa0baa47b>] btrfs_tree_read_lock+0x3b/0x100 [btrfs]
[11847.057105] [<ffffffffa0b47854>] btrfs_read_lock_root_node+0x34/0x50 [btrfs]
[11847.063237] [<ffffffffa0b4c6f1>] btrfs_search_slot+0x6d1/0xa70 [btrfs]
[11847.069375] [<ffffffffa0b5378e>] lookup_inline_extent_backref+0xde/0x580 [btrfs]
[11847.075530] [<ffffffffa0b53c85>] insert_inline_extent_backref+0x55/0xd0 [btrfs]
[11847.081667] [<ffffffffa0b54188>] __btrfs_inc_extent_ref.isra.52+0x98/0x250 [btrfs]
[11847.087827] [<ffffffffa0b59709>] __btrfs_run_delayed_refs+0xc79/0x10a0 [btrfs]
[11847.093942] [<ffffffffa0b5c7d2>] btrfs_run_delayed_refs+0x82/0x290 [btrfs]
[11847.100033] [<ffffffffa0b70cc3>] btrfs_commit_transaction+0x43/0xb60 [btrfs]
[11847.106111] [<ffffffffa0b6c373>] transaction_kthread+0x1c3/0x230 [btrfs]
[11847.112157] [<ffffffff81086c79>] kthread+0xc9/0xe0
[11847.118149] [<ffffffff8166e84f>] ret_from_fork+0x3f/0x70
[11847.125196] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[11847.137110] Leftover inexact backtrace:
[11847.148896] [<ffffffff81086bb0>] ? kthread_worker_fn+0x170/0x170
[11847.154826] Code: 00 00 ba 01 00 00 00 48 8d 7f 04 f0 0f b1 53 04 85 c0 75 45 b8 00 01 00 00 f0 0f c1 03 0f b6 c0 3d ff 00 00 00 75 0e f3 90 8b 03 <0f> b6 c0 3d ff 00 00 00 74 f2 c6 43 04 00 5b 5d f3 c3 40 0f b6
[11848.062327] ata7.00: qc timeout (cmd 0xec)
[11848.074328] ata7.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[11848.080540] ata7.00: revalidation failed (errno=-5)
[11848.086743] ata7: limiting SATA link speed to 3.0 Gbps
[11848.092929] ata7: hard resetting link
[11853.465134] ata7: link is slow to respond, please be patient (ready=0)
[11858.116113] ata7: COMRESET failed (errno=-16)
[11858.122356] ata7: hard resetting link
[11863.486922] ata7: link is slow to respond, please be patient (ready=0)
[11868.137901] ata7: COMRESET failed (errno=-16)
[11868.144178] ata7: hard resetting link
[11870.864731] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kworker/u8:5:3901]
[11870.871053] Modules linked in: xt_nat nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_conntrack xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables af_packet bridge stp llc iscsi_ibft iscsi_boot_sysfs btrfs xor x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel raid6_pq aesni_intel aes_x86_64 lrw gf128mul iTCO_wdt glue_helper ablk_helper iTCO_vendor_support cryptd pcspkr i2c_i801 ib_mthca lpc_ich tpm_tis 8250_fintek ie31200_edac mfd_core shpchp battery edac_core thermal tpm video fan button processor hid_generic usbhid uas usb_storage amdkfd amd_iommu_v2 radeon igb dca i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
[11870.905338] fb_sys_fops ttm drm xhci_pci xhci_hcd ehci_pci ehci_hcd usbcore usb_common e1000e ptp pps_core fjes vhost_net tun vhost macvtap macvlan sg rpcrdma sunrpc rdma_cm iw_cm ib_ipoib ib_cm ib_sa ib_umad ib_mad ib_core ib_addr
[11870.919238] CPU: 3 PID: 3901 Comm: kworker/u8:5 Tainted: G D L 4.3.0-2-default #1
[11870.925988] Hardware name: FUJITSU PRIMERGY TX100 S3P/D3009-B1, BIOS V4.6.5.3 R1.10.0 for D3009-B1x 12/18/2012
[11870.932707] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
[11870.939301] task: ffff88019ffec0c0 ti: ffff880253e94000 task.ti: ffff880253e94000
[11870.945796] RIP: 0010:[<ffffffff810b11e9>] [<ffffffff810b11e9>] queued_write_lock_slowpath+0x39/0x80
[11870.952246] RSP: 0018:ffff880253e979b0 EFLAGS: 00000286
[11870.958546] RAX: 00000000000000ff RBX: ffff88030f0a3700 RCX: 0000000000000001
[11870.964830] RDX: 0000000000000001 RSI: ffff88030f0a36a0 RDI: ffff88030f0a3700
[11870.971030] RBP: ffff880253e979b8 R08: 00000000b5b2d000 R09: ffff88019ffec0c0
[11870.977168] R10: ffff880000000000 R11: 0000000000000001 R12: ffff88030f0a3700
[11870.983245] R13: ffff880403c40d14 R14: ffff880253e97aee R15: ffff880403c40d18
[11870.989273] FS: 0000000000000000(0000) GS:ffff88042fd80000(0000) knlGS:0000000000000000
[11870.995334] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11871.001395] CR2: 000055f473f6d2ef CR3: 0000000001e0f000 CR4: 00000000001406e0
[11871.007489] Stack:
[11871.013538] ffff88030f0a36a0 ffff880253e979c8 ffffffff8166e1a5 ffff880253e979e8
[11871.019709] ffffffffa0baa62a ffff880403c40d10 0000000000000000 ffff880253e97a98
[11871.025851] ffffffffa0b4c95e ffff880253e97a20 ffffffff811d26f6 ffffea000291ed80
[11871.031949] Call Trace:
[11871.037977] [<ffffffff8166e1a5>] _raw_write_lock+0x25/0x30
[11871.044018] [<ffffffffa0baa62a>] btrfs_try_tree_write_lock+0x2a/0x70 [btrfs]
[11871.050068] [<ffffffffa0b4c95e>] btrfs_search_slot+0x93e/0xa70 [btrfs]
[11871.056088] [<ffffffffa0b5378e>] lookup_inline_extent_backref+0xde/0x580 [btrfs]
[11871.062117] [<ffffffffa0b53c85>] insert_inline_extent_backref+0x55/0xd0 [btrfs]
[11871.068093] [<ffffffffa0b54188>] __btrfs_inc_extent_ref.isra.52+0x98/0x250 [btrfs]
[11871.074086] [<ffffffffa0b59709>] __btrfs_run_delayed_refs+0xc79/0x10a0 [btrfs]
[11871.080071] [<ffffffffa0b5c7d2>] btrfs_run_delayed_refs+0x82/0x290 [btrfs]
[11871.086072] [<ffffffffa0b5ca17>] delayed_ref_async_start+0x37/0x90 [btrfs]
[11871.092059] [<ffffffffa0ba104a>] btrfs_scrubparity_helper+0xca/0x300 [btrfs]
[11871.098055] [<ffffffffa0ba12be>] btrfs_extent_refs_helper+0xe/0x10 [btrfs]
[11871.104042] [<ffffffff81080c89>] process_one_work+0x159/0x470
[11871.110019] [<ffffffff81080fe8>] worker_thread+0x48/0x4a0
[11871.115973] [<ffffffff81086c79>] kthread+0xc9/0xe0
[11871.121904] [<ffffffff8166e84f>] ret_from_fork+0x3f/0x70
[11871.128864] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[11871.140621] Leftover inexact backtrace:
[11871.152326] [<ffffffff81086bb0>] ? kthread_worker_fn+0x170/0x170
[11871.158205] Code: 89 e5 53 48 89 fb f0 0f b1 57 04 85 c0 75 4b 8b 03 85 c0 75 0d ba ff 00 00 00 f0 0f b1 13 85 c0 74 31 ba 01 00 00 00 eb 02 f3 90 <0f> b6 03 84 c0 75 f7 f0 0f b0 13 84 c0 75 ef ba ff 00 00 00 eb
[11873.524706] ata7: link is slow to respond, please be patient (ready=0)
[11874.863507] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [btrfs-transacti:2055]
[11874.869694] Modules linked in: xt_nat nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables xt_conntrack xt_tcpudp ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables af_packet bridge stp llc iscsi_ibft iscsi_boot_sysfs btrfs xor x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel raid6_pq aesni_intel aes_x86_64 lrw gf128mul iTCO_wdt glue_helper ablk_helper iTCO_vendor_support cryptd pcspkr i2c_i801 ib_mthca lpc_ich tpm_tis 8250_fintek ie31200_edac mfd_core shpchp battery edac_core thermal tpm video fan button processor hid_generic usbhid uas usb_storage amdkfd amd_iommu_v2 radeon igb dca i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
[11874.902740] fb_sys_fops ttm drm xhci_pci xhci_hcd ehci_pci ehci_hcd usbcore usb_common e1000e ptp pps_core fjes vhost_net tun vhost macvtap macvlan sg rpcrdma sunrpc rdma_cm iw_cm ib_ipoib ib_cm ib_sa ib_umad ib_mad ib_core ib_addr
[11874.916784] CPU: 2 PID: 2055 Comm: btrfs-transacti Tainted: G D L 4.3.0-2-default #1
[11874.923777] Hardware name: FUJITSU PRIMERGY TX100 S3P/D3009-B1, BIOS V4.6.5.3 R1.10.0 for D3009-B1x 12/18/2012
[11874.930887] task: ffff880414eaa180 ti: ffff880414eb0000 task.ti: ffff880414eb0000
[11874.938005] RIP: 0010:[<ffffffff810b1169>] [<ffffffff810b1169>] queued_read_lock_slowpath+0x49/0x90
[11874.945190] RSP: 0018:ffff880414eb39c0 EFLAGS: 00000246
[11874.952391] RAX: 00000000000000ff RBX: ffff8801906fd188 RCX: 0000000000000004
[11874.959587] RDX: 0000000000000001 RSI: 00000000000001ff RDI: ffff8801906fd18c
[11874.966655] RBP: ffff880414eb39c8 R08: 0000000000000009 R09: 0000000000000001
[11874.973600] R10: 00000000000058ec R11: 0000000000000000 R12: ffff8801906fd188
[11874.980418] R13: ffff880414eaa180 R14: ffff880414eaa180 R15: ffff880414eb3a00
[11874.987113] FS: 0000000000000000(0000) GS:ffff88042fd00000(0000) knlGS:0000000000000000
[11874.993724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11875.000193] CR2: 0000560a819e7001 CR3: 0000000001e0f000 CR4: 00000000001406e0
[11875.006573] Stack:
[11875.012853] ffff8801906fd128 ffff880414eb39d8 ffffffff8166e24a ffff880414eb3a38
[11875.019195] ffffffffa0baa47b ffff8803cb3bd2a0 0000000000000000 ffffffff8137342a
[11875.025480] 0000000000000000 ffff880414eb3a30 ffff8801906fd128 ffff880412615800
[11875.031756] Call Trace:
[11875.037915] [<ffffffff8166e24a>] _raw_read_lock+0x2a/0x30
[11875.044081] [<ffffffffa0baa47b>] btrfs_tree_read_lock+0x3b/0x100 [btrfs]
[11875.050254] [<ffffffffa0b47854>] btrfs_read_lock_root_node+0x34/0x50 [btrfs]
[11875.056431] [<ffffffffa0b4c6f1>] btrfs_search_slot+0x6d1/0xa70 [btrfs]
[11875.062586] [<ffffffffa0b5378e>] lookup_inline_extent_backref+0xde/0x580 [btrfs]
[11875.068782] [<ffffffffa0b53c85>] insert_inline_extent_backref+0x55/0xd0 [btrfs]
[11875.074952] [<ffffffffa0b54188>] __btrfs_inc_extent_ref.isra.52+0x98/0x250 [btrfs]
[11875.081103] [<ffffffffa0b59709>] __btrfs_run_delayed_refs+0xc79/0x10a0 [btrfs]
[11875.087219] [<ffffffffa0b5c7d2>] btrfs_run_delayed_refs+0x82/0x290 [btrfs]
[11875.093316] [<ffffffffa0b70cc3>] btrfs_commit_transaction+0x43/0xb60 [btrfs]
[11875.099392] [<ffffffffa0b6c373>] transaction_kthread+0x1c3/0x230 [btrfs]
[11875.105435] [<ffffffff81086c79>] kthread+0xc9/0xe0
[11875.111458] [<ffffffff8166e84f>] ret_from_fork+0x3f/0x70
[11875.118550] DWARF2 unwinder stuck at ret_from_fork+0x3f/0x70
[11875.130489] Leftover inexact backtrace:
[11875.142319] [<ffffffff81086bb0>] ? kthread_worker_fn+0x170/0x170
[11875.148261] Code: 00 01 00 00 ba 01 00 00 00 48 8d 7f 04 f0 0f b1 53 04 85 c0 75 45 b8 00 01 00 00 f0 0f c1 03 0f b6 c0 3d ff 00 00 00 75 0e f3 90 <8b> 03 0f b6 c0 3d ff 00 00 00 74 f2 c6 43 04 00 5b 5d f3 c3 40
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel lockup, might be helpful log.
2015-12-13 22:55 Kernel lockup, might be helpful log Birdsarenice
@ 2015-12-14 6:51 ` Duncan
2015-12-14 8:35 ` Hugo Mills
2015-12-14 7:36 ` Chris Murphy
2015-12-14 12:06 ` Filipe Manana
2 siblings, 1 reply; 7+ messages in thread
From: Duncan @ 2015-12-14 6:51 UTC (permalink / raw)
To: linux-btrfs
Birdsarenice posted on Sun, 13 Dec 2015 22:55:19 +0000 as excerpted:
> Meanwhile, I did get lucky: At one crash I happened to be logged in and
> was able to hit dmesg seconds before it went completely. So what I have
> here is information that looks like it'll help you track down a
> rarely-encountered and hard-to-reproduce bug which can cause the system
> to lock up completely in event of certain types of hard drive failure.
> It might be nothing, but perhaps someone will find it of use - because
> it'd be a tricky one to both reproduce and get a good error report if it
> did occur.
>
> I see an 'invalid opcode' error in here, that's pretty unusual
Disclaimer: I'm a list regular and (small-scale) sysadmin, not a dev,
and most certainly not a btrfs dev. Take what I saw with that in mind,
tho I've been active on-list for over a year and thus now have a
reasonable level of practical sysadmin configuration and crisis recovery
level btrfs experience.
You could well be quite correct with the unusual crash log and its value,
I'll leave that up to the devs to decide, but that "invalid opcode: 0000"
bit is in fact not at all unusual on btrfs. Tho I can say it fooled me
originally as well, because it certainly /looks/ both suspicious and in
general unusual.
Based on how a dev explained it to me, I believe btrfs actually
deliberately uses opcode 0000 to trigger a semi-controlled crash in
instances where code that "should never happen" actually gets executed
for some reason, leaving the kernel is an unknown and thus not
trustworthy enough to reliably write to storage devices and do a
controlled shutdown. That's of course why the tracebacks are there, to
help the devs figure out where it was and what triggered it, but the 0000
opcode itself is actually quite frequently found in these tracebacks,
because it's the method chosen to deliberately trigger them.
I'd guess the same technique is actually used in various other (non-
btrfs) kernel code as well, but in fully stable code it actually is very
rarely seen, precisely because it /does/ mean the kernel reached code
that it is never expected to reach, meaning something specific went wrong
to get to that point, and in fully stable code, it's rare that any code
paths actually leading to that sort of execution point remain, as they've
all been found over the years.
But of course btrfs, while no longer experimental, remains "still
stabilizing and maturing, not yet fully stable or mature", so there's
still code paths left that do still occasionally reach these intended to
be unreachable code points, and when that happens, triggering a crash and
hopefully getting a traceback that helps the devs figure out which code
path has the bug and why, is a good thing to do, and this is apparently
the way it's done.
(BTW, compliments on the nick and email address. =:^)
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel lockup, might be helpful log.
2015-12-13 22:55 Kernel lockup, might be helpful log Birdsarenice
2015-12-14 6:51 ` Duncan
@ 2015-12-14 7:36 ` Chris Murphy
2015-12-14 8:28 ` Birdsarenice
2015-12-14 12:06 ` Filipe Manana
2 siblings, 1 reply; 7+ messages in thread
From: Chris Murphy @ 2015-12-14 7:36 UTC (permalink / raw)
To: Birdsarenice, Btrfs BTRFS
I can't help with the call traces. But several (not all) of the hard
resetting link messages are hallmark cases where the SCSI command
timer default of 30 seconds looks like it's being hit while the drive
itself is hung up doing a sector read recovery (multiple attempts).
It's worth seeing if 'smartctl -l scterc <dev>' will report back that
SCT is supported and that it's just disabled, meaning you can change
this to something sane like with 'smartctl -l 70,70 <dev>' which will
make the drive time out before the linux kernel command timer. That'll
let Btrfs do the right thing, rather than constantly getting poked in
both eyes by link resets.
Chris Murphy
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel lockup, might be helpful log.
2015-12-14 7:36 ` Chris Murphy
@ 2015-12-14 8:28 ` Birdsarenice
0 siblings, 0 replies; 7+ messages in thread
From: Birdsarenice @ 2015-12-14 8:28 UTC (permalink / raw)
To: Chris Murphy, Btrfs BTRFS
I've no need for a fix. I know exactly what the underlying cause is:
Those Seagate 8TB Archive drives and their known compatibility issues
with some kernel versions. I just shared the log because it's a
situation that btrfs handles very, very poorly, and the error handling
could be improved. If a drive is unresponsive, btrfs really should be
able to just cease using it and treat it as failed, or even unmount the
entire filesystem - either would be preferable to what actually happens
(at least for me), a system hang that leaves nothing functional whatsoever.
I've 'solved' it by removing all drives of that model. It's been running
without issue since I did that.
On 14/12/15 07:36, Chris Murphy wrote:
> I can't help with the call traces. But several (not all) of the hard
> resetting link messages are hallmark cases where the SCSI command
> timer default of 30 seconds looks like it's being hit while the drive
> itself is hung up doing a sector read recovery (multiple attempts).
> It's worth seeing if 'smartctl -l scterc <dev>' will report back that
> SCT is supported and that it's just disabled, meaning you can change
> this to something sane like with 'smartctl -l 70,70 <dev>' which will
> make the drive time out before the linux kernel command timer. That'll
> let Btrfs do the right thing, rather than constantly getting poked in
> both eyes by link resets.
>
>
> Chris Murphy
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel lockup, might be helpful log.
2015-12-14 6:51 ` Duncan
@ 2015-12-14 8:35 ` Hugo Mills
2015-12-14 12:38 ` Duncan
0 siblings, 1 reply; 7+ messages in thread
From: Hugo Mills @ 2015-12-14 8:35 UTC (permalink / raw)
To: Duncan; +Cc: linux-btrfs
[-- Attachment #1: Type: text/plain, Size: 2300 bytes --]
On Mon, Dec 14, 2015 at 06:51:41AM +0000, Duncan wrote:
> Birdsarenice posted on Sun, 13 Dec 2015 22:55:19 +0000 as excerpted:
>
> > Meanwhile, I did get lucky: At one crash I happened to be logged in and
> > was able to hit dmesg seconds before it went completely. So what I have
> > here is information that looks like it'll help you track down a
> > rarely-encountered and hard-to-reproduce bug which can cause the system
> > to lock up completely in event of certain types of hard drive failure.
> > It might be nothing, but perhaps someone will find it of use - because
> > it'd be a tricky one to both reproduce and get a good error report if it
> > did occur.
> >
> > I see an 'invalid opcode' error in here, that's pretty unusual
>
> Disclaimer: I'm a list regular and (small-scale) sysadmin, not a dev,
> and most certainly not a btrfs dev. Take what I saw with that in mind,
> tho I've been active on-list for over a year and thus now have a
> reasonable level of practical sysadmin configuration and crisis recovery
> level btrfs experience.
>
> You could well be quite correct with the unusual crash log and its value,
> I'll leave that up to the devs to decide, but that "invalid opcode: 0000"
> bit is in fact not at all unusual on btrfs. Tho I can say it fooled me
> originally as well, because it certainly /looks/ both suspicious and in
> general unusual.
>
> Based on how a dev explained it to me, I believe btrfs actually
> deliberately uses opcode 0000 to trigger a semi-controlled crash in
> instances where code that "should never happen" actually gets executed
> for some reason, leaving the kernel is an unknown and thus not
> trustworthy enough to reliably write to storage devices and do a
> controlled shutdown. That's of course why the tracebacks are there, to
> help the devs figure out where it was and what triggered it, but the 0000
> opcode itself is actually quite frequently found in these tracebacks,
> because it's the method chosen to deliberately trigger them.
It's not just btrfs. Invalid opcode is the way that the kernel's
BUG and BUG_ON macro is implemented.
Hugo.
--
Hugo Mills | Great oxymorons of the world, no. 10:
hugo@... carfax.org.uk | Business Ethics
http://carfax.org.uk/ |
PGP: E2AB1DE4 |
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel lockup, might be helpful log.
2015-12-13 22:55 Kernel lockup, might be helpful log Birdsarenice
2015-12-14 6:51 ` Duncan
2015-12-14 7:36 ` Chris Murphy
@ 2015-12-14 12:06 ` Filipe Manana
2 siblings, 0 replies; 7+ messages in thread
From: Filipe Manana @ 2015-12-14 12:06 UTC (permalink / raw)
To: Birdsarenice; +Cc: linux-btrfs@vger.kernel.org
On Sun, Dec 13, 2015 at 10:55 PM, Birdsarenice <Quail@birds-are-nice.me> wrote:
> I've finally finished deleting all those nasty unreliable Seagate drives
> from my array. During the process I crashed my server - over, and over, and
> over. Completely gone - screen blank, controls unresponsive, no network
> activity (no, I don't have root on btrfs - data only). Most annoying, but I
> think btrfs survived it all somehow - it's scrubbing now.
>
> Meanwhile, I did get lucky: At one crash I happened to be logged in and was
> able to hit dmesg seconds before it went completely. So what I have here is
> information that looks like it'll help you track down a rarely-encountered
> and hard-to-reproduce bug which can cause the system to lock up completely
> in event of certain types of hard drive failure. It might be nothing, but
> perhaps someone will find it of use - because it'd be a tricky one to both
> reproduce and get a good error report if it did occur.
>
> I see an 'invalid opcode' error in here, that's pretty unusual - and again
> it even gives a file name and line number to look at. The root cause of all
> my issues is the NCQ issue with Seagate 8TB archive drives, which is Someone
> Else's Problem - but I think some good can come of this, as these exotic
> forms of corruption and weird drive semi-failures have revealed ways in
> which btrfs's error handling could be made more graceful.
>
> Meanwhile I remain impressed that btrfs appears to have kept all my data
> intact even though all these issues.
Regarding the trace you got, from a BUG_ON, it's due a regression
present in 4.2 and 4.3 kernels that got fixed in 4.4-rc. The fixes are
scheduled for the next stable releases of 4.2.x and 4.3.x. A ton of
people have hit this (one example report
http://www.spinics.net/lists/linux-btrfs/msg49766.html).
--
Filipe David Manana,
"Reasonable men adapt themselves to the world.
Unreasonable men adapt the world to themselves.
That's why all progress depends on unreasonable men."
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Kernel lockup, might be helpful log.
2015-12-14 8:35 ` Hugo Mills
@ 2015-12-14 12:38 ` Duncan
0 siblings, 0 replies; 7+ messages in thread
From: Duncan @ 2015-12-14 12:38 UTC (permalink / raw)
To: linux-btrfs
Hugo Mills posted on Mon, 14 Dec 2015 08:35:24 +0000 as excerpted:
> It's not just btrfs. Invalid opcode is the way that the kernel's BUG and
> BUG_ON macro is implemented.
Thanks. I indicated that I suspected broader kernel use further down the
reply, but it's very nice to have confirmation, both of invalid opcode
use elsewhere, and of it being the kernel's general implementation for
BUG and BUG_ON.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-12-14 12:38 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-13 22:55 Kernel lockup, might be helpful log Birdsarenice
2015-12-14 6:51 ` Duncan
2015-12-14 8:35 ` Hugo Mills
2015-12-14 12:38 ` Duncan
2015-12-14 7:36 ` Chris Murphy
2015-12-14 8:28 ` Birdsarenice
2015-12-14 12:06 ` Filipe Manana
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox