From mboxrd@z Thu Jan 1 00:00:00 1970 From: Daniel Walker Subject: Re: Hard CPU Lockup when accessing MD RAID5 Date: Wed, 20 Apr 2016 06:52:35 +0000 Message-ID: <57172735.9030202@ftwinc.net> References: <570D6E79.1010201@ftwinc.net> <20160413170008.GA6186@kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20160413170008.GA6186@kernel.org> Sender: linux-raid-owner@vger.kernel.org To: linux-raid@vger.kernel.org List-Id: linux-raid.ids Hi, I upgraded the kernel to the latest stable with debugging enabled (4.5.1) without any luck, this is what is outputted in dmesg: [262448.558983] INFO: task php:13376 blocked for more than 120 seconds. [262448.559057] Tainted: G W 4.5.1 #1 [262448.559092] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [262448.559246] php D ffff88001c297a18 0 13376 12277 0x00000000 [262448.559519] ffff88001c297a18 ffff881ff248c100 ffff880013e9b400 ffff881fea472000 [262448.559603] ffff88001c297ae8 ffff88001c298000 ffff881c5cac1b30 ffff880013e9b400 [262448.560046] 0000000000020001 0000000545ea7820 ffff88001c297a30 ffffffff814d5690 [262448.560485] Call Trace: [262448.560541] [] schedule+0x30/0x80 [262448.560761] [] schedule_timeout+0x21e/0x2a0 [262448.560828] [] ? xfs_bmap_search_extents+0x7d/0x100 [262448.561000] [] ? down_trylock+0x29/0x40 [262448.561135] [] __down+0x5f/0xa0 [262448.561268] [] ? _xfs_buf_find+0x156/0x350 [262448.561347] [] down+0x3c/0x50 [262448.561390] [] xfs_buf_lock+0x37/0xf0 [262448.561435] [] _xfs_buf_find+0x156/0x350 [262448.561557] [] xfs_buf_get_map+0x25/0x280 [262448.561603] [] ? kmem_zone_alloc+0x7b/0x120 [262448.561666] [] xfs_buf_read_map+0x28/0x180 [262448.561768] [] xfs_trans_read_buf_map+0xeb/0x300 [262448.561809] [] xfs_imap_to_bp+0x5a/0xc0 [262448.561881] [] xfs_iunlink_remove+0x275/0x3a0 [262448.561943] [] ? kmem_zone_alloc+0x7b/0x120 [262448.561988] [] xfs_ifree+0x33/0xd0 [262448.562033] [] xfs_inactive_ifree+0xb5/0x200 [262448.562109] [] xfs_inactive+0x88/0x110 [262448.562296] [] xfs_fs_evict_inode+0xc1/0x110 [262448.562344] [] evict+0xbb/0x180 [262448.562405] [] iput+0x193/0x200 [262448.562483] [] d_delete+0x122/0x160 [262448.562520] [] vfs_rmdir+0xf9/0x120 [262448.562559] [] do_rmdir+0x1b7/0x1d0 [262448.562607] [] ? exit_to_usermode_loop+0x90/0xb0 [262448.562665] [] SyS_rmdir+0x11/0x20 [262448.562891] [] entry_SYSCALL_64_fastpath+0x16/0x6e [262489.707201] NMI watchdog: Watchdog detected hard LOCKUP on cpu 15 [262489.707227] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ipt_REJECT nf_reject_ipv4 iptable_mangle netconsole configfs tun xt_multiport ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding ext4 crc16 mbcache jbd2 raid1 raid0 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq md_mod sg sd_mod hid_generic usbhid hid x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_ssse3 iTCO_wdt sha256_generic iTCO_vendor_support hmac drbg xhci_pci ahci sb_edac ehci_pci ansi_cprng xhci_hcd ehci_hcd libahci i2c_i801 edac_core lpc_ich mei_me mfd_core libata usbcore igb mei megaraid_sas i2c_algo_bit usb_common ptp aesni_intel pps_core aes_x86_64 ioatdma lrw gf128mul glue_helper ablk_helper i2c_core scsi_mod dca cryptd ipmi_si ipmi_msghandler acpi_power_meter tpm_tis tpm processor button [262489.708066] CPU: 15 PID: 17535 Comm: kworker/u32:6 Tainted: G W 4.5.1 #1 [262489.708124] Hardware name: Supermicro Super Server/X10DRi-LN4+, BIOS 2.0 12/17/2015 [262489.708187] Workqueue: writeback wb_workfn (flush-9:7) [262489.708228] 0000000000000000 ffff88207fde5bd0 ffffffff812e00b8 0000000000000000 [262489.708298] 0000000000000000 ffff88207fde5be8 ffffffff810dff1d ffff881ff2270000 [262489.708368] ffff88207fde5c20 ffffffff8110f8f8 0000000000000001 ffff88207fdeaf00 [262489.708438] Call Trace: [262489.708467] [] dump_stack+0x4d/0x65 [262489.708512] [] watchdog_overflow_callback+0xdd/0xf0 [262489.708552] [] __perf_event_overflow+0x88/0x1d0 [262489.708589] [] perf_event_overflow+0x14/0x20 [262489.708627] [] intel_pmu_handle_irq+0x1d0/0x4a0 [262489.708666] [] ? vunmap_page_range+0x1a1/0x310 [262489.708703] [] ? unmap_kernel_range_noflush+0xc/0x10 [262489.708748] [] ? ghes_copy_tofrom_phys+0x113/0x1e0 [262489.708788] [] ? native_apic_wait_icr_idle+0x1a/0x30 [262489.708827] [] ? arch_irq_work_raise+0x30/0x40 [262489.708865] [] perf_event_nmi_handler+0x28/0x50 [262489.708902] [] nmi_handle+0x61/0x110 [262489.708939] [] do_nmi+0x117/0x3e0 [262489.708975] [] end_repeat_nmi+0x1a/0x1e [262489.709013] [] ? raid5_unplug+0x70/0x130 [raid456] [262489.709051] [] ? raid5_unplug+0x70/0x130 [raid456] [262489.709089] [] ? raid5_unplug+0x70/0x130 [raid456] [262489.709125] <> [] blk_flush_plug_list+0xa8/0x210 [262489.709169] [] ? bit_wait_timeout+0x70/0x70 [262489.709206] [] io_schedule_timeout+0x54/0x130 [262489.709242] [] bit_wait_io+0x16/0x60 [262489.709277] [] __wait_on_bit_lock+0x49/0xa0 [262489.709314] [] __lock_page+0xb0/0xc0 [262489.709352] [] ? autoremove_wake_function+0x30/0x30 [262489.709391] [] write_cache_pages+0x2f0/0x4d0 [262489.709427] [] ? wb_position_ratio+0x1f0/0x1f0 [262489.709465] [] generic_writepages+0x3e/0x60 [262489.709502] [] xfs_vm_writepages+0x38/0x40 [262489.709539] [] do_writepages+0x19/0x30 [262489.709574] [] __writeback_single_inode+0x40/0x310 [262489.709612] [] writeback_sb_inodes+0x242/0x520 [262489.709649] [] __writeback_inodes_wb+0x8a/0xc0 [262489.709686] [] wb_writeback+0x247/0x2d0 [262489.709721] [] wb_workfn+0x20f/0x3c0 [262489.709758] [] process_one_work+0x143/0x400 [262489.709795] [] worker_thread+0x61/0x490 [262489.709831] [] ? max_active_store+0x60/0x60 [262489.709867] [] kthread+0xd6/0xf0 [262489.709901] [] ? kthread_park+0x50/0x50 [262489.709937] [] ret_from_fork+0x3f/0x70 [262489.709972] [] ? kthread_park+0x50/0x50 [262491.022971] NMI watchdog: Watchdog detected hard LOCKUP on cpu 0 [262491.023470] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ipt_REJECT nf_reject_ipv4 iptable_mangle netconsole configfs tun xt_multiport ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding ext4 crc16 mbcache jbd2 raid1 raid0 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq md_mod sg sd_mod hid_generic usbhid hid x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_ssse3 iTCO_wdt sha256_generic iTCO_vendor_support hmac drbg xhci_pci ahci sb_edac ehci_pci ansi_cprng xhci_hcd ehci_hcd libahci i2c_i801 edac_core lpc_ich mei_me mfd_core libata usbcore igb mei megaraid_sas i2c_algo_bit usb_common ptp aesni_intel pps_core aes_x86_64 ioatdma lrw gf128mul glue_helper ablk_helper i2c_core scsi_mod dca cryptd ipmi_si ipmi_msghandler acpi_power_meter tpm_tis tpm processor button [262491.029705] CPU: 0 PID: 1178 Comm: md7_raid5 Tainted: G W 4.5.1 #1 [262491.029776] Hardware name: Supermicro Super Server/X10DRi-LN4+, BIOS 2.0 12/17/2015 [262491.029849] 0000000000000000 ffff88207fc05bd0 ffffffff812e00b8 0000000000000000 [262491.029988] 0000000000000000 ffff88207fc05be8 ffffffff810dff1d ffff881fff032000 [262491.030124] ffff88207fc05c20 ffffffff8110f8f8 0000000000000001 ffff88207fc0af00 [262491.030260] Call Trace: [262491.030302] [] dump_stack+0x4d/0x65 [262491.030377] [] watchdog_overflow_callback+0xdd/0xf0 [262491.030432] [] __perf_event_overflow+0x88/0x1d0 [262491.030484] [] perf_event_overflow+0x14/0x20 [262491.030536] [] intel_pmu_handle_irq+0x1d0/0x4a0 [262491.030589] [] ? vunmap_page_range+0x1a1/0x310 [262491.030640] [] ? unmap_kernel_range_noflush+0xc/0x10 [262491.030693] [] ? ghes_copy_tofrom_phys+0x113/0x1e0 [262491.030745] [] ? ghes_read_estatus+0x71/0x140 [262491.030797] [] perf_event_nmi_handler+0x28/0x50 [262491.030849] [] nmi_handle+0x61/0x110 [262491.030898] [] do_nmi+0x201/0x3e0 [262491.030949] [] end_repeat_nmi+0x1a/0x1e [262491.030998] [] ? queued_spin_lock_slowpath+0x153/0x170 [262491.031050] [] ? queued_spin_lock_slowpath+0x153/0x170 [262491.031102] [] ? queued_spin_lock_slowpath+0x153/0x170 [262491.031153] <> [] _raw_spin_lock_irq+0x1c/0x20 [262491.031225] [] raid5d+0x91/0x720 [raid456] [262491.031276] [] ? try_to_del_timer_sync+0x4a/0x60 [262491.031328] [] ? del_timer_sync+0x43/0x50 [262491.031377] [] ? schedule_timeout+0x14e/0x2a0 [262491.031428] [] ? trace_event_raw_event_tick_stop+0x100/0x100 [262491.031502] [] md_thread+0x12b/0x130 [md_mod] [262491.031555] [] ? wait_woken+0x80/0x80 [262491.031605] [] ? find_pers+0x70/0x70 [md_mod] [262491.031656] [] kthread+0xd6/0xf0 [262491.031704] [] ? kthread_park+0x50/0x50 [262491.031753] [] ret_from_fork+0x3f/0x70 [262491.031802] [] ? kthread_park+0x50/0x50 [262491.031753] [] ret_from_fork+0x3f/0x70 [262491.031802] [] ? kthread_park+0x50/0x50 The server is hosting plain VPS's, there's a few that use it for rtorrent which is quite disk extenssive, but from what I can see that iowait is quite low. There's absolutely nothing logged at all before the lockups, everythings running fine and then suddenly it just crashes, im beginning to think we might have a hardware problem, but im having a hard time finding the actual issue. Any ideas? Best regards Den 13-04-2016 kl. 19:00 skrev Shaohua Li: > Looks there is a deadlock trying to hold the device_lock or hash_lock. anything > abormal print out before the NMI watchdog? What is running in the machine? > Looks this is old kernel, is it possible you can try a latest kernel and report > back? > > Thanks, > Shaohua > > On Tue, Apr 12, 2016 at 09:54:08PM +0000, Daniel Walker wrote: >> Im having some issues on a brand new Supermicro server that we have running >> in production along side a few other machines which are identical to this >> server.. >> >> The output from the netconsole attached to the server is here: >> >> Apr 12 21:34:45 [75704.964946] NMI watchdog: Watchdog detected hard LOCKUP >> on cpu 6 >> Apr 12 21:34:45 >> Apr 12 21:34:45 [75704.964973] Modules linked in: >> Apr 12 21:34:45 ipt_REJECT >> Apr 12 21:34:45 nf_reject_ipv4 >> Apr 12 21:34:45 iptable_mangle >> Apr 12 21:34:45 tun >> Apr 12 21:34:45 netconsole >> Apr 12 21:34:45 configfs >> Apr 12 21:34:45 xt_multiport >> Apr 12 21:34:45 ip6table_filter >> Apr 12 21:34:45 ip6_tables >> Apr 12 21:34:45 iptable_filter >> Apr 12 21:34:45 ip_tables >> Apr 12 21:34:45 x_tables >> Apr 12 21:34:45 bridge >> Apr 12 21:34:45 stp >> Apr 12 21:34:45 llc >> Apr 12 21:34:45 bonding >> Apr 12 21:34:45 ext4 >> Apr 12 21:34:45 crc16 >> Apr 12 21:34:45 mbcache >> Apr 12 21:34:45 jbd2 >> Apr 12 21:34:45 raid1 >> Apr 12 21:34:45 raid0 >> Apr 12 21:34:45 raid456 >> Apr 12 21:34:45 async_raid6_recov >> Apr 12 21:34:45 async_memcpy >> Apr 12 21:34:45 async_pq >> Apr 12 21:34:45 async_xor >> Apr 12 21:34:45 xor >> Apr 12 21:34:45 async_tx >> Apr 12 21:34:45 raid6_pq >> Apr 12 21:34:45 md_mod >> Apr 12 21:34:45 sr_mod >> Apr 12 21:34:45 cdrom >> Apr 12 21:34:45 usb_storage >> Apr 12 21:34:45 hid_generic >> Apr 12 21:34:45 usbhid >> Apr 12 21:34:45 hid >> Apr 12 21:34:45 sg >> Apr 12 21:34:45 sd_mod >> Apr 12 21:34:45 x86_pkg_temp_thermal >> Apr 12 21:34:45 coretemp >> Apr 12 21:34:45 crct10dif_pclmul >> Apr 12 21:34:45 crc32_pclmul >> Apr 12 21:34:45 crc32c_intel >> Apr 12 21:34:45 jitterentropy_rng >> Apr 12 21:34:45 sha256_ssse3 >> Apr 12 21:34:45 sha256_generic >> Apr 12 21:34:45 hmac >> Apr 12 21:34:45 iTCO_wdt >> Apr 12 21:34:45 iTCO_vendor_support >> Apr 12 21:34:45 drbg >> Apr 12 21:34:45 ansi_cprng >> Apr 12 21:34:45 aesni_intel >> Apr 12 21:34:45 aes_x86_64 >> Apr 12 21:34:45 lrw >> Apr 12 21:34:45 gf128mul >> Apr 12 21:34:45 glue_helper >> Apr 12 21:34:45 ablk_helper >> Apr 12 21:34:45 cryptd >> Apr 12 21:34:45 ahci >> Apr 12 21:34:45 libahci >> Apr 12 21:34:45 sb_edac >> Apr 12 21:34:45 libata >> Apr 12 21:34:45 igb >> Apr 12 21:34:45 megaraid_sas >> Apr 12 21:34:45 xhci_pci >> Apr 12 21:34:45 ehci_pci >> Apr 12 21:34:45 i2c_algo_bit >> Apr 12 21:34:45 xhci_hcd >> Apr 12 21:34:45 ehci_hcd >> Apr 12 21:34:45 edac_core >> Apr 12 21:34:45 ptp >> Apr 12 21:34:45 mei_me >> Apr 12 21:34:45 lpc_ich >> Apr 12 21:34:45 i2c_i801 >> Apr 12 21:34:45 usbcore >> Apr 12 21:34:45 pps_core >> Apr 12 21:34:45 mfd_core >> Apr 12 21:34:45 mei >> Apr 12 21:34:45 usb_common >> Apr 12 21:34:45 i2c_core >> Apr 12 21:34:45 ioatdma >> Apr 12 21:34:45 scsi_mod >> Apr 12 21:34:45 dca >> Apr 12 21:34:45 ipmi_si >> Apr 12 21:34:45 ipmi_msghandler >> Apr 12 21:34:45 acpi_power_meter >> Apr 12 21:34:45 tpm_tis >> Apr 12 21:34:45 tpm >> Apr 12 21:34:45 processor >> Apr 12 21:34:45 button >> Apr 12 21:34:45 >> Apr 12 21:34:45 [75704.965874] CPU: 6 PID: 25339 Comm: main Not tainted >> 4.4.1 #2 >> Apr 12 21:34:45 [75704.965916] Hardware name: Supermicro Super >> Server/X10DRi-LN4+, BIOS 2.0 12/17/2015 >> Apr 12 21:34:45 [75704.965979] 0000000000000000 >> Apr 12 21:34:45 ffffffff812abdf3 >> Apr 12 21:34:45 0000000000000000 >> Apr 12 21:34:45 ffffffff810cf5f5 >> Apr 12 21:34:45 >> Apr 12 21:34:45 [75704.966054] ffff881ff2870000 >> Apr 12 21:34:45 ffffffff810fcea2 >> Apr 12 21:34:45 0000000000000001 >> Apr 12 21:34:45 ffff881fffcc5e58 >> Apr 12 21:34:45 >> Apr 12 21:34:45 [75704.966134] ffff881fffccaf00 >> Apr 12 21:34:45 ffff881fffccb100 >> Apr 12 21:34:45 ffff881ff2870000 >> Apr 12 21:34:45 ffffffff8101bc63 >> Apr 12 21:34:45 >> Apr 12 21:34:45 [75704.966211] Call Trace: >> Apr 12 21:34:45 [75704.966246] >> Apr 12 21:34:45 [] ? dump_stack+0x40/0x5d >> Apr 12 21:34:45 [75704.966297] [] ? >> watchdog_overflow_callback+0xb5/0xd0 >> Apr 12 21:34:45 [75704.966339] [] ? >> __perf_event_overflow+0x82/0x1c0 >> Apr 12 21:34:45 [75704.966384] [] ? >> intel_pmu_handle_irq+0x1c3/0x3e0 >> Apr 12 21:34:45 [75704.966431] [] ? >> vunmap_page_range+0x1bb/0x320 >> Apr 12 21:34:45 [75704.966474] [] ? >> ghes_copy_tofrom_phys+0x110/0x1d0 >> Apr 12 21:34:45 [75704.966519] [] ? >> perf_event_nmi_handler+0x23/0x40 >> Apr 12 21:34:45 [75704.966560] [] ? >> nmi_handle+0x65/0x100 >> Apr 12 21:34:45 [75704.966597] [] ? do_nmi+0x1de/0x360 >> Apr 12 21:34:45 [75704.970603] [] ? >> end_repeat_nmi+0x1a/0x1e >> Apr 12 21:34:45 [75704.970644] [] ? >> queued_spin_lock_slowpath+0xea/0x150 >> Apr 12 21:34:45 [75704.970685] [] ? >> queued_spin_lock_slowpath+0xea/0x150 >> Apr 12 21:34:45 [75704.970728] [] ? >> queued_spin_lock_slowpath+0xea/0x150 >> Apr 12 21:34:45 [75704.970768] <> >> Apr 12 21:34:45 [] ? make_request+0x60b/0xbd0 [raid456] >> Apr 12 21:34:45 [75704.970838] [] ? wait_woken+0x80/0x80 >> Apr 12 21:34:45 [75704.970878] [] ? >> kmem_cache_alloc+0xf4/0x120 >> Apr 12 21:34:45 [75704.970922] [] ? >> md_make_request+0xdd/0x220 [md_mod] >> Apr 12 21:34:45 [75704.970969] [] ? >> xfs_map_buffer.isra.12+0x2e/0x60 >> Apr 12 21:34:45 [75704.971012] [] ? >> generic_make_request+0xed/0x1d0 >> Apr 12 21:34:45 [75704.971052] [] ? >> submit_bio+0x5a/0x140 >> Apr 12 21:34:45 [75704.971098] [] ? >> release_pages+0xc9/0x270 >> Apr 12 21:34:45 [75704.971145] [] ? >> do_mpage_readpage+0x2d1/0x640 >> Apr 12 21:34:45 [75704.971187] [] ? >> mpage_readpages+0xdd/0x130 >> Apr 12 21:34:45 [75704.971226] [] ? >> __xfs_get_blocks+0x750/0x750 >> Apr 12 21:34:45 [75704.971267] [] ? >> __xfs_get_blocks+0x750/0x750 >> Apr 12 21:34:45 [75704.971313] [] ? >> alloc_pages_current+0x85/0x110 >> Apr 12 21:34:45 [75704.971354] [] ? >> __do_page_cache_readahead+0x165/0x1f0 >> Apr 12 21:34:45 [75704.971399] [] ? >> pagecache_get_page+0x22/0x1a0 >> Apr 12 21:34:45 [75704.971441] [] ? >> filemap_fault+0x37c/0x400 >> Apr 12 21:34:45 [75704.971481] [] ? >> xfs_filemap_fault+0x3b/0x80 >> Apr 12 21:34:45 [75704.971526] [] ? __do_fault+0x3a/0xc0 >> Apr 12 21:34:45 [75704.971564] [] ? >> handle_mm_fault+0x1063/0x1650 >> Apr 12 21:34:45 [75704.971614] [] ? >> __do_page_fault+0x11e/0x370 >> Apr 12 21:34:45 [75704.971653] [] ? >> SyS_epoll_wait+0x8f/0xd0 >> Apr 12 21:34:45 [75704.971694] [] ? page_fault+0x1f/0x30 >> Apr 12 21:34:45 [75705.493640] NMI watchdog: Watchdog detected hard LOCKUP >> on cpu 12 >> Apr 12 21:34:45 >> Apr 12 21:34:45 [75705.493668] Modules linked in: >> Apr 12 21:34:45 ipt_REJECT >> Apr 12 21:34:45 nf_reject_ipv4 >> Apr 12 21:34:45 iptable_mangle >> Apr 12 21:34:45 tun >> Apr 12 21:34:45 netconsole >> Apr 12 21:34:45 configfs >> Apr 12 21:34:45 xt_multiport >> Apr 12 21:34:45 ip6table_filter >> Apr 12 21:34:45 ip6_tables >> Apr 12 21:34:45 iptable_filter >> Apr 12 21:34:45 ip_tables >> Apr 12 21:34:45 x_tables >> Apr 12 21:34:45 bridge >> Apr 12 21:34:45 stp >> Apr 12 21:34:45 llc >> Apr 12 21:34:45 bonding >> Apr 12 21:34:45 ext4 >> Apr 12 21:34:45 crc16 >> Apr 12 21:34:45 mbcache >> Apr 12 21:34:45 jbd2 >> Apr 12 21:34:45 raid1 >> Apr 12 21:34:45 raid0 >> Apr 12 21:34:45 raid456 >> Apr 12 21:34:45 async_raid6_recov >> Apr 12 21:34:45 async_memcpy >> Apr 12 21:34:45 async_pq >> Apr 12 21:34:45 async_xor >> Apr 12 21:34:45 xor >> Apr 12 21:34:45 async_tx >> Apr 12 21:34:45 raid6_pq >> Apr 12 21:34:45 md_mod >> Apr 12 21:34:45 sr_mod >> Apr 12 21:34:45 cdrom >> Apr 12 21:34:45 usb_storage >> Apr 12 21:34:45 hid_generic >> Apr 12 21:34:45 usbhid >> Apr 12 21:34:45 hid >> Apr 12 21:34:45 sg >> Apr 12 21:34:45 sd_mod >> Apr 12 21:34:45 x86_pkg_temp_thermal >> Apr 12 21:34:45 coretemp >> Apr 12 21:34:45 crct10dif_pclmul >> Apr 12 21:34:45 crc32_pclmul >> Apr 12 21:34:45 crc32c_intel >> Apr 12 21:34:45 jitterentropy_rng >> Apr 12 21:34:45 sha256_ssse3 >> Apr 12 21:34:45 sha256_generic >> Apr 12 21:34:45 hmac >> Apr 12 21:34:45 iTCO_wdt >> Apr 12 21:34:45 iTCO_vendor_support >> Apr 12 21:34:45 drbg >> Apr 12 21:34:45 ansi_cprng >> Apr 12 21:34:45 aesni_intel >> Apr 12 21:34:45 aes_x86_64 >> Apr 12 21:34:45 lrw >> Apr 12 21:34:45 gf128mul >> Apr 12 21:34:45 glue_helper >> Apr 12 21:34:45 ablk_helper >> Apr 12 21:34:45 cryptd >> Apr 12 21:34:45 ahci >> Apr 12 21:34:45 libahci >> Apr 12 21:34:45 sb_edac >> Apr 12 21:34:45 libata >> Apr 12 21:34:45 igb >> Apr 12 21:34:45 megaraid_sas >> Apr 12 21:34:45 xhci_pci >> Apr 12 21:34:45 ehci_pci >> Apr 12 21:34:45 i2c_algo_bit >> Apr 12 21:34:45 xhci_hcd >> Apr 12 21:34:45 ehci_hcd >> Apr 12 21:34:45 edac_core >> Apr 12 21:34:45 ptp >> Apr 12 21:34:45 mei_me >> Apr 12 21:34:45 lpc_ich >> Apr 12 21:34:45 i2c_i801 >> Apr 12 21:34:45 usbcore >> Apr 12 21:34:45 pps_core >> Apr 12 21:34:45 mfd_core >> Apr 12 21:34:45 mei >> Apr 12 21:34:45 usb_common >> Apr 12 21:34:45 i2c_core >> Apr 12 21:34:45 ioatdma >> Apr 12 21:34:45 scsi_mod >> Apr 12 21:34:45 dca >> Apr 12 21:34:45 ipmi_si >> Apr 12 21:34:45 ipmi_msghandler >> Apr 12 21:34:45 acpi_power_meter >> Apr 12 21:34:45 tpm_tis >> Apr 12 21:34:45 tpm >> Apr 12 21:34:45 processor >> Apr 12 21:34:45 button >> Apr 12 21:34:45 >> Apr 12 21:34:45 [75705.494688] CPU: 12 PID: 32350 Comm: main Not tainted >> 4.4.1 #2 >> Apr 12 21:34:45 [75705.494728] Hardware name: Supermicro Super >> Server/X10DRi-LN4+, BIOS 2.0 12/17/2015 >> Apr 12 21:34:45 [75705.494790] 0000000000000000 >> Apr 12 21:34:45 ffffffff812abdf3 >> Apr 12 21:34:45 0000000000000000 >> Apr 12 21:34:45 ffffffff810cf5f5 >> Apr 12 21:34:45 >> Apr 12 21:34:45 [75705.494886] ffff883ff29a0000 >> Apr 12 21:34:45 ffffffff810fcea2 >> Apr 12 21:34:45 0000000000000001 >> Apr 12 21:34:45 ffff88407fc85e58 >> Apr 12 21:34:45 >> Apr 12 21:34:45 [75705.494976] ffff88407fc8af00 >> Apr 12 21:34:45 ffff88407fc8b100 >> Apr 12 21:34:45 ffff883ff29a0000 >> Apr 12 21:34:45 ffffffff8101bc63 >> Apr 12 21:34:45 >> Apr 12 21:34:45 [75705.495064] Call Trace: >> Apr 12 21:34:45 [75705.495094] >> Apr 12 21:34:45 [] ? dump_stack+0x40/0x5d >> Apr 12 21:34:45 [75705.495150] [] ? >> watchdog_overflow_callback+0xb5/0xd0 >> Apr 12 21:34:45 [75705.495193] [] ? >> __perf_event_overflow+0x82/0x1c0 >> Apr 12 21:34:45 [75705.495237] [] ? >> intel_pmu_handle_irq+0x1c3/0x3e0 >> Apr 12 21:34:45 [75705.495284] [] ? >> vunmap_page_range+0x1bb/0x320 >> Apr 12 21:34:45 [75705.495330] [] ? >> ghes_copy_tofrom_phys+0x110/0x1d0 >> Apr 12 21:34:45 [75705.495373] [] ? >> perf_event_nmi_handler+0x23/0x40 >> Apr 12 21:34:45 [75705.495418] [] ? >> nmi_handle+0x65/0x100 >> Apr 12 21:34:45 [75705.495458] [] ? do_nmi+0x10e/0x360 >> Apr 12 21:34:45 [75705.495497] [] ? >> end_repeat_nmi+0x1a/0x1e >> Apr 12 21:34:45 [75705.495540] [] ? >> queued_spin_lock_slowpath+0xea/0x150 >> Apr 12 21:34:45 [75705.495581] [] ? >> queued_spin_lock_slowpath+0xea/0x150 >> Apr 12 21:34:45 [75705.495621] [] ? >> queued_spin_lock_slowpath+0xea/0x150 >> Apr 12 21:34:45 [75705.495661] <> >> Apr 12 21:34:45 [] ? make_request+0x60b/0xbd0 [raid456] >> Apr 12 21:34:45 [75705.495733] [] ? >> blk_rq_init+0x87/0xa0 >> Apr 12 21:34:45 [75705.495771] [] ? >> get_request+0x29c/0x6e0 >> Apr 12 21:34:45 [75705.495812] [] ? wait_woken+0x80/0x80 >> Apr 12 21:34:45 [75705.495853] [] ? >> md_make_request+0xdd/0x220 [md_mod] >> Apr 12 21:34:45 [75705.495898] [] ? >> blk_queue_bio+0x15e/0x350 >> Apr 12 21:34:45 [75705.495937] [] ? >> generic_make_request+0xed/0x1d0 >> Apr 12 21:34:45 [75705.495978] [] ? >> submit_bio+0x5a/0x140 >> Apr 12 21:34:45 [75705.496018] [] ? >> mpage_bio_submit+0x1e/0x30 >> Apr 12 21:34:45 [75705.496057] [] ? >> mpage_readpages+0x106/0x130 >> Apr 12 21:34:45 [75705.496102] [] ? >> __xfs_get_blocks+0x750/0x750 >> Apr 12 21:34:45 [75705.496144] [] ? >> __xfs_get_blocks+0x750/0x750 >> Apr 12 21:34:45 [75705.496185] [] ? >> alloc_pages_current+0x85/0x110 >> Apr 12 21:34:45 [75705.496227] [] ? >> __do_page_cache_readahead+0x165/0x1f0 >> Apr 12 21:34:45 [75705.496268] [] ? vma_link+0x75/0xb0 >> Apr 12 21:34:45 [75705.496307] [] ? >> force_page_cache_readahead+0x9b/0xe0 >> Apr 12 21:34:45 [75705.496352] [] ? >> madvise_willneed+0x76/0x140 >> Apr 12 21:34:45 [75705.496395] [] ? >> handle_mm_fault+0x9ae/0x1650 >> Apr 12 21:34:45 [75705.496437] [] ? find_vma+0x5b/0x70 >> Apr 12 21:34:45 [75705.496476] [] ? >> SyS_madvise+0x312/0x6f0 >> Apr 12 21:34:45 [75705.496515] [] ? >> entry_SYSCALL_64_fastpath+0x16/0x6e >> Apr 12 21:34:47 [75707.118049] NMI watchdog: Watchdog detected hard LOCKUP >> on cpu 15 >> Apr 12 21:34:47 >> Apr 12 21:34:47 [75707.118078] Modules linked in: >> Apr 12 21:34:47 ipt_REJECT >> Apr 12 21:34:47 nf_reject_ipv4 >> Apr 12 21:34:47 iptable_mangle >> Apr 12 21:34:47 tun >> Apr 12 21:34:47 netconsole >> Apr 12 21:34:47 configfs >> Apr 12 21:34:47 xt_multiport >> Apr 12 21:34:47 ip6table_filter >> Apr 12 21:34:47 ip6_tables >> Apr 12 21:34:47 iptable_filter >> Apr 12 21:34:47 ip_tables >> Apr 12 21:34:47 x_tables >> Apr 12 21:34:47 bridge >> Apr 12 21:34:47 stp >> Apr 12 21:34:47 llc >> Apr 12 21:34:47 bonding >> Apr 12 21:34:47 ext4 >> Apr 12 21:34:47 crc16 >> Apr 12 21:34:47 mbcache >> Apr 12 21:34:47 jbd2 >> Apr 12 21:34:47 raid1 >> Apr 12 21:34:47 raid0 >> Apr 12 21:34:47 raid456 >> Apr 12 21:34:47 async_raid6_recov >> Apr 12 21:34:47 async_memcpy >> Apr 12 21:34:47 async_pq >> Apr 12 21:34:47 async_xor >> Apr 12 21:34:47 xor >> Apr 12 21:34:47 async_tx >> Apr 12 21:34:47 raid6_pq >> Apr 12 21:34:47 md_mod >> Apr 12 21:34:47 sr_mod >> Apr 12 21:34:47 cdrom >> Apr 12 21:34:47 usb_storage >> Apr 12 21:34:47 hid_generic >> Apr 12 21:34:47 usbhid >> Apr 12 21:34:47 hid >> Apr 12 21:34:47 sg >> Apr 12 21:34:47 sd_mod >> Apr 12 21:34:47 x86_pkg_temp_thermal >> Apr 12 21:34:47 coretemp >> Apr 12 21:34:47 crct10dif_pclmul >> Apr 12 21:34:47 crc32_pclmul >> Apr 12 21:34:47 crc32c_intel >> Apr 12 21:34:47 jitterentropy_rng >> Apr 12 21:34:47 sha256_ssse3 >> Apr 12 21:34:47 sha256_generic >> Apr 12 21:34:47 hmac >> Apr 12 21:34:47 iTCO_wdt >> Apr 12 21:34:47 iTCO_vendor_support >> Apr 12 21:34:47 drbg >> Apr 12 21:34:47 ansi_cprng >> Apr 12 21:34:47 aesni_intel >> Apr 12 21:34:47 aes_x86_64 >> Apr 12 21:34:47 lrw >> Apr 12 21:34:47 gf128mul >> Apr 12 21:34:47 glue_helper >> Apr 12 21:34:47 ablk_helper >> Apr 12 21:34:47 cryptd >> Apr 12 21:34:47 ahci >> Apr 12 21:34:47 libahci >> Apr 12 21:34:47 sb_edac >> Apr 12 21:34:47 libata >> Apr 12 21:34:47 igb >> Apr 12 21:34:47 megaraid_sas >> Apr 12 21:34:47 xhci_pci >> Apr 12 21:34:47 ehci_pci >> Apr 12 21:34:47 i2c_algo_bit >> Apr 12 21:34:47 xhci_hcd >> Apr 12 21:34:47 ehci_hcd >> Apr 12 21:34:47 edac_core >> Apr 12 21:34:47 ptp >> Apr 12 21:34:47 mei_me >> Apr 12 21:34:47 lpc_ich >> Apr 12 21:34:47 i2c_i801 >> Apr 12 21:34:47 usbcore >> Apr 12 21:34:47 pps_core >> Apr 12 21:34:47 mfd_core >> Apr 12 21:34:47 mei >> Apr 12 21:34:47 usb_common >> Apr 12 21:34:47 i2c_core >> Apr 12 21:34:47 ioatdma >> Apr 12 21:34:47 scsi_mod >> Apr 12 21:34:47 dca >> Apr 12 21:34:47 ipmi_si >> Apr 12 21:34:47 ipmi_msghandler >> Apr 12 21:34:47 acpi_power_meter >> Apr 12 21:34:47 tpm_tis >> Apr 12 21:34:47 tpm >> Apr 12 21:34:47 processor >> Apr 12 21:34:47 button >> Apr 12 21:34:47 >> Apr 12 21:34:47 [75707.119088] CPU: 15 PID: 31940 Comm: main Not tainted >> 4.4.1 #2 >> Apr 12 21:34:47 [75707.119134] Hardware name: Supermicro Super >> Server/X10DRi-LN4+, BIOS 2.0 12/17/2015 >> Apr 12 21:34:47 [75707.119196] 0000000000000000 >> Apr 12 21:34:47 ffffffff812abdf3 >> Apr 12 21:34:47 0000000000000000 >> Apr 12 21:34:47 ffffffff810cf5f5 >> Apr 12 21:34:47 >> Apr 12 21:34:47 [75707.119277] ffff883ff2a20000 >> Apr 12 21:34:47 ffffffff810fcea2 >> Apr 12 21:34:47 0000000000000001 >> Apr 12 21:34:47 ffff88407fce5e58 >> Apr 12 21:34:47 >> Apr 12 21:34:47 [75707.119360] ffff88407fceaf00 >> Apr 12 21:34:47 ffff88407fceb100 >> Apr 12 21:34:47 ffff883ff2a20000 >> Apr 12 21:34:47 ffffffff8101bc63 >> Apr 12 21:34:47 >> Apr 12 21:34:47 [75707.119439] Call Trace: >> Apr 12 21:34:47 [75707.119471] >> Apr 12 21:34:47 [] ? dump_stack+0x40/0x5d >> Apr 12 21:34:47 [75707.119527] [] ? >> watchdog_overflow_callback+0xb5/0xd0 >> Apr 12 21:34:47 [75707.119571] [] ? >> __perf_event_overflow+0x82/0x1c0 >> Apr 12 21:34:47 [75707.119614] [] ? >> intel_pmu_handle_irq+0x1c3/0x3e0 >> Apr 12 21:34:47 [75707.119657] [] ? >> vunmap_page_range+0x1bb/0x320 >> Apr 12 21:34:47 [75707.119703] [] ? >> ghes_copy_tofrom_phys+0x110/0x1d0 >> Apr 12 21:34:47 [75707.119758] [] ? >> perf_event_nmi_handler+0x23/0x40 >> Apr 12 21:34:47 [75707.119800] [] ? >> nmi_handle+0x65/0x100 >> Apr 12 21:34:47 [75707.119838] [] ? do_nmi+0x10e/0x360 >> Apr 12 21:34:47 [75707.119878] [] ? >> end_repeat_nmi+0x1a/0x1e >> Apr 12 21:34:47 [75707.119920] [] ? >> queued_spin_lock_slowpath+0xea/0x150 >> Apr 12 21:34:47 [75707.119962] [] ? >> queued_spin_lock_slowpath+0xea/0x150 >> Apr 12 21:34:47 [75707.120002] [] ? >> queued_spin_lock_slowpath+0xea/0x150 >> Apr 12 21:34:47 [75707.120042] <> >> Apr 12 21:34:47 [] ? make_request+0x60b/0xbd0 [raid456] >> Apr 12 21:34:47 [75707.120113] [] ? wait_woken+0x80/0x80 >> Apr 12 21:34:47 [75707.120152] [] ? >> md_make_request+0xdd/0x220 [md_mod] >> Apr 12 21:34:47 [75707.120195] [] ? >> generic_make_request+0xed/0x1d0 >> Apr 12 21:34:47 [75707.120236] [] ? >> submit_bio+0x5a/0x140 >> Apr 12 21:34:47 [75707.120277] [] ? >> workingset_refault+0x4f/0xa0 >> Apr 12 21:34:47 [75707.120320] [] ? >> mpage_bio_submit+0x1e/0x30 >> Apr 12 21:34:47 [75707.120359] [] ? >> mpage_readpages+0x106/0x130 >> Apr 12 21:34:47 [75707.120401] [] ? >> __xfs_get_blocks+0x750/0x750 >> Apr 12 21:34:47 [75707.120439] [] ? >> __xfs_get_blocks+0x750/0x750 >> Apr 12 21:34:47 [75707.120481] [] ? >> alloc_pages_current+0x85/0x110 >> Apr 12 21:34:47 [75707.120523] [] ? >> __do_page_cache_readahead+0x165/0x1f0 >> Apr 12 21:34:47 [75707.120564] [] ? vma_link+0x75/0xb0 >> Apr 12 21:34:47 [75707.120602] [] ? >> force_page_cache_readahead+0x77/0xe0 >> Apr 12 21:34:47 [75707.120644] [] ? >> madvise_willneed+0x76/0x140 >> Apr 12 21:34:47 [75707.120683] [] ? >> handle_mm_fault+0x9ae/0x1650 >> Apr 12 21:34:47 [75707.120722] [] ? find_vma+0x5b/0x70 >> Apr 12 21:34:47 [75707.120760] [] ? >> SyS_madvise+0x312/0x6f0 >> Apr 12 21:34:47 [75707.120799] [] ? >> entry_SYSCALL_64_fastpath+0x16/0x6e >> >> Once this starts, a couple of minutes goes by and the machine locks up >> completely. >> >> I have been unable to locate the problem here, anyone that can point me in >> the right direction? >> >> Best regards >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html