From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: George Spelvin <linux@horizon.com>
Cc: DL-MPTFusionLinux@lsi.com, kashyap.desai@lsi.com,
linux-scsi@vger.kernel.org, Nagalakshmi.Nandigama@lsi.com
Subject: Re: [mptscsih] Watchdog detected hard LOCKUP on cpu 0
Date: Mon, 25 Nov 2013 16:01:54 +0400 [thread overview]
Message-ID: <1385380914.2354.38.camel@dabdike> (raw)
In-Reply-To: <20131125074849.21605.qmail@science.horizon.com>
On Mon, 2013-11-25 at 02:48 -0500, George Spelvin wrote:
> I first reported this in mid-October, but I've been AFK for a month
> and haven't done anything about it in that time. Basically, sustained
> linear reads from 6 (7200 RPM 2 TB) disks on a BR10i controller causes
> a hard lockup.
>
> Anyway, I recompiled with CONFIG_LOCKUP_DETECTOR, and it didn't take
> long to capture this (hand-transcribed, but double-checked). I omitted
> most of the timestamps, as they're not very interesting, but I uncluded
> a few at the end that had significant delays between them.
>
> Does anyone have any ideas for where to start debugging this?
The reason for the lack of replies is that no-one has much of an idea.
This really looks like a hardware problem. The qi_submit_sync() is
suggestive: it's the intel IOMMU mapping call ... have you tried
reproducing this with the iommu disabled?
James
> Thank you very much!
>
> [ 321.243221] ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 0 at kernel.watchdog.c:245 watchdog_overflow_callback+0x9a/0xc0()
> Watchdog detected hard LOCKUP on cpu 0
> Modules linked in: twofish_generic twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common ecb cmac xcbc fuse
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.1-00045-g27b879d64d #306
> Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./X79-UP4, BIOS F2 07/16/2012
> 0000000000000009 ffff88043fc06c40 ffffffff815d0ee9 ffff88043fc06c88
> ffff88043fc06c78 ffffffff8104fef3 ffff88042d816800 0000000000000000
> ffff88043fc06da0 0000000000000000 ffff88043fc06ef8 ffff88043fc06cd8
> Call Trace:
> <NMI> [<ffffffff815d0ee9>] dump_stack+0x54/0x74
> [<ffffffff8104fef3>] warn_slowpath_common+0x73/0x90
> [<ffffffff8104ff57>] warn_slowpath_fmt+0x47/0x50
> [<ffffffff810bc990>] ? restart_watchdog_hrtimer+0x40/0x40
> [<ffffffff810bca2a>] watchdog_overflow_callback+0x9a/0xc0
> [<ffffffff810c924e>] __perf_event_overflow+0x8e/0x2c0
> [<ffffffff810c9c44>] perf_event_overflow+0x14/0x20
> [<ffffffff8101be36>] intel_pmu_handle_irq+0x1b6/0x390
> [<ffffffff810150cb>] perf_event_nmi_handler+0x2b/0x50
> [<ffffffff81006857>] nmi_handle.isra.3+0x87/0x140
> [<ffffffff810069e0>] do_nmi+0xd0/0x340
> [<ffffffff815d9ab7>] end_repeat_nmi+0x1e/0x2e
> [<ffffffff815d9161>] ? _raw_spin_lock+0x11/0x40
> [<ffffffff815d9161>] ? _raw_spin_lock+0x11/0x40
> [<ffffffff815d9161>] ? _raw_spin_lock+0x11/0x40
> <<EOE>> <IRQ> [<ffffffff814dbc2a>] ? qi_submit_sync+0x28a/0x450
> [<ffffffff813b1e1d>] ? scsi_run_queue+0x11d/0x280
> [<ffffffff814dbeca>] qi_flush_iotlb+0x5a/0x60
> [<ffffffff814dce9a>] flush_unmaps+0x15a/0x170
> [<ffffffff814dceb0>] ? flush_unmaps+0x170/0x170
> [<ffffffff814dcec9>] flush_unmaps_timeout+0x19/0x30
> [<ffffffff8105a7c2>] call_timer_fn.isra.29+0x22/0x80
> [<ffffffff8105a9d9>] run_timer_softirq+0x1b9/0x290
> [<ffffffff8120cc00>] ? timerqueue_add+0x60/0xb0
> [<ffffffff810546c9>] __do_softirq+0xd9/0x1a0
> [<ffffffff815daf7c>] call_softirq+0x1c/0x30
> [<ffffffff81004d75>] do_softirq+0x35/0x70
> [<ffffffff810548e5>] irq_exit+0x95/0xa0
> [<ffffffff8102c08f>] smp_apic_timer_interrupt+0x3f/0x50
> [<ffffffff815da90a>] apic_timer_interrupt+0x6a/0x70
> <EOI> [<ffffffff81070b52>] ? __hrtimer_start_range_ns+0x1f2/0x3b0
> [<ffffffff814ca1c7>] ? cpuidle_enter_state+0x47/0xc0
> [<ffffffff814ca1c3>] ? cpuidle_enter_state+0x43/0xc0
> [<ffffffff814ca2e9>] cpuidle_idle_call+0xa9/0x150
> [<ffffffff8100bed9>] arch_cpu_idle+0x9/0x20
> [<ffffffff8109619e>] cpu_startup_entry+0x7e/0x170
> [<ffffffff815c97eb>] rest_init+0x8b/0x90
> [<ffffffff81ab5d35>] start_kernel+0x2d9/0x2e4
> [<ffffffff81ab5865>] ? repair_env_string+0x5c/0x5c
> [<ffffffff81ab55a3>] x86_64_start_reservations+0x2a/0x2c
> [<ffffffff81ab566c>] x86_64_start_kernel+0xc7/0xca
> [ 321.271385] ---[ end trace e25797a0833ba41e ]---
> [ 321.272175] perf samples too long (226338 > 2500), lowering kernel.perf_event_max_sample_rate to 50100
> [ 321.272986] INFO: NMI handler (perf_event_nmi_handler_ took too long to run: 29.766 msecs
> [ 329.848706] perf samples too long (224588 > 4990), lowering kernel.perf_event_max_sample_rate to 25200
> [ 338.553847] perf samples too long (222847 > 9920), lowering kernel.perf_event_max_sample_rate to 12600
> [ 339.993145] mptscsih: ioc0: attampting task abort! (sc=ffff880422009d00)
> [ 339.993331] sd 14:0:3:0: [sdf] CDB:
> [ 339.993603] Read(10): 28 00 01 fa 8d 00 00 04 00 00
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-11-25 12:01 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-13 13:42 Hard lockup during intense reads from BR10i George Spelvin
2013-10-14 9:08 ` George Spelvin
2013-11-25 7:48 ` [mptscsih] Watchdog detected hard LOCKUP on cpu 0 George Spelvin
2013-11-25 12:01 ` James Bottomley [this message]
2013-11-25 17:16 ` George Spelvin
2013-11-28 10:06 ` George Spelvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1385380914.2354.38.camel@dabdike \
--to=james.bottomley@hansenpartnership.com \
--cc=DL-MPTFusionLinux@lsi.com \
--cc=Nagalakshmi.Nandigama@lsi.com \
--cc=kashyap.desai@lsi.com \
--cc=linux-scsi@vger.kernel.org \
--cc=linux@horizon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.