From: "Darrick J. Wong" <djwong@us.ibm.com>
To: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: linux-scsi <linux-scsi@vger.kernel.org>,
James Bottomley <James.Bottomley@SteelEye.com>
Subject: Re: aic94xx IO errors with "escb_tasklet_complete: phy0: REQ_TASK_ABORT"
Date: Wed, 04 Oct 2006 17:50:35 -0700 [thread overview]
Message-ID: <452456DB.20407@us.ibm.com> (raw)
In-Reply-To: <20061004164438.GC5091@rhun.haifa.ibm.com>
Muli Ben-Yehuda wrote:
> [resending as it probably hit the 100K limit the first time]
>
> I'm seeing these aic94xx IO errors on an IBM x366, usually after I
> copy ~20GB but occasionally as soon as heavy IO starts. Happens with
> and without Calgary enabled (iommu=off). I'm seeing this on two
> different disks which badblocks claims are ok. The machine usually
> stays up and keeps chugging along after this happens.
I hit a real REQ_TASK_ABORT about five minutes into a pounder run.
Below is the serial log from what happened. Muli, do you see something
like this? (REQ_TASK_ABORT w/ reason code 0x6 (PROTOCOL ERROR)?)
I'm testing my experimental patch to feed these REQ_* errors up to
libsas; also note that there appear to be bugs in my implementation. :)
--D
[ 862.993067] aic94xx: escb_tasklet_complete: phy0: REQ_TASK_ABORT(f0) tc: 16 stat: 6 dl->idx: 0
[ 863.001658] aic94xx: escb_tasklet_complete: kicking ascb ffff810096953880
[ 863.047452] aic94xx: escb_tasklet_complete: kicking ascb ffff810096953880
Suspicious that we try to fail this twice... looks like I have something to do tomorrow. :)
[ 863.085458] ----------- [cut here ] --------- [please bite here ] ---------
[ 863.092397] Kernel BUG at include/linux/mm.h:300
[ 863.096998] invalid opcode: 0000 [1] PREEMPT SMP
[ 863.101714] CPU 0
[ 863.103725] Modules linked in: ext2 ext3 jbd mbcache acpi_cpufreq processor cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_onde
mand freq_table cpufreq_conservative dm_mod md_mod ipv6 sg sd_mod aic94xx libsas firmware_class scsi_transport_sas ide_cd cdrom ata_generic a
ta_piix generic serio_raw ahci ehci_hcd libata scsi_mod piix ide_core shpchp pci_hotplug uhci_hcd usbcore mousedev tsdev evdev unix
[ 863.140063] Pid: 3838, comm: memxfer5b Not tainted 2.6.18-git4-dic94xx #104
[ 863.147002] RIP: 0010:[<ffffffff8012e033>] [<ffffffff8012e033>] __free_pages+0xb/0x32
[ 863.154909] RSP: 0000:ffffffff80513d70 EFLAGS: 00010046
[ 863.160203] RAX: 0000000000000000 RBX: ffff810098478000 RCX: 000000000000003f
[ 863.167314] RDX: ffff81000000d000 RSI: 0000000000000000 RDI: ffff8100bf13e940
[ 863.174426] RBP: ffffffff80513d70 R08: 0000000000000002 R09: ffffffff80115ab8
[ 863.181538] R10: ffffffff80115ab8 R11: 00000000000f4240 R12: 0000000000000000
[ 863.188650] R13: ffff8100ba048000 R14: ffff810096953880 R15: ffff8100ba126d08
[ 863.195763] FS: 00002b3835d4c6d0(0000) GS:ffffffff808af000(0000) knlGS:0000000000000000
[ 863.203827] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 863.209553] CR2: 00002ac0e3e27000 CR3: 00000000baea3000 CR4: 00000000000006e0
[ 863.216666] Process memxfer5b (pid: 3838, threadinfo ffff810071e1c000, task ffff81003d284080)
[ 863.225162] Stack: ffffffff80513d90 ffffffff80135dd6 00000000000001c0 ffff810098478000
[ 863.233186] ffffffff80513db0 ffffffff8016fbc2 ffff8100ba759680 ffff810081ed9480
[ 863.240594] ffffffff80513de0 ffffffff88181965 0000000000000002 0000000000000006
[ 863.247819] Call Trace:
[ 863.250555] [<ffffffff80135dd6>] free_pages+0x85/0x8a
[ 863.255787] [<ffffffff8016fbc2>] dma_free_coherent+0x41/0x46
[ 863.261539] [<ffffffff88181965>] :aic94xx:asd_unbuild_ssp_ascb+0x98/0xfa
[ 863.268320] [<ffffffff88182be3>] :aic94xx:asd_escb_tasklet_complete+0x2dc/0x465
[ 863.275704] [<ffffffff8817e3d8>] :aic94xx:escb_tasklet_complete+0x8d1/0xa25
[ 863.282739] [<ffffffff88173916>] :aic94xx:asd_dl_tasklet_handler+0xd0/0x103
[ 863.289768] [<ffffffff8018e03f>] tasklet_action+0x6d/0xc5
[ 863.295294] [<ffffffff80110837>] __do_softirq+0x6b/0xf6
[ 863.300646] [<ffffffff8015dae8>] call_softirq+0x1c/0x28
[ 863.305950] DWARF2 unwinder stuck at call_softirq+0x1c/0x28
[ 863.311503] Leftover inexact backtrace:
[ 863.315323] <IRQ> [<ffffffff8016c0d3>] do_softirq+0x36/0x9c
[ 863.320983] [<ffffffff8018de7c>] irq_exit+0x4e/0x5a
[ 863.325933] [<ffffffff8016c2fd>] do_IRQ+0xf4/0xfe
[ 863.330710] [<ffffffff8015cd46>] ret_from_intr+0x0/0xf
[ 863.335917] <EOI>
[ 863.337938]
[ 863.337939] Code: 0f 0b 68 40 a6 37 80 c2 2c 01 f0 ff 4f 08 0f 94 c0 84 c0 74
[ 863.346801] RIP [<ffffffff8012e033>] __free_pages+0xb/0x32
[ 863.352365] RSP <ffffffff80513d70>
[ 863.356089] <3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20
[ 863.364077] in_atomic():1, irqs_disabled():1
[ 863.368329]
[ 863.368330] Call Trace:
[ 863.372338] [<ffffffff8016af36>] show_trace+0xae/0x33a
[ 863.377556] [<ffffffff8016b3d9>] dump_stack+0x13/0x15
[ 863.382686] [<ffffffff8010b294>] __might_sleep+0xb3/0xb5
[ 863.388112] [<ffffffff8019ce3a>] down_read+0x1a/0x42
[ 863.393225] [<ffffffff80194c87>] blocking_notifier_call_chain+0x18/0x3d
[ 863.399972] [<ffffffff8018ba8a>] profile_task_exit+0x15/0x17
[ 863.405755] [<ffffffff80113c1c>] do_exit+0x25/0x9c6
[ 863.410756] [<ffffffff8016b41f>] kernel_math_error+0x0/0x96
[ 863.416406] [<ffff81003d284080>]
[ 863.419711] DWARF2 unwinder stuck at 0xffff81003d284080
[ 863.424917] Leftover inexact backtrace:
[ 863.428737] <IRQ> [<ffffffff80164e70>] do_trap+0xdb/0xea
[ 863.434138] [<ffffffff8016b8dc>] do_invalid_op+0xac/0xb8
[ 863.439520] [<ffffffff8012e033>] __free_pages+0xb/0x32
[ 863.444731] [<ffffffff80115c52>] release_console_sem+0x1e4/0x21e
[ 863.450811] [<ffffffff8018b8c2>] vprintk+0x2d8/0x333
[ 863.455851] [<ffffffff8015d5c1>] error_exit+0x0/0x96
[ 863.460890] [<ffffffff80115ab8>] release_console_sem+0x4a/0x21e
[ 863.466878] [<ffffffff80115ab8>] release_console_sem+0x4a/0x21e
[ 863.472868] [<ffffffff8012e033>] __free_pages+0xb/0x32
[ 863.478080] [<ffffffff80135dd6>] free_pages+0x85/0x8a
[ 863.483205] [<ffffffff8016fbc2>] dma_free_coherent+0x41/0x46
[ 863.488941] [<ffffffff88181965>] :aic94xx:asd_unbuild_ssp_ascb+0x98/0xfa
[ 863.495715] [<ffffffff88182be3>] :aic94xx:asd_escb_tasklet_complete+0x2dc/0x465
[ 863.503100] [<ffffffff8817e3d8>] :aic94xx:escb_tasklet_complete+0x8d1/0xa25
[ 863.510133] [<ffffffff8019f3f7>] trace_hardirqs_on+0xe6/0x124
[ 863.515955] [<ffffffff88173916>] :aic94xx:asd_dl_tasklet_handler+0xd0/0x103
[ 863.522985] [<ffffffff8018e03f>] tasklet_action+0x6d/0xc5
[ 863.528455] [<ffffffff80110837>] __do_softirq+0x6b/0xf6
[ 863.533753] [<ffffffff8015dae8>] call_softirq+0x1c/0x28
[ 863.539050] [<ffffffff8016c0d3>] do_softirq+0x36/0x9c
[ 863.544173] [<ffffffff8018de7c>] irq_exit+0x4e/0x5a
[ 863.549122] [<ffffffff8016c2fd>] do_IRQ+0xf4/0xfe
[ 863.553899] [<ffffffff8015cd46>] ret_from_intr+0x0/0xf
[ 863.559106] <EOI>
[ 863.561147] Kernel panic - not syncing: Aiee, killing interrupt handler!
[ 863.567831] <0>Rebooting in 30 seconds..
prev parent reply other threads:[~2006-10-05 0:50 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-04 16:44 aic94xx IO errors with "escb_tasklet_complete: phy0: REQ_TASK_ABORT" Muli Ben-Yehuda
2006-10-04 18:29 ` Andy Warner
2006-10-04 20:12 ` Darrick J. Wong
2006-10-04 19:31 ` Andy Warner
2006-10-04 20:56 ` Muli Ben-Yehuda
2006-10-04 21:11 ` James Bottomley
2006-10-05 21:55 ` Muli Ben-Yehuda
2006-10-05 22:11 ` Darrick J. Wong
2006-10-16 19:51 ` Muli Ben-Yehuda
2006-10-16 20:00 ` James Bottomley
2006-10-16 20:47 ` Muli Ben-Yehuda
2006-10-05 0:50 ` Darrick J. Wong [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=452456DB.20407@us.ibm.com \
--to=djwong@us.ibm.com \
--cc=James.Bottomley@SteelEye.com \
--cc=linux-scsi@vger.kernel.org \
--cc=muli@il.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox