All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@us.ibm.com>
To: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: linux-scsi <linux-scsi@vger.kernel.org>,
	James Bottomley <James.Bottomley@SteelEye.com>
Subject: Re: aic94xx IO errors with "escb_tasklet_complete: phy0: REQ_TASK_ABORT"
Date: Wed, 04 Oct 2006 17:50:35 -0700	[thread overview]
Message-ID: <452456DB.20407@us.ibm.com> (raw)
In-Reply-To: <20061004164438.GC5091@rhun.haifa.ibm.com>

Muli Ben-Yehuda wrote:
> [resending as it probably hit the 100K limit the first time]
> 
> I'm seeing these aic94xx IO errors on an IBM x366, usually after I
> copy ~20GB but occasionally as soon as heavy IO starts. Happens with
> and without Calgary enabled (iommu=off). I'm seeing this on two
> different disks which badblocks claims are ok. The machine usually
> stays up and keeps chugging along after this happens.

I hit a real REQ_TASK_ABORT about five minutes into a pounder run.
Below is the serial log from what happened.  Muli, do you see something
like this?  (REQ_TASK_ABORT w/ reason code 0x6 (PROTOCOL ERROR)?)

I'm testing my experimental patch to feed these REQ_* errors up to
libsas; also note that there appear to be bugs in my implementation. :)

--D

[  862.993067] aic94xx: escb_tasklet_complete: phy0: REQ_TASK_ABORT(f0) tc: 16 stat: 6 dl->idx: 0
[  863.001658] aic94xx: escb_tasklet_complete: kicking ascb ffff810096953880 
[  863.047452] aic94xx: escb_tasklet_complete: kicking ascb ffff810096953880 

Suspicious that we try to fail this twice... looks like I have something to do tomorrow. :)

[  863.085458] ----------- [cut here ] --------- [please bite here ] ---------
[  863.092397] Kernel BUG at include/linux/mm.h:300
[  863.096998] invalid opcode: 0000 [1] PREEMPT SMP 
[  863.101714] CPU 0 
[  863.103725] Modules linked in: ext2 ext3 jbd mbcache acpi_cpufreq processor cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_onde
mand freq_table cpufreq_conservative dm_mod md_mod ipv6 sg sd_mod aic94xx libsas firmware_class scsi_transport_sas ide_cd cdrom ata_generic a
ta_piix generic serio_raw ahci ehci_hcd libata scsi_mod piix ide_core shpchp pci_hotplug uhci_hcd usbcore mousedev tsdev evdev unix
[  863.140063] Pid: 3838, comm: memxfer5b Not tainted 2.6.18-git4-dic94xx #104
[  863.147002] RIP: 0010:[<ffffffff8012e033>]  [<ffffffff8012e033>] __free_pages+0xb/0x32
[  863.154909] RSP: 0000:ffffffff80513d70  EFLAGS: 00010046
[  863.160203] RAX: 0000000000000000 RBX: ffff810098478000 RCX: 000000000000003f
[  863.167314] RDX: ffff81000000d000 RSI: 0000000000000000 RDI: ffff8100bf13e940
[  863.174426] RBP: ffffffff80513d70 R08: 0000000000000002 R09: ffffffff80115ab8
[  863.181538] R10: ffffffff80115ab8 R11: 00000000000f4240 R12: 0000000000000000
[  863.188650] R13: ffff8100ba048000 R14: ffff810096953880 R15: ffff8100ba126d08
[  863.195763] FS:  00002b3835d4c6d0(0000) GS:ffffffff808af000(0000) knlGS:0000000000000000
[  863.203827] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  863.209553] CR2: 00002ac0e3e27000 CR3: 00000000baea3000 CR4: 00000000000006e0
[  863.216666] Process memxfer5b (pid: 3838, threadinfo ffff810071e1c000, task ffff81003d284080)
[  863.225162] Stack:  ffffffff80513d90 ffffffff80135dd6 00000000000001c0 ffff810098478000
[  863.233186]  ffffffff80513db0 ffffffff8016fbc2 ffff8100ba759680 ffff810081ed9480
[  863.240594]  ffffffff80513de0 ffffffff88181965 0000000000000002 0000000000000006
[  863.247819] Call Trace:
[  863.250555]  [<ffffffff80135dd6>] free_pages+0x85/0x8a
[  863.255787]  [<ffffffff8016fbc2>] dma_free_coherent+0x41/0x46
[  863.261539]  [<ffffffff88181965>] :aic94xx:asd_unbuild_ssp_ascb+0x98/0xfa
[  863.268320]  [<ffffffff88182be3>] :aic94xx:asd_escb_tasklet_complete+0x2dc/0x465
[  863.275704]  [<ffffffff8817e3d8>] :aic94xx:escb_tasklet_complete+0x8d1/0xa25
[  863.282739]  [<ffffffff88173916>] :aic94xx:asd_dl_tasklet_handler+0xd0/0x103
[  863.289768]  [<ffffffff8018e03f>] tasklet_action+0x6d/0xc5
[  863.295294]  [<ffffffff80110837>] __do_softirq+0x6b/0xf6
[  863.300646]  [<ffffffff8015dae8>] call_softirq+0x1c/0x28
[  863.305950] DWARF2 unwinder stuck at call_softirq+0x1c/0x28
[  863.311503] Leftover inexact backtrace:
[  863.315323]  <IRQ> [<ffffffff8016c0d3>] do_softirq+0x36/0x9c
[  863.320983]  [<ffffffff8018de7c>] irq_exit+0x4e/0x5a
[  863.325933]  [<ffffffff8016c2fd>] do_IRQ+0xf4/0xfe
[  863.330710]  [<ffffffff8015cd46>] ret_from_intr+0x0/0xf
[  863.335917]  <EOI>
[  863.337938] 
[  863.337939] Code: 0f 0b 68 40 a6 37 80 c2 2c 01 f0 ff 4f 08 0f 94 c0 84 c0 74 
[  863.346801] RIP  [<ffffffff8012e033>] __free_pages+0xb/0x32
[  863.352365]  RSP <ffffffff80513d70>
[  863.356089]  <3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20
[  863.364077] in_atomic():1, irqs_disabled():1
[  863.368329] 
[  863.368330] Call Trace:
[  863.372338]  [<ffffffff8016af36>] show_trace+0xae/0x33a
[  863.377556]  [<ffffffff8016b3d9>] dump_stack+0x13/0x15
[  863.382686]  [<ffffffff8010b294>] __might_sleep+0xb3/0xb5
[  863.388112]  [<ffffffff8019ce3a>] down_read+0x1a/0x42
[  863.393225]  [<ffffffff80194c87>] blocking_notifier_call_chain+0x18/0x3d
[  863.399972]  [<ffffffff8018ba8a>] profile_task_exit+0x15/0x17
[  863.405755]  [<ffffffff80113c1c>] do_exit+0x25/0x9c6
[  863.410756]  [<ffffffff8016b41f>] kernel_math_error+0x0/0x96
[  863.416406]  [<ffff81003d284080>]
[  863.419711] DWARF2 unwinder stuck at 0xffff81003d284080
[  863.424917] Leftover inexact backtrace:
[  863.428737]  <IRQ> [<ffffffff80164e70>] do_trap+0xdb/0xea
[  863.434138]  [<ffffffff8016b8dc>] do_invalid_op+0xac/0xb8
[  863.439520]  [<ffffffff8012e033>] __free_pages+0xb/0x32
[  863.444731]  [<ffffffff80115c52>] release_console_sem+0x1e4/0x21e
[  863.450811]  [<ffffffff8018b8c2>] vprintk+0x2d8/0x333
[  863.455851]  [<ffffffff8015d5c1>] error_exit+0x0/0x96
[  863.460890]  [<ffffffff80115ab8>] release_console_sem+0x4a/0x21e
[  863.466878]  [<ffffffff80115ab8>] release_console_sem+0x4a/0x21e
[  863.472868]  [<ffffffff8012e033>] __free_pages+0xb/0x32
[  863.478080]  [<ffffffff80135dd6>] free_pages+0x85/0x8a
[  863.483205]  [<ffffffff8016fbc2>] dma_free_coherent+0x41/0x46
[  863.488941]  [<ffffffff88181965>] :aic94xx:asd_unbuild_ssp_ascb+0x98/0xfa
[  863.495715]  [<ffffffff88182be3>] :aic94xx:asd_escb_tasklet_complete+0x2dc/0x465
[  863.503100]  [<ffffffff8817e3d8>] :aic94xx:escb_tasklet_complete+0x8d1/0xa25
[  863.510133]  [<ffffffff8019f3f7>] trace_hardirqs_on+0xe6/0x124
[  863.515955]  [<ffffffff88173916>] :aic94xx:asd_dl_tasklet_handler+0xd0/0x103
[  863.522985]  [<ffffffff8018e03f>] tasklet_action+0x6d/0xc5
[  863.528455]  [<ffffffff80110837>] __do_softirq+0x6b/0xf6
[  863.533753]  [<ffffffff8015dae8>] call_softirq+0x1c/0x28
[  863.539050]  [<ffffffff8016c0d3>] do_softirq+0x36/0x9c
[  863.544173]  [<ffffffff8018de7c>] irq_exit+0x4e/0x5a
[  863.549122]  [<ffffffff8016c2fd>] do_IRQ+0xf4/0xfe
[  863.553899]  [<ffffffff8015cd46>] ret_from_intr+0x0/0xf
[  863.559106]  <EOI>
[  863.561147] Kernel panic - not syncing: Aiee, killing interrupt handler!
[  863.567831]  <0>Rebooting in 30 seconds..

      parent reply	other threads:[~2006-10-05  0:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-10-04 16:44 aic94xx IO errors with "escb_tasklet_complete: phy0: REQ_TASK_ABORT" Muli Ben-Yehuda
2006-10-04 18:29 ` Andy Warner
2006-10-04 20:12   ` Darrick J. Wong
2006-10-04 19:31     ` Andy Warner
2006-10-04 20:56   ` Muli Ben-Yehuda
2006-10-04 21:11     ` James Bottomley
2006-10-05 21:55       ` Muli Ben-Yehuda
2006-10-05 22:11         ` Darrick J. Wong
2006-10-16 19:51       ` Muli Ben-Yehuda
2006-10-16 20:00         ` James Bottomley
2006-10-16 20:47           ` Muli Ben-Yehuda
2006-10-05  0:50 ` Darrick J. Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=452456DB.20407@us.ibm.com \
    --to=djwong@us.ibm.com \
    --cc=James.Bottomley@SteelEye.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=muli@il.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.