From: Tore Anderson <tore@linpro.no>
To: Andrew Vasquez <andrew.vasquez@qlogic.com>
Cc: Linux SCSI Mailing List <linux-scsi@vger.kernel.org>
Subject: Re: Recurring qla2xxx crashes (maybe APIC related)
Date: Fri, 25 Apr 2008 19:06:19 +0200 [thread overview]
Message-ID: <48120F8B.7060005@linpro.no> (raw)
In-Reply-To: <20080425155018.GG8849@plap4>
[-- Attachment #1: Type: text/plain, Size: 1895 bytes --]
Hi Andrew,
* Andrew Vasquez
> There's a slew of problem reports noted on the web with this 'APIC
> error' signature... From the qla2xxx driver perspective the following
> logs show the classic 'no interrupts being routed' failures:
Yes - I suspect this might not have anything to do with the HBA at all.
It is a bit odd that it is always the qla2xxx driver that runs into
trouble, for instance will I/O to the local hard drives continue to work
(which is fortunate as that's where I have the kernel logs).
> So do the abort requests fail with the similar signature (timeout)?
The log isn't edited so if it doesn't say then I don't know.
I/O service never recovers after the crash, so the multipath maps blocks
all I/O until the machine is rebooted (which the remaining cluster
members take care of within a minute).
> There's a blanket suggestion that has helped others (perhaps by
> ignoring the problem), disable the APIC:
>
> apm=force noapic acpi=off pci=noacpi
>
> but that seems like a bandaid. I'd suggest you work this through your
> IBM support contract, if possible.
I will try to do both, thank you for the suggestions. I fear IBM will
hang up on me for not running SuSE or Red Hat, though...
> BTW: I'd like to take a look at several failure iterations, could you
> send the messages file during the failures...
Okay, sent you the (unedited) kern.log since the last log rotation. It
contains several crash events, as well as the bootup messages (left them
in there in case there's anything interesting for you to see).
I have many more crash events in the rotated logs. If you want I can
send you those too (maybe off list due to their size), just say so.
They all look the same, though: APIC errors followed by qla2xxx
attempting to fix it, but the rports never recover and in the end the
machine is rebooted by another cluster node.
Regards,
--
Tore Anderson
[-- Attachment #2: kern.log.gz --]
[-- Type: application/x-gzip, Size: 46847 bytes --]
next prev parent reply other threads:[~2008-04-25 17:06 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-25 10:04 Recurring qla2xxx crashes (maybe APIC related) Tore Anderson
2008-04-25 15:50 ` Andrew Vasquez
2008-04-25 17:06 ` Tore Anderson [this message]
2008-04-25 17:18 ` Andrew Vasquez
2008-04-28 6:37 ` Tore Anderson
2008-04-29 21:16 ` Andrew Vasquez
2008-04-29 21:45 ` Tore Anderson
2008-04-29 22:29 ` Andrew Vasquez
2008-04-30 8:32 ` Tore Anderson
2008-04-30 17:18 ` Andrew Vasquez
2008-05-05 7:48 ` Tore Anderson
2008-05-05 20:00 ` Tore Anderson
2008-05-06 14:02 ` Andrew Vasquez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48120F8B.7060005@linpro.no \
--to=tore@linpro.no \
--cc=andrew.vasquez@qlogic.com \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).