* qla2xxx driver/firmware versions & thermal stability
@ 2004-08-16 8:07 Jamie Clark
2004-08-16 21:19 ` Andrew Vasquez
0 siblings, 1 reply; 3+ messages in thread
From: Jamie Clark @ 2004-08-16 8:07 UTC (permalink / raw)
To: linux-scsi
In the recent past we ran qla2300 driver version 6.04.00 with qla2300F
cards. The driver/card often died with the following messages (kernel
2.4.19-pre10aa4):
Jan 12 12:54:56 prodapp4 kernel: qla2x00: Status Entry invalid handle.
Jan 12 12:54:56 prodapp4 kernel: qla2x00: Performing ISP error recovery
- ha= c493c07c.
...
Jan 12 12:55:22 prodapp4 kernel: qla2x00(2): ISP error recovery failed -
board disabled
These failures seem to be temperature related. A slight upward shift in
the computer room temperature (from very cold to not-quite-as-cold)
*always* precipitated these errors. After upgrading all the machines to
qla2340 boards we still have the same problem.
Now, I'm quite aware that this could be caused by poor cooling
circulation (2U Intel Server chassis / STL2 board) but my basic tests
with temperature probes indicate nothing excessive - in fact everything
seems chilly. The room is kept at 18degC, the server cases are cold to
touch, internal case temperature around the cards of about 22degC. I
have detailed 5-minute temperature data going back for a year - all of
it well inside the Qlogic operating specs.
My question is: has anyone else seen anything like this? Does the older
Qlogic firmware tend to run hotter? Is the newer any better? At the
moment we're on 6.04.00 as this is the release that is supposedly
certified by Hitachi (storage vendor). I'd be happy to upgrade to later
version if I knew it would cure the thermal sensitivity.
boot probe with 2340 and 6.04 driver:
qla2x00_set_info starts at address = c3ac0060
qla2x00: Found VID=1077 DID=2312 SSVID=1077 SSDID=100
scsi(2): Found a QLA2312 @ bus 1, device 0xa, irq 20, iobase 0x6400
scsi(2): Allocated 4096 SRB(s).
scsi(2): Configure NVRAM parameters...
scsi(2): 32 Bit PCI Addressing Enabled.
scsi(2): Verifying loaded RISC code...
scsi(2): Verifying chip...
scsi(2): Waiting for LIP to complete...
scsi(2): LOOP UP detected.
scsi(2): Topology - (F_Port), Host Loop address 0xffff
scsi2 : QLogic QLA2312 PCI to Fibre Channel Host Adapter: bus 1 device
10 irq 20
Firmware version: 3.01.18, Driver version 6.04.00
Vendor: HITACHI Model: DF600F Rev: 0000
Type: Direct-Access ANSI SCSI revision: 03
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: qla2xxx driver/firmware versions & thermal stability
2004-08-16 8:07 qla2xxx driver/firmware versions & thermal stability Jamie Clark
@ 2004-08-16 21:19 ` Andrew Vasquez
2004-08-16 23:41 ` Jamie Clark
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Vasquez @ 2004-08-16 21:19 UTC (permalink / raw)
To: Jamie Clark; +Cc: linux-scsi
On Mon, 16 Aug 2004, Jamie Clark wrote:
>
> In the recent past we ran qla2300 driver version 6.04.00 with qla2300F
> cards. The driver/card often died with the following messages (kernel
> 2.4.19-pre10aa4):
>
> Jan 12 12:54:56 prodapp4 kernel: qla2x00: Status Entry invalid handle.
> Jan 12 12:54:56 prodapp4 kernel: qla2x00: Performing ISP error recovery
> - ha= c493c07c.
> ...
> Jan 12 12:55:22 prodapp4 kernel: qla2x00(2): ISP error recovery failed -
> board disabled
>
> These failures seem to be temperature related. A slight upward shift in
> the computer room temperature (from very cold to not-quite-as-cold)
> *always* precipitated these errors. After upgrading all the machines to
> qla2340 boards we still have the same problem.
>
Could you report your findings to QLogic tech-support
(www.qlogic.com/support), they'll be able to assist in
troubleshooting.
Regards,
Andrew Vasquez
QLogic Corporation
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: qla2xxx driver/firmware versions & thermal stability
2004-08-16 21:19 ` Andrew Vasquez
@ 2004-08-16 23:41 ` Jamie Clark
0 siblings, 0 replies; 3+ messages in thread
From: Jamie Clark @ 2004-08-16 23:41 UTC (permalink / raw)
To: Andrew Vasquez; +Cc: linux-scsi
Andrew Vasquez wrote:
>Could you report your findings to QLogic tech-support
>(www.qlogic.com/support), they'll be able to assist in
>troubleshooting.
>
>
>
:) Been there recently. In a nutshell: "it shouldn't happen,
but that's quite an old driver, perhaps you could try
upgrading"
Forgive me for not taking the advice immediately, but if
you knew how long I'd been looking at this, how much
equipment isolated/replaced and how many versions
of the qla driver I build/tried *before* 6.04...
When my colleague raised this on lkml 12-18months
ago (before we knew it was heat related) he had an
off-list reply from someone who had spent a long
time chasing the same error - and traced it to heat
sensitivity (within operating spec).
I originially concluded that it was perhaps a bad batch
of cards - hence replacing all the 2300s with 2340s.
I may eventually replace the driver as I have almost
run out of options - but I was hoping for something
a little more scientific after my past driver upgrades.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2004-08-16 23:41 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-16 8:07 qla2xxx driver/firmware versions & thermal stability Jamie Clark
2004-08-16 21:19 ` Andrew Vasquez
2004-08-16 23:41 ` Jamie Clark
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).