linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* qla2xxx driver/firmware versions & thermal stability
@ 2004-08-16  8:07 Jamie Clark
  2004-08-16 21:19 ` Andrew Vasquez
  0 siblings, 1 reply; 3+ messages in thread
From: Jamie Clark @ 2004-08-16  8:07 UTC (permalink / raw)
  To: linux-scsi

In the recent past we ran qla2300 driver version 6.04.00 with qla2300F 
cards. The driver/card often died with the following messages  (kernel  
2.4.19-pre10aa4):

Jan 12 12:54:56 prodapp4 kernel: qla2x00: Status Entry invalid handle.
Jan 12 12:54:56 prodapp4 kernel: qla2x00: Performing ISP error recovery 
- ha= c493c07c.
...
Jan 12 12:55:22 prodapp4 kernel: qla2x00(2): ISP error recovery failed - 
board disabled

These failures seem to be temperature related. A slight upward shift in 
the computer room temperature (from very cold to not-quite-as-cold) 
*always* precipitated these errors. After upgrading all the machines to 
qla2340 boards we still have the same problem.

Now, I'm quite aware that this could be caused by poor cooling 
circulation (2U Intel Server chassis / STL2 board) but my basic tests 
with temperature probes indicate nothing excessive - in fact everything 
seems chilly. The room is kept at 18degC, the server cases are cold to 
touch, internal case temperature around the cards of about 22degC. I 
have detailed 5-minute temperature data going back for a year - all of 
it well inside the Qlogic operating specs.

My question is: has anyone else seen anything like this?  Does the older 
Qlogic firmware tend to run hotter?  Is the newer any better?  At the 
moment we're on 6.04.00 as this is the release that is supposedly 
certified by Hitachi (storage vendor). I'd be happy to upgrade to later 
version if I knew it would cure the thermal sensitivity.

boot probe with 2340 and 6.04 driver:

qla2x00_set_info starts at address = c3ac0060
qla2x00: Found  VID=1077 DID=2312 SSVID=1077 SSDID=100
scsi(2): Found a QLA2312  @ bus 1, device 0xa, irq 20, iobase 0x6400
scsi(2): Allocated 4096 SRB(s).
scsi(2): Configure NVRAM parameters...
scsi(2): 32 Bit PCI Addressing Enabled.
scsi(2): Verifying loaded RISC code...
scsi(2): Verifying chip...
scsi(2): Waiting for LIP to complete...
scsi(2): LOOP UP detected.
scsi(2): Topology - (F_Port), Host Loop address 0xffff
scsi2 : QLogic QLA2312 PCI to Fibre Channel Host Adapter: bus 1 device 
10 irq 20
        Firmware version:  3.01.18, Driver version 6.04.00
  Vendor: HITACHI   Model: DF600F            Rev: 0000
  Type:   Direct-Access                      ANSI SCSI revision: 03


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: qla2xxx driver/firmware versions & thermal stability
  2004-08-16  8:07 qla2xxx driver/firmware versions & thermal stability Jamie Clark
@ 2004-08-16 21:19 ` Andrew Vasquez
  2004-08-16 23:41   ` Jamie Clark
  0 siblings, 1 reply; 3+ messages in thread
From: Andrew Vasquez @ 2004-08-16 21:19 UTC (permalink / raw)
  To: Jamie Clark; +Cc: linux-scsi

On Mon, 16 Aug 2004, Jamie Clark wrote:

> 
> In the recent past we ran qla2300 driver version 6.04.00 with qla2300F 
> cards. The driver/card often died with the following messages  (kernel  
> 2.4.19-pre10aa4):
> 
> Jan 12 12:54:56 prodapp4 kernel: qla2x00: Status Entry invalid handle.
> Jan 12 12:54:56 prodapp4 kernel: qla2x00: Performing ISP error recovery 
> - ha= c493c07c.
> ...
> Jan 12 12:55:22 prodapp4 kernel: qla2x00(2): ISP error recovery failed - 
> board disabled
> 
> These failures seem to be temperature related. A slight upward shift in 
> the computer room temperature (from very cold to not-quite-as-cold) 
> *always* precipitated these errors. After upgrading all the machines to 
> qla2340 boards we still have the same problem.
> 

Could you report your findings to QLogic tech-support
(www.qlogic.com/support), they'll be able to assist in
troubleshooting.  

Regards,
Andrew Vasquez
QLogic Corporation

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: qla2xxx driver/firmware versions & thermal stability
  2004-08-16 21:19 ` Andrew Vasquez
@ 2004-08-16 23:41   ` Jamie Clark
  0 siblings, 0 replies; 3+ messages in thread
From: Jamie Clark @ 2004-08-16 23:41 UTC (permalink / raw)
  To: Andrew Vasquez; +Cc: linux-scsi

Andrew Vasquez wrote:

>Could you report your findings to QLogic tech-support
>(www.qlogic.com/support), they'll be able to assist in
>troubleshooting.  
>
>  
>
:) Been there recently. In a nutshell: "it shouldn't happen,
but that's quite an old driver, perhaps you could try
upgrading"

Forgive me for not taking the advice immediately, but if
you knew how long I'd been looking at this, how much
equipment isolated/replaced and how many versions
of the qla driver I build/tried *before* 6.04...

When my colleague raised this on lkml 12-18months
ago (before we knew it was heat related) he had an
off-list reply from someone who had spent a long
time chasing the same error - and traced it to heat
sensitivity (within operating spec).

I originially concluded that it was perhaps a bad batch
of cards - hence replacing all the 2300s with 2340s.

I may eventually replace the driver as I have almost
run out of options - but I was hoping for something
a little more scientific after my past driver upgrades.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-08-16 23:41 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-16  8:07 qla2xxx driver/firmware versions & thermal stability Jamie Clark
2004-08-16 21:19 ` Andrew Vasquez
2004-08-16 23:41   ` Jamie Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).