public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* 2.5.31 qlogic error "this should not happen"
@ 2002-08-22 22:39 rwhron
  2002-08-22 22:50 ` Lincoln Dale
  2002-08-22 22:54 ` Doug Ledford
  0 siblings, 2 replies; 9+ messages in thread
From: rwhron @ 2002-08-22 22:39 UTC (permalink / raw)
  To: linux-kernel

While running bonnie++ with 2.5.31 and 2.5.31-mm1,
a quad xeon with QLogic Corp. QLA2200 (rev 05)
stopped responding.  These were the last lines
in /var/log/messages before the box was rebooted.

kernel: qlogicfc0 : no handle slots, this should not happen.
kernel: hostdata->queued is 6, in_ptr: 7d
kernel: qlogicfc0 : no handle slots, this should not happen.
kernel: hostdata->queued is 6, in_ptr: 18
kernel: qlogicfc0 : no handle slots, this should not happen.
kernel: hostdata->queued is 6, in_ptr: 33
kernel: qlogicfc0 : no handle slots, this should not happen.
kernel: hostdata->queued is 6, in_ptr: 33
kernel: qlogicfc0 : no handle slots, this should not happen.
kernel: hostdata->queued is 6, in_ptr: 69
kernel: qlogicfc0 : no handle slots, this should not happen.
kernel: hostdata->queued is 6, in_ptr: 69
kernel: qlogicfc0 : no handle slots, this should not happen.
kernel: hostdata->queued is 6, in_ptr: 4
kernel: qlogicfc0 : no handle slots, this should not happen.
kernel: hostdata->queued is 6, in_ptr: 1f
kernel: qlogicfc0 : no handle slots, this should not happen.
kernel: hostdata->queued is 6, in_ptr: 3a

This is the qlogic config:
# CONFIG_SCSI_QLOGIC_FAS is not set
CONFIG_SCSI_QLOGIC_ISP=y
CONFIG_SCSI_QLOGIC_FC=y
# CONFIG_SCSI_QLOGIC_FC_FIRMWARE is not set
# CONFIG_SCSI_QLOGIC_1280 is not set

The same config works fine on 2.4.

Anyone know if the newer qlogic driver compatible with 2.5?
http://marc.theaimsgroup.com/?l=linux-kernel&m=102919565520122&w=2

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.31 qlogic error "this should not happen"
  2002-08-22 22:39 2.5.31 qlogic error "this should not happen" rwhron
@ 2002-08-22 22:50 ` Lincoln Dale
  2002-08-22 22:54 ` Doug Ledford
  1 sibling, 0 replies; 9+ messages in thread
From: Lincoln Dale @ 2002-08-22 22:50 UTC (permalink / raw)
  To: rwhron; +Cc: linux-kernel


>Anyone know if the newer qlogic driver compatible with 2.5?
>http://marc.theaimsgroup.com/?l=linux-kernel&m=102919565520122&w=2

i forward-ported their (beta) driver to 2.5 and posted the diff to l-k at 
the time.


cheers,

lincoln.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.31 qlogic error "this should not happen"
  2002-08-22 22:39 2.5.31 qlogic error "this should not happen" rwhron
  2002-08-22 22:50 ` Lincoln Dale
@ 2002-08-22 22:54 ` Doug Ledford
  1 sibling, 0 replies; 9+ messages in thread
From: Doug Ledford @ 2002-08-22 22:54 UTC (permalink / raw)
  To: rwhron; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 816 bytes --]

On Thu, Aug 22, 2002 at 06:39:16PM -0400, rwhron@earthlink.net wrote:
> While running bonnie++ with 2.5.31 and 2.5.31-mm1,
> a quad xeon with QLogic Corp. QLA2200 (rev 05)
> stopped responding.  These were the last lines
> in /var/log/messages before the box was rebooted.
> 
> kernel: qlogicfc0 : no handle slots, this should not happen.
> kernel: hostdata->queued is 6, in_ptr: 7d

Hmmm...sounds like no one bothered to correct the lock usage in this 
driver after the 2.5 kernel switched to per device queue locks instead of 
the global io_request_lock usage that this driver depended on to be safe.  
Try applying the attached patch and see if it helps you out any.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606
  

[-- Attachment #2: linux-2.4.17-iorl_before.patch --]
[-- Type: text/plain, Size: 1674 bytes --]

diff -u --new-file --recursive --exclude-from /usr/src/exclude linux.19pre8/drivers/scsi/qlogicfc.c linux.19pre8-ac5/drivers/scsi/qlogicfc.c
--- linux.19pre8/drivers/scsi/qlogicfc.c	Thu May  9 22:40:23 2002
+++ linux.19pre8-ac5/drivers/scsi/qlogicfc.c	Thu May  9 22:40:52 2002
@@ -1343,18 +1343,11 @@
 
 	num_free = QLOGICFC_REQ_QUEUE_LEN - REQ_QUEUE_DEPTH(in_ptr, out_ptr);
 	num_free = (num_free > 2) ? num_free - 2 : 0;
-	host->can_queue = hostdata->queued + num_free;
+	host->can_queue = host->host_busy + num_free;
 	if (host->can_queue > QLOGICFC_REQ_QUEUE_LEN)
 		host->can_queue = QLOGICFC_REQ_QUEUE_LEN;
 	host->sg_tablesize = QLOGICFC_MAX_SG(num_free);
 
-	/* this is really gross */
-	if (host->can_queue <= host->host_busy){
-	        if (host->can_queue+2 < host->host_busy) 
-			DEBUG(printk("qlogicfc%d.c crosses its fingers.\n", hostdata->host_id));
-		host->can_queue = host->host_busy + 1;
-	}
-
 	LEAVE("isp2x00_queuecommand");
 
 	return 0;
@@ -1623,17 +1616,11 @@
 
 	num_free = QLOGICFC_REQ_QUEUE_LEN - REQ_QUEUE_DEPTH(in_ptr, out_ptr);
 	num_free = (num_free > 2) ? num_free - 2 : 0;
-	host->can_queue = hostdata->queued + num_free;
+	host->can_queue = host->host_busy + num_free;
 	if (host->can_queue > QLOGICFC_REQ_QUEUE_LEN)
 		host->can_queue = QLOGICFC_REQ_QUEUE_LEN;
 	host->sg_tablesize = QLOGICFC_MAX_SG(num_free);
 
-	if (host->can_queue <= host->host_busy){
-	        if (host->can_queue+2 < host->host_busy) 
-		        DEBUG(printk("qlogicfc%d : crosses its fingers.\n", hostdata->host_id));
-		host->can_queue = host->host_busy + 1;
-	}
-
 	outw(HCCR_CLEAR_RISC_INTR, host->io_port + HOST_HCCR);
 	LEAVE_INTR("isp2x00_intr_handler");
 }

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.31 qlogic error "this should not happen"
@ 2002-08-22 23:08 Eric Weigle
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Weigle @ 2002-08-22 23:08 UTC (permalink / raw)
  To: rwhron; +Cc: Linux kernel mailing list (lkml)

[-- Attachment #1: Type: text/plain, Size: 549 bytes --]

FWIW-

I occasionally saw that error on our 2.4 RAID system; it went away when I
increased the size of the handles array in qlogicfc.h:

-#define QLOGICFC_REQ_QUEUE_LEN  127 /* must be power of two - 1 */
+#define QLOGICFC_REQ_QUEUE_LEN  255 /* must be power of two - 1 */


I know this probably isn't the ``right'' solution, but it worked for me...
your mileage may vary.

-Eric

-- 
------------------------------------------------
 Eric H. Weigle -- http://public.lanl.gov/ehw/ 
------------------------------------------------

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.31 qlogic error "this should not happen"
@ 2002-08-23  0:52 rwhron
  0 siblings, 0 replies; 9+ messages in thread
From: rwhron @ 2002-08-23  0:52 UTC (permalink / raw)
  To: ltd; +Cc: linux-kernel

> i forward-ported their (beta) driver to 2.5 and posted the diff to l-k at
> the time.

The qlogic (beta) driver is good for 10-20% throughput and lower latency
in the tests I run.  I'd love to see the qlogic patch + your
http://marc.theaimsgroup.com/?l=linux-kernel&m=102643109230093&w=2
patch in 2.5.

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.31 qlogic error "this should not happen"
@ 2002-08-23  1:08 rwhron
  0 siblings, 0 replies; 9+ messages in thread
From: rwhron @ 2002-08-23  1:08 UTC (permalink / raw)
  To: dledford; +Cc: linux-kernel

> Try applying the attached patch and see if it helps you out any.

Thanks Doug.  I'm running your patch now.  It takes about 8-12 hours
to get past bonnie++.  So far, so good.

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* 2.5.31 qlogic error "this should not happen"
@ 2002-08-23 11:26 rwhron
  0 siblings, 0 replies; 9+ messages in thread
From: rwhron @ 2002-08-23 11:26 UTC (permalink / raw)
  To: dledford; +Cc: linux-kernel

> Try applying the attached patch and see if it helps you out any.

The box locked up while running bonnie++ with the patch.
One difference was without the patch, it printed the
"this should not happen" 9 times.  With the patch, it printed
54 times (6 times more), if that's any kind of clue.

I'm trying Eric's suggestion to change QLOGICFC_REQ_QUEUE_LEN
from 127 to 255 on top of your patch.

-- 
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.31 qlogic error "this should not happen"
@ 2002-08-25 19:18 rwhron
  2002-08-26 19:36 ` Patrick Mansfield
  0 siblings, 1 reply; 9+ messages in thread
From: rwhron @ 2002-08-25 19:18 UTC (permalink / raw)
  To: ehw, dledford; +Cc: linux-kernel

> I occasionally saw that error on our 2.4 RAID system; it went away when I
> increased the size of the handles array in qlogicfc.h:

-#define QLOGICFC_REQ_QUEUE_LEN  127 /* must be power of two - 1 */
+#define QLOGICFC_REQ_QUEUE_LEN  255 /* must be power of two - 1 */

That change in addition to Doug's patch in 
http://marc.theaimsgroup.com/?l=linux-kernel&m=103005703808312&w=2
have done the trick. 2.5.31-mm1-dl-ew completed the 54 hour
benchmarathon. (first 2.5 kernel to finish). 

Details at:
http://home.earthlink.net/~rwhron/kernel/bigbox.html
-- 
Randy Hron


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.5.31 qlogic error "this should not happen"
  2002-08-25 19:18 rwhron
@ 2002-08-26 19:36 ` Patrick Mansfield
  0 siblings, 0 replies; 9+ messages in thread
From: Patrick Mansfield @ 2002-08-26 19:36 UTC (permalink / raw)
  To: rwhron; +Cc: ehw, dledford, linux-kernel, linux-scsi

On Sun, Aug 25, 2002 at 03:18:54PM -0400, rwhron@earthlink.net wrote:
> > I occasionally saw that error on our 2.4 RAID system; it went away when I
> > increased the size of the handles array in qlogicfc.h:
> 
> -#define QLOGICFC_REQ_QUEUE_LEN  127 /* must be power of two - 1 */
> +#define QLOGICFC_REQ_QUEUE_LEN  255 /* must be power of two - 1 */
> 
> That change in addition to Doug's patch in 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=103005703808312&w=2
> have done the trick. 2.5.31-mm1-dl-ew completed the 54 hour
> benchmarathon. (first 2.5 kernel to finish). 
> 
> Details at:
> http://home.earthlink.net/~rwhron/kernel/bigbox.html
> -- 
> Randy Hron

That's a huge amount of data. Can you show cat /proc/scsi/scsi of the machine
(as well as the lspci etc)?

Does RAID-5 mean you have a storage array attached? i.e. IBM 3542, fastt 200?

It looks like the latest qlogic qla driver includes 2.5.x support,
I haven't tried it but will at some time:

http://download.qlogic.com/drivers/5639/qla2x00-v6.1b5-dist.tgz

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2002-08-26 19:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-08-22 22:39 2.5.31 qlogic error "this should not happen" rwhron
2002-08-22 22:50 ` Lincoln Dale
2002-08-22 22:54 ` Doug Ledford
  -- strict thread matches above, loose matches on Subject: below --
2002-08-22 23:08 Eric Weigle
2002-08-23  0:52 rwhron
2002-08-23  1:08 rwhron
2002-08-23 11:26 rwhron
2002-08-25 19:18 rwhron
2002-08-26 19:36 ` Patrick Mansfield

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox