QLA12160 ring buffer starvation? on 2.4.x

Linux SCSI subsystem development
 help / color / mirror / Atom feed

* QLA12160 ring buffer starvation? on 2.4.x
@ 2002-11-01 11:50 j-nomura
  2002-11-01 17:04 ` Patrick Mansfield
  0 siblings, 1 reply; 4+ messages in thread
From: j-nomura @ 2002-11-01 11:50 UTC (permalink / raw)
  To: linux-scsi, jes; +Cc: j-nomura

Hello,

I'm using qla1280 driver 3.23 Beta with 2.4.19 kernel and have a trouble
with it.

When I access multiple disks connected to a single SCSI card,
some of disks suffer QLA12160 ring buffer starvation.
It eventually cause SCSI timeout in middle layer (and panics).

As the timed out command have stayed in qla1280 per-LUN queue more than
30 seconds while commands in other queues haven't, I suspect that current
qla1280 driver has a structual problem which brings unfairness of command
issuing. (as followings)

Do you have any idea?

I can workaround it by limitting ha->bus_settings[].hiwat to 16. 

The command issuing process of qla1280 is:

1. scsi_dispatch_cmd puts commands into queue by qla1280_queuecommand().
2. Queued commands are issued to QLA12160 ring buffer by qla1280_next()
   until ring becomes full or outstanding command for the queue reaches
   maximum.
   The maximum number of outstanding commands per queue is determined
   by ha->bus_settings[].hiwat.
3. When ring buffer does not have enough room, commands are put again
   to the tail of queue by qla1280_putq_t().

qla1280_next is executed either when the first command is queued by
qla1280_queuecommand or when issued command completes in qla1280_done.
qla1280_next works only for specified queue.

With these manner, if ha->bus_settings[].hiwat and ring size are similar
value, the queue which has issued many commands has more chance to issue
next commands than others do, doesn't it?

Best regards.
--
NOMURA, Jun'ichi <j-nomura@ce.jp.nec.com, nomura@hpc.bs1.fc.nec.co.jp>
HPC Operating System Group, 1st Computers Software Division,
Computers Software Operations Unit, NEC Solutions.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: QLA12160 ring buffer starvation? on 2.4.x
  2002-11-01 11:50 QLA12160 ring buffer starvation? on 2.4.x j-nomura
@ 2002-11-01 17:04 ` Patrick Mansfield
  2002-11-01 19:42   ` Patrick Mansfield
  0 siblings, 1 reply; 4+ messages in thread
From: Patrick Mansfield @ 2002-11-01 17:04 UTC (permalink / raw)
  To: j-nomura; +Cc: linux-scsi, jes

On Fri, Nov 01, 2002 at 08:50:09PM +0900, j-nomura@ce.jp.nec.com wrote:
> Hello,
> 
> I'm using qla1280 driver 3.23 Beta with 2.4.19 kernel and have a trouble
> with it.
> 
> As the timed out command have stayed in qla1280 per-LUN queue more than
> 30 seconds while commands in other queues haven't, I suspect that current
> qla1280 driver has a structual problem which brings unfairness of command
> issuing. (as followings)
> 
> Do you have any idea?
> 
> I can workaround it by limitting ha->bus_settings[].hiwat to 16. 

> qla1280_next is executed either when the first command is queued by
> qla1280_queuecommand or when issued command completes in qla1280_done.
> qla1280_next works only for specified queue.
> 
> With these manner, if ha->bus_settings[].hiwat and ring size are similar
> value, the queue which has issued many commands has more chance to issue
> next commands than others do, doesn't it?
> 
> Best regards.
> --
> NOMURA, Jun'ichi <j-nomura@ce.jp.nec.com, nomura@hpc.bs1.fc.nec.co.jp>

Hi -

The mid-layer has a similiar algorithm, such that it should allow IO to
all devices, but not in a fair way.

But, there is a flaw in that it requeues IO for a finished command before
checking for the starved IO. This could explain your timeout.

I have not hit the problem or tried out this fix (other than compiling it).

Does this help any?

--- linux-2.4.19/drivers/scsi/scsi_lib.c-orig	Fri Aug  2 17:39:44 2002
+++ linux-2.4.19/drivers/scsi/scsi_lib.c	Fri Nov  1 07:59:44 2002
@@ -265,11 +265,6 @@
 		list_add(&SCpnt->request.queue, &q->queue_head);
 	}
 
-	/*
-	 * Just hit the requeue function for the queue.
-	 */
-	q->request_fn(q);
-
 	SDpnt = (Scsi_Device *) q->queuedata;
 	SHpnt = SDpnt->host;
 
@@ -328,6 +323,12 @@
 			SHpnt->some_device_starved = 0;
 		}
 	}
+
+	/*
+	 * Just hit the requeue function for the queue.
+	 */
+	q->request_fn(q);
+
 	spin_unlock_irqrestore(&io_request_lock, flags);
 }
 
-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: QLA12160 ring buffer starvation? on 2.4.x
  2002-11-01 17:04 ` Patrick Mansfield
@ 2002-11-01 19:42   ` Patrick Mansfield
  2002-11-05 11:04     ` j-nomura
  0 siblings, 1 reply; 4+ messages in thread
From: Patrick Mansfield @ 2002-11-01 19:42 UTC (permalink / raw)
  To: j-nomura; +Cc: linux-scsi, jes

On Fri, Nov 01, 2002 at 09:04:19AM -0800, Patrick Mansfield wrote:
> On Fri, Nov 01, 2002 at 08:50:09PM +0900, j-nomura@ce.jp.nec.com wrote:
> > qla1280_next is executed either when the first command is queued by
> > qla1280_queuecommand or when issued command completes in qla1280_done.
> > qla1280_next works only for specified queue.

I didn't notice the per-queue aspect of the above, that implies
it can completely starve one LU.

You could try setting can_queue to a reasonable limit (rather than
0xfffff) like the size of the ring buffer, and perhaps also lower
the queue depth (similiar to lowering hiwat).

If you are running fine with hiwat lowered, that sounds like a decent
solution (and similiar to what other adapters do - don't allow the
queue depth * number of devices to execeed the capabilities of the
adapter).

-- Patrick Mansfield

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: QLA12160 ring buffer starvation? on 2.4.x
  2002-11-01 19:42   ` Patrick Mansfield
@ 2002-11-05 11:04     ` j-nomura
  0 siblings, 0 replies; 4+ messages in thread
From: j-nomura @ 2002-11-05 11:04 UTC (permalink / raw)
  To: patmans; +Cc: linux-scsi, jes

Hello Patrick,

thank you for your mails and excuse me for late reply.

As you wrote in the latter mail, moving request function forword does
not solve this problem.

Mid-layer queue_depth is set as hiwat value, so lowering hiwat
sets queue depth a decent value in both mid- and low- layer.

Setting can_queue to size of ring buffer seems to make it better.


Also, qla1280_next puts command back to the last of queue, when
it cannot find any room for the command in ring buffer. This makes
the problem worse.
It would be fair to make qla1280_next put request back to the head
of device queue rather than to the tail of it.


From: Patrick Mansfield <patmans@us.ibm.com>
Subject: Re: QLA12160 ring buffer starvation? on 2.4.x
Date: Fri, 1 Nov 2002 11:42:48 -0800

> On Fri, Nov 01, 2002 at 09:04:19AM -0800, Patrick Mansfield wrote:
> > On Fri, Nov 01, 2002 at 08:50:09PM +0900, j-nomura@ce.jp.nec.com wrote:
> > > qla1280_next is executed either when the first command is queued by
> > > qla1280_queuecommand or when issued command completes in qla1280_done.
> > > qla1280_next works only for specified queue.
> 
> I didn't notice the per-queue aspect of the above, that implies
> it can completely starve one LU.
> 
> You could try setting can_queue to a reasonable limit (rather than
> 0xfffff) like the size of the ring buffer, and perhaps also lower
> the queue depth (similiar to lowering hiwat).
> 
> If you are running fine with hiwat lowered, that sounds like a decent
> solution (and similiar to what other adapters do - don't allow the
> queue depth * number of devices to execeed the capabilities of the
> adapter).

Best regards.
--
NOMURA, Jun'ichi <j-nomura@ce.jp.nec.com, nomura@hpc.bs1.fc.nec.co.jp>
HPC Operating System Group, 1st Computers Software Division,
Computers Software Operations Unit, NEC Solutions.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2002-11-05 11:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-01 11:50 QLA12160 ring buffer starvation? on 2.4.x j-nomura
2002-11-01 17:04 ` Patrick Mansfield
2002-11-01 19:42   ` Patrick Mansfield
2002-11-05 11:04     ` j-nomura

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox