Linux PARISC architecture development
 help / color / mirror / Atom feed
* [parisc-linux] Re: tag starvation
       [not found] <200201240200.VAA00858@monmouth.com>
@ 2002-01-25  6:55 ` Grant Grundler
  2002-01-25  9:57   ` Richard Hirst
  0 siblings, 1 reply; 8+ messages in thread
From: Grant Grundler @ 2002-01-25  6:55 UTC (permalink / raw)
  To: Vlad Markov; +Cc: debian-hppa, parisc-linux

Follow up to parisc-linux@lists.parisc-linux.org only please.
(ie this isn't a debian linux issue)

Vlad Markov wrote:
> I get the following message when I boot up on a 735/99:
> Jan 22 19:12:41 grumpy kernel: scsi0 (1:0) Target is suffering from tag
>	starvation.

Seems only 53c700.c prints this msg. Is this the right driver?

Normally you shouldn't see this *if* the device supports tagged commands.
I can never remember the SCSI driver options (the parisc-linux FAQ has 
the URL to them) but one of them will either disable or limit 
"queue depth" for Queue Tags and that should take care of it.

BTW, normally you want to post the /proc/scsi/scsi info for
the devices in question.

> The kernel Iam using is:
> Jan 22 19:12:41 grumpy kernel: Linux version 2.4.16-32 (root@paer) (gcc versi
>   on

> 3.0.3) #1 Sat Dec 29 01:28:13 MST 2001
> 
> I suspect this is not a good message - should I worry? 

Panic would be worse ;^)

grant

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [parisc-linux] Re: tag starvation
  2002-01-25  6:55 ` [parisc-linux] Re: tag starvation Grant Grundler
@ 2002-01-25  9:57   ` Richard Hirst
  2002-01-26  5:28     ` Grant Grundler
  0 siblings, 1 reply; 8+ messages in thread
From: Richard Hirst @ 2002-01-25  9:57 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Vlad Markov, debian-hppa, parisc-linux

On Thu, Jan 24, 2002 at 11:55:01PM -0700, Grant Grundler wrote:
> 
> Follow up to parisc-linux@lists.parisc-linux.org only please.
> (ie this isn't a debian linux issue)
> 
> Vlad Markov wrote:
> > I get the following message when I boot up on a 735/99:
> > Jan 22 19:12:41 grumpy kernel: scsi0 (1:0) Target is suffering from tag
> >	starvation.
> 
> Seems only 53c700.c prints this msg. Is this the right driver?

Should be; 735 has unsupported 53c720 FWD, and 53c700 or 710 driven by
53c700.c.  I assume Vlad has external narrow SE disks attached.

> Normally you shouldn't see this *if* the device supports tagged commands.

I used to get this a lot, until James changed the driver to only report
the first occurance of it.  Don't remember the exact details, but iirc
it is for info only, and non-fatal.  From the code it looks like it
means some cmnd has been sitting in the drive unprocessed for too long,
and the code rejects new cmds until those older ones have been processed
or timed out.

> I can never remember the SCSI driver options (the parisc-linux FAQ has 
> the URL to them) but one of them will either disable or limit 
> "queue depth" for Queue Tags and that should take care of it.

Hmm, I thought that would be a feature of a specific driver, not the
upper layers.  53c700.c doesn't (yet) have any boot options to disable
tags.

Richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [parisc-linux] Re: tag starvation
  2002-01-25  9:57   ` Richard Hirst
@ 2002-01-26  5:28     ` Grant Grundler
  2002-01-26 17:22       ` Richard Hirst
  0 siblings, 1 reply; 8+ messages in thread
From: Grant Grundler @ 2002-01-26  5:28 UTC (permalink / raw)
  To: Richard Hirst; +Cc: Vlad Markov, parisc-linux

Richard Hirst wrote:
> I used to get this a lot, until James changed the driver to only report
> the first occurance of it.  Don't remember the exact details, but iirc
> it is for info only, and non-fatal.  From the code it looks like it
> means some cmnd has been sitting in the drive unprocessed for too long,
> and the code rejects new cmds until those older ones have been processed
> or timed out.

Ah ok...does "starvation" properly described by the following scenario?

While the drive is completing IO requests as fast as it can,
*some* IO's don't complete because they never become the most
optimal one to complete. This is an inherent "unfairness"
in the normal SCSI  Queue tag. (Two other types of tags exist but
are never used by Unix OS's AFAICT: ordered, Head)

HP branded (and tested) drives are required to return a completion
for any outstanding IO within 3 seconds. ie if 8 tags are active
any given time, the drive can complete the IO's in any order until
an IO reaches this 3 second limit. The reason for the limit is
to prevent outstanding file system meta data from locking up access
to portions the file system for 30+ seconds.

> Hmm, I thought that would be a feature of a specific driver, not the
> upper layers.  53c700.c doesn't (yet) have any boot options to disable
> tags.

hmmm...could we just replace the following constant with
a MODULE_PARAM() variable?

53c700.h:#define NCR_700_MAX_TAGS               16

In the interim, I suggest reducing this until the problem
goes away for that disk.

IIRC, HPUX allows the user to set this on a per disk basis
using scsictl command.

thanks,
grant

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [parisc-linux] Re: tag starvation
  2002-01-26  5:28     ` Grant Grundler
@ 2002-01-26 17:22       ` Richard Hirst
  0 siblings, 0 replies; 8+ messages in thread
From: Richard Hirst @ 2002-01-26 17:22 UTC (permalink / raw)
  To: Grant Grundler; +Cc: Vlad Markov, parisc-linux

On Fri, Jan 25, 2002 at 10:28:33PM -0700, Grant Grundler wrote:
> Ah ok...does "starvation" properly described by the following scenario?
> 
> While the drive is completing IO requests as fast as it can,
> *some* IO's don't complete because they never become the most
> optimal one to complete. This is an inherent "unfairness"
> in the normal SCSI  Queue tag. (Two other types of tags exist but
> are never used by Unix OS's AFAICT: ordered, Head)

Yes, that sums it up nicely.

> hmmm...could we just replace the following constant with
> a MODULE_PARAM() variable?
> 
> 53c700.h:#define NCR_700_MAX_TAGS               16

That sounds like a good idea; I think James was planning to add some
options like that to the driver, so we should check with him before
doing anything.

Richard

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [parisc-linux] Re: tag starvation
@ 2002-01-26 17:23 James Bottomley
  2002-01-27  8:41 ` Grant Grundler
  0 siblings, 1 reply; 8+ messages in thread
From: James Bottomley @ 2002-01-26 17:23 UTC (permalink / raw)
  To: Richard Hirst; +Cc: Grant Grundler, parisc-linux

> > Normally you shouldn't see this *if* the device supports tagged
> > commands.

> I used to get this a lot, until James changed the driver to only
> report the first occurance of it.  Don't remember the exact details,
> but iirc it is for info only, and non-fatal.  From the code it looks
> like it means some cmnd has been sitting in the drive unprocessed for
> too long, and the code rejects new cmds until those older ones have
> been processed or timed out.

That's essentially it.  A driver is allowed to execute simple tagged commands 
in any order it chooses (since it knows its own internal platter topology, it 
is supposed to order the execution to be the fastest and most efficient 
possible).  However, the queuing algorithm on some drives can be inherently 
unfair; usually if you have a steady stream of I/Os to one part of the platter 
and a single I/O waiting for a different one.  An unfair algorithm may simply 
ignore a pending tagged command for quite a period of time (this is what is 
known as tag starvation).  If the command remains unprocessed for >2s, the 
mid-layer will begin error recovery, which can cause all sorts of problems.

Almost every good driver that implements tagged commands has some sort of 
algorithm to detect this situation and correct it before the mid layer comes 
in with the big hammer.  The message is a harmless warning that this type of 
correction has been activated in the driver.  For those who're interested in 
the details, I attach the explanation of what it actually does at the bottom.

> > I can never remember the SCSI driver options (the parisc-linux FAQ has
> > the URL to them) but one of them will either disable or limit 
> > "queue depth" for Queue Tags and that should take care of it.

> Hmm, I thought that would be a feature of a specific driver, not the
> upper layers.  53c700.c doesn't (yet) have any boot options to disable
> tags.

OK, my fault, I keep meaning to add it.  One thing that irritates me about 
this option is that it should be a global one (belonging to the whole SCSI 
subsystem) not local to each driver.  However, that's just a pet peeve of mine 
(in fact the SCSI subsystem should do an awful lot more of this type of option 
tracking and helping), it's not too difficult to implement, I'll get on with 
doing it.

James

How to Counter Tag Starvation
==============================

Most of the maintained drivers in Linux do this by keeping a timer on the 
outstanding tagged commands.  When they see the timer expire they switch from 
simple tags to ordered tags (an ordered tag is like a marker in the 
queue---you can't execute any command after an ordered tag untill all those 
before it have completed).

The 53c700 has a much simpler approach:  A tag is simply a number between 0 
and 255 identifying the command.  Obviously, there cannot be two tags with the 
same number to the same device outstanding at any one time.  For each device 
the 53c700 keeps track of the tag number of the oldest outstanding command and 
the next tag to allocate (the latter is incremented by one [modulo 256] every 
time a command goes out).  You can think of this as hands on a clock with 256 
graduations.  All outstanding tags are between the two hands.  The driver 
detects tag starvation when the hands try to cross (i.e. the next tag to be 
allocated would be the same tag number as the oldest outstanding command).  At 
that point, it prints the message and refuses to accept any further I/Os from 
the mid layer.  Eventually, the offending outstanding command will clear 
(possibly after all the rest of the commands are emptied) and the driver 
begins accepting I/Os again.

The reason for this approach in the 53c700 is that it is driving much older 
(and buggier) devices.  If the device messed up on the ordered queue tag we 
could get into a whole heap of trouble.

Obviously, since the SCSI mid-layer also keeps a timer on outstanding 
commands, it is a complete waste to duplicate this inside the driver.  
Unfortunately, the first the driver hears from the mid-layer about a problem 
command is when the mid-layer wants it aborted, by which time it is a bit late.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [parisc-linux] Re: tag starvation
  2002-01-26 17:23 James Bottomley
@ 2002-01-27  8:41 ` Grant Grundler
  2002-01-27 16:52   ` James Bottomley
  0 siblings, 1 reply; 8+ messages in thread
From: Grant Grundler @ 2002-01-27  8:41 UTC (permalink / raw)
  To: James Bottomley; +Cc: Richard Hirst, parisc-linux

James Bottomley wrote:
> That's essentially it.  A driver is allowed to execute simple tagged commands
> in any order it chooses (since it knows its own internal platter topology, it

James,
thanks for the excellent explanation.
But did you mean device or drive?  (instead of "driver")

> ignore a pending tagged command for quite a period of time (this is what is 
> known as tag starvation).  If the command remains unprocessed for >2s, the 
> mid-layer will begin error recovery, which can cause all sorts of problems.

ah...that explains it. Most HP drives are expected to have 3s.

> One thing that irritates me about 
> this option is that it should be a global one (belonging to the whole SCSI 
> subsystem) not local to each driver.

It should also be *per drive*. Different drives implement
different numbers of queue tags (eg disk array vs simple mech).

> How to Counter Tag Starvation
> ==============================
> 
> Most of the maintained drivers in Linux do this by keeping a timer on the 
> outstanding tagged commands.  When they see the timer expire they switch from
>    
> simple tags to ordered tags (an ordered tag is like a marker in the 
> queue---you can't execute any command after an ordered tag untill all those 
> before it have completed).

AFAIK, HP does not test disk drives to verify ordered tags work
correctly. One reason is we didn't want to expose new bugs by mixing
ordered with simple tags. The other reason is we saw a 25% performance
hit. The 5400 rpm 2GB drives at the time could complete ~80 IO/s with
simple tags. This dropped to 60-65 IO/s for ordered tags. Ordered tags
was considered an unacceptable solution at that point.

> The driver detects tag starvation when the hands try to cross (i.e. the
> next tag to be
> allocated would be the same tag number as the oldest outstanding command).
> At that point, it prints the message and refuses to accept any further
> I/Os from the mid layer.

Well done - I like this solution too.

> The reason for this approach in the 53c700 is that it is driving much older 
> (and buggier) devices.  If the device messed up on the ordered queue tag we 
> could get into a whole heap of trouble.

Exactly. Best case is the drive gets confused and locks up.
Worst case is it looses the data.

> Obviously, since the SCSI mid-layer also keeps a timer on outstanding 
> commands, it is a complete waste to duplicate this inside the driver.  
> Unfortunately, the first the driver hears from the mid-layer about a problem 
> command is when the mid-layer wants it aborted, by which time it is a bit lat
>   e.

This is fun part about driver interactions in the error recovery path.
Could one avoid this mess if the SCSI interface driver could guarantee
the IO will complete (with or w/o error) with-in the time frame
specified by the device (eg tape or disk) driver?

grant

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [parisc-linux] Re: tag starvation
  2002-01-27  8:41 ` Grant Grundler
@ 2002-01-27 16:52   ` James Bottomley
  2002-01-27 19:05     ` Grant Grundler
  0 siblings, 1 reply; 8+ messages in thread
From: James Bottomley @ 2002-01-27 16:52 UTC (permalink / raw)
  To: Grant Grundler; +Cc: James Bottomley, Richard Hirst, parisc-linux

> > That's essentially it.  A driver is allowed to execute simple tagged
> > commands
> > in any order it chooses (since it knows its own internal platter
> > topology, it

> James, thanks for the excellent explanation. But did you mean device
> or drive?  (instead of "driver") 

drive (but actually device is more accurate).

> > One thing that irritates me about 
> > this option is that it should be a global one (belonging to the whole
> > SCSI 
> > subsystem) not local to each driver.

> It should also be *per drive*. Different drives implement different
> numbers of queue tags (eg disk array vs simple mech). 

That's what I mean:  Inside scsi_scan.c there's a table which identifies 
various devices (by INQUIRY string) and takes certain actions (scan all luns, 
turn on/off tag queueing etc.).  There should be module parameters that allow 
adding to or modifying this.  In addition, there should be a module parameter 
that allows setting these actions by (scsi,channel,target,lun) quad, then we 
wouldn't need each of the low level drivers to have its own module commands 
for doing this.

> > Obviously, since the SCSI mid-layer also keeps a timer on outstanding 
> > commands, it is a complete waste to duplicate this inside the driver.
> 
> > Unfortunately, the first the driver hears from the mid-layer about a
> > problem 
> > command is when the mid-layer wants it aborted, by which time it is a
> > bit lat
> >  e.

> This is fun part about driver interactions in the error recovery path.
> Could one avoid this mess if the SCSI interface driver could guarantee
> the IO will complete (with or w/o error) with-in the time frame
> specified by the device (eg tape or disk) driver? 

Actually, I looked and the timeout is now 10s, not 2s.  The upper device 
drivers (st, sd etc.) do get to override this.  The problems tend to come when 
the error recovery takes over.

James

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [parisc-linux] Re: tag starvation
  2002-01-27 16:52   ` James Bottomley
@ 2002-01-27 19:05     ` Grant Grundler
  0 siblings, 0 replies; 8+ messages in thread
From: Grant Grundler @ 2002-01-27 19:05 UTC (permalink / raw)
  To: James Bottomley; +Cc: Richard Hirst, parisc-linux

James Bottomley wrote:
> Actually, I looked and the timeout is now 10s, not 2s.

Ah ok. then really the question is if this happens with HP firmware or not.
(Eg ST15150WD should have HP12 firmware rev, iirc). I don't expect it to.

> The upper device 
> drivers (st, sd etc.) do get to override this.  The problems tend to come
> when the error recovery takes over.

errory recovery is a nightmare. For HPUX, we concluded we coudn't safely
guarantee the bus would be cleared until about 45 seconds after an IO
had be initiated. Roughly in the following order:
	30 sec	IO timeout
	10 sec	abort cmd to timeout (attempt to kill the original IO)
	 5 sec	post-reset delay (SCSI device recovery from bus reset)

Shortening the 30 second timer could be done but it would require
more restrictions on how much IO is going across the bus. non-trivial
problem given these were SCSI clusters.

grant

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-01-27 19:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <200201240200.VAA00858@monmouth.com>
2002-01-25  6:55 ` [parisc-linux] Re: tag starvation Grant Grundler
2002-01-25  9:57   ` Richard Hirst
2002-01-26  5:28     ` Grant Grundler
2002-01-26 17:22       ` Richard Hirst
2002-01-26 17:23 James Bottomley
2002-01-27  8:41 ` Grant Grundler
2002-01-27 16:52   ` James Bottomley
2002-01-27 19:05     ` Grant Grundler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox