How many retries to allow?

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* How many retries to allow?
@ 2008-09-25 20:18 Alan Stern
  2008-09-28  9:25 ` Boaz Harrosh
  0 siblings, 1 reply; 5+ messages in thread
From: Alan Stern @ 2008-09-25 20:18 UTC (permalink / raw)
  To: James Bottomley, Boaz Harrosh; +Cc: SCSI development list

James and Boaz:

Here's a question.  Suppose a device returns NOT READY sense key 
repeatedly.  How long should the request be retried before we give up?  
If we never give up then the request will never finish, so the caller 
will hang.

Alan Stern

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How many retries to allow?
  2008-09-25 20:18 How many retries to allow? Alan Stern
@ 2008-09-28  9:25 ` Boaz Harrosh
  2008-09-28 15:44   ` Alan Stern
  0 siblings, 1 reply; 5+ messages in thread
From: Boaz Harrosh @ 2008-09-28  9:25 UTC (permalink / raw)
  To: Alan Stern; +Cc: James Bottomley, SCSI development list

Alan Stern wrote:
> James and Boaz:
> 
> Here's a question.  Suppose a device returns NOT READY sense key 
> repeatedly.  How long should the request be retried before we give up?  
> If we never give up then the request will never finish, so the caller 
> will hang.
> 
> Alan Stern
> 
I always thought  request->retries was for that. Perhaps I misunderstood.

I think there should be one user settable global counter that will limit
all retries of any kind.

Just my $0.017
Boaz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How many retries to allow?
  2008-09-28  9:25 ` Boaz Harrosh
@ 2008-09-28 15:44   ` Alan Stern
  2008-09-28 16:16     ` Boaz Harrosh
  0 siblings, 1 reply; 5+ messages in thread
From: Alan Stern @ 2008-09-28 15:44 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: James Bottomley, SCSI development list

On Sun, 28 Sep 2008, Boaz Harrosh wrote:

> Alan Stern wrote:
> > James and Boaz:
> > 
> > Here's a question.  Suppose a device returns NOT READY sense key 
> > repeatedly.  How long should the request be retried before we give up?  
> > If we never give up then the request will never finish, so the caller 
> > will hang.
> > 
> > Alan Stern
> > 
> I always thought  request->retries was for that. Perhaps I misunderstood.

Maybe it is intended for that purpose, but it isn't being used as far 
as I can tell.  req->retries is never decremented; instead 
scmd->allowed is initialized to req->retries when the request is 
prepped.  But when a command fails and scsi_requeue_command() is 
called, the request is un-prepped and put back on the queue.  Then it 
is prepped again and a new scmd is created -- with the same number of 
retries as before.  Thus we will never run out of retries.

> I think there should be one user settable global counter that will limit
> all retries of any kind.

You're missing a major point.  Suppose for example that the device
returns NOT READY because a new medium is being loaded, a procedure
that takes a couple of seconds.  But the SCSI core doesn't wait between
retries; a new command is sent as soon as the old one fails.  A retry
limit of 10 could easily be used up in a fraction of a second, and then
the request would fail.

Is this how it's supposed to work?  Would it be better to invoke the 
error handler for this sort of thing?

Alan Stern

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How many retries to allow?
  2008-09-28 15:44   ` Alan Stern
@ 2008-09-28 16:16     ` Boaz Harrosh
  2008-09-29 21:14       ` Alan Stern
  0 siblings, 1 reply; 5+ messages in thread
From: Boaz Harrosh @ 2008-09-28 16:16 UTC (permalink / raw)
  To: Alan Stern; +Cc: James Bottomley, SCSI development list

Alan Stern wrote:
> On Sun, 28 Sep 2008, Boaz Harrosh wrote:
> 
>> Alan Stern wrote:
>>> James and Boaz:
>>>
>>> Here's a question.  Suppose a device returns NOT READY sense key 
>>> repeatedly.  How long should the request be retried before we give up?  
>>> If we never give up then the request will never finish, so the caller 
>>> will hang.
>>>
>>> Alan Stern
>>>
>> I always thought  request->retries was for that. Perhaps I misunderstood.
> 
> Maybe it is intended for that purpose, but it isn't being used as far 
> as I can tell.  req->retries is never decremented; instead 
> scmd->allowed is initialized to req->retries when the request is 
> prepped.  But when a command fails and scsi_requeue_command() is 
> called, the request is un-prepped and put back on the queue.  Then it 
> is prepped again and a new scmd is created -- with the same number of 
> retries as before.  Thus we will never run out of retries.
> 

This sounds like a bug to me. It should be fixed. Perhaps it's there since 
the 2.6.18 changes when direct scsi_cmnd requeuing was eliminated. A test
would be most welcome. It should be easy to prove. I would if you don't bit
me to it. (Am pretty busy)

>> I think there should be one user settable global counter that will limit
>> all retries of any kind.
> 
> You're missing a major point.  Suppose for example that the device
> returns NOT READY because a new medium is being loaded, a procedure
> that takes a couple of seconds.  But the SCSI core doesn't wait between
> retries; a new command is sent as soon as the old one fails.  A retry
> limit of 10 could easily be used up in a fraction of a second, and then
> the request would fail.
> 
> Is this how it's supposed to work?  Would it be better to invoke the 
> error handler for this sort of thing?
> 

I always think of that as: timeout been the inner loop and retries on top
of that so 2-second-timeout, 5-retries, means 10 seconds. But now that you point
it out I can see how for some errors this breaks. A test with scsi_debug error
injection should be devised, to make sure things are fixed and don't regress in
the future.

I believe there are lots of theoretical catastrophes in current code, but
not too many in practice. Though, I agree that a pragmatic programing mindset
was practiced, over a more generalized one.

> Alan Stern
> 
> --

Sorry, I will not have time to conduct any tests in the near future, so you're on
your own. But I'll review anything you can post in the matter.

Thanks
Boaz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: How many retries to allow?
  2008-09-28 16:16     ` Boaz Harrosh
@ 2008-09-29 21:14       ` Alan Stern
  0 siblings, 0 replies; 5+ messages in thread
From: Alan Stern @ 2008-09-29 21:14 UTC (permalink / raw)
  To: Boaz Harrosh; +Cc: James Bottomley, SCSI development list

On Sun, 28 Sep 2008, Boaz Harrosh wrote:

> This sounds like a bug to me. It should be fixed. Perhaps it's there since 
> the 2.6.18 changes when direct scsi_cmnd requeuing was eliminated. A test
> would be most welcome. It should be easy to prove. I would if you don't bit
> me to it. (Am pretty busy)

I'll do some testing soon.

> Sorry, I will not have time to conduct any tests in the near future, so you're on
> your own. But I'll review anything you can post in the matter.

I just posted three initial patches; you can review those for now.  
They straighten out existing issues but ignore the matter of limiting
the number of retries.

Alan Stern


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-09-29 21:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-25 20:18 How many retries to allow? Alan Stern
2008-09-28  9:25 ` Boaz Harrosh
2008-09-28 15:44   ` Alan Stern
2008-09-28 16:16     ` Boaz Harrosh
2008-09-29 21:14       ` Alan Stern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox