* How many retries to allow? @ 2008-09-25 20:18 Alan Stern 2008-09-28 9:25 ` Boaz Harrosh 0 siblings, 1 reply; 5+ messages in thread From: Alan Stern @ 2008-09-25 20:18 UTC (permalink / raw) To: James Bottomley, Boaz Harrosh; +Cc: SCSI development list James and Boaz: Here's a question. Suppose a device returns NOT READY sense key repeatedly. How long should the request be retried before we give up? If we never give up then the request will never finish, so the caller will hang. Alan Stern ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How many retries to allow? 2008-09-25 20:18 How many retries to allow? Alan Stern @ 2008-09-28 9:25 ` Boaz Harrosh 2008-09-28 15:44 ` Alan Stern 0 siblings, 1 reply; 5+ messages in thread From: Boaz Harrosh @ 2008-09-28 9:25 UTC (permalink / raw) To: Alan Stern; +Cc: James Bottomley, SCSI development list Alan Stern wrote: > James and Boaz: > > Here's a question. Suppose a device returns NOT READY sense key > repeatedly. How long should the request be retried before we give up? > If we never give up then the request will never finish, so the caller > will hang. > > Alan Stern > I always thought request->retries was for that. Perhaps I misunderstood. I think there should be one user settable global counter that will limit all retries of any kind. Just my $0.017 Boaz ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How many retries to allow? 2008-09-28 9:25 ` Boaz Harrosh @ 2008-09-28 15:44 ` Alan Stern 2008-09-28 16:16 ` Boaz Harrosh 0 siblings, 1 reply; 5+ messages in thread From: Alan Stern @ 2008-09-28 15:44 UTC (permalink / raw) To: Boaz Harrosh; +Cc: James Bottomley, SCSI development list On Sun, 28 Sep 2008, Boaz Harrosh wrote: > Alan Stern wrote: > > James and Boaz: > > > > Here's a question. Suppose a device returns NOT READY sense key > > repeatedly. How long should the request be retried before we give up? > > If we never give up then the request will never finish, so the caller > > will hang. > > > > Alan Stern > > > I always thought request->retries was for that. Perhaps I misunderstood. Maybe it is intended for that purpose, but it isn't being used as far as I can tell. req->retries is never decremented; instead scmd->allowed is initialized to req->retries when the request is prepped. But when a command fails and scsi_requeue_command() is called, the request is un-prepped and put back on the queue. Then it is prepped again and a new scmd is created -- with the same number of retries as before. Thus we will never run out of retries. > I think there should be one user settable global counter that will limit > all retries of any kind. You're missing a major point. Suppose for example that the device returns NOT READY because a new medium is being loaded, a procedure that takes a couple of seconds. But the SCSI core doesn't wait between retries; a new command is sent as soon as the old one fails. A retry limit of 10 could easily be used up in a fraction of a second, and then the request would fail. Is this how it's supposed to work? Would it be better to invoke the error handler for this sort of thing? Alan Stern ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How many retries to allow? 2008-09-28 15:44 ` Alan Stern @ 2008-09-28 16:16 ` Boaz Harrosh 2008-09-29 21:14 ` Alan Stern 0 siblings, 1 reply; 5+ messages in thread From: Boaz Harrosh @ 2008-09-28 16:16 UTC (permalink / raw) To: Alan Stern; +Cc: James Bottomley, SCSI development list Alan Stern wrote: > On Sun, 28 Sep 2008, Boaz Harrosh wrote: > >> Alan Stern wrote: >>> James and Boaz: >>> >>> Here's a question. Suppose a device returns NOT READY sense key >>> repeatedly. How long should the request be retried before we give up? >>> If we never give up then the request will never finish, so the caller >>> will hang. >>> >>> Alan Stern >>> >> I always thought request->retries was for that. Perhaps I misunderstood. > > Maybe it is intended for that purpose, but it isn't being used as far > as I can tell. req->retries is never decremented; instead > scmd->allowed is initialized to req->retries when the request is > prepped. But when a command fails and scsi_requeue_command() is > called, the request is un-prepped and put back on the queue. Then it > is prepped again and a new scmd is created -- with the same number of > retries as before. Thus we will never run out of retries. > This sounds like a bug to me. It should be fixed. Perhaps it's there since the 2.6.18 changes when direct scsi_cmnd requeuing was eliminated. A test would be most welcome. It should be easy to prove. I would if you don't bit me to it. (Am pretty busy) >> I think there should be one user settable global counter that will limit >> all retries of any kind. > > You're missing a major point. Suppose for example that the device > returns NOT READY because a new medium is being loaded, a procedure > that takes a couple of seconds. But the SCSI core doesn't wait between > retries; a new command is sent as soon as the old one fails. A retry > limit of 10 could easily be used up in a fraction of a second, and then > the request would fail. > > Is this how it's supposed to work? Would it be better to invoke the > error handler for this sort of thing? > I always think of that as: timeout been the inner loop and retries on top of that so 2-second-timeout, 5-retries, means 10 seconds. But now that you point it out I can see how for some errors this breaks. A test with scsi_debug error injection should be devised, to make sure things are fixed and don't regress in the future. I believe there are lots of theoretical catastrophes in current code, but not too many in practice. Though, I agree that a pragmatic programing mindset was practiced, over a more generalized one. > Alan Stern > > -- Sorry, I will not have time to conduct any tests in the near future, so you're on your own. But I'll review anything you can post in the matter. Thanks Boaz ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: How many retries to allow? 2008-09-28 16:16 ` Boaz Harrosh @ 2008-09-29 21:14 ` Alan Stern 0 siblings, 0 replies; 5+ messages in thread From: Alan Stern @ 2008-09-29 21:14 UTC (permalink / raw) To: Boaz Harrosh; +Cc: James Bottomley, SCSI development list On Sun, 28 Sep 2008, Boaz Harrosh wrote: > This sounds like a bug to me. It should be fixed. Perhaps it's there since > the 2.6.18 changes when direct scsi_cmnd requeuing was eliminated. A test > would be most welcome. It should be easy to prove. I would if you don't bit > me to it. (Am pretty busy) I'll do some testing soon. > Sorry, I will not have time to conduct any tests in the near future, so you're on > your own. But I'll review anything you can post in the matter. I just posted three initial patches; you can review those for now. They straighten out existing issues but ignore the matter of limiting the number of retries. Alan Stern ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-09-29 21:14 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-09-25 20:18 How many retries to allow? Alan Stern 2008-09-28 9:25 ` Boaz Harrosh 2008-09-28 15:44 ` Alan Stern 2008-09-28 16:16 ` Boaz Harrosh 2008-09-29 21:14 ` Alan Stern
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox