From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Anderson Subject: Re: [PATCH]: Flexible timeout infrastructure Date: Wed, 16 Jun 2004 08:27:58 -0700 Sender: linux-scsi-owner@vger.kernel.org Message-ID: <20040616152758.GB4288@us.ibm.com> References: <40CF0F9F.4050902@adaptec.com> <1087313492.1796.37.camel@mulgrave> <40CF4A15.9060005@adaptec.com> <1087329285.2048.94.camel@mulgrave> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from e35.co.us.ibm.com ([32.97.110.133]:19619 "EHLO e35.co.us.ibm.com") by vger.kernel.org with ESMTP id S263174AbUFPP2G (ORCPT ); Wed, 16 Jun 2004 11:28:06 -0400 Content-Disposition: inline In-Reply-To: <1087329285.2048.94.camel@mulgrave> List-Id: linux-scsi@vger.kernel.org To: James Bottomley Cc: Luben Tuikov , SCSI Mailing List James Bottomley [James.Bottomley@steeleye.com] wrote: > In the ensuing discussion there have been various changes to this > suggested, which seem to provide a framework for the solution: > > 1. Timer handling would still all be done in the mid-layer > > 2. Any driver supplying the notify function would have it called on > timer expiry. > > 3. The LLD communicates what action it wishes to be taken based on the > return value from the notify. I suggest 3 possible return actions: > > a. Do nothing and continue with error handling > > b. I fixed the problem, complete the command immediately and proceed as > though nothing went wrong. Does this mean scsi_times_out will complete the command by calling a SCSI mid layer internal form of the scsi_done function (less the scsi_delete_timer call) or that the LLDD will call scsi_done and we will need to modify scsi_done to accept these no timer running cases. > > c. I need more time, reset the timer and notify me again when it fails. > > For (c), I propose that we use the same timeout period, but increment > the retry count (and do this up to allowed retries plus one [so that > no-retry commands have one crack at being recovered by the LLD]) when > retries are exhausted, normal error handling would proceed on timer > expiry leading to certain failure of the command since it would be > ineligible to be retried. The comment on the no-retry commands appears counter to the intent of FASTFAIL. On a multi-ported device if there really is a port / controller issue we have increased the failover time 2x the timeout value which IIRC was one case that FASTFAIL wished to address. > > what additional features do you need beyond this proposal? > -andmike -- Michael Anderson andmike@us.ibm.com