public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Michael Reed <mdr@sgi.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-scsi <linux-scsi@vger.kernel.org>, Jeremy Higdon <jeremy@sgi.com>
Subject: [PATCH 0/2] Prevent infinite retries due to DID_RESET return status
Date: Wed, 31 Jan 2007 12:54:06 -0600	[thread overview]
Message-ID: <45C0E5CE.7040203@sgi.com> (raw)
In-Reply-To: <20070102121514.GA31621@infradead.org>

I have implemented Christoph's suggestions and comments in his reply
to my RFC.
--------------------------------------------------------------------

Due to a firmware mismatch between a host and target (names withheld to
protect the innocent?), the LLDD was returning DID_RESET for every
i/o command.  This patch modifies the scsi layer to take into account
when the command which received DID_RESET was issued and eventually
give up on it instead of unconditionally reissuing it forever.
With this patch, on my test system, the command receiving the solid
DID_RESET times out after about 360 seconds.

The premise for this patch is that no command should have an infinite
lifetime.  The impetus for this patch was a system which would not
reach a command prompt without disconnecting the storage from the
host.

The significant change in this patch is to call scsi_queue_insert()
instead of scsi_requeue_command() if the command which receives a
DID_RESET did not complete any i/o (good_bytes==0).  scsi_queue_insert()
does not release the command and regenerate it like scsi_requeue_command()
does, hence jiffies_at_alloc reflects when the command was first issued.

Per Christoph's suggestion, I have broken this into two patches.  The
first implements the moving of the scsi_release_buffer() calls so that
it can be more easily reviewed.  The second patch implements the
lifetime timer for commands receiving DID_RESET.

These patches differ from the first posting in that scsi_queue_insert()
is called directly instead of calling the since removed scsi_retry_command();
comments have been cleaned up; and a comment has been added to indicate
that the code is supposed to fall through to call scsi_end_request()
if the command which received DID_RESET has expired.

These patches were tested by modifying a LLDD to return DID_RESET for
every command received.

Patches apply to 2.6.20-rc6-git1.

Signed-off-by: Michael Reed <mdr@sgi.com>

Christoph Hellwig wrote:
> On Mon, Dec 11, 2006 at 03:42:34PM -0600, Michael Reed wrote:
>> Due to a firmware mismatch between a host and target (names withheld to
>> protect the innocent?), the LLDD was returning DID_RESET for every
>> i/o command.  This patch modifies the scsi layer to take into account
>> when the command which received DID_RESET was issued and eventually
>> give up on it instead of unconditionally reissuing it forever
>> when it receives a DID_RESET.  With this patch, on my test system,
>> the command receiving the constant DID_RESET times out after about
>> 360 seconds.
>>
>> The premise for this patch is that no command should have an infinite
>> lifetime.  The impetus for this patch was a system which would not
>> reach a command prompt without disconnecting the storage from the
>> host.
>>
>> The significant change in this patch is to call scsi_retry_command()
>> instead of scsi_requeue_command() if the command which receives a
>> DID_RESET did not complete any i/o (good_bytes==0).  scsi_retry_command()
>> does not release the command and regenerate it like scsi_requeue_command()
>> does, hence jiffies_at_alloc reflects when the command was first issued.
> 
> Generally this patch looks good to me.  Some comments:
> 
> 
>> -extern int scsi_retry_command(struct scsi_cmnd *cmd);
>> +extern int scsi_retry_command(struct scsi_cmnd *cmd, int reason);
> 
> 	I've just sent a patch to kill scsi_retry_command.  Use
> 	scsi_queue_insert directly instead.
> 
>> +	scsi_release_buffers(cmd);
> 
> 	Can you please separate out the moving of the scsi_release_buffer
> 	calls into a separate patch so it can be audited better?
> 
>> @@ -961,9 +992,20 @@ void scsi_io_completion(struct scsi_cmnd
>>  		/* Third party bus reset or reset for error recovery
>>  		 * reasons.  Just retry the request and see what
>>  		 * happens.
>> +		 * If no data was transferred, just reissue this
>> +		 * command.  If data was transferred, regenerate
>> +		 * the command to transfer only untransferred data.
>>  		 */
> 
> The whole comment should look more like:
> 
> 		/*
> 		 * Third party bus reset or reset for error recovery reasons.
> 		 * If no data was transferred, just reissue this command.
> 		 * If data was transferred, regenerate the command to transfer
> 		 * only untransferred data.
> 		 */
> 
>> -		scsi_requeue_command(q, cmd);
>> -		return;
>> +		if (!good_bytes) {
>> +			if (!(scsi_command_expired(cmd))) {
>> +				scsi_retry_command(cmd, SCSI_MLQUEUE_DID_RESET);
>> +				return;
>> +			}
>> +		}
>> +		else {
>> +			scsi_requeue_command(q, cmd);
>> +			return;
>> +		}
> 
> 	With this code we now fallthrough if we don't have any good bytes
> 	and the command has expired.  Is this the expected behaviour? If
> 	yes we need a good comment describing it.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

      reply	other threads:[~2007-01-31 18:54 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-11 21:42 [RFC] Prevent infinite retries due to DID_RESET return status Michael Reed
2007-01-02 12:15 ` Christoph Hellwig
2007-01-31 18:54   ` Michael Reed [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45C0E5CE.7040203@sgi.com \
    --to=mdr@sgi.com \
    --cc=hch@infradead.org \
    --cc=jeremy@sgi.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox