All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Reed <mdr@sgi.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-scsi <linux-scsi@vger.kernel.org>, Jeremy Higdon <jeremy@sgi.com>
Subject: [PATCH 0/2] Prevent infinite retries due to DID_RESET return status
Date: Wed, 31 Jan 2007 12:54:06 -0600	[thread overview]
Message-ID: <45C0E5CE.7040203@sgi.com> (raw)
In-Reply-To: <20070102121514.GA31621@infradead.org>

I have implemented Christoph's suggestions and comments in his reply
to my RFC.
--------------------------------------------------------------------

Due to a firmware mismatch between a host and target (names withheld to
protect the innocent?), the LLDD was returning DID_RESET for every
i/o command.  This patch modifies the scsi layer to take into account
when the command which received DID_RESET was issued and eventually
give up on it instead of unconditionally reissuing it forever.
With this patch, on my test system, the command receiving the solid
DID_RESET times out after about 360 seconds.

The premise for this patch is that no command should have an infinite
lifetime.  The impetus for this patch was a system which would not
reach a command prompt without disconnecting the storage from the
host.

The significant change in this patch is to call scsi_queue_insert()
instead of scsi_requeue_command() if the command which receives a
DID_RESET did not complete any i/o (good_bytes==0).  scsi_queue_insert()
does not release the command and regenerate it like scsi_requeue_command()
does, hence jiffies_at_alloc reflects when the command was first issued.

Per Christoph's suggestion, I have broken this into two patches.  The
first implements the moving of the scsi_release_buffer() calls so that
it can be more easily reviewed.  The second patch implements the
lifetime timer for commands receiving DID_RESET.

These patches differ from the first posting in that scsi_queue_insert()
is called directly instead of calling the since removed scsi_retry_command();
comments have been cleaned up; and a comment has been added to indicate
that the code is supposed to fall through to call scsi_end_request()
if the command which received DID_RESET has expired.

These patches were tested by modifying a LLDD to return DID_RESET for
every command received.

Patches apply to 2.6.20-rc6-git1.

Signed-off-by: Michael Reed <mdr@sgi.com>

Christoph Hellwig wrote:
> On Mon, Dec 11, 2006 at 03:42:34PM -0600, Michael Reed wrote:
>> Due to a firmware mismatch between a host and target (names withheld to
>> protect the innocent?), the LLDD was returning DID_RESET for every
>> i/o command.  This patch modifies the scsi layer to take into account
>> when the command which received DID_RESET was issued and eventually
>> give up on it instead of unconditionally reissuing it forever
>> when it receives a DID_RESET.  With this patch, on my test system,
>> the command receiving the constant DID_RESET times out after about
>> 360 seconds.
>>
>> The premise for this patch is that no command should have an infinite
>> lifetime.  The impetus for this patch was a system which would not
>> reach a command prompt without disconnecting the storage from the
>> host.
>>
>> The significant change in this patch is to call scsi_retry_command()
>> instead of scsi_requeue_command() if the command which receives a
>> DID_RESET did not complete any i/o (good_bytes==0).  scsi_retry_command()
>> does not release the command and regenerate it like scsi_requeue_command()
>> does, hence jiffies_at_alloc reflects when the command was first issued.
> 
> Generally this patch looks good to me.  Some comments:
> 
> 
>> -extern int scsi_retry_command(struct scsi_cmnd *cmd);
>> +extern int scsi_retry_command(struct scsi_cmnd *cmd, int reason);
> 
> 	I've just sent a patch to kill scsi_retry_command.  Use
> 	scsi_queue_insert directly instead.
> 
>> +	scsi_release_buffers(cmd);
> 
> 	Can you please separate out the moving of the scsi_release_buffer
> 	calls into a separate patch so it can be audited better?
> 
>> @@ -961,9 +992,20 @@ void scsi_io_completion(struct scsi_cmnd
>>  		/* Third party bus reset or reset for error recovery
>>  		 * reasons.  Just retry the request and see what
>>  		 * happens.
>> +		 * If no data was transferred, just reissue this
>> +		 * command.  If data was transferred, regenerate
>> +		 * the command to transfer only untransferred data.
>>  		 */
> 
> The whole comment should look more like:
> 
> 		/*
> 		 * Third party bus reset or reset for error recovery reasons.
> 		 * If no data was transferred, just reissue this command.
> 		 * If data was transferred, regenerate the command to transfer
> 		 * only untransferred data.
> 		 */
> 
>> -		scsi_requeue_command(q, cmd);
>> -		return;
>> +		if (!good_bytes) {
>> +			if (!(scsi_command_expired(cmd))) {
>> +				scsi_retry_command(cmd, SCSI_MLQUEUE_DID_RESET);
>> +				return;
>> +			}
>> +		}
>> +		else {
>> +			scsi_requeue_command(q, cmd);
>> +			return;
>> +		}
> 
> 	With this code we now fallthrough if we don't have any good bytes
> 	and the command has expired.  Is this the expected behaviour? If
> 	yes we need a good comment describing it.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

      reply	other threads:[~2007-01-31 18:54 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-11 21:42 [RFC] Prevent infinite retries due to DID_RESET return status Michael Reed
2007-01-02 12:15 ` Christoph Hellwig
2007-01-31 18:54   ` Michael Reed [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45C0E5CE.7040203@sgi.com \
    --to=mdr@sgi.com \
    --cc=hch@infradead.org \
    --cc=jeremy@sgi.com \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.