All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Vasquez <andrew.vasquez@qlogic.com>
To: "michaelc@cs.wisc.edu" <michaelc@cs.wisc.edu>
Cc: "linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>
Subject: Re: [PATCH] scsi_lib: don't decrement busy counters when inserting commands
Date: Thu, 18 Dec 2008 01:03:41 -0800	[thread overview]
Message-ID: <20081218090341.GA1765@plap4-2.local> (raw)
In-Reply-To: <1229493343215-git-send-email-michaelc@cs.wisc.edu>

On Tue, 16 Dec 2008, michaelc@cs.wisc.edu wrote:

> This patch fixes a regression in scsi-misc introduced with:
> 312efe5efcdb02d604ea37a41d965f32ca276d6a
> [SCSI] simplify scsi_io_completion().
> 
> The problem is that in previous kernels scsi_io_completion would call
> scsi_requeue_command, but now it can call scsi_queue_insert for
> something like a UNIT_ATTENTION (for netapp targets we get
> UNIT_ATTENTION when restarting a iscsi session). And scsi_queue_insert
> will call scsi_device_unbusy, but in the scsi_io_completion code path
> scsi_finish_command has already called this so we now end up
> with invalid host, target and device busy values.

Thanks for finding this, as we've run into this problem ourselves.
After applying your fix, at least I/O appears to continue...  But
there's still some residual problems left over with the changes from
scsi_requeue_command() -> scsi_queue_insert() in
312efe5efcdb02d604ea37a41d965f32ca276d6a that are causing commands
returned by the LLD with a DID_RESET status to be reissued with
cleared cmd->sdb data which in our tests are manifesting in firmware
detected overruns.  Here's a snippet of a READ_10 scsi_cmnd upon
completion by the storage:

	[  107.083085] scsi(4:0:0): OVERRUN status detected 0x7-0x400
	[  107.087081] CDB: 0x28 0x0 0x0 0x0 0x28 0x0
	[  107.087081] CDB: 0x0 0x4 0x0 0x0 0x0 0x0
	[  107.087081] PID=0x82d req=0x0 xtra=0x80000 -- returning DID_ERROR status!
	[  107.087081] cmd: serial=82d scsi_bufflen=0 retries=0 allowed=5 sc_data_direction=2
	[  107.087081] cmd->sdb: length=0 resid=0
	[  107.087081] cmd->sdb.table: sgl=0000000000000000 nents=0 orig_nents=0
	[  107.087081] req: cmd_flags=83c60 hard_nr_sections=0x400 nr_sectors=0x400 data_len=0x80000

the CDB in question contains a valid transfer-length of 400h blocks,
but the scsi_bufflen() of the scsi_cmnd is set to 0.

With the new 'simplify' code, what appears to be happening is the
scsi_cmnd->sdb structure is never being re-populated within
scsi_io_completion() after the scsi_release_buffers() call as
scsi_queue_insert() simply reissues the request with the previously
allocated scsi_cmnd and the request's REQ_DONTPREP bit set...

The original 'pre-simplified' code, as Mike C. notes, would call
scsi_requeue_command() which tore-down the request's scsi_cmnd prior
to retry.

I've tried this small patch:

	--- a/drivers/scsi/scsi_lib.c
	+++ b/drivers/scsi/scsi_lib.c
	@@ -970,7 +973,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
			 * reasons.  Just retry the command and see what
			 * happens.
			 */
	-		action = ACTION_RETRY;
	+		action = ACTION_REPREP;
		} else if (sense_valid && !sense_deferred) {
			switch (sshdr.sense_key) {
			case UNIT_ATTENTION:

but, it doesn't appear to solve the problem either.  Additionally,
it's not going to cover any other cases which ultimately result in
scsi_queue_insert() being used for retries...

Any thoughts on how to handle this???

  reply	other threads:[~2008-12-18  9:03 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-17  5:55 [PATCH] scsi_lib: don't decrement busy counters when inserting commands michaelc
2008-12-18  9:03 ` Andrew Vasquez [this message]
2009-01-02 16:42 ` James Bottomley
2009-01-03  3:39   ` Alan Stern
2009-01-04 20:53   ` Mike Christie
2009-01-05 23:28   ` Andrew Vasquez
2009-01-06  1:56     ` James Bottomley
2009-01-06  7:01       ` Andrew Vasquez
2009-01-06 19:15         ` James Bottomley
2009-01-06 19:48           ` Andrew Vasquez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081218090341.GA1765@plap4-2.local \
    --to=andrew.vasquez@qlogic.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.