[PATCH 0/14] scsi: scsi_decide

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/14] scsi: scsi_decide_dispostion update
@ 2008-09-02 16:05 Mike Anderson
  2008-09-02 16:05 ` [PATCH 01/14] block: separate failfast into multiple bits Mike Anderson
                   ` (14 more replies)
  0 siblings, 15 replies; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi

This patch series is an update to a previous set of patches posted by Mike
Christie in the thread referenced below.
http://thread.gmane.org/gmane.linux.scsi/44058/focus=4405

This patch series creates new return codes for scsi_decide_disposition and
scsi_check_sense so that retry restrictions and disposition can be implied
directly from the return code. Retry restrictions have also been moved
into the requeue function.

Mike C and I have tested a few of the return code types, but it would be
good to have other return types check with different hardware.

Change in behavior.
	- dm-mp fast fail on transport errors.
	- device busy and host busy limited by wait_for check.
	- A few more error cases will return DRIVER_TIMEOUT. Possibly should
	  add more DRIVER_ error codes.

A summary of the current disposition is shown below.

1.) Current disposition policy.
2.6.27 policy + DID_TRANSPORT patches

==============================================================================
scsi_queue_insert
==============================================================================
	SCSI_MLQUEUE_EH_RETRY		retry
	SCSI_MLQUEUE_DEVICE_BUSY 	retry, set device_blocked
	SCSI_MLQUEUE_HOST_BUSY		retry, set host_blocked

==============================================================================
scsi_softirq_done
==============================================================================
	disposition SUCCESS:
		scsi_finish_command
	disposition NEEDS_RETRY:
		scsi_queue_insert SCSI_MLQUEUE_EH_RETRY
	disposition ADD_TO_MLQUEUE:
		scsi_queue_insert SCSI_MLQUEUE_DEVICE_BUSY
	default:
		scsi_eh_scmd_add
	

==============================================================================
host_byte		
==============================================================================

DID_OK			goto status_byte
DID_NO_CONNECT		SUCCESS
DID_BUS_BUSY		allowed && !blk_noretry
DID_TIME_OUT		SUCCESS (TUR/INQ) / FAILED
DID_BAD_TARGET		SUCCESS
DID_ABORT		SUCCESS
DID_PARITY		allowed && !blk_noretry
DID_ERROR		allowed && !blk_noretry (status for RES)
DID_RESET		SUCCESS
DID_BAD_INTR		default
DID_PASSTHROUGH		SUCCESS
DID_SOFT_ERROR		allowed && !blk_noretry
DID_IMM_RETRY		NEEDS_RETRY
DID_REQUEUE		ADD_TO_MLQUEUE
DID_TRANSPORT_DISRUPTED ADD_TO_MLQUEUE
DID_TRANSPORT_FAILFAST	SUCCESS
default			FAILED


==============================================================================
status_byte
==============================================================================

QUEUE_FULL		ADD_TO_MLQUEUE
BUSY			ADD_TO_MLQUEUE
GOOD			SUCCESS
COMMAND_TERMINATED	SUCCESS
TASK_ABORTED		SUCCESS
CHECK_CONDITION		SEE SENSE
CONDITION_GOOD		SUCCESS
INTERMEDIATE_GOOD	SUCCESS
INTERMEDIATE_C_GOOD	SUCCESS
ACA_ACTIVE		SUCCESS
RESERVATION_CONFLICT	SUCCESS
default			FAILED

==============================================================================
sense
==============================================================================

!normalize_sense	FAILED
scsi_sense_is_deferred	NEEDS_RETRY (allowed && !blk_noretry)

scsi_dh->check_sense	handler return

FILEMARK, EOM or ILI	SUCCESS

NO_SENSE		SUCCESS
RECOVERED_ERROR		SUCCESS
ABORTED_COMMAND		SUCCESS (DIF) / NEEDS_RETRY
NOT_READY		NEEDS_RETRY (ua, bc rdy) / FAILED (restart) / SUCCESS
UNIT_ATTENTION		NEEDS_RETRY (ua, bc rdy) / FAILED (restart) / SUCCESS
COPY_ABORTED		SUCCESS
VOLUME_OVERFLOW		SUCCESS
MISCOMPARE		SUCCESS
MEDIUM_ERROR		SUCCESS (0x11, 0x13, 0x14) / NEEDS_RETRY
HARDWARE_ERROR		ADD_TO_MLQUEUE (retry_hwerror) / SUCCESS
ILLEGAL_REQUEST		SUCCESS
BLANK_CHECK		SUCCESS
DATA_PROTECT		SUCCESS
default			SUCCESS



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 01/14] block: separate failfast into multiple bits.
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-02 16:35   ` Grant Grundler
  2008-09-02 16:05 ` [PATCH 02/14] scsi: add transport host byte errors (v3) Mike Anderson
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi
  Cc: Mike Christie, Jens Axboe, Alasdair G Kergon, Neil Brown,
	Martin Schwidefsky

From: Mike Christie <michaelc@cs.wisc.edu>

Multipath is best at handling transport errors. If it gets a device
error then there is not much the multipath layer can do. It will just
access the same device but from a different path.

This patch breaks up failfast into device, transport and driver errors.
The multipath layers (md and dm mutlipath) only ask the lower levels to
fast fail transport errors. The user of failfast, read ahead, will ask
to fast fail on all errors.

Note that blk_noretry_request will return true if any failfast bit
is set. This allows drivers that do not support the multipath failfast
bits to continue to fail on any failfast error like before. Drivers
like scsi that are able to fail fast specific errors can check
for the specific fail fast type. In the next patch I will convert
scsi.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Alasdair G Kergon <agk@redhat.com>
Cc: Neil Brown <neilb@suse.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 block/blk-core.c                            |   11 +++++++++--
 drivers/md/dm-mpath.c                       |    2 +-
 drivers/md/multipath.c                      |    4 ++--
 drivers/s390/block/dasd_diag.c              |    2 +-
 drivers/s390/block/dasd_eckd.c              |    2 +-
 drivers/s390/block/dasd_fba.c               |    2 +-
 drivers/scsi/device_handler/scsi_dh_alua.c  |    3 ++-
 drivers/scsi/device_handler/scsi_dh_emc.c   |    3 ++-
 drivers/scsi/device_handler/scsi_dh_hp_sw.c |    6 ++++--
 drivers/scsi/device_handler/scsi_dh_rdac.c  |    3 ++-
 drivers/scsi/scsi_transport_spi.c           |    4 +++-
 include/linux/bio.h                         |   26 +++++++++++++++++---------
 include/linux/blkdev.h                      |   15 ++++++++++++---
 13 files changed, 57 insertions(+), 26 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 4889eb8..f3c29d0 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1073,8 +1073,15 @@ void init_request_from_bio(struct request *req, struct bio *bio)
 	/*
 	 * inherit FAILFAST from bio (for read-ahead, and explicit FAILFAST)
 	 */
-	if (bio_rw_ahead(bio) || bio_failfast(bio))
-		req->cmd_flags |= REQ_FAILFAST;
+	if (bio_rw_ahead(bio))
+		req->cmd_flags |= (REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
+				   REQ_FAILFAST_DRIVER);
+	if (bio_failfast_dev(bio))
+		req->cmd_flags |= REQ_FAILFAST_DEV;
+	if (bio_failfast_transport(bio))
+		req->cmd_flags |= REQ_FAILFAST_TRANSPORT;
+	if (bio_failfast_driver(bio))
+		req->cmd_flags |= REQ_FAILFAST_DRIVER;
 
 	/*
 	 * REQ_BARRIER implies no merging, but lets make it explicit
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 71dd65a..b48e201 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -827,7 +827,7 @@ static int multipath_map(struct dm_target *ti, struct bio *bio,
 	dm_bio_record(&mpio->details, bio);
 
 	map_context->ptr = mpio;
-	bio->bi_rw |= (1 << BIO_RW_FAILFAST);
+	bio->bi_rw |= (1 << BIO_RW_FAILFAST_TRANSPORT);
 	r = map_io(m, bio, mpio, 0);
 	if (r < 0 || r == DM_MAPIO_REQUEUE)
 		mempool_free(mpio, m->mpio_pool);
diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
index c4779cc..2426201 100644
--- a/drivers/md/multipath.c
+++ b/drivers/md/multipath.c
@@ -172,7 +172,7 @@ static int multipath_make_request (struct request_queue *q, struct bio * bio)
 	mp_bh->bio = *bio;
 	mp_bh->bio.bi_sector += multipath->rdev->data_offset;
 	mp_bh->bio.bi_bdev = multipath->rdev->bdev;
-	mp_bh->bio.bi_rw |= (1 << BIO_RW_FAILFAST);
+	mp_bh->bio.bi_rw |= (1 << BIO_RW_FAILFAST_TRANSPORT);
 	mp_bh->bio.bi_end_io = multipath_end_request;
 	mp_bh->bio.bi_private = mp_bh;
 	generic_make_request(&mp_bh->bio);
@@ -398,7 +398,7 @@ static void multipathd (mddev_t *mddev)
 			*bio = *(mp_bh->master_bio);
 			bio->bi_sector += conf->multipaths[mp_bh->path].rdev->data_offset;
 			bio->bi_bdev = conf->multipaths[mp_bh->path].rdev->bdev;
-			bio->bi_rw |= (1 << BIO_RW_FAILFAST);
+			bio->bi_rw |= (1 << BIO_RW_FAILFAST_TRANSPORT);
 			bio->bi_end_io = multipath_end_request;
 			bio->bi_private = mp_bh;
 			generic_make_request(bio);
diff --git a/drivers/s390/block/dasd_diag.c b/drivers/s390/block/dasd_diag.c
index 85fcb43..7844461 100644
--- a/drivers/s390/block/dasd_diag.c
+++ b/drivers/s390/block/dasd_diag.c
@@ -544,7 +544,7 @@ static struct dasd_ccw_req *dasd_diag_build_cp(struct dasd_device *memdev,
 	}
 	cqr->retries = DIAG_MAX_RETRIES;
 	cqr->buildclk = get_clock();
-	if (req->cmd_flags & REQ_FAILFAST)
+	if (blk_noretry_request(req))
 		set_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags);
 	cqr->startdev = memdev;
 	cqr->memdev = memdev;
diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c
index 773b3fe..b11a221 100644
--- a/drivers/s390/block/dasd_eckd.c
+++ b/drivers/s390/block/dasd_eckd.c
@@ -1683,7 +1683,7 @@ static struct dasd_ccw_req *dasd_eckd_build_cp(struct dasd_device *startdev,
 			recid++;
 		}
 	}
-	if (req->cmd_flags & REQ_FAILFAST)
+	if (blk_noretry_request(req))
 		set_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags);
 	cqr->startdev = startdev;
 	cqr->memdev = startdev;
diff --git a/drivers/s390/block/dasd_fba.c b/drivers/s390/block/dasd_fba.c
index aa0c533..115e032 100644
--- a/drivers/s390/block/dasd_fba.c
+++ b/drivers/s390/block/dasd_fba.c
@@ -355,7 +355,7 @@ static struct dasd_ccw_req *dasd_fba_build_cp(struct dasd_device * memdev,
 			recid++;
 		}
 	}
-	if (req->cmd_flags & REQ_FAILFAST)
+	if (blk_noretry_request(req))
 		set_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags);
 	cqr->startdev = memdev;
 	cqr->memdev = memdev;
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index 994da56..6bc55a6 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -109,7 +109,8 @@ static struct request *get_alua_req(struct scsi_device *sdev,
 	}
 
 	rq->cmd_type = REQ_TYPE_BLOCK_PC;
-	rq->cmd_flags |= REQ_FAILFAST | REQ_NOMERGE;
+	rq->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
+			 REQ_FAILFAST_DRIVER | REQ_NOMERGE;
 	rq->retries = ALUA_FAILOVER_RETRIES;
 	rq->timeout = ALUA_FAILOVER_TIMEOUT;
 
diff --git a/drivers/scsi/device_handler/scsi_dh_emc.c b/drivers/scsi/device_handler/scsi_dh_emc.c
index b9d23e9..64a56e5 100644
--- a/drivers/scsi/device_handler/scsi_dh_emc.c
+++ b/drivers/scsi/device_handler/scsi_dh_emc.c
@@ -304,7 +304,8 @@ static struct request *get_req(struct scsi_device *sdev, int cmd,
 
 	rq->cmd[4] = len;
 	rq->cmd_type = REQ_TYPE_BLOCK_PC;
-	rq->cmd_flags |= REQ_FAILFAST;
+	rq->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
+			 REQ_FAILFAST_DRIVER;
 	rq->timeout = CLARIION_TIMEOUT;
 	rq->retries = CLARIION_RETRIES;
 
diff --git a/drivers/scsi/device_handler/scsi_dh_hp_sw.c b/drivers/scsi/device_handler/scsi_dh_hp_sw.c
index a6a4ef3..08ba1ce 100644
--- a/drivers/scsi/device_handler/scsi_dh_hp_sw.c
+++ b/drivers/scsi/device_handler/scsi_dh_hp_sw.c
@@ -112,7 +112,8 @@ static int hp_sw_tur(struct scsi_device *sdev, struct hp_sw_dh_data *h)
 		return SCSI_DH_RES_TEMP_UNAVAIL;
 
 	req->cmd_type = REQ_TYPE_BLOCK_PC;
-	req->cmd_flags |= REQ_FAILFAST;
+	req->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
+			  REQ_FAILFAST_DRIVER;
 	req->cmd_len = COMMAND_SIZE(TEST_UNIT_READY);
 	memset(req->cmd, 0, MAX_COMMAND_SIZE);
 	req->cmd[0] = TEST_UNIT_READY;
@@ -205,7 +206,8 @@ static int hp_sw_start_stop(struct scsi_device *sdev, struct hp_sw_dh_data *h)
 		return SCSI_DH_RES_TEMP_UNAVAIL;
 
 	req->cmd_type = REQ_TYPE_BLOCK_PC;
-	req->cmd_flags |= REQ_FAILFAST;
+	req->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
+			  REQ_FAILFAST_DRIVER;
 	req->cmd_len = COMMAND_SIZE(START_STOP);
 	memset(req->cmd, 0, MAX_COMMAND_SIZE);
 	req->cmd[0] = START_STOP;
diff --git a/drivers/scsi/device_handler/scsi_dh_rdac.c b/drivers/scsi/device_handler/scsi_dh_rdac.c
index 2dee69d..c504afe 100644
--- a/drivers/scsi/device_handler/scsi_dh_rdac.c
+++ b/drivers/scsi/device_handler/scsi_dh_rdac.c
@@ -228,7 +228,8 @@ static struct request *get_rdac_req(struct scsi_device *sdev,
 	memset(rq->cmd, 0, BLK_MAX_CDB);
 
 	rq->cmd_type = REQ_TYPE_BLOCK_PC;
-	rq->cmd_flags |= REQ_FAILFAST | REQ_NOMERGE;
+	rq->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
+			 REQ_FAILFAST_DRIVER;
 	rq->retries = RDAC_RETRIES;
 	rq->timeout = RDAC_TIMEOUT;
 
diff --git a/drivers/scsi/scsi_transport_spi.c b/drivers/scsi/scsi_transport_spi.c
index b29360e..7c2d289 100644
--- a/drivers/scsi/scsi_transport_spi.c
+++ b/drivers/scsi/scsi_transport_spi.c
@@ -109,7 +109,9 @@ static int spi_execute(struct scsi_device *sdev, const void *cmd,
 	for(i = 0; i < DV_RETRIES; i++) {
 		result = scsi_execute(sdev, cmd, dir, buffer, bufflen,
 				      sense, DV_TIMEOUT, /* retries */ 1,
-				      REQ_FAILFAST);
+				      REQ_FAILFAST_DEV |
+				      REQ_FAILFAST_TRANSPORT |
+				      REQ_FAILFAST_DRIVER);
 		if (result & DRIVER_SENSE) {
 			struct scsi_sense_hdr sshdr_tmp;
 			if (!sshdr)
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 0933a14..425a4ec 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -147,15 +147,20 @@ struct bio {
  * bit 0 -- read (not set) or write (set)
  * bit 1 -- rw-ahead when set
  * bit 2 -- barrier
- * bit 3 -- fail fast, don't want low level driver retries
- * bit 4 -- synchronous I/O hint: the block layer will unplug immediately
+ * bit 3 -- synchronous I/O hint: the block layer will unplug immediately
+ * bit 4 -- meta data
+ * bit 5 -- fail fast device errors
+ * bit 6 -- fail fast transport errors
+ * bit 7 -- fail fast driver errors
  */
-#define BIO_RW		0
-#define BIO_RW_AHEAD	1
-#define BIO_RW_BARRIER	2
-#define BIO_RW_FAILFAST	3
-#define BIO_RW_SYNC	4
-#define BIO_RW_META	5
+#define BIO_RW				0
+#define BIO_RW_AHEAD			1
+#define BIO_RW_BARRIER			2
+#define BIO_RW_SYNC			3
+#define BIO_RW_META			4
+#define BIO_RW_FAILFAST_DEV		5
+#define BIO_RW_FAILFAST_TRANSPORT	6
+#define BIO_RW_FAILFAST_DRIVER		7
 
 /*
  * upper 16 bits of bi_rw define the io priority of this bio
@@ -182,7 +187,10 @@ struct bio {
 #define bio_sectors(bio)	((bio)->bi_size >> 9)
 #define bio_barrier(bio)	((bio)->bi_rw & (1 << BIO_RW_BARRIER))
 #define bio_sync(bio)		((bio)->bi_rw & (1 << BIO_RW_SYNC))
-#define bio_failfast(bio)	((bio)->bi_rw & (1 << BIO_RW_FAILFAST))
+#define bio_failfast_dev(bio)	((bio)->bi_rw &	(1 << BIO_RW_FAILFAST_DEV))
+#define bio_failfast_transport(bio)	\
+	((bio)->bi_rw & (1 << BIO_RW_FAILFAST_TRANSPORT))
+#define bio_failfast_driver(bio) ((bio)->bi_rw & (1 << BIO_RW_FAILFAST_DRIVER))
 #define bio_rw_ahead(bio)	((bio)->bi_rw & (1 << BIO_RW_AHEAD))
 #define bio_rw_meta(bio)	((bio)->bi_rw & (1 << BIO_RW_META))
 #define bio_empty_barrier(bio)	(bio_barrier(bio) && !(bio)->bi_size)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e61f22b..3f37fb6 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -88,7 +88,9 @@ enum {
  */
 enum rq_flag_bits {
 	__REQ_RW,		/* not set, read. set, write */
-	__REQ_FAILFAST,		/* no low level driver retries */
+	__REQ_FAILFAST_DEV,	/* no driver retries of device errors */
+	__REQ_FAILFAST_TRANSPORT, /* no driver retries of transport errors */
+	__REQ_FAILFAST_DRIVER,	/* no driver retries of driver errors */
 	__REQ_SORTED,		/* elevator knows about this request */
 	__REQ_SOFTBARRIER,	/* may not be passed by ioscheduler */
 	__REQ_HARDBARRIER,	/* may not be passed by drive either */
@@ -111,7 +113,9 @@ enum rq_flag_bits {
 };
 
 #define REQ_RW		(1 << __REQ_RW)
-#define REQ_FAILFAST	(1 << __REQ_FAILFAST)
+#define REQ_FAILFAST_DEV	(1 << __REQ_FAILFAST_DEV)
+#define REQ_FAILFAST_TRANSPORT	(1 << __REQ_FAILFAST_TRANSPORT)
+#define REQ_FAILFAST_DRIVER	(1 << __REQ_FAILFAST_DRIVER)
 #define REQ_SORTED	(1 << __REQ_SORTED)
 #define REQ_SOFTBARRIER	(1 << __REQ_SOFTBARRIER)
 #define REQ_HARDBARRIER	(1 << __REQ_HARDBARRIER)
@@ -523,7 +527,12 @@ enum {
 #define blk_special_request(rq)	((rq)->cmd_type == REQ_TYPE_SPECIAL)
 #define blk_sense_request(rq)	((rq)->cmd_type == REQ_TYPE_SENSE)
 
-#define blk_noretry_request(rq)	((rq)->cmd_flags & REQ_FAILFAST)
+#define blk_failfast_dev(rq)	((rq)->cmd_flags & REQ_FAILFAST_DEV)
+#define blk_failfast_transport(rq) ((rq)->cmd_flags & REQ_FAILFAST_TRANSPORT)
+#define blk_failfast_driver(rq)	((rq)->cmd_flags & REQ_FAILFAST_DRIVER)
+#define blk_noretry_request(rq)	(blk_failfast_dev(rq) ||	\
+				 blk_failfast_transport(rq) ||	\
+				 blk_failfast_driver(rq))
 #define blk_rq_started(rq)	((rq)->cmd_flags & REQ_STARTED)
 
 #define blk_account_rq(rq)	(blk_rq_started(rq) && blk_fs_request(rq))
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 02/14] scsi: add transport host byte errors (v3)
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
  2008-09-02 16:05 ` [PATCH 01/14] block: separate failfast into multiple bits Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-02 16:05 ` [PATCH 03/14] scsi: Move wait_for check Mike Anderson
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

From: Mike Christie <michaelc@cs.wisc.edu>

Currently, if there is a transport problem the iscsi drivers will return
outstanding commands (commands being exeucted by the driver/fw/hw) with
DID_BUS_BUSY and block the session so no new commands can be queued.
Commands that are caught between the failure handling and blocking are
failed with DID_IMM_RETRY or one of the scsi ml queuecommand return values.
When the recovery_timeout fires, the iscsi drivers then fail IO with
DID_NO_CONNECT.

For fcp, some drivers will fail some outstanding IO (disk but possibly not
tape) with DID_BUS_BUSY or DID_ERROR or some other value that causes a retry
and hits the scsi_error.c failfast check, block the rport, and commands
caught in the race are failed with DID_IMM_RETRY. Other drivers, may
hold onto all IO and wait for the terminate_rport_io or dev_loss_tmo_callbk
to be called.

The following patches attempt to unify what upper layers will see drivers
like multipath can make a good guess. This relies on drivers being
hooked into their transport class.

This first patch just defines two new host byte errors so drivers can
return the same value for when a rport/session is blocked and for
when the fast_io_fail_tmo fires.

The idea is that if the LLD/class detects a problem and is going to block
a rport/session, then if the LLD wants or must return the command to scsi-ml,
then it can return it with DID_TRANSPORT_DISRUPTED. This will requeue
the IO into the same scsi queue it came from, until the fast io fail timer
fires and the class decides what to do.

When using multipath and the fast_io_fail_tmo fires then the class
can fail commands with DID_TRANSPORT_FAILFAST or drivers can use
DID_TRANSPORT_FAILFAST in their terminate_rport_io callbacks or
the equivlent in iscsi if we ever implement more advanced recovery methods.
A LLD, like lpfc, could continue to return DID_ERROR and then it will hit
the normal failfast path, so drivers do not have fully be ported to
work better. The point of the patches is that upper layers will
not see a failure that could be recovered from while the rport/session is
blocked until fast_io_fail_tmo/recovery_timeout fires.

V3
Remove some comments.
V2
Fixed patch/diff errors and renamed DID_TRANSPORT_BLOCKED to
DID_TRANSPORT_DISRUPTED.
V1
initial patch.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 drivers/scsi/constants.c  |    3 ++-
 drivers/scsi/scsi_error.c |   15 ++++++++++++++-
 include/scsi/scsi.h       |    5 +++++
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/constants.c b/drivers/scsi/constants.c
index 9785d73..4003dee 100644
--- a/drivers/scsi/constants.c
+++ b/drivers/scsi/constants.c
@@ -1364,7 +1364,8 @@ EXPORT_SYMBOL(scsi_print_sense);
 static const char * const hostbyte_table[]={
 "DID_OK", "DID_NO_CONNECT", "DID_BUS_BUSY", "DID_TIME_OUT", "DID_BAD_TARGET",
 "DID_ABORT", "DID_PARITY", "DID_ERROR", "DID_RESET", "DID_BAD_INTR",
-"DID_PASSTHROUGH", "DID_SOFT_ERROR", "DID_IMM_RETRY", "DID_REQUEUE"};
+"DID_PASSTHROUGH", "DID_SOFT_ERROR", "DID_IMM_RETRY", "DID_REQUEUE",
+"DID_TRANSPORT_DISRUPTED", "DID_TRANSPORT_FAILFAST" };
 #define NUM_HOSTBYTE_STRS ARRAY_SIZE(hostbyte_table)

 static const char * const driverbyte_table[]={
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 880051c..4e30343 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1351,7 +1351,20 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)

 	case DID_REQUEUE:
 		return ADD_TO_MLQUEUE;
-
+	case DID_TRANSPORT_DISRUPTED:
+		/*
+		 * LLD/transport was disrupted during processing of the IO.
+		 * The transport class is now blocked/blocking,
+		 * and the transport will decide what to do with the IO
+		 * based on its timers and recovery capablilities.
+		 */
+		return ADD_TO_MLQUEUE;
+	case DID_TRANSPORT_FAILFAST:
+		/*
+		 * The transport decided to failfast the IO (most likely
+		 * the fast io fail tmo fired), so send IO directly upwards.
+		 */
+		return SUCCESS;
 	case DID_ERROR:
 		if (msg_byte(scmd->result) == COMMAND_COMPLETE &&
 		    status_byte(scmd->result) == RESERVATION_CONFLICT)
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 5c40cc5..8740a16 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -367,6 +367,11 @@ struct scsi_lun {
 #define DID_IMM_RETRY   0x0c	/* Retry without decrementing retry count  */
 #define DID_REQUEUE	0x0d	/* Requeue command (no immediate retry) also
 				 * without decrementing the retry count	   */
+#define DID_TRANSPORT_DISRUPTED 0x0e /* Transport error disrupted execution
+				      * and the driver blocked the port to
+				      * recover the link. Transport class will
+				      * retry or fail IO */
+#define DID_TRANSPORT_FAILFAST	0x0f /* Transport class fastfailed the io */
 #define DRIVER_OK       0x00	/* Driver status                           */

 /*
-- 
1.5.5.1

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 03/14] scsi: Move wait_for check
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
  2008-09-02 16:05 ` [PATCH 01/14] block: separate failfast into multiple bits Mike Anderson
  2008-09-02 16:05 ` [PATCH 02/14] scsi: add transport host byte errors (v3) Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-02 16:05 ` [PATCH 04/14] scsi: Move retries check Mike Anderson
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

Move wait_for check to scsi_queue_insert.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_lib.c |   17 +++++++++--------
 1 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ff5d56b..b340087 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -115,11 +115,20 @@ int scsi_queue_insert(struct scsi_cmnd *cmd, int reason)
 	struct Scsi_Host *host = cmd->device->host;
 	struct scsi_device *device = cmd->device;
 	struct request_queue *q = device->request_queue;
+	unsigned long wait_for = (cmd->allowed + 1) * cmd->timeout_per_command;
 	unsigned long flags;
 
 	SCSI_LOG_MLQUEUE(1,
 		 printk("Inserting command %p into mlqueue\n", cmd));
 
+	if (time_before(cmd->jiffies_at_alloc + wait_for, jiffies)) {
+		sdev_printk(KERN_ERR, cmd->device, "timing out command, "
+			    "waited %lus\n", wait_for/HZ);
+		set_driver_byte(cmd, DRIVER_TIMEOUT);
+		scsi_finish_command(cmd);
+		return 0;
+	}
+
 	/*
 	 * Set the appropriate busy bit for the device/host.
 	 *
@@ -1421,19 +1430,11 @@ static void scsi_kill_request(struct request *req, struct request_queue *q)
 static void scsi_softirq_done(struct request *rq)
 {
 	struct scsi_cmnd *cmd = rq->completion_data;
-	unsigned long wait_for = (cmd->allowed + 1) * cmd->timeout_per_command;
 	int disposition;
 
 	INIT_LIST_HEAD(&cmd->eh_entry);
 
 	disposition = scsi_decide_disposition(cmd);
-	if (disposition != SUCCESS &&
-	    time_before(cmd->jiffies_at_alloc + wait_for, jiffies)) {
-		sdev_printk(KERN_ERR, cmd->device,
-			    "timing out command, waited %lus\n",
-			    wait_for/HZ);
-		disposition = SUCCESS;
-	}
 			
 	scsi_log_completion(cmd, disposition);
 
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 04/14] scsi: Move retries check
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
                   ` (2 preceding siblings ...)
  2008-09-02 16:05 ` [PATCH 03/14] scsi: Move wait_for check Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-04 18:27   ` James Bottomley
  2008-09-02 16:05 ` [PATCH 05/14] scsi: Move blk_noretry_request Mike Anderson
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

Move retries check to scsi_queue_insert.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_error.c |    8 ++------
 drivers/scsi/scsi_lib.c   |    7 +++++++
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 4e30343..eb4290a 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1347,8 +1347,6 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 	case DID_SOFT_ERROR:
 		goto maybe_retry;
 	case DID_IMM_RETRY:
-		return NEEDS_RETRY;
-
 	case DID_REQUEUE:
 		return ADD_TO_MLQUEUE;
 	case DID_TRANSPORT_DISRUPTED:
@@ -1456,8 +1454,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 	 * the request was not marked fast fail.  Note that above,
 	 * even if the request is marked fast fail, we still requeue
 	 * for queue congestion conditions (QUEUE_FULL or BUSY) */
-	if ((++scmd->retries) <= scmd->allowed
-	    && !blk_noretry_request(scmd->request)) {
+	if (!blk_noretry_request(scmd->request)) {
 		return NEEDS_RETRY;
 	} else {
 		/*
@@ -1582,8 +1579,7 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
 	list_for_each_entry_safe(scmd, next, done_q, eh_entry) {
 		list_del_init(&scmd->eh_entry);
 		if (scsi_device_online(scmd->device) &&
-		    !blk_noretry_request(scmd->request) &&
-		    (++scmd->retries <= scmd->allowed)) {
+		    !blk_noretry_request(scmd->request)) {
 			SCSI_LOG_ERROR_RECOVERY(3, printk("%s: flush"
 							  " retry cmd: %p\n",
 							  current->comm,
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index b340087..afb4b33 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -146,6 +146,13 @@ int scsi_queue_insert(struct scsi_cmnd *cmd, int reason)
 		host->host_blocked = host->max_host_blocked;
 	else if (reason == SCSI_MLQUEUE_DEVICE_BUSY)
 		device->device_blocked = device->max_device_blocked;
+	else if (reason == SCSI_MLQUEUE_EH_RETRY) {
+		if (++cmd->retries > cmd->allowed) {
+			set_driver_byte(cmd, DRIVER_TIMEOUT);
+			scsi_finish_command(cmd);
+			return 0;
+		}
+	}
 
 	/*
 	 * Decrement the counters, since these commands are no longer
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 05/14] scsi: Move blk_noretry_request
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
                   ` (3 preceding siblings ...)
  2008-09-02 16:05 ` [PATCH 04/14] scsi: Move retries check Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-02 16:05 ` [PATCH 06/14] scsi: remove maybe_retry Mike Anderson
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

Move blk_noretry_request check to scsi_queue_insert.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_error.c |   15 +--------------
 drivers/scsi/scsi_lib.c   |    3 ++-
 2 files changed, 3 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index eb4290a..5c112e2 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1449,19 +1449,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 	return FAILED;
 
       maybe_retry:
-
-	/* we requeue for retry because the error was retryable, and
-	 * the request was not marked fast fail.  Note that above,
-	 * even if the request is marked fast fail, we still requeue
-	 * for queue congestion conditions (QUEUE_FULL or BUSY) */
-	if (!blk_noretry_request(scmd->request)) {
 		return NEEDS_RETRY;
-	} else {
-		/*
-		 * no more retries - report this one back to upper level.
-		 */
-		return SUCCESS;
-	}
 }
 
 /**
@@ -1578,8 +1566,7 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
 
 	list_for_each_entry_safe(scmd, next, done_q, eh_entry) {
 		list_del_init(&scmd->eh_entry);
-		if (scsi_device_online(scmd->device) &&
-		    !blk_noretry_request(scmd->request)) {
+		if (scsi_device_online(scmd->device)) {
 			SCSI_LOG_ERROR_RECOVERY(3, printk("%s: flush"
 							  " retry cmd: %p\n",
 							  current->comm,
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index afb4b33..a085973 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -147,7 +147,8 @@ int scsi_queue_insert(struct scsi_cmnd *cmd, int reason)
 	else if (reason == SCSI_MLQUEUE_DEVICE_BUSY)
 		device->device_blocked = device->max_device_blocked;
 	else if (reason == SCSI_MLQUEUE_EH_RETRY) {
-		if (++cmd->retries > cmd->allowed) {
+		if (blk_noretry_request(cmd->request) ||
+		    ++cmd->retries > cmd->allowed) {
 			set_driver_byte(cmd, DRIVER_TIMEOUT);
 			scsi_finish_command(cmd);
 			return 0;
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 06/14] scsi: remove maybe_retry
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
                   ` (4 preceding siblings ...)
  2008-09-02 16:05 ` [PATCH 05/14] scsi: Move blk_noretry_request Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-02 16:05 ` [PATCH 07/14] scsi: change return codes in scsi_decide_disposition Mike Anderson
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

Remove maybe_retry from scsi_decide_disposition.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_error.c |   16 +++-------------
 1 files changed, 3 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 5c112e2..ffe5e70 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1295,7 +1295,6 @@ static void scsi_eh_offline_sdevs(struct list_head *work_q,
  */
 int scsi_decide_disposition(struct scsi_cmnd *scmd)
 {
-	int rtn;
 
 	/*
 	 * if the device is offline, then we clearly just pass the result back
@@ -1345,7 +1344,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * and not get stuck in a loop.
 		 */
 	case DID_SOFT_ERROR:
-		goto maybe_retry;
+		return NEEDS_RETRY;
 	case DID_IMM_RETRY:
 	case DID_REQUEUE:
 		return ADD_TO_MLQUEUE;
@@ -1375,7 +1374,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 
 	case DID_BUS_BUSY:
 	case DID_PARITY:
-		goto maybe_retry;
+		return NEEDS_RETRY;
 	case DID_TIME_OUT:
 		/*
 		 * when we scan the bus, we get timeout messages for
@@ -1422,14 +1421,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 	case TASK_ABORTED:
 		return SUCCESS;
 	case CHECK_CONDITION:
-		rtn = scsi_check_sense(scmd);
-		if (rtn == NEEDS_RETRY)
-			goto maybe_retry;
-		/* if rtn == FAILED, we have no sense information;
-		 * returning FAILED will wake the error handler thread
-		 * to collect the sense and redo the decide
-		 * disposition */
-		return rtn;
+		return scsi_check_sense(scmd);
 	case CONDITION_GOOD:
 	case INTERMEDIATE_GOOD:
 	case INTERMEDIATE_C_GOOD:
@@ -1448,8 +1440,6 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 	}
 	return FAILED;
 
-      maybe_retry:
-		return NEEDS_RETRY;
 }
 
 /**
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 07/14] scsi: change return codes in scsi_decide_disposition
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
                   ` (5 preceding siblings ...)
  2008-09-02 16:05 ` [PATCH 06/14] scsi: remove maybe_retry Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-02 16:05 ` [PATCH 08/14] scsi: rename scsi_queue_insert to scsi_attempt_requeue_command Mike Anderson
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

Created new return codes for scsi_decide_disposition
and scsi_check_sense so that retry restrictions and
disposition can be implied directly from the return
code.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_error.c |   80 +++++++++++++++++++++++----------------------
 drivers/scsi/scsi_lib.c   |   33 +++++++++---------
 include/scsi/scsi.h       |   30 +++++++++++++++--
 3 files changed, 85 insertions(+), 58 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index ffe5e70..6b8cbb1 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -290,7 +290,9 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host *shost,
  * @scmd:	Cmd to have sense checked.
  *
  * Return value:
- * 	SUCCESS or FAILED or NEEDS_RETRY
+ * 	SCSI_MLQUEUE_DIS_FINISH
+ * 	SCSI_MLQUEUE_DIS_RETRY
+ * 	SCSI_MLQUEUE_DIS_FAIL
  *
  * Notes:
  *	When a deferred error is detected the current command has
@@ -302,10 +304,10 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 	struct scsi_sense_hdr sshdr;
 
 	if (! scsi_command_normalize_sense(scmd, &sshdr))
-		return FAILED;	/* no valid sense data */
+		return SCSI_MLQUEUE_DIS_FAIL;	/* no valid sense data */
 
 	if (scsi_sense_is_deferred(&sshdr))
-		return NEEDS_RETRY;
+		return SCSI_MLQUEUE_DIS_RETRY;
 
 	if (sdev->scsi_dh_data && sdev->scsi_dh_data->scsi_dh &&
 			sdev->scsi_dh_data->scsi_dh->check_sense) {
@@ -324,7 +326,7 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 	if (sshdr.response_code == 0x70) {
 		/* fixed format */
 		if (scmd->sense_buffer[2] & 0xe0)
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 	} else {
 		/*
 		 * descriptor format: look for "stream commands sense data
@@ -334,20 +336,20 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 		if ((sshdr.additional_length > 3) &&
 		    (scmd->sense_buffer[8] == 0x4) &&
 		    (scmd->sense_buffer[11] & 0xe0))
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 	}
 
 	switch (sshdr.sense_key) {
 	case NO_SENSE:
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	case RECOVERED_ERROR:
-		return /* soft_error */ SUCCESS;
+		return /* soft_error */ SCSI_MLQUEUE_DIS_FINISH;
 
 	case ABORTED_COMMAND:
 		if (sshdr.asc == 0x10) /* DIF */
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 
-		return NEEDS_RETRY;
+		return SCSI_MLQUEUE_DIS_RETRY;
 	case NOT_READY:
 	case UNIT_ATTENTION:
 		/*
@@ -358,48 +360,48 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 		 */
 		if (scmd->device->expecting_cc_ua) {
 			scmd->device->expecting_cc_ua = 0;
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_DIS_RETRY;
 		}
 		/*
 		 * if the device is in the process of becoming ready, we 
 		 * should retry.
 		 */
 		if ((sshdr.asc == 0x04) && (sshdr.ascq == 0x01))
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_DIS_RETRY;
 		/*
 		 * if the device is not started, we need to wake
 		 * the error handler to start the motor
 		 */
 		if (scmd->device->allow_restart &&
 		    (sshdr.asc == 0x04) && (sshdr.ascq == 0x02))
-			return FAILED;
-		return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FAIL;
+		return SCSI_MLQUEUE_DIS_FINISH;
 
 		/* these three are not supported */
 	case COPY_ABORTED:
 	case VOLUME_OVERFLOW:
 	case MISCOMPARE:
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 
 	case MEDIUM_ERROR:
 		if (sshdr.asc == 0x11 || /* UNRECOVERED READ ERR */
 		    sshdr.asc == 0x13 || /* AMNF DATA FIELD */
 		    sshdr.asc == 0x14) { /* RECORD NOT FOUND */
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		}
-		return NEEDS_RETRY;
+		return SCSI_MLQUEUE_DIS_RETRY;
 
 	case HARDWARE_ERROR:
 		if (scmd->device->retry_hwerror)
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_DIS_RETRY;
 		else
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 
 	case ILLEGAL_REQUEST:
 	case BLANK_CHECK:
 	case DATA_PROTECT:
 	default:
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	}
 }
 
@@ -1304,7 +1306,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		SCSI_LOG_ERROR_RECOVERY(5, printk("%s: device offline - report"
 						  " as SUCCESS\n",
 						  __func__));
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	}
 
 	/*
@@ -1319,7 +1321,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * did_ok.
 		 */
 		scmd->result &= 0xff00ffff;
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	case DID_OK:
 		/*
 		 * looks good.  drop through, and check the next byte.
@@ -1333,7 +1335,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * to the top level driver, not that we actually think
 		 * that it indicates SUCCESS.
 		 */
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 		/*
 		 * when the low level driver returns did_soft_error,
 		 * it is responsible for keeping an internal retry counter 
@@ -1344,10 +1346,10 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * and not get stuck in a loop.
 		 */
 	case DID_SOFT_ERROR:
-		return NEEDS_RETRY;
+		return SCSI_MLQUEUE_DIS_RETRY;
 	case DID_IMM_RETRY:
 	case DID_REQUEUE:
-		return ADD_TO_MLQUEUE;
+		return SCSI_MLQUEUE_IMM_RETRY;
 	case DID_TRANSPORT_DISRUPTED:
 		/*
 		 * LLD/transport was disrupted during processing of the IO.
@@ -1355,13 +1357,13 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * and the transport will decide what to do with the IO
 		 * based on its timers and recovery capablilities.
 		 */
-		return ADD_TO_MLQUEUE;
+		return SCSI_MLQUEUE_IMM_RETRY;
 	case DID_TRANSPORT_FAILFAST:
 		/*
 		 * The transport decided to failfast the IO (most likely
 		 * the fast io fail tmo fired), so send IO directly upwards.
 		 */
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	case DID_ERROR:
 		if (msg_byte(scmd->result) == COMMAND_COMPLETE &&
 		    status_byte(scmd->result) == RESERVATION_CONFLICT)
@@ -1374,7 +1376,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 
 	case DID_BUS_BUSY:
 	case DID_PARITY:
-		return NEEDS_RETRY;
+		return SCSI_MLQUEUE_DIS_RETRY;
 	case DID_TIME_OUT:
 		/*
 		 * when we scan the bus, we get timeout messages for
@@ -1383,21 +1385,21 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 */
 		if ((scmd->cmnd[0] == TEST_UNIT_READY ||
 		     scmd->cmnd[0] == INQUIRY)) {
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		} else {
-			return FAILED;
+			return SCSI_MLQUEUE_DIS_FAIL;
 		}
 	case DID_RESET:
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	default:
-		return FAILED;
+		return SCSI_MLQUEUE_DIS_FAIL;
 	}
 
 	/*
 	 * next, check the message byte.
 	 */
 	if (msg_byte(scmd->result) != COMMAND_COMPLETE)
-		return FAILED;
+		return SCSI_MLQUEUE_DIS_FAIL;
 
 	/*
 	 * check the status byte to see if this indicates anything special.
@@ -1415,11 +1417,11 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * the empty queue handling to trigger a stall in the
 		 * device.
 		 */
-		return ADD_TO_MLQUEUE;
+		return SCSI_MLQUEUE_IMM_RETRY;
 	case GOOD:
 	case COMMAND_TERMINATED:
 	case TASK_ABORTED:
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	case CHECK_CONDITION:
 		return scsi_check_sense(scmd);
 	case CONDITION_GOOD:
@@ -1429,16 +1431,16 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		/*
 		 * who knows?  FIXME(eric)
 		 */
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 
 	case RESERVATION_CONFLICT:
 		sdev_printk(KERN_INFO, scmd->device,
 			    "reservation conflict\n");
-		return SUCCESS; /* causes immediate i/o error */
+		return SCSI_MLQUEUE_DIS_FINISH; /* causes immediate i/o error */
 	default:
-		return FAILED;
+		return SCSI_MLQUEUE_DIS_FAIL;
 	}
-	return FAILED;
+	return SCSI_MLQUEUE_DIS_FAIL;
 
 }
 
@@ -1561,7 +1563,7 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
 							  " retry cmd: %p\n",
 							  current->comm,
 							  scmd));
-				scsi_queue_insert(scmd, SCSI_MLQUEUE_EH_RETRY);
+			scsi_queue_insert(scmd, SCSI_MLQUEUE_DIS_RETRY);
 		} else {
 			/*
 			 * If just we got sense for the device (called
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index a085973..51a737f 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -146,9 +146,17 @@ int scsi_queue_insert(struct scsi_cmnd *cmd, int reason)
 		host->host_blocked = host->max_host_blocked;
 	else if (reason == SCSI_MLQUEUE_DEVICE_BUSY)
 		device->device_blocked = device->max_device_blocked;
-	else if (reason == SCSI_MLQUEUE_EH_RETRY) {
-		if (blk_noretry_request(cmd->request) ||
-		    ++cmd->retries > cmd->allowed) {
+
+	if (!scsi_ign_failfast(reason)) {
+		if (blk_noretry_request(cmd->request)) {
+			set_driver_byte(cmd, DRIVER_TIMEOUT);
+			scsi_finish_command(cmd);
+			return 0;
+		}
+	}
+
+	if (!scsi_ign_cmd_retries(reason)) {
+		if (++cmd->retries > cmd->allowed) {
 			set_driver_byte(cmd, DRIVER_TIMEOUT);
 			scsi_finish_command(cmd);
 			return 0;
@@ -1446,20 +1454,13 @@ static void scsi_softirq_done(struct request *rq)
 			
 	scsi_log_completion(cmd, disposition);
 
-	switch (disposition) {
-		case SUCCESS:
+	if (scsi_disposition_finish(disposition))
+		scsi_finish_command(cmd);
+	else if (scsi_disposition_retry(disposition))
+		scsi_queue_insert(cmd, disposition);
+	else
+		if (!scsi_eh_scmd_add(cmd, 0))
 			scsi_finish_command(cmd);
-			break;
-		case NEEDS_RETRY:
-			scsi_queue_insert(cmd, SCSI_MLQUEUE_EH_RETRY);
-			break;
-		case ADD_TO_MLQUEUE:
-			scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
-			break;
-		default:
-			if (!scsi_eh_scmd_add(cmd, 0))
-				scsi_finish_command(cmd);
-	}
 }
 
 /*
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index 8740a16..a3083de 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -414,9 +414,33 @@ struct scsi_lun {
 /*
  * Midlevel queue return values.
  */
-#define SCSI_MLQUEUE_HOST_BUSY   0x1055
-#define SCSI_MLQUEUE_DEVICE_BUSY 0x1056
-#define SCSI_MLQUEUE_EH_RETRY    0x1057
+
+enum {
+	/*
+	 * Retry Constraints
+	 *
+	 * SCSI_IGN_ALLOWED		: Ignore cmd retries allowed check
+	 * SCSI_IGN_BLK_FAILFAST	: Ignore blk_failfast check.
+	 */
+	SCSI_IGN_ALLOWED	= 0x01,
+	SCSI_IGN_BLK_FAILFAST	= 0x02,
+
+	SCSI_MLQUEUE_DIS_FINISH	= 0x10,
+	SCSI_MLQUEUE_DIS_RETRY	= 0x20,
+	SCSI_MLQUEUE_DIS_FAIL	= 0x40,
+
+	SCSI_MLQUEUE_HOST_BUSY		= 0x100 | SCSI_MLQUEUE_DIS_RETRY |
+		SCSI_IGN_BLK_FAILFAST | SCSI_IGN_ALLOWED,
+	SCSI_MLQUEUE_DEVICE_BUSY	= 0x101 | SCSI_MLQUEUE_DIS_RETRY |
+		SCSI_IGN_BLK_FAILFAST | SCSI_IGN_ALLOWED,
+	SCSI_MLQUEUE_IMM_RETRY		= 0x102 | SCSI_MLQUEUE_DIS_RETRY |
+		SCSI_IGN_BLK_FAILFAST | SCSI_IGN_ALLOWED,
+};
+
+#define scsi_disposition_finish(dis) (dis & SCSI_MLQUEUE_DIS_FINISH)
+#define scsi_disposition_retry(dis) (dis & SCSI_MLQUEUE_DIS_RETRY)
+#define scsi_ign_cmd_retries(dis) (dis & SCSI_IGN_ALLOWED)
+#define scsi_ign_failfast(dis) (dis & SCSI_IGN_BLK_FAILFAST)
 
 /*
  *  Use these to separate status msg and our bytes
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 08/14] scsi: rename scsi_queue_insert to scsi_attempt_requeue_command
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
                   ` (6 preceding siblings ...)
  2008-09-02 16:05 ` [PATCH 07/14] scsi: change return codes in scsi_decide_disposition Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-02 16:05 ` [PATCH 09/14] scsi: have device handlers return SCSI_MLQUEUE error value Mike Anderson
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

From: Mike Christie <michaelc@cs.wisc.edu>

This renames scsi_queue_insert to scsi_attempt_requeue_command, because
it may not be able to requeue the command if there are not retries left
or if failfast is set. The naming is also meant to better match
the other requeue function, scsi_requeue_command, which will always
requeue a command.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 drivers/scsi/scsi.c       |    4 ++--
 drivers/scsi/scsi_error.c |    3 ++-
 drivers/scsi/scsi_lib.c   |   22 ++++++++++++++--------
 drivers/scsi/scsi_priv.h  |    2 +-
 4 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index ee6be59..b87fbb2 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -672,7 +672,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd)
 		 * future requests should not occur until the device 
 		 * transitions out of the suspend state.
 		 */
-		scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
+		scsi_attempt_requeue_command(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
 
 		SCSI_LOG_MLQUEUE(3, printk("queuecommand : device blocked \n"));
 
@@ -756,7 +756,7 @@ int scsi_dispatch_cmd(struct scsi_cmnd *cmd)
 	if (rtn) {
 		if (scsi_delete_timer(cmd)) {
 			atomic_inc(&cmd->device->iodone_cnt);
-			scsi_queue_insert(cmd,
+			scsi_attempt_requeue_command(cmd,
 					  (rtn == SCSI_MLQUEUE_DEVICE_BUSY) ?
 					  rtn : SCSI_MLQUEUE_HOST_BUSY);
 		}
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 6b8cbb1..45c7d24 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1563,7 +1563,8 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
 							  " retry cmd: %p\n",
 							  current->comm,
 							  scmd));
-			scsi_queue_insert(scmd, SCSI_MLQUEUE_DIS_RETRY);
+			scsi_attempt_requeue_command(scmd,
+						     SCSI_MLQUEUE_DIS_RETRY);
 		} else {
 			/*
 			 * If just we got sense for the device (called
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 51a737f..9bbc11d 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -92,9 +92,9 @@ static void scsi_unprep_request(struct request *req)
 }
 
 /*
- * Function:    scsi_queue_insert()
+ * Function:    scsi_attempt_requeue_command()
  *
- * Purpose:     Insert a command in the midlevel queue.
+ * Purpose:     Attempt to insert a command in the midlevel queue.
  *
  * Arguments:   cmd    - command that we are adding to queue.
  *              reason - why we are inserting command to queue.
@@ -103,14 +103,20 @@ static void scsi_unprep_request(struct request *req)
  *
  * Returns:     Nothing.
  *
- * Notes:       We do this for one of two cases.  Either the host is busy
- *              and it cannot accept any more commands for the time being,
- *              or the device returned QUEUE_FULL and can accept no more
- *              commands.
+ * Notes:       We do this for multiple cases.
+ *
+ *		Host or device queueing:
+ *		Either the host or device is busy and it cannot accept any more
+ *		commands for the time being.
+ *
+ * 		SCSI error processing:
+ * 		The scsi-eh has decided to requeue a command after getting
+ * 		a command it believes ir retryable.
+ *
  * Notes:       This could be called either from an interrupt context or a
  *              normal process context.
  */
-int scsi_queue_insert(struct scsi_cmnd *cmd, int reason)
+int scsi_attempt_requeue_command(struct scsi_cmnd *cmd, int reason)
 {
 	struct Scsi_Host *host = cmd->device->host;
 	struct scsi_device *device = cmd->device;
@@ -1457,7 +1463,7 @@ static void scsi_softirq_done(struct request *rq)
 	if (scsi_disposition_finish(disposition))
 		scsi_finish_command(cmd);
 	else if (scsi_disposition_retry(disposition))
-		scsi_queue_insert(cmd, disposition);
+		scsi_attempt_requeue_command(cmd, disposition);
 	else
 		if (!scsi_eh_scmd_add(cmd, 0))
 			scsi_finish_command(cmd);
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 79f0f75..e702513 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -66,7 +66,7 @@ int scsi_eh_get_sense(struct list_head *work_q,
 /* scsi_lib.c */
 extern int scsi_maybe_unblock_host(struct scsi_device *sdev);
 extern void scsi_device_unbusy(struct scsi_device *sdev);
-extern int scsi_queue_insert(struct scsi_cmnd *cmd, int reason);
+extern int scsi_attempt_requeue_command(struct scsi_cmnd *cmd, int reason);
 extern void scsi_next_command(struct scsi_cmnd *cmd);
 extern void scsi_io_completion(struct scsi_cmnd *, unsigned int);
 extern void scsi_run_host_queues(struct Scsi_Host *shost);
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 09/14] scsi: have device handlers return SCSI_MLQUEUE error value
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
                   ` (7 preceding siblings ...)
  2008-09-02 16:05 ` [PATCH 08/14] scsi: rename scsi_queue_insert to scsi_attempt_requeue_command Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-02 16:05 ` [PATCH 10/14] scsi: convert other scsi_check_sense users to new error codes Mike Anderson
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

From: Mike Christie <michaelc@cs.wisc.edu>

This has the scsi device handlers return SCSI_MLQUEUE error
values instead of the old values.

One change is that if the handler does not care we return 0
instead of SCSI_RETURN_NOT_HANDLED.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 drivers/scsi/device_handler/scsi_dh_alua.c |   16 ++++++++--------
 drivers/scsi/device_handler/scsi_dh_emc.c  |    8 ++++----
 drivers/scsi/device_handler/scsi_dh_rdac.c |   10 +++++-----
 drivers/scsi/scsi_error.c                  |    2 +-
 4 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
index 6bc55a6..b98d531 100644
--- a/drivers/scsi/device_handler/scsi_dh_alua.c
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c
@@ -426,45 +426,45 @@ static int alua_check_sense(struct scsi_device *sdev,
 			/*
 			 * LUN Not Accessible - ALUA state transition
 			 */
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_IMM_RETRY;
 		if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x0b)
 			/*
 			 * LUN Not Accessible -- Target port in standby state
 			 */
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x0c)
 			/*
 			 * LUN Not Accessible -- Target port in unavailable state
 			 */
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x12)
 			/*
 			 * LUN Not Ready -- Offline
 			 */
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		break;
 	case UNIT_ATTENTION:
 		if (sense_hdr->asc == 0x29 && sense_hdr->ascq == 0x00)
 			/*
 			 * Power On, Reset, or Bus Device Reset, just retry.
 			 */
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_IMM_RETRY;
 		if (sense_hdr->asc == 0x2a && sense_hdr->ascq == 0x06) {
 			/*
 			 * ALUA state changed
 			 */
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_IMM_RETRY;
 		}
 		if (sense_hdr->asc == 0x2a && sense_hdr->ascq == 0x07) {
 			/*
 			 * Implicit ALUA state transition failed
 			 */
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_IMM_RETRY;
 		}
 		break;
 	}
 
-	return SCSI_RETURN_NOT_HANDLED;
+	return 0;
 }
 
 /*
diff --git a/drivers/scsi/device_handler/scsi_dh_emc.c b/drivers/scsi/device_handler/scsi_dh_emc.c
index 64a56e5..4be7be3 100644
--- a/drivers/scsi/device_handler/scsi_dh_emc.c
+++ b/drivers/scsi/device_handler/scsi_dh_emc.c
@@ -418,7 +418,7 @@ static int clariion_check_sense(struct scsi_device *sdev,
 			 * Can return FAILED only when we want the error
 			 * recovery process to kick in.
 			 */
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		break;
 	case ILLEGAL_REQUEST:
 		if (sense_hdr->asc == 0x25 && sense_hdr->ascq == 0x01)
@@ -432,7 +432,7 @@ static int clariion_check_sense(struct scsi_device *sdev,
 			 * Can return FAILED only when we want the error
 			 * recovery process to kick in.
 			 */
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		break;
 	case UNIT_ATTENTION:
 		if (sense_hdr->asc == 0x29 && sense_hdr->ascq == 0x00)
@@ -440,11 +440,11 @@ static int clariion_check_sense(struct scsi_device *sdev,
 			 * Unit Attention Code. This is the first IO
 			 * to the new path, so just retry.
 			 */
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_IMM_RETRY;
 		break;
 	}
 
-	return SCSI_RETURN_NOT_HANDLED;
+	return 0;
 }
 
 static int clariion_prep_fn(struct scsi_device *sdev, struct request *req)
diff --git a/drivers/scsi/device_handler/scsi_dh_rdac.c b/drivers/scsi/device_handler/scsi_dh_rdac.c
index c504afe..9b1c2d5 100644
--- a/drivers/scsi/device_handler/scsi_dh_rdac.c
+++ b/drivers/scsi/device_handler/scsi_dh_rdac.c
@@ -546,13 +546,13 @@ static int rdac_check_sense(struct scsi_device *sdev,
 			 *
 			 * Nothing we can do here. Try to bypass the path.
 			 */
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		if (sense_hdr->asc == 0x04 && sense_hdr->ascq == 0xA1)
 			/* LUN Not Ready - Quiescense in progress
 			 *
 			 * Just retry and wait.
 			 */
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_IMM_RETRY;
 		break;
 	case ILLEGAL_REQUEST:
 		if (sense_hdr->asc == 0x94 && sense_hdr->ascq == 0x01) {
@@ -561,7 +561,7 @@ static int rdac_check_sense(struct scsi_device *sdev,
 			 * Fail the path, so that the other path be used.
 			 */
 			h->state = RDAC_STATE_PASSIVE;
-			return SUCCESS;
+			return SCSI_MLQUEUE_DIS_FINISH;
 		}
 		break;
 	case UNIT_ATTENTION:
@@ -569,11 +569,11 @@ static int rdac_check_sense(struct scsi_device *sdev,
 			/*
 			 * Power On, Reset, or Bus Device Reset, just retry.
 			 */
-			return NEEDS_RETRY;
+			return SCSI_MLQUEUE_IMM_RETRY;
 		break;
 	}
 	/* success just means we do not care what scsi-ml does */
-	return SCSI_RETURN_NOT_HANDLED;
+	return 0;
 }
 
 static const struct scsi_dh_devlist rdac_dev_list[] = {
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 45c7d24..b0e4b42 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -314,7 +314,7 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 		int rc;
 
 		rc = sdev->scsi_dh_data->scsi_dh->check_sense(sdev, &sshdr);
-		if (rc != SCSI_RETURN_NOT_HANDLED)
+		if (rc)
 			return rc;
 		/* handler does not care. Drop down to default handling */
 	}
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 10/14] scsi: convert other scsi_check_sense users to new error codes
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
                   ` (8 preceding siblings ...)
  2008-09-02 16:05 ` [PATCH 09/14] scsi: have device handlers return SCSI_MLQUEUE error value Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-02 16:05 ` [PATCH 11/14] scsi: fix up SCSI_MLQUEUE defintions and add driver, device and transport ones Mike Anderson
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

From: Mike Christie <michaelc@cs.wisc.edu>

scsi_eh_completed_normally are expecting a old style FAILED/SUCCESS
type of error code, but scsi_check_sense returns the new SCSI_MLQUEUE
ones. This converted scsi_check_sense users to the new codes.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_error.c |   27 ++++++++++++---------------
 1 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index b0e4b42..16bca8b 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -431,13 +431,13 @@ static int scsi_eh_completed_normally(struct scsi_cmnd *scmd)
 		return scsi_check_sense(scmd);
 	}
 	if (host_byte(scmd->result) != DID_OK)
-		return FAILED;
+		return SCSI_MLQUEUE_DIS_FAIL;
 
 	/*
 	 * next, check the message byte.
 	 */
 	if (msg_byte(scmd->result) != COMMAND_COMPLETE)
-		return FAILED;
+		return SCSI_MLQUEUE_DIS_FAIL;
 
 	/*
 	 * now, check the status byte to see if this indicates
@@ -446,7 +446,7 @@ static int scsi_eh_completed_normally(struct scsi_cmnd *scmd)
 	switch (status_byte(scmd->result)) {
 	case GOOD:
 	case COMMAND_TERMINATED:
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	case CHECK_CONDITION:
 		return scsi_check_sense(scmd);
 	case CONDITION_GOOD:
@@ -455,14 +455,14 @@ static int scsi_eh_completed_normally(struct scsi_cmnd *scmd)
 		/*
 		 * who knows?  FIXME(eric)
 		 */
-		return SUCCESS;
+		return SCSI_MLQUEUE_DIS_FINISH;
 	case BUSY:
 	case QUEUE_FULL:
 	case RESERVATION_CONFLICT:
 	default:
-		return FAILED;
+		return SCSI_MLQUEUE_DIS_FAIL;
 	}
-	return FAILED;
+	return SCSI_MLQUEUE_DIS_FAIL;
 }
 
 /**
@@ -790,15 +790,12 @@ static int scsi_send_eh_cmnd(struct scsi_cmnd *scmd, unsigned char *cmnd,
 			printk("%s: scsi_eh_completed_normally %x\n",
 			       __func__, rtn));
 
-		switch (rtn) {
-		case SUCCESS:
-		case NEEDS_RETRY:
-		case FAILED:
-			break;
-		default:
+		if (scsi_disposition_finish(rtn))
+			rtn = SUCCESS;
+		else if (scsi_disposition_retry(rtn))
+			rtn = NEEDS_RETRY;
+		else
 			rtn = FAILED;
-			break;
-		}
 	} else {
 		scsi_abort_eh_cmnd(scmd);
 		rtn = FAILED;
@@ -1026,7 +1023,7 @@ static int scsi_eh_stu(struct Scsi_Host *shost,
 		stu_scmd = NULL;
 		list_for_each_entry(scmd, work_q, eh_entry)
 			if (scmd->device == sdev && SCSI_SENSE_VALID(scmd) &&
-			    scsi_check_sense(scmd) == FAILED ) {
+			    scsi_check_sense(scmd) == SCSI_MLQUEUE_DIS_FAIL) {
 				stu_scmd = scmd;
 				break;
 			}
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 11/14] scsi: fix up SCSI_MLQUEUE defintions and add driver, device and transport ones
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
                   ` (9 preceding siblings ...)
  2008-09-02 16:05 ` [PATCH 10/14] scsi: convert other scsi_check_sense users to new error codes Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-02 16:05 ` [PATCH 12/14] scsi: move device online check to scsi_attempt_requeue_command Mike Anderson
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

From: Mike Christie <michaelc@cs.wisc.edu>

This adds new driver, device and transport errors and fixes up
the SCSI_MLQUEUE definitions so there not collisions.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_error.c |   22 +++++++++++-----------
 drivers/scsi/scsi_lib.c   |   26 ++++++++++++++++++++------
 include/scsi/scsi.h       |   34 +++++++++++++++++++++++-----------
 3 files changed, 54 insertions(+), 28 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 16bca8b..417f119 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -307,7 +307,7 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 		return SCSI_MLQUEUE_DIS_FAIL;	/* no valid sense data */
 
 	if (scsi_sense_is_deferred(&sshdr))
-		return SCSI_MLQUEUE_DIS_RETRY;
+		return SCSI_MLQUEUE_DIS_DEV_RETRY;
 
 	if (sdev->scsi_dh_data && sdev->scsi_dh_data->scsi_dh &&
 			sdev->scsi_dh_data->scsi_dh->check_sense) {
@@ -349,7 +349,7 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 		if (sshdr.asc == 0x10) /* DIF */
 			return SCSI_MLQUEUE_DIS_FINISH;
 
-		return SCSI_MLQUEUE_DIS_RETRY;
+		return SCSI_MLQUEUE_DIS_DEV_RETRY;
 	case NOT_READY:
 	case UNIT_ATTENTION:
 		/*
@@ -360,14 +360,14 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 		 */
 		if (scmd->device->expecting_cc_ua) {
 			scmd->device->expecting_cc_ua = 0;
-			return SCSI_MLQUEUE_DIS_RETRY;
+			return SCSI_MLQUEUE_DIS_DEV_RETRY;
 		}
 		/*
 		 * if the device is in the process of becoming ready, we 
 		 * should retry.
 		 */
 		if ((sshdr.asc == 0x04) && (sshdr.ascq == 0x01))
-			return SCSI_MLQUEUE_DIS_RETRY;
+			return SCSI_MLQUEUE_DIS_DEV_RETRY;
 		/*
 		 * if the device is not started, we need to wake
 		 * the error handler to start the motor
@@ -389,11 +389,11 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
 		    sshdr.asc == 0x14) { /* RECORD NOT FOUND */
 			return SCSI_MLQUEUE_DIS_FINISH;
 		}
-		return SCSI_MLQUEUE_DIS_RETRY;
+		return SCSI_MLQUEUE_DIS_DEV_RETRY;
 
 	case HARDWARE_ERROR:
 		if (scmd->device->retry_hwerror)
-			return SCSI_MLQUEUE_DIS_RETRY;
+			return SCSI_MLQUEUE_DIS_DEV_RETRY;
 		else
 			return SCSI_MLQUEUE_DIS_FINISH;
 
@@ -1343,7 +1343,7 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 		 * and not get stuck in a loop.
 		 */
 	case DID_SOFT_ERROR:
-		return SCSI_MLQUEUE_DIS_RETRY;
+		return SCSI_MLQUEUE_DIS_DRV_RETRY;
 	case DID_IMM_RETRY:
 	case DID_REQUEUE:
 		return SCSI_MLQUEUE_IMM_RETRY;
@@ -1369,11 +1369,11 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 			 * lower down
 			 */
 			break;
-		/* fallthrough */
-
-	case DID_BUS_BUSY:
+		/* fall through */
 	case DID_PARITY:
-		return SCSI_MLQUEUE_DIS_RETRY;
+		return SCSI_MLQUEUE_DIS_DEV_RETRY;
+	case DID_BUS_BUSY:
+		return SCSI_MLQUEUE_DIS_XPT_RETRY;
 	case DID_TIME_OUT:
 		/*
 		 * when we scan the bus, we get timeout messages for
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9bbc11d..3f01015 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -153,14 +153,28 @@ int scsi_attempt_requeue_command(struct scsi_cmnd *cmd, int reason)
 	else if (reason == SCSI_MLQUEUE_DEVICE_BUSY)
 		device->device_blocked = device->max_device_blocked;
 
-	if (!scsi_ign_failfast(reason)) {
-		if (blk_noretry_request(cmd->request)) {
-			set_driver_byte(cmd, DRIVER_TIMEOUT);
-			scsi_finish_command(cmd);
-			return 0;
-		}
+	if (!scsi_ign_failfast(reason) && scsi_disposition_retry(reason)) {
+		if (reason & SCSI_MLQUEUE_DIS_XPT_RETRY) {
+			if (!blk_failfast_transport(cmd->request))
+				goto check_retries;
+		} else if (reason & SCSI_MLQUEUE_DIS_DEV_RETRY) {
+			if (!blk_failfast_dev(cmd->request))
+				goto check_retries;
+		} else if (reason & SCSI_MLQUEUE_DIS_DRV_RETRY) {
+			if (!blk_failfast_driver(cmd->request))
+				goto check_retries;
+		} else if (reason & SCSI_MLQUEUE_DIS_RETRY) {
+			if (!blk_noretry_request(cmd->request))
+				goto check_retries;
+		} else
+			goto check_retries;
+
+		set_driver_byte(cmd, DRIVER_TIMEOUT);
+		scsi_finish_command(cmd);
+		return 0;
 	}
 
+check_retries:
 	if (!scsi_ign_cmd_retries(reason)) {
 		if (++cmd->retries > cmd->allowed) {
 			set_driver_byte(cmd, DRIVER_TIMEOUT);
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index a3083de..cacb12c 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -425,20 +425,32 @@ enum {
 	SCSI_IGN_ALLOWED	= 0x01,
 	SCSI_IGN_BLK_FAILFAST	= 0x02,
 
-	SCSI_MLQUEUE_DIS_FINISH	= 0x10,
-	SCSI_MLQUEUE_DIS_RETRY	= 0x20,
-	SCSI_MLQUEUE_DIS_FAIL	= 0x40,
-
-	SCSI_MLQUEUE_HOST_BUSY		= 0x100 | SCSI_MLQUEUE_DIS_RETRY |
-		SCSI_IGN_BLK_FAILFAST | SCSI_IGN_ALLOWED,
-	SCSI_MLQUEUE_DEVICE_BUSY	= 0x101 | SCSI_MLQUEUE_DIS_RETRY |
-		SCSI_IGN_BLK_FAILFAST | SCSI_IGN_ALLOWED,
-	SCSI_MLQUEUE_IMM_RETRY		= 0x102 | SCSI_MLQUEUE_DIS_RETRY |
-		SCSI_IGN_BLK_FAILFAST | SCSI_IGN_ALLOWED,
+	SCSI_MLQUEUE_DIS_SHIFT		= 4,
+	SCSI_MLQUEUE_DIS_FINISH		= 0x01 << SCSI_MLQUEUE_DIS_SHIFT,
+	SCSI_MLQUEUE_DIS_RETRY		= 0x02 << SCSI_MLQUEUE_DIS_SHIFT,
+	SCSI_MLQUEUE_DIS_XPT_RETRY	= 0x04 << SCSI_MLQUEUE_DIS_SHIFT,
+	SCSI_MLQUEUE_DIS_DEV_RETRY	= 0x08 << SCSI_MLQUEUE_DIS_SHIFT,
+	SCSI_MLQUEUE_DIS_DRV_RETRY	= 0x10 << SCSI_MLQUEUE_DIS_SHIFT,
+	SCSI_MLQUEUE_DIS_FAIL		= 0x20 << SCSI_MLQUEUE_DIS_SHIFT,
+
+	SCSI_MLQUEUE_BUSY_SHIFT		= 8,
+	SCSI_MLQUEUE_HOST_BUSY		= (0x01 << SCSI_MLQUEUE_BUSY_SHIFT) |
+		SCSI_MLQUEUE_DIS_RETRY | SCSI_IGN_BLK_FAILFAST |
+		SCSI_IGN_ALLOWED,
+	SCSI_MLQUEUE_DEVICE_BUSY	= (0x02 << SCSI_MLQUEUE_BUSY_SHIFT) |
+		SCSI_MLQUEUE_DIS_RETRY | SCSI_IGN_BLK_FAILFAST |
+		SCSI_IGN_ALLOWED,
+	SCSI_MLQUEUE_IMM_RETRY		= (0x04 << SCSI_MLQUEUE_BUSY_SHIFT) |
+		SCSI_MLQUEUE_DIS_RETRY | SCSI_IGN_BLK_FAILFAST |
+		SCSI_IGN_ALLOWED,
 };
 
 #define scsi_disposition_finish(dis) (dis & SCSI_MLQUEUE_DIS_FINISH)
-#define scsi_disposition_retry(dis) (dis & SCSI_MLQUEUE_DIS_RETRY)
+#define scsi_disposition_retry(dis)			\
+	((dis & SCSI_MLQUEUE_DIS_RETRY)		||	\
+	 (dis & SCSI_MLQUEUE_DIS_XPT_RETRY)	||	\
+	 (dis & SCSI_MLQUEUE_DIS_DEV_RETRY)	||	\
+	 (dis & SCSI_MLQUEUE_DIS_DRV_RETRY))
 #define scsi_ign_cmd_retries(dis) (dis & SCSI_IGN_ALLOWED)
 #define scsi_ign_failfast(dis) (dis & SCSI_IGN_BLK_FAILFAST)
 
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 12/14] scsi: move device online check to scsi_attempt_requeue_command
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
                   ` (10 preceding siblings ...)
  2008-09-02 16:05 ` [PATCH 11/14] scsi: fix up SCSI_MLQUEUE defintions and add driver, device and transport ones Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-02 16:05 ` [PATCH 13/14] scsi: remove scsi_device_online from scsi_decide_disposition Mike Anderson
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

Move the device online check from scsi_eh_flush_done_q to
scsi_attempt_requeue_command so that it can be shared by other callers and
simplifies scsi_eh_flush_done_q. To provide similar behavior also move
the setting of a zero result value to scsi_eh_finish_cmd.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_error.c |   26 ++++++--------------------
 drivers/scsi/scsi_lib.c   |    6 ++++++
 2 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 417f119..c9b5598 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -835,6 +835,8 @@ void scsi_eh_finish_cmd(struct scsi_cmnd *scmd, struct list_head *done_q)
 {
 	scmd->device->host->host_failed--;
 	scmd->eh_eflags = 0;
+	if (!scmd->result)
+		scmd->result |= (DRIVER_TIMEOUT << 24);
 	list_move_tail(&scmd->eh_entry, done_q);
 }
 EXPORT_SYMBOL(scsi_eh_finish_cmd);
@@ -1555,26 +1557,10 @@ void scsi_eh_flush_done_q(struct list_head *done_q)
 
 	list_for_each_entry_safe(scmd, next, done_q, eh_entry) {
 		list_del_init(&scmd->eh_entry);
-		if (scsi_device_online(scmd->device)) {
-			SCSI_LOG_ERROR_RECOVERY(3, printk("%s: flush"
-							  " retry cmd: %p\n",
-							  current->comm,
-							  scmd));
-			scsi_attempt_requeue_command(scmd,
-						     SCSI_MLQUEUE_DIS_RETRY);
-		} else {
-			/*
-			 * If just we got sense for the device (called
-			 * scsi_eh_get_sense), scmd->result is already
-			 * set, do not set DRIVER_TIMEOUT.
-			 */
-			if (!scmd->result)
-				scmd->result |= (DRIVER_TIMEOUT << 24);
-			SCSI_LOG_ERROR_RECOVERY(3, printk("%s: flush finish"
-							" cmd: %p\n",
-							current->comm, scmd));
-			scsi_finish_command(scmd);
-		}
+		SCSI_LOG_ERROR_RECOVERY(3, printk("%s: flush"
+						  "attempt retry cmd: %p\n",
+						  current->comm, scmd));
+		scsi_attempt_requeue_command(scmd, SCSI_MLQUEUE_DIS_RETRY);
 	}
 }
 EXPORT_SYMBOL(scsi_eh_flush_done_q);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 3f01015..38c118f 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -135,6 +135,12 @@ int scsi_attempt_requeue_command(struct scsi_cmnd *cmd, int reason)
 		return 0;
 	}
 
+	if (!scsi_device_online(cmd->device)) {
+		set_driver_byte(cmd, DRIVER_TIMEOUT);
+		scsi_finish_command(cmd);
+		return 0;
+	}
+
 	/*
 	 * Set the appropriate busy bit for the device/host.
 	 *
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 13/14] scsi: remove scsi_device_online from scsi_decide_disposition
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
                   ` (11 preceding siblings ...)
  2008-09-02 16:05 ` [PATCH 12/14] scsi: move device online check to scsi_attempt_requeue_command Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-02 16:05 ` [PATCH 14/14] scsi: update scsi_log_completion disposition decoding Mike Anderson
  2008-09-02 17:03 ` [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Christie
  14 siblings, 0 replies; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

Remove scsi_device_online from scsi_decide_disposition as the
check is now performed in scsi_attempt_requeue_command if a
retry is attempted.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_error.c |   11 -----------
 1 files changed, 0 insertions(+), 11 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index c9b5598..446352d 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1298,17 +1298,6 @@ int scsi_decide_disposition(struct scsi_cmnd *scmd)
 {
 
 	/*
-	 * if the device is offline, then we clearly just pass the result back
-	 * up to the top level.
-	 */
-	if (!scsi_device_online(scmd->device)) {
-		SCSI_LOG_ERROR_RECOVERY(5, printk("%s: device offline - report"
-						  " as SUCCESS\n",
-						  __func__));
-		return SCSI_MLQUEUE_DIS_FINISH;
-	}
-
-	/*
 	 * first check the host byte, to see if there is anything in there
 	 * that would indicate what we need to do.
 	 */
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 14/14] scsi: update scsi_log_completion disposition decoding
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
                   ` (12 preceding siblings ...)
  2008-09-02 16:05 ` [PATCH 13/14] scsi: remove scsi_device_online from scsi_decide_disposition Mike Anderson
@ 2008-09-02 16:05 ` Mike Anderson
  2008-09-02 17:03 ` [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Christie
  14 siblings, 0 replies; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 16:05 UTC (permalink / raw)
  To: linux-scsi; +Cc: Mike Christie

Update scsi_log_completion to decode new disposition codes.

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
---
 drivers/scsi/scsi.c |   41 ++++++++++++++++++++++-------------------
 include/scsi/scsi.h |    1 +
 2 files changed, 23 insertions(+), 19 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index b87fbb2..a4f56af 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -578,37 +578,40 @@ void scsi_log_completion(struct scsi_cmnd *cmd, int disposition)
 	if (unlikely(scsi_logging_level)) {
 		level = SCSI_LOG_LEVEL(SCSI_LOG_MLCOMPLETE_SHIFT,
 				       SCSI_LOG_MLCOMPLETE_BITS);
-		if (((level > 0) && (cmd->result || disposition != SUCCESS)) ||
+		if (((level > 0) &&
+		    (cmd->result || !scsi_disposition_finish(disposition))) ||
 		    (level > 1)) {
 			scmd_printk(KERN_INFO, cmd, "Done: ");
 			if (level > 2)
 				printk("0x%p ", cmd);
+
 			/*
 			 * Dump truncated values, so we usually fit within
 			 * 80 chars.
 			 */
-			switch (disposition) {
-			case SUCCESS:
+			if (scsi_disposition_finish(disposition))
 				printk("SUCCESS\n");
-				break;
-			case NEEDS_RETRY:
+			else if (scsi_disposition_retry(disposition))
 				printk("RETRY\n");
-				break;
-			case ADD_TO_MLQUEUE:
-				printk("MLQUEUE\n");
-				break;
-			case FAILED:
+			else if (scsi_disposition_fail(disposition))
 				printk("FAILED\n");
-				break;
-			case TIMEOUT_ERROR:
-				/* 
-				 * If called via scsi_times_out.
-				 */
-				printk("TIMEOUT\n");
-				break;
-			default:
-				printk("UNKNOWN\n");
+			else {
+				switch (disposition) {
+				case SUCCESS:
+					printk("SUCCESS\n");
+					break;
+				case TIMEOUT_ERROR:
+					/*
+					 * If called via scsi_times_out.
+					 */
+					printk("TIMEOUT\n");
+					break;
+				default:
+					printk("UNKNOWN: 0x%x\n",
+					       disposition);
+				}
 			}
+
 			scsi_print_result(cmd);
 			scsi_print_command(cmd);
 			if (status_byte(cmd->result) & CHECK_CONDITION)
diff --git a/include/scsi/scsi.h b/include/scsi/scsi.h
index cacb12c..c7f317b 100644
--- a/include/scsi/scsi.h
+++ b/include/scsi/scsi.h
@@ -446,6 +446,7 @@ enum {
 };
 
 #define scsi_disposition_finish(dis) (dis & SCSI_MLQUEUE_DIS_FINISH)
+#define scsi_disposition_fail(dis) (dis & SCSI_MLQUEUE_DIS_FAIL)
 #define scsi_disposition_retry(dis)			\
 	((dis & SCSI_MLQUEUE_DIS_RETRY)		||	\
 	 (dis & SCSI_MLQUEUE_DIS_XPT_RETRY)	||	\
-- 
1.5.5.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 01/14] block: separate failfast into multiple bits.
  2008-09-02 16:05 ` [PATCH 01/14] block: separate failfast into multiple bits Mike Anderson
@ 2008-09-02 16:35   ` Grant Grundler
  2008-09-02 16:59     ` Mike Christie
  0 siblings, 1 reply; 23+ messages in thread
From: Grant Grundler @ 2008-09-02 16:35 UTC (permalink / raw)
  To: Mike Anderson
  Cc: linux-scsi, Mike Christie, Jens Axboe, Alasdair G Kergon,
	Neil Brown, Martin Schwidefsky

On Tue, Sep 2, 2008 at 9:05 AM, Mike Anderson
<andmike@linux.vnet.ibm.com> wrote:
> From: Mike Christie <michaelc@cs.wisc.edu>
>
> Multipath is best at handling transport errors. If it gets a device
> error then there is not much the multipath layer can do. It will just
> access the same device but from a different path.
>
> This patch breaks up failfast into device, transport and driver errors.

Is there any document that describes what those errors are for each
class of transport?

This great work though...I'm looking forward to a storage subsystem
where each level
can cooperate with the ones above it.

> The multipath layers (md and dm mutlipath) only ask the lower levels to
> fast fail transport errors. The user of failfast, read ahead, will ask
> to fast fail on all errors.
>
> Note that blk_noretry_request will return true if any failfast bit
> is set. This allows drivers that do not support the multipath failfast
> bits to continue to fail on any failfast error like before. Drivers
> like scsi that are able to fail fast specific errors can check
> for the specific fail fast type. In the next patch I will convert
> scsi.
>
> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
> Cc: Jens Axboe <jens.axboe@oracle.com>
> Cc: Alasdair G Kergon <agk@redhat.com>
> Cc: Neil Brown <neilb@suse.de>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
> ---
>  block/blk-core.c                            |   11 +++++++++--
>  drivers/md/dm-mpath.c                       |    2 +-
>  drivers/md/multipath.c                      |    4 ++--
>  drivers/s390/block/dasd_diag.c              |    2 +-
>  drivers/s390/block/dasd_eckd.c              |    2 +-
>  drivers/s390/block/dasd_fba.c               |    2 +-
>  drivers/scsi/device_handler/scsi_dh_alua.c  |    3 ++-
>  drivers/scsi/device_handler/scsi_dh_emc.c   |    3 ++-
>  drivers/scsi/device_handler/scsi_dh_hp_sw.c |    6 ++++--
>  drivers/scsi/device_handler/scsi_dh_rdac.c  |    3 ++-
>  drivers/scsi/scsi_transport_spi.c           |    4 +++-
>  include/linux/bio.h                         |   26 +++++++++++++++++---------
>  include/linux/blkdev.h                      |   15 ++++++++++++---
>  13 files changed, 57 insertions(+), 26 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 4889eb8..f3c29d0 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1073,8 +1073,15 @@ void init_request_from_bio(struct request *req, struct bio *bio)
>        /*
>         * inherit FAILFAST from bio (for read-ahead, and explicit FAILFAST)
>         */
> -       if (bio_rw_ahead(bio) || bio_failfast(bio))
> -               req->cmd_flags |= REQ_FAILFAST;
> +       if (bio_rw_ahead(bio))
> +               req->cmd_flags |= (REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
> +                                  REQ_FAILFAST_DRIVER);
> +       if (bio_failfast_dev(bio))
> +               req->cmd_flags |= REQ_FAILFAST_DEV;
> +       if (bio_failfast_transport(bio))
> +               req->cmd_flags |= REQ_FAILFAST_TRANSPORT;
> +       if (bio_failfast_driver(bio))
> +               req->cmd_flags |= REQ_FAILFAST_DRIVER;

This is open source.
Why can't something like this be done?
    req->cmd_flags |= bio_failfast_flags(bio);

I'm assuming the REQ_FAILFAST_* flags can be equated to the same bits
in the bio layer.


>
>        /*
>         * REQ_BARRIER implies no merging, but lets make it explicit
> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index 71dd65a..b48e201 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -827,7 +827,7 @@ static int multipath_map(struct dm_target *ti, struct bio *bio,
>        dm_bio_record(&mpio->details, bio);
>
>        map_context->ptr = mpio;
> -       bio->bi_rw |= (1 << BIO_RW_FAILFAST);
> +       bio->bi_rw |= (1 << BIO_RW_FAILFAST_TRANSPORT);
>        r = map_io(m, bio, mpio, 0);
>        if (r < 0 || r == DM_MAPIO_REQUEUE)
>                mempool_free(mpio, m->mpio_pool);
> diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
> index c4779cc..2426201 100644
> --- a/drivers/md/multipath.c
> +++ b/drivers/md/multipath.c
> @@ -172,7 +172,7 @@ static int multipath_make_request (struct request_queue *q, struct bio * bio)
>        mp_bh->bio = *bio;
>        mp_bh->bio.bi_sector += multipath->rdev->data_offset;
>        mp_bh->bio.bi_bdev = multipath->rdev->bdev;
> -       mp_bh->bio.bi_rw |= (1 << BIO_RW_FAILFAST);
> +       mp_bh->bio.bi_rw |= (1 << BIO_RW_FAILFAST_TRANSPORT);
>        mp_bh->bio.bi_end_io = multipath_end_request;
>        mp_bh->bio.bi_private = mp_bh;
>        generic_make_request(&mp_bh->bio);
> @@ -398,7 +398,7 @@ static void multipathd (mddev_t *mddev)
>                        *bio = *(mp_bh->master_bio);
>                        bio->bi_sector += conf->multipaths[mp_bh->path].rdev->data_offset;
>                        bio->bi_bdev = conf->multipaths[mp_bh->path].rdev->bdev;
> -                       bio->bi_rw |= (1 << BIO_RW_FAILFAST);
> +                       bio->bi_rw |= (1 << BIO_RW_FAILFAST_TRANSPORT);
>                        bio->bi_end_io = multipath_end_request;
>                        bio->bi_private = mp_bh;
>                        generic_make_request(bio);
> diff --git a/drivers/s390/block/dasd_diag.c b/drivers/s390/block/dasd_diag.c
> index 85fcb43..7844461 100644
> --- a/drivers/s390/block/dasd_diag.c
> +++ b/drivers/s390/block/dasd_diag.c
> @@ -544,7 +544,7 @@ static struct dasd_ccw_req *dasd_diag_build_cp(struct dasd_device *memdev,
>        }
>        cqr->retries = DIAG_MAX_RETRIES;
>        cqr->buildclk = get_clock();
> -       if (req->cmd_flags & REQ_FAILFAST)
> +       if (blk_noretry_request(req))
>                set_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags);
>        cqr->startdev = memdev;
>        cqr->memdev = memdev;
> diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c
> index 773b3fe..b11a221 100644
> --- a/drivers/s390/block/dasd_eckd.c
> +++ b/drivers/s390/block/dasd_eckd.c
> @@ -1683,7 +1683,7 @@ static struct dasd_ccw_req *dasd_eckd_build_cp(struct dasd_device *startdev,
>                        recid++;
>                }
>        }
> -       if (req->cmd_flags & REQ_FAILFAST)
> +       if (blk_noretry_request(req))
>                set_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags);
>        cqr->startdev = startdev;
>        cqr->memdev = startdev;
> diff --git a/drivers/s390/block/dasd_fba.c b/drivers/s390/block/dasd_fba.c
> index aa0c533..115e032 100644
> --- a/drivers/s390/block/dasd_fba.c
> +++ b/drivers/s390/block/dasd_fba.c
> @@ -355,7 +355,7 @@ static struct dasd_ccw_req *dasd_fba_build_cp(struct dasd_device * memdev,
>                        recid++;
>                }
>        }
> -       if (req->cmd_flags & REQ_FAILFAST)
> +       if (blk_noretry_request(req))
>                set_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags);
>        cqr->startdev = memdev;
>        cqr->memdev = memdev;
> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
> index 994da56..6bc55a6 100644
> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
> @@ -109,7 +109,8 @@ static struct request *get_alua_req(struct scsi_device *sdev,
>        }
>
>        rq->cmd_type = REQ_TYPE_BLOCK_PC;
> -       rq->cmd_flags |= REQ_FAILFAST | REQ_NOMERGE;
> +       rq->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
> +                        REQ_FAILFAST_DRIVER | REQ_NOMERGE;
>        rq->retries = ALUA_FAILOVER_RETRIES;
>        rq->timeout = ALUA_FAILOVER_TIMEOUT;
>
> diff --git a/drivers/scsi/device_handler/scsi_dh_emc.c b/drivers/scsi/device_handler/scsi_dh_emc.c
> index b9d23e9..64a56e5 100644
> --- a/drivers/scsi/device_handler/scsi_dh_emc.c
> +++ b/drivers/scsi/device_handler/scsi_dh_emc.c
> @@ -304,7 +304,8 @@ static struct request *get_req(struct scsi_device *sdev, int cmd,
>
>        rq->cmd[4] = len;
>        rq->cmd_type = REQ_TYPE_BLOCK_PC;
> -       rq->cmd_flags |= REQ_FAILFAST;
> +       rq->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
> +                        REQ_FAILFAST_DRIVER;
>        rq->timeout = CLARIION_TIMEOUT;
>        rq->retries = CLARIION_RETRIES;
>
> diff --git a/drivers/scsi/device_handler/scsi_dh_hp_sw.c b/drivers/scsi/device_handler/scsi_dh_hp_sw.c
> index a6a4ef3..08ba1ce 100644
> --- a/drivers/scsi/device_handler/scsi_dh_hp_sw.c
> +++ b/drivers/scsi/device_handler/scsi_dh_hp_sw.c
> @@ -112,7 +112,8 @@ static int hp_sw_tur(struct scsi_device *sdev, struct hp_sw_dh_data *h)
>                return SCSI_DH_RES_TEMP_UNAVAIL;
>
>        req->cmd_type = REQ_TYPE_BLOCK_PC;
> -       req->cmd_flags |= REQ_FAILFAST;
> +       req->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
> +                         REQ_FAILFAST_DRIVER;
>        req->cmd_len = COMMAND_SIZE(TEST_UNIT_READY);
>        memset(req->cmd, 0, MAX_COMMAND_SIZE);
>        req->cmd[0] = TEST_UNIT_READY;
> @@ -205,7 +206,8 @@ static int hp_sw_start_stop(struct scsi_device *sdev, struct hp_sw_dh_data *h)
>                return SCSI_DH_RES_TEMP_UNAVAIL;
>
>        req->cmd_type = REQ_TYPE_BLOCK_PC;
> -       req->cmd_flags |= REQ_FAILFAST;
> +       req->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
> +                         REQ_FAILFAST_DRIVER;
>        req->cmd_len = COMMAND_SIZE(START_STOP);
>        memset(req->cmd, 0, MAX_COMMAND_SIZE);
>        req->cmd[0] = START_STOP;
> diff --git a/drivers/scsi/device_handler/scsi_dh_rdac.c b/drivers/scsi/device_handler/scsi_dh_rdac.c
> index 2dee69d..c504afe 100644
> --- a/drivers/scsi/device_handler/scsi_dh_rdac.c
> +++ b/drivers/scsi/device_handler/scsi_dh_rdac.c
> @@ -228,7 +228,8 @@ static struct request *get_rdac_req(struct scsi_device *sdev,
>        memset(rq->cmd, 0, BLK_MAX_CDB);
>
>        rq->cmd_type = REQ_TYPE_BLOCK_PC;
> -       rq->cmd_flags |= REQ_FAILFAST | REQ_NOMERGE;
> +       rq->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
> +                        REQ_FAILFAST_DRIVER;

Was dropping the REQ_NOMERGE intentional?
Everywhere else it seems REQ_FAILFAST was simply replaced with the
three new flag bits.

thanks,
grant

>        rq->retries = RDAC_RETRIES;
>        rq->timeout = RDAC_TIMEOUT;
>
> diff --git a/drivers/scsi/scsi_transport_spi.c b/drivers/scsi/scsi_transport_spi.c
> index b29360e..7c2d289 100644
> --- a/drivers/scsi/scsi_transport_spi.c
> +++ b/drivers/scsi/scsi_transport_spi.c
> @@ -109,7 +109,9 @@ static int spi_execute(struct scsi_device *sdev, const void *cmd,
>        for(i = 0; i < DV_RETRIES; i++) {
>                result = scsi_execute(sdev, cmd, dir, buffer, bufflen,
>                                      sense, DV_TIMEOUT, /* retries */ 1,
> -                                     REQ_FAILFAST);
> +                                     REQ_FAILFAST_DEV |
> +                                     REQ_FAILFAST_TRANSPORT |
> +                                     REQ_FAILFAST_DRIVER);
>                if (result & DRIVER_SENSE) {
>                        struct scsi_sense_hdr sshdr_tmp;
>                        if (!sshdr)
> diff --git a/include/linux/bio.h b/include/linux/bio.h
> index 0933a14..425a4ec 100644
> --- a/include/linux/bio.h
> +++ b/include/linux/bio.h
> @@ -147,15 +147,20 @@ struct bio {
>  * bit 0 -- read (not set) or write (set)
>  * bit 1 -- rw-ahead when set
>  * bit 2 -- barrier
> - * bit 3 -- fail fast, don't want low level driver retries
> - * bit 4 -- synchronous I/O hint: the block layer will unplug immediately
> + * bit 3 -- synchronous I/O hint: the block layer will unplug immediately
> + * bit 4 -- meta data
> + * bit 5 -- fail fast device errors
> + * bit 6 -- fail fast transport errors
> + * bit 7 -- fail fast driver errors
>  */
> -#define BIO_RW         0
> -#define BIO_RW_AHEAD   1
> -#define BIO_RW_BARRIER 2
> -#define BIO_RW_FAILFAST        3
> -#define BIO_RW_SYNC    4
> -#define BIO_RW_META    5
> +#define BIO_RW                         0
> +#define BIO_RW_AHEAD                   1
> +#define BIO_RW_BARRIER                 2
> +#define BIO_RW_SYNC                    3
> +#define BIO_RW_META                    4
> +#define BIO_RW_FAILFAST_DEV            5
> +#define BIO_RW_FAILFAST_TRANSPORT      6
> +#define BIO_RW_FAILFAST_DRIVER         7
>
>  /*
>  * upper 16 bits of bi_rw define the io priority of this bio
> @@ -182,7 +187,10 @@ struct bio {
>  #define bio_sectors(bio)       ((bio)->bi_size >> 9)
>  #define bio_barrier(bio)       ((bio)->bi_rw & (1 << BIO_RW_BARRIER))
>  #define bio_sync(bio)          ((bio)->bi_rw & (1 << BIO_RW_SYNC))
> -#define bio_failfast(bio)      ((bio)->bi_rw & (1 << BIO_RW_FAILFAST))
> +#define bio_failfast_dev(bio)  ((bio)->bi_rw & (1 << BIO_RW_FAILFAST_DEV))
> +#define bio_failfast_transport(bio)    \
> +       ((bio)->bi_rw & (1 << BIO_RW_FAILFAST_TRANSPORT))
> +#define bio_failfast_driver(bio) ((bio)->bi_rw & (1 << BIO_RW_FAILFAST_DRIVER))
>  #define bio_rw_ahead(bio)      ((bio)->bi_rw & (1 << BIO_RW_AHEAD))
>  #define bio_rw_meta(bio)       ((bio)->bi_rw & (1 << BIO_RW_META))
>  #define bio_empty_barrier(bio) (bio_barrier(bio) && !(bio)->bi_size)
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index e61f22b..3f37fb6 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -88,7 +88,9 @@ enum {
>  */
>  enum rq_flag_bits {
>        __REQ_RW,               /* not set, read. set, write */
> -       __REQ_FAILFAST,         /* no low level driver retries */
> +       __REQ_FAILFAST_DEV,     /* no driver retries of device errors */
> +       __REQ_FAILFAST_TRANSPORT, /* no driver retries of transport errors */
> +       __REQ_FAILFAST_DRIVER,  /* no driver retries of driver errors */
>        __REQ_SORTED,           /* elevator knows about this request */
>        __REQ_SOFTBARRIER,      /* may not be passed by ioscheduler */
>        __REQ_HARDBARRIER,      /* may not be passed by drive either */
> @@ -111,7 +113,9 @@ enum rq_flag_bits {
>  };
>
>  #define REQ_RW         (1 << __REQ_RW)
> -#define REQ_FAILFAST   (1 << __REQ_FAILFAST)
> +#define REQ_FAILFAST_DEV       (1 << __REQ_FAILFAST_DEV)
> +#define REQ_FAILFAST_TRANSPORT (1 << __REQ_FAILFAST_TRANSPORT)
> +#define REQ_FAILFAST_DRIVER    (1 << __REQ_FAILFAST_DRIVER)
>  #define REQ_SORTED     (1 << __REQ_SORTED)
>  #define REQ_SOFTBARRIER        (1 << __REQ_SOFTBARRIER)
>  #define REQ_HARDBARRIER        (1 << __REQ_HARDBARRIER)
> @@ -523,7 +527,12 @@ enum {
>  #define blk_special_request(rq)        ((rq)->cmd_type == REQ_TYPE_SPECIAL)
>  #define blk_sense_request(rq)  ((rq)->cmd_type == REQ_TYPE_SENSE)
>
> -#define blk_noretry_request(rq)        ((rq)->cmd_flags & REQ_FAILFAST)
> +#define blk_failfast_dev(rq)   ((rq)->cmd_flags & REQ_FAILFAST_DEV)
> +#define blk_failfast_transport(rq) ((rq)->cmd_flags & REQ_FAILFAST_TRANSPORT)
> +#define blk_failfast_driver(rq)        ((rq)->cmd_flags & REQ_FAILFAST_DRIVER)
> +#define blk_noretry_request(rq)        (blk_failfast_dev(rq) ||        \
> +                                blk_failfast_transport(rq) ||  \
> +                                blk_failfast_driver(rq))
>  #define blk_rq_started(rq)     ((rq)->cmd_flags & REQ_STARTED)
>
>  #define blk_account_rq(rq)     (blk_rq_started(rq) && blk_fs_request(rq))
> --
> 1.5.5.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 01/14] block: separate failfast into multiple bits.
  2008-09-02 16:35   ` Grant Grundler
@ 2008-09-02 16:59     ` Mike Christie
  2008-09-02 17:31       ` Mike Anderson
  0 siblings, 1 reply; 23+ messages in thread
From: Mike Christie @ 2008-09-02 16:59 UTC (permalink / raw)
  To: Grant Grundler
  Cc: Mike Anderson, linux-scsi, Jens Axboe, Alasdair G Kergon,
	Neil Brown, Martin Schwidefsky

Grant Grundler wrote:
> On Tue, Sep 2, 2008 at 9:05 AM, Mike Anderson
> <andmike@linux.vnet.ibm.com> wrote:
>> From: Mike Christie <michaelc@cs.wisc.edu>
>>
>> Multipath is best at handling transport errors. If it gets a device
>> error then there is not much the multipath layer can do. It will just
>> access the same device but from a different path.
>>
>> This patch breaks up failfast into device, transport and driver errors.
> 
> Is there any document that describes what those errors are for each
> class of transport?



Not yet. For SCSI I was still trying to classify the host byte errors, 
because drivers are using them differently. I had sent patches in the 
thread here
http://marc.info/?l=linux-scsi&m=121918956332584&w=2
  that just start syncing up the transport errors for SCSI by adding 
some new transport host byte errors and converting drivers and transport 
classes to them.



> 
> This great work though...I'm looking forward to a storage subsystem
> where each level
> can cooperate with the ones above it.
> 
>> The multipath layers (md and dm mutlipath) only ask the lower levels to
>> fast fail transport errors. The user of failfast, read ahead, will ask
>> to fast fail on all errors.
>>
>> Note that blk_noretry_request will return true if any failfast bit
>> is set. This allows drivers that do not support the multipath failfast
>> bits to continue to fail on any failfast error like before. Drivers
>> like scsi that are able to fail fast specific errors can check
>> for the specific fail fast type. In the next patch I will convert
>> scsi.
>>
>> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
>> Cc: Jens Axboe <jens.axboe@oracle.com>
>> Cc: Alasdair G Kergon <agk@redhat.com>
>> Cc: Neil Brown <neilb@suse.de>
>> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
>> Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
>> ---
>>  block/blk-core.c                            |   11 +++++++++--
>>  drivers/md/dm-mpath.c                       |    2 +-
>>  drivers/md/multipath.c                      |    4 ++--
>>  drivers/s390/block/dasd_diag.c              |    2 +-
>>  drivers/s390/block/dasd_eckd.c              |    2 +-
>>  drivers/s390/block/dasd_fba.c               |    2 +-
>>  drivers/scsi/device_handler/scsi_dh_alua.c  |    3 ++-
>>  drivers/scsi/device_handler/scsi_dh_emc.c   |    3 ++-
>>  drivers/scsi/device_handler/scsi_dh_hp_sw.c |    6 ++++--
>>  drivers/scsi/device_handler/scsi_dh_rdac.c  |    3 ++-
>>  drivers/scsi/scsi_transport_spi.c           |    4 +++-
>>  include/linux/bio.h                         |   26 +++++++++++++++++---------
>>  include/linux/blkdev.h                      |   15 ++++++++++++---
>>  13 files changed, 57 insertions(+), 26 deletions(-)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 4889eb8..f3c29d0 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -1073,8 +1073,15 @@ void init_request_from_bio(struct request *req, struct bio *bio)
>>        /*
>>         * inherit FAILFAST from bio (for read-ahead, and explicit FAILFAST)
>>         */
>> -       if (bio_rw_ahead(bio) || bio_failfast(bio))
>> -               req->cmd_flags |= REQ_FAILFAST;
>> +       if (bio_rw_ahead(bio))
>> +               req->cmd_flags |= (REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
>> +                                  REQ_FAILFAST_DRIVER);
>> +       if (bio_failfast_dev(bio))
>> +               req->cmd_flags |= REQ_FAILFAST_DEV;
>> +       if (bio_failfast_transport(bio))
>> +               req->cmd_flags |= REQ_FAILFAST_TRANSPORT;
>> +       if (bio_failfast_driver(bio))
>> +               req->cmd_flags |= REQ_FAILFAST_DRIVER;
> 
> This is open source.
> Why can't something like this be done?
>     req->cmd_flags |= bio_failfast_flags(bio);


We used to do that when it was just 1 bit for failfast, and it kept 
getting messed up. I fixed it once before and Tomo or someone else had 
to fix it again later when it got messed up again, so I thought this is 
more clear and would prevent future screw ups.



> 
> I'm assuming the REQ_FAILFAST_* flags can be equated to the same bits
> in the bio layer.

> 
> 
>>        /*
>>         * REQ_BARRIER implies no merging, but lets make it explicit
>> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
>> index 71dd65a..b48e201 100644
>> --- a/drivers/md/dm-mpath.c
>> +++ b/drivers/md/dm-mpath.c
>> @@ -827,7 +827,7 @@ static int multipath_map(struct dm_target *ti, struct bio *bio,
>>        dm_bio_record(&mpio->details, bio);
>>
>>        map_context->ptr = mpio;
>> -       bio->bi_rw |= (1 << BIO_RW_FAILFAST);
>> +       bio->bi_rw |= (1 << BIO_RW_FAILFAST_TRANSPORT);
>>        r = map_io(m, bio, mpio, 0);
>>        if (r < 0 || r == DM_MAPIO_REQUEUE)
>>                mempool_free(mpio, m->mpio_pool);
>> diff --git a/drivers/md/multipath.c b/drivers/md/multipath.c
>> index c4779cc..2426201 100644
>> --- a/drivers/md/multipath.c
>> +++ b/drivers/md/multipath.c
>> @@ -172,7 +172,7 @@ static int multipath_make_request (struct request_queue *q, struct bio * bio)
>>        mp_bh->bio = *bio;
>>        mp_bh->bio.bi_sector += multipath->rdev->data_offset;
>>        mp_bh->bio.bi_bdev = multipath->rdev->bdev;
>> -       mp_bh->bio.bi_rw |= (1 << BIO_RW_FAILFAST);
>> +       mp_bh->bio.bi_rw |= (1 << BIO_RW_FAILFAST_TRANSPORT);
>>        mp_bh->bio.bi_end_io = multipath_end_request;
>>        mp_bh->bio.bi_private = mp_bh;
>>        generic_make_request(&mp_bh->bio);
>> @@ -398,7 +398,7 @@ static void multipathd (mddev_t *mddev)
>>                        *bio = *(mp_bh->master_bio);
>>                        bio->bi_sector += conf->multipaths[mp_bh->path].rdev->data_offset;
>>                        bio->bi_bdev = conf->multipaths[mp_bh->path].rdev->bdev;
>> -                       bio->bi_rw |= (1 << BIO_RW_FAILFAST);
>> +                       bio->bi_rw |= (1 << BIO_RW_FAILFAST_TRANSPORT);
>>                        bio->bi_end_io = multipath_end_request;
>>                        bio->bi_private = mp_bh;
>>                        generic_make_request(bio);
>> diff --git a/drivers/s390/block/dasd_diag.c b/drivers/s390/block/dasd_diag.c
>> index 85fcb43..7844461 100644
>> --- a/drivers/s390/block/dasd_diag.c
>> +++ b/drivers/s390/block/dasd_diag.c
>> @@ -544,7 +544,7 @@ static struct dasd_ccw_req *dasd_diag_build_cp(struct dasd_device *memdev,
>>        }
>>        cqr->retries = DIAG_MAX_RETRIES;
>>        cqr->buildclk = get_clock();
>> -       if (req->cmd_flags & REQ_FAILFAST)
>> +       if (blk_noretry_request(req))
>>                set_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags);
>>        cqr->startdev = memdev;
>>        cqr->memdev = memdev;
>> diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c
>> index 773b3fe..b11a221 100644
>> --- a/drivers/s390/block/dasd_eckd.c
>> +++ b/drivers/s390/block/dasd_eckd.c
>> @@ -1683,7 +1683,7 @@ static struct dasd_ccw_req *dasd_eckd_build_cp(struct dasd_device *startdev,
>>                        recid++;
>>                }
>>        }
>> -       if (req->cmd_flags & REQ_FAILFAST)
>> +       if (blk_noretry_request(req))
>>                set_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags);
>>        cqr->startdev = startdev;
>>        cqr->memdev = startdev;
>> diff --git a/drivers/s390/block/dasd_fba.c b/drivers/s390/block/dasd_fba.c
>> index aa0c533..115e032 100644
>> --- a/drivers/s390/block/dasd_fba.c
>> +++ b/drivers/s390/block/dasd_fba.c
>> @@ -355,7 +355,7 @@ static struct dasd_ccw_req *dasd_fba_build_cp(struct dasd_device * memdev,
>>                        recid++;
>>                }
>>        }
>> -       if (req->cmd_flags & REQ_FAILFAST)
>> +       if (blk_noretry_request(req))
>>                set_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags);
>>        cqr->startdev = memdev;
>>        cqr->memdev = memdev;
>> diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c
>> index 994da56..6bc55a6 100644
>> --- a/drivers/scsi/device_handler/scsi_dh_alua.c
>> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c
>> @@ -109,7 +109,8 @@ static struct request *get_alua_req(struct scsi_device *sdev,
>>        }
>>
>>        rq->cmd_type = REQ_TYPE_BLOCK_PC;
>> -       rq->cmd_flags |= REQ_FAILFAST | REQ_NOMERGE;
>> +       rq->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
>> +                        REQ_FAILFAST_DRIVER | REQ_NOMERGE;
>>        rq->retries = ALUA_FAILOVER_RETRIES;
>>        rq->timeout = ALUA_FAILOVER_TIMEOUT;
>>
>> diff --git a/drivers/scsi/device_handler/scsi_dh_emc.c b/drivers/scsi/device_handler/scsi_dh_emc.c
>> index b9d23e9..64a56e5 100644
>> --- a/drivers/scsi/device_handler/scsi_dh_emc.c
>> +++ b/drivers/scsi/device_handler/scsi_dh_emc.c
>> @@ -304,7 +304,8 @@ static struct request *get_req(struct scsi_device *sdev, int cmd,
>>
>>        rq->cmd[4] = len;
>>        rq->cmd_type = REQ_TYPE_BLOCK_PC;
>> -       rq->cmd_flags |= REQ_FAILFAST;
>> +       rq->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
>> +                        REQ_FAILFAST_DRIVER;
>>        rq->timeout = CLARIION_TIMEOUT;
>>        rq->retries = CLARIION_RETRIES;
>>
>> diff --git a/drivers/scsi/device_handler/scsi_dh_hp_sw.c b/drivers/scsi/device_handler/scsi_dh_hp_sw.c
>> index a6a4ef3..08ba1ce 100644
>> --- a/drivers/scsi/device_handler/scsi_dh_hp_sw.c
>> +++ b/drivers/scsi/device_handler/scsi_dh_hp_sw.c
>> @@ -112,7 +112,8 @@ static int hp_sw_tur(struct scsi_device *sdev, struct hp_sw_dh_data *h)
>>                return SCSI_DH_RES_TEMP_UNAVAIL;
>>
>>        req->cmd_type = REQ_TYPE_BLOCK_PC;
>> -       req->cmd_flags |= REQ_FAILFAST;
>> +       req->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
>> +                         REQ_FAILFAST_DRIVER;
>>        req->cmd_len = COMMAND_SIZE(TEST_UNIT_READY);
>>        memset(req->cmd, 0, MAX_COMMAND_SIZE);
>>        req->cmd[0] = TEST_UNIT_READY;
>> @@ -205,7 +206,8 @@ static int hp_sw_start_stop(struct scsi_device *sdev, struct hp_sw_dh_data *h)
>>                return SCSI_DH_RES_TEMP_UNAVAIL;
>>
>>        req->cmd_type = REQ_TYPE_BLOCK_PC;
>> -       req->cmd_flags |= REQ_FAILFAST;
>> +       req->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
>> +                         REQ_FAILFAST_DRIVER;
>>        req->cmd_len = COMMAND_SIZE(START_STOP);
>>        memset(req->cmd, 0, MAX_COMMAND_SIZE);
>>        req->cmd[0] = START_STOP;
>> diff --git a/drivers/scsi/device_handler/scsi_dh_rdac.c b/drivers/scsi/device_handler/scsi_dh_rdac.c
>> index 2dee69d..c504afe 100644
>> --- a/drivers/scsi/device_handler/scsi_dh_rdac.c
>> +++ b/drivers/scsi/device_handler/scsi_dh_rdac.c
>> @@ -228,7 +228,8 @@ static struct request *get_rdac_req(struct scsi_device *sdev,
>>        memset(rq->cmd, 0, BLK_MAX_CDB);
>>
>>        rq->cmd_type = REQ_TYPE_BLOCK_PC;
>> -       rq->cmd_flags |= REQ_FAILFAST | REQ_NOMERGE;
>> +       rq->cmd_flags |= REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT |
>> +                        REQ_FAILFAST_DRIVER;
> 
> Was dropping the REQ_NOMERGE intentional?

Yes, and no.

> Everywhere else it seems REQ_FAILFAST was simply replaced with the
> three new flag bits.

We do not need it, because blk_execute_rq_nowait sets it for us, so I 
dropped it when I first made the patches a long time ago. It was a 
different patch for a while, but I think it got merged with this one a 
while back by accident (the alua NOMERGE setting did not get dropped 
when it could have, because it was added to scsi-misc at a different time).

Thanks for the review.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/14] scsi: scsi_decide_dispostion update
  2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
                   ` (13 preceding siblings ...)
  2008-09-02 16:05 ` [PATCH 14/14] scsi: update scsi_log_completion disposition decoding Mike Anderson
@ 2008-09-02 17:03 ` Mike Christie
  14 siblings, 0 replies; 23+ messages in thread
From: Mike Christie @ 2008-09-02 17:03 UTC (permalink / raw)
  To: Mike Anderson; +Cc: linux-scsi

Mike Anderson wrote:
> This patch series is an update to a previous set of patches posted by Mike
> Christie in the thread referenced below.
> http://thread.gmane.org/gmane.linux.scsi/44058/focus=4405

Thanks for posting this.

MikeA's patches replace my last two patches:
0008-block-separate-failfast-into-multiple-bits.patch (this is actually 
the same patch but sent again to make it easier it compile and test and 
review).
0009-scsi-modify-scsi-to-handle-new-fail-fast-flags.patch (MikeA's 
patches kill this patch).

MikeA also resent my
0003-scsi-add-transport-host-byte-errors-v3.patch so this could all 
compile and build and be tested together.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 01/14] block: separate failfast into multiple bits.
  2008-09-02 16:59     ` Mike Christie
@ 2008-09-02 17:31       ` Mike Anderson
  2008-09-03  8:27         ` Boaz Harrosh
  0 siblings, 1 reply; 23+ messages in thread
From: Mike Anderson @ 2008-09-02 17:31 UTC (permalink / raw)
  To: Mike Christie
  Cc: Grant Grundler, linux-scsi, Jens Axboe, Alasdair G Kergon,
	Neil Brown, Martin Schwidefsky

Mike Christie <michaelc@cs.wisc.edu> wrote:
> Grant Grundler wrote:
>> On Tue, Sep 2, 2008 at 9:05 AM, Mike Anderson
>> <andmike@linux.vnet.ibm.com> wrote:
>>> From: Mike Christie <michaelc@cs.wisc.edu>
>>>
>>> Multipath is best at handling transport errors. If it gets a device
>>> error then there is not much the multipath layer can do. It will just
>>> access the same device but from a different path.
>>>
>>> This patch breaks up failfast into device, transport and driver errors.
>>
>> Is there any document that describes what those errors are for each
>> class of transport?
>
>
>
> Not yet. For SCSI I was still trying to classify the host byte errors,  
> because drivers are using them differently. I had sent patches in the  
> thread here
> http://marc.info/?l=linux-scsi&m=121918956332584&w=2
>  that just start syncing up the transport errors for SCSI by adding some 
> new transport host byte errors and converting drivers and transport  
> classes to them.
>

I can work on a patch to the scsi_mid_low_api.txt document that describes
the mapping / policy of these changes (i.e., DID_* / sense to
SCSI_MLQUEUE_DIS_* mapping to blk_failfast_* mapping ). 

-andmike
--
Michael Anderson
andmike@linux.vnet.ibm.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 01/14] block: separate failfast into multiple bits.
  2008-09-02 17:31       ` Mike Anderson
@ 2008-09-03  8:27         ` Boaz Harrosh
  0 siblings, 0 replies; 23+ messages in thread
From: Boaz Harrosh @ 2008-09-03  8:27 UTC (permalink / raw)
  To: Mike Anderson
  Cc: Mike Christie, Grant Grundler, linux-scsi, Jens Axboe,
	Alasdair G Kergon, Neil Brown, Martin Schwidefsky

Mike Anderson wrote:
> 
> I can work on a patch to the scsi_mid_low_api.txt document that describes
> the mapping / policy of these changes (i.e., DID_* / sense to
> SCSI_MLQUEUE_DIS_* mapping to blk_failfast_* mapping ). 
> 
> -andmike
> --
> Michael Anderson
> andmike@linux.vnet.ibm.com

This could be grate thanks. Please make sure it will contain all the
information you posted in the [0/14] email.

Thanks again
Boaz

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 04/14] scsi: Move retries check
  2008-09-02 16:05 ` [PATCH 04/14] scsi: Move retries check Mike Anderson
@ 2008-09-04 18:27   ` James Bottomley
  2008-09-04 19:52     ` Mike Anderson
  0 siblings, 1 reply; 23+ messages in thread
From: James Bottomley @ 2008-09-04 18:27 UTC (permalink / raw)
  To: Mike Anderson; +Cc: linux-scsi, Mike Christie

On Tue, 2008-09-02 at 09:05 -0700, Mike Anderson wrote:
> Move retries check to scsi_queue_insert.
> 
> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
> Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>

Could you clarify the signoff chain on this, please?  It implies that
the patch originated with Mike, but there's no From: for him.

If it's his patch, it needs a From:

If it's your patch, your signoff needs to be first, and his needs to
become an acked-by.

James



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 04/14] scsi: Move retries check
  2008-09-04 18:27   ` James Bottomley
@ 2008-09-04 19:52     ` Mike Anderson
  2008-09-04 21:21       ` James Bottomley
  0 siblings, 1 reply; 23+ messages in thread
From: Mike Anderson @ 2008-09-04 19:52 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-scsi, Mike Christie

James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> On Tue, 2008-09-02 at 09:05 -0700, Mike Anderson wrote:
> > Move retries check to scsi_queue_insert.
> > 
> > Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
> > Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
> 
> Could you clarify the signoff chain on this, please?  It implies that
> the patch originated with Mike, but there's no From: for him.
> 
> If it's his patch, it needs a From:
> 
> If it's your patch, your signoff needs to be first, and his needs to
> become an acked-by.

The sign off is incorrect. The patch originated with me I can resend the
patch series to the list with the corrected order / info.

-andmike
--
Michael Anderson
andmike@linux.vnet.ibm.com

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 04/14] scsi: Move retries check
  2008-09-04 19:52     ` Mike Anderson
@ 2008-09-04 21:21       ` James Bottomley
  0 siblings, 0 replies; 23+ messages in thread
From: James Bottomley @ 2008-09-04 21:21 UTC (permalink / raw)
  To: Mike Anderson; +Cc: linux-scsi, Mike Christie

On Thu, 2008-09-04 at 12:52 -0700, Mike Anderson wrote:
> James Bottomley <James.Bottomley@HansenPartnership.com> wrote:
> > On Tue, 2008-09-02 at 09:05 -0700, Mike Anderson wrote:
> > > Move retries check to scsi_queue_insert.
> > > 
> > > Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
> > > Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
> > 
> > Could you clarify the signoff chain on this, please?  It implies that
> > the patch originated with Mike, but there's no From: for him.
> > 
> > If it's his patch, it needs a From:
> > 
> > If it's your patch, your signoff needs to be first, and his needs to
> > become an acked-by.
> 
> The sign off is incorrect. The patch originated with me I can resend the
> patch series to the list with the corrected order / info.

OK, could you redo against scsi-misc, then please.  I needed the
original patch series from Mike to pull in the lpfc changes, so if you
just submit your updates to that, it will make for a much cleaner
history set.

Thanks,

James



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2008-09-04 21:21 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-02 16:05 [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Anderson
2008-09-02 16:05 ` [PATCH 01/14] block: separate failfast into multiple bits Mike Anderson
2008-09-02 16:35   ` Grant Grundler
2008-09-02 16:59     ` Mike Christie
2008-09-02 17:31       ` Mike Anderson
2008-09-03  8:27         ` Boaz Harrosh
2008-09-02 16:05 ` [PATCH 02/14] scsi: add transport host byte errors (v3) Mike Anderson
2008-09-02 16:05 ` [PATCH 03/14] scsi: Move wait_for check Mike Anderson
2008-09-02 16:05 ` [PATCH 04/14] scsi: Move retries check Mike Anderson
2008-09-04 18:27   ` James Bottomley
2008-09-04 19:52     ` Mike Anderson
2008-09-04 21:21       ` James Bottomley
2008-09-02 16:05 ` [PATCH 05/14] scsi: Move blk_noretry_request Mike Anderson
2008-09-02 16:05 ` [PATCH 06/14] scsi: remove maybe_retry Mike Anderson
2008-09-02 16:05 ` [PATCH 07/14] scsi: change return codes in scsi_decide_disposition Mike Anderson
2008-09-02 16:05 ` [PATCH 08/14] scsi: rename scsi_queue_insert to scsi_attempt_requeue_command Mike Anderson
2008-09-02 16:05 ` [PATCH 09/14] scsi: have device handlers return SCSI_MLQUEUE error value Mike Anderson
2008-09-02 16:05 ` [PATCH 10/14] scsi: convert other scsi_check_sense users to new error codes Mike Anderson
2008-09-02 16:05 ` [PATCH 11/14] scsi: fix up SCSI_MLQUEUE defintions and add driver, device and transport ones Mike Anderson
2008-09-02 16:05 ` [PATCH 12/14] scsi: move device online check to scsi_attempt_requeue_command Mike Anderson
2008-09-02 16:05 ` [PATCH 13/14] scsi: remove scsi_device_online from scsi_decide_disposition Mike Anderson
2008-09-02 16:05 ` [PATCH 14/14] scsi: update scsi_log_completion disposition decoding Mike Anderson
2008-09-02 17:03 ` [PATCH 0/14] scsi: scsi_decide_dispostion update Mike Christie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox