[PATCH 0/5] SCSI EH cleanup

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/5] SCSI EH cleanup
@ 2016-06-20  9:35 Hannes Reinecke
  2016-06-20  9:35 ` [PATCH 1/5] libsas: allow async aborts Hannes Reinecke
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Hannes Reinecke @ 2016-06-20  9:35 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: James Bottomley, Christoph Hellwig, linux-scsi, Hannes Reinecke

Hi all,

here's a patchset to cleanup SCSI EH. The main point is that we should
finally drop the no_async_abort flag; up to now we haven't had any
issues with the asynchronous aborts, and the flag was never used.

As usual, comments and reviews are welcome.

Christoph Hellwig (1):
  libsas: allow async aborts

Hannes Reinecke (4):
  scsi: make scsi_eh_scmd_add() always succeed
  scsi: make eh_eflags persistent
  scsi: make asynchronous aborts mandatory
  scsi: Do not escalate failed EH command

 Documentation/scsi/scsi_eh.txt      |  31 ++++-----
 drivers/scsi/libsas/sas_scsi_host.c |   3 -
 drivers/scsi/scsi_error.c           | 127 +++++-------------------------------
 drivers/scsi/scsi_lib.c             |   4 +-
 drivers/scsi/scsi_priv.h            |   3 +-
 include/scsi/scsi_eh.h              |   1 +
 include/scsi/scsi_host.h            |   5 --
 7 files changed, 35 insertions(+), 139 deletions(-)

-- 
1.8.5.6


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/5] libsas: allow async aborts
  2016-06-20  9:35 [PATCH 0/5] SCSI EH cleanup Hannes Reinecke
@ 2016-06-20  9:35 ` Hannes Reinecke
  2016-06-20  9:35 ` [PATCH 2/5] scsi: make scsi_eh_scmd_add() always succeed Hannes Reinecke
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Hannes Reinecke @ 2016-06-20  9:35 UTC (permalink / raw)
  To: Martin K. Petersen; +Cc: James Bottomley, Christoph Hellwig, linux-scsi

From: Christoph Hellwig <hch@lst.de>

We now first try to call ->eh_abort_handler from a work queue, but libsas
was always failing that for no good reason.  Allow async aborts.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/scsi/libsas/sas_scsi_host.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 519dac4..37a2a84 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -491,9 +491,6 @@ int sas_eh_abort_handler(struct scsi_cmnd *cmd)
 	struct Scsi_Host *host = cmd->device->host;
 	struct sas_internal *i = to_sas_internal(host->transportt);
 
-	if (current != host->ehandler)
-		return FAILED;
-
 	if (!i->dft->lldd_abort_task)
 		return FAILED;
 
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/5] scsi: make scsi_eh_scmd_add() always succeed
  2016-06-20  9:35 [PATCH 0/5] SCSI EH cleanup Hannes Reinecke
  2016-06-20  9:35 ` [PATCH 1/5] libsas: allow async aborts Hannes Reinecke
@ 2016-06-20  9:35 ` Hannes Reinecke
  2016-06-22 13:28   ` Christoph Hellwig
  2016-06-20  9:35 ` [PATCH 3/5] scsi: make eh_eflags persistent Hannes Reinecke
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Hannes Reinecke @ 2016-06-20  9:35 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: James Bottomley, Christoph Hellwig, linux-scsi, Hannes Reinecke

scsi_eh_scmd_add() currently only will fail if no
error handler thread is started (which will never be the
case) or if the state machine encounters an illegal transition.

But if we're encountering an invalid state transition
chances is we cannot fixup things with the error handler.
So better add a WARN_ON for illegal host states and
make scsi_dh_scmd_add() a void function.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/scsi_error.c | 39 +++++++++++++--------------------------
 drivers/scsi/scsi_lib.c   |  4 ++--
 drivers/scsi/scsi_priv.h  |  2 +-
 3 files changed, 16 insertions(+), 29 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 984ddcb..deb35737 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -162,13 +162,7 @@ scmd_eh_abort_handler(struct work_struct *work)
 		}
 	}
 
-	if (!scsi_eh_scmd_add(scmd, 0)) {
-		SCSI_LOG_ERROR_RECOVERY(3,
-			scmd_printk(KERN_WARNING, scmd,
-				    "terminate aborted command\n"));
-		set_host_byte(scmd, DID_TIME_OUT);
-		scsi_finish_command(scmd);
-	}
+	scsi_eh_scmd_add(scmd, 0);
 }
 
 /**
@@ -224,37 +218,32 @@ scsi_abort_command(struct scsi_cmnd *scmd)
  * scsi_eh_scmd_add - add scsi cmd to error handling.
  * @scmd:	scmd to run eh on.
  * @eh_flag:	optional SCSI_EH flag.
- *
- * Return value:
- *	0 on failure.
  */
-int scsi_eh_scmd_add(struct scsi_cmnd *scmd, int eh_flag)
+void scsi_eh_scmd_add(struct scsi_cmnd *scmd, int eh_flag)
 {
 	struct Scsi_Host *shost = scmd->device->host;
 	unsigned long flags;
-	int ret = 0;
 
-	if (!shost->ehandler)
-		return 0;
+	WARN_ON(!shost->ehandler);
 
 	spin_lock_irqsave(shost->host_lock, flags);
+	WARN_ON(shost->shost_state != SHOST_RUNNING &&
+		shost->shost_state != SHOST_CANCEL &&
+		shost->shost_state != SHOST_RECOVERY &&
+		shost->shost_state != SHOST_CANCEL_RECOVERY);
 	if (scsi_host_set_state(shost, SHOST_RECOVERY))
-		if (scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY))
-			goto out_unlock;
+		scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY);
 
 	if (shost->eh_deadline != -1 && !shost->last_reset)
 		shost->last_reset = jiffies;
 
-	ret = 1;
 	if (scmd->eh_eflags & SCSI_EH_ABORT_SCHEDULED)
 		eh_flag &= ~SCSI_EH_CANCEL_CMD;
 	scmd->eh_eflags |= eh_flag;
 	list_add_tail(&scmd->eh_entry, &shost->eh_cmd_q);
 	shost->host_failed++;
 	scsi_eh_wakeup(shost);
- out_unlock:
 	spin_unlock_irqrestore(shost->host_lock, flags);
-	return ret;
 }
 
 /**
@@ -285,13 +274,11 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 		rtn = host->hostt->eh_timed_out(scmd);
 
 	if (rtn == BLK_EH_NOT_HANDLED) {
-		if (!host->hostt->no_async_abort &&
-		    scsi_abort_command(scmd) == SUCCESS)
-			return BLK_EH_NOT_HANDLED;
-
-		set_host_byte(scmd, DID_TIME_OUT);
-		if (!scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD))
-			rtn = BLK_EH_HANDLED;
+		if (host->hostt->no_async_abort ||
+		    scsi_abort_command(scmd) != SUCCESS) {
+			set_host_byte(scmd, DID_TIME_OUT);
+			scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD);
+		}
 	}
 
 	return rtn;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index b2e332a..7a8c9ad 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1559,8 +1559,8 @@ static void scsi_softirq_done(struct request *rq)
 			scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
 			break;
 		default:
-			if (!scsi_eh_scmd_add(cmd, 0))
-				scsi_finish_command(cmd);
+			scsi_eh_scmd_add(cmd, 0);
+			break;
 	}
 }
 
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 57a4b99..de937ba 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -71,7 +71,7 @@ extern enum blk_eh_timer_return scsi_times_out(struct request *req);
 extern int scsi_error_handler(void *host);
 extern int scsi_decide_disposition(struct scsi_cmnd *cmd);
 extern void scsi_eh_wakeup(struct Scsi_Host *shost);
-extern int scsi_eh_scmd_add(struct scsi_cmnd *, int);
+extern void scsi_eh_scmd_add(struct scsi_cmnd *, int);
 void scsi_eh_ready_devs(struct Scsi_Host *shost,
 			struct list_head *work_q,
 			struct list_head *done_q);
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/5] scsi: make eh_eflags persistent
  2016-06-20  9:35 [PATCH 0/5] SCSI EH cleanup Hannes Reinecke
  2016-06-20  9:35 ` [PATCH 1/5] libsas: allow async aborts Hannes Reinecke
  2016-06-20  9:35 ` [PATCH 2/5] scsi: make scsi_eh_scmd_add() always succeed Hannes Reinecke
@ 2016-06-20  9:35 ` Hannes Reinecke
  2016-06-22 13:33   ` Christoph Hellwig
  2016-06-20  9:35 ` [PATCH 4/5] scsi: make asynchronous aborts mandatory Hannes Reinecke
  2016-06-20  9:35 ` [PATCH 5/5] scsi: Do not escalate failed EH command Hannes Reinecke
  4 siblings, 1 reply; 10+ messages in thread
From: Hannes Reinecke @ 2016-06-20  9:35 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: James Bottomley, Christoph Hellwig, linux-scsi, Hannes Reinecke

To detect if a failed command has been retried we must not
clear scmd->eh_eflags when EH finishes.
The flag should be persistent throughout the lifetime
of the command.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 Documentation/scsi/scsi_eh.txt | 3 ---
 drivers/scsi/scsi_error.c      | 4 ++--
 include/scsi/scsi_eh.h         | 1 +
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/Documentation/scsi/scsi_eh.txt b/Documentation/scsi/scsi_eh.txt
index 8638f61..745eed5 100644
--- a/Documentation/scsi/scsi_eh.txt
+++ b/Documentation/scsi/scsi_eh.txt
@@ -264,7 +264,6 @@ scmd->allowed.
  3. scmd recovered
     ACTION: scsi_eh_finish_cmd() is invoked to EH-finish scmd
 	- shost->host_failed--
-	- clear scmd->eh_eflags
 	- scsi_setup_cmd_retry()
 	- move from local eh_work_q to local eh_done_q
     LOCKING: none
@@ -452,8 +451,6 @@ except for #1 must be implemented by eh_strategy_handler().
 
  - shost->host_failed is zero.
 
- - Each scmd's eh_eflags field is cleared.
-
  - Each scmd is in such a state that scsi_setup_cmd_retry() on the
    scmd doesn't make any difference.
 
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index deb35737..eb0f19f 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -182,7 +182,6 @@ scsi_abort_command(struct scsi_cmnd *scmd)
 		/*
 		 * Retry after abort failed, escalate to next level.
 		 */
-		scmd->eh_eflags &= ~SCSI_EH_ABORT_SCHEDULED;
 		SCSI_LOG_ERROR_RECOVERY(3,
 			scmd_printk(KERN_INFO, scmd,
 				    "previous abort failed\n"));
@@ -919,6 +918,7 @@ void scsi_eh_prep_cmnd(struct scsi_cmnd *scmd, struct scsi_eh_save *ses,
 	ses->result = scmd->result;
 	ses->underflow = scmd->underflow;
 	ses->prot_op = scmd->prot_op;
+	ses->eh_eflags = scmd->eh_eflags;
 
 	scmd->prot_op = SCSI_PROT_NORMAL;
 	scmd->eh_eflags = 0;
@@ -982,6 +982,7 @@ void scsi_eh_restore_cmnd(struct scsi_cmnd* scmd, struct scsi_eh_save *ses)
 	scmd->result = ses->result;
 	scmd->underflow = ses->underflow;
 	scmd->prot_op = ses->prot_op;
+	scmd->eh_eflags = ses->eh_eflags;
 }
 EXPORT_SYMBOL(scsi_eh_restore_cmnd);
 
@@ -1115,7 +1116,6 @@ static int scsi_eh_action(struct scsi_cmnd *scmd, int rtn)
 void scsi_eh_finish_cmd(struct scsi_cmnd *scmd, struct list_head *done_q)
 {
 	scmd->device->host->host_failed--;
-	scmd->eh_eflags = 0;
 	list_move_tail(&scmd->eh_entry, done_q);
 }
 EXPORT_SYMBOL(scsi_eh_finish_cmd);
diff --git a/include/scsi/scsi_eh.h b/include/scsi/scsi_eh.h
index dbb8c64..f2f876c 100644
--- a/include/scsi/scsi_eh.h
+++ b/include/scsi/scsi_eh.h
@@ -30,6 +30,7 @@ extern int scsi_ioctl_reset(struct scsi_device *, int __user *);
 struct scsi_eh_save {
 	/* saved state */
 	int result;
+	int eh_eflags;
 	enum dma_data_direction data_direction;
 	unsigned underflow;
 	unsigned char cmd_len;
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/5] scsi: make asynchronous aborts mandatory
  2016-06-20  9:35 [PATCH 0/5] SCSI EH cleanup Hannes Reinecke
                   ` (2 preceding siblings ...)
  2016-06-20  9:35 ` [PATCH 3/5] scsi: make eh_eflags persistent Hannes Reinecke
@ 2016-06-20  9:35 ` Hannes Reinecke
  2016-06-22 13:31   ` Christoph Hellwig
  2016-06-20  9:35 ` [PATCH 5/5] scsi: Do not escalate failed EH command Hannes Reinecke
  4 siblings, 1 reply; 10+ messages in thread
From: Hannes Reinecke @ 2016-06-20  9:35 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: James Bottomley, Christoph Hellwig, linux-scsi, Hannes Reinecke

There hasn't been any reports for HBAs where asynchronous abort
would not work, so we should make it mandatory and remove
the fallback.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 Documentation/scsi/scsi_eh.txt | 28 +++++++--------
 drivers/scsi/scsi_error.c      | 81 ++++--------------------------------------
 drivers/scsi/scsi_lib.c        |  2 +-
 drivers/scsi/scsi_priv.h       |  3 +-
 include/scsi/scsi_host.h       |  5 ---
 5 files changed, 22 insertions(+), 97 deletions(-)

diff --git a/Documentation/scsi/scsi_eh.txt b/Documentation/scsi/scsi_eh.txt
index 745eed5..6e07245fb 100644
--- a/Documentation/scsi/scsi_eh.txt
+++ b/Documentation/scsi/scsi_eh.txt
@@ -70,7 +70,7 @@ with the command.
 	scmd is requeued to blk queue.
 
  - otherwise
-	scsi_eh_scmd_add(scmd, 0) is invoked for the command.  See
+	scsi_eh_scmd_add(scmd) is invoked for the command.  See
 	[1-3] for details of this function.
 
 
@@ -103,13 +103,15 @@ function
         eh_timed_out() callback did not handle the command.
 	Step #2 is taken.
 
- 2. If the host supports asynchronous completion (as indicated by the
-    no_async_abort setting in the host template) scsi_abort_command()
-    is invoked to schedule an asynchrous abort. If that fails
-    Step #3 is taken.
+ 2. scsi_abort_command() is invoked to schedule an asynchrous abort
+    (Seee [1-3] for more information).
+    Asynchronous abort are not invoked for commands which have
+    SCSI_EH_ABORT_SCHEDULED set (this indicates that the command
+    already had been aborted once, and this is a retry which failed),
+    or when the EH deadline is expired. In these case Step #3 is taken.
 
- 2. scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD) is invoked for the
-    command.  See [1-3] for more information.
+ 3. scsi_eh_scmd_add(scmd) is invoked for the
+    command.  See [1-4] for more information.
 
 [1-3] Asynchronous command aborts
 
@@ -124,16 +126,13 @@ function
 
  scmds enter EH via scsi_eh_scmd_add(), which does the following.
 
- 1. Turns on scmd->eh_eflags as requested.  It's 0 for error
-    completions and SCSI_EH_CANCEL_CMD for timeouts.
+ 1. Links scmd->eh_entry to shost->eh_cmd_q
 
- 2. Links scmd->eh_entry to shost->eh_cmd_q
+ 2. Sets SHOST_RECOVERY bit in shost->shost_state
 
- 3. Sets SHOST_RECOVERY bit in shost->shost_state
+ 3. Increments shost->host_failed
 
- 4. Increments shost->host_failed
-
- 5. Wakes up SCSI EH thread if shost->host_busy == shost->host_failed
+ 4. Wakes up SCSI EH thread if shost->host_busy == shost->host_failed
 
  As can be seen above, once any scmd is added to shost->eh_cmd_q,
 SHOST_RECOVERY shost_state bit is turned on.  This prevents any new
@@ -249,7 +248,6 @@ scmd->allowed.
 
  1. Error completion / time out
     ACTION: scsi_eh_scmd_add() is invoked for scmd
-	- set scmd->eh_eflags
 	- add scmd to shost->eh_cmd_q
 	- set SHOST_RECOVERY
 	- shost->host_failed++
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index eb0f19f..cf47b81 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -162,7 +162,7 @@ scmd_eh_abort_handler(struct work_struct *work)
 		}
 	}
 
-	scsi_eh_scmd_add(scmd, 0);
+	scsi_eh_scmd_add(scmd);
 }
 
 /**
@@ -216,9 +216,8 @@ scsi_abort_command(struct scsi_cmnd *scmd)
 /**
  * scsi_eh_scmd_add - add scsi cmd to error handling.
  * @scmd:	scmd to run eh on.
- * @eh_flag:	optional SCSI_EH flag.
  */
-void scsi_eh_scmd_add(struct scsi_cmnd *scmd, int eh_flag)
+void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
 {
 	struct Scsi_Host *shost = scmd->device->host;
 	unsigned long flags;
@@ -236,9 +235,6 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd, int eh_flag)
 	if (shost->eh_deadline != -1 && !shost->last_reset)
 		shost->last_reset = jiffies;
 
-	if (scmd->eh_eflags & SCSI_EH_ABORT_SCHEDULED)
-		eh_flag &= ~SCSI_EH_CANCEL_CMD;
-	scmd->eh_eflags |= eh_flag;
 	list_add_tail(&scmd->eh_entry, &shost->eh_cmd_q);
 	shost->host_failed++;
 	scsi_eh_wakeup(shost);
@@ -273,10 +269,9 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 		rtn = host->hostt->eh_timed_out(scmd);
 
 	if (rtn == BLK_EH_NOT_HANDLED) {
-		if (host->hostt->no_async_abort ||
-		    scsi_abort_command(scmd) != SUCCESS) {
+		if (scsi_abort_command(scmd) != SUCCESS) {
 			set_host_byte(scmd, DID_TIME_OUT);
-			scsi_eh_scmd_add(scmd, SCSI_EH_CANCEL_CMD);
+			scsi_eh_scmd_add(scmd);
 		}
 	}
 
@@ -329,7 +324,7 @@ static inline void scsi_eh_prt_fail_stats(struct Scsi_Host *shost,
 		list_for_each_entry(scmd, work_q, eh_entry) {
 			if (scmd->device == sdev) {
 				++total_failures;
-				if (scmd->eh_eflags & SCSI_EH_CANCEL_CMD)
+				if (scmd->eh_eflags & SCSI_EH_ABORT_SCHEDULED)
 					++cmd_cancel;
 				else
 					++cmd_failed;
@@ -1152,8 +1147,7 @@ int scsi_eh_get_sense(struct list_head *work_q,
 	 * should not get sense.
 	 */
 	list_for_each_entry_safe(scmd, next, work_q, eh_entry) {
-		if ((scmd->eh_eflags & SCSI_EH_CANCEL_CMD) ||
-		    (scmd->eh_eflags & SCSI_EH_ABORT_SCHEDULED) ||
+		if ((scmd->eh_eflags & SCSI_EH_ABORT_SCHEDULED) ||
 		    SCSI_SENSE_VALID(scmd))
 			continue;
 
@@ -1293,61 +1287,6 @@ static int scsi_eh_test_devices(struct list_head *cmd_list,
 	return list_empty(work_q);
 }
 
-
-/**
- * scsi_eh_abort_cmds - abort pending commands.
- * @work_q:	&list_head for pending commands.
- * @done_q:	&list_head for processed commands.
- *
- * Decription:
- *    Try and see whether or not it makes sense to try and abort the
- *    running command.  This only works out to be the case if we have one
- *    command that has timed out.  If the command simply failed, it makes
- *    no sense to try and abort the command, since as far as the shost
- *    adapter is concerned, it isn't running.
- */
-static int scsi_eh_abort_cmds(struct list_head *work_q,
-			      struct list_head *done_q)
-{
-	struct scsi_cmnd *scmd, *next;
-	LIST_HEAD(check_list);
-	int rtn;
-	struct Scsi_Host *shost;
-
-	list_for_each_entry_safe(scmd, next, work_q, eh_entry) {
-		if (!(scmd->eh_eflags & SCSI_EH_CANCEL_CMD))
-			continue;
-		shost = scmd->device->host;
-		if (scsi_host_eh_past_deadline(shost)) {
-			list_splice_init(&check_list, work_q);
-			SCSI_LOG_ERROR_RECOVERY(3,
-				scmd_printk(KERN_INFO, scmd,
-					    "%s: skip aborting cmd, past eh deadline\n",
-					    current->comm));
-			return list_empty(work_q);
-		}
-		SCSI_LOG_ERROR_RECOVERY(3,
-			scmd_printk(KERN_INFO, scmd,
-				     "%s: aborting cmd\n", current->comm));
-		rtn = scsi_try_to_abort_cmd(shost->hostt, scmd);
-		if (rtn == FAILED) {
-			SCSI_LOG_ERROR_RECOVERY(3,
-				scmd_printk(KERN_INFO, scmd,
-					    "%s: aborting cmd failed\n",
-					     current->comm));
-			list_splice_init(&check_list, work_q);
-			return list_empty(work_q);
-		}
-		scmd->eh_eflags &= ~SCSI_EH_CANCEL_CMD;
-		if (rtn == FAST_IO_FAIL)
-			scsi_eh_finish_cmd(scmd, done_q);
-		else
-			list_move_tail(&scmd->eh_entry, &check_list);
-	}
-
-	return scsi_eh_test_devices(&check_list, work_q, done_q, 0);
-}
-
 /**
  * scsi_eh_try_stu - Send START_UNIT to device.
  * @scmd:	&scsi_cmnd to send START_UNIT
@@ -1690,11 +1629,6 @@ static void scsi_eh_offline_sdevs(struct list_head *work_q,
 		sdev_printk(KERN_INFO, scmd->device, "Device offlined - "
 			    "not ready after error recovery\n");
 		scsi_device_set_state(scmd->device, SDEV_OFFLINE);
-		if (scmd->eh_eflags & SCSI_EH_CANCEL_CMD) {
-			/*
-			 * FIXME: Handle lost cmds.
-			 */
-		}
 		scsi_eh_finish_cmd(scmd, done_q);
 	}
 	return;
@@ -2138,8 +2072,7 @@ static void scsi_unjam_host(struct Scsi_Host *shost)
 	SCSI_LOG_ERROR_RECOVERY(1, scsi_eh_prt_fail_stats(shost, &eh_work_q));
 
 	if (!scsi_eh_get_sense(&eh_work_q, &eh_done_q))
-		if (!scsi_eh_abort_cmds(&eh_work_q, &eh_done_q))
-			scsi_eh_ready_devs(shost, &eh_work_q, &eh_done_q);
+		scsi_eh_ready_devs(shost, &eh_work_q, &eh_done_q);
 
 	spin_lock_irqsave(shost->host_lock, flags);
 	if (shost->eh_deadline != -1)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 7a8c9ad..f9b858b 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1559,7 +1559,7 @@ static void scsi_softirq_done(struct request *rq)
 			scsi_queue_insert(cmd, SCSI_MLQUEUE_DEVICE_BUSY);
 			break;
 		default:
-			scsi_eh_scmd_add(cmd, 0);
+			scsi_eh_scmd_add(cmd);
 			break;
 	}
 }
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index de937ba..bc880ca 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -18,7 +18,6 @@ struct scsi_nl_hdr;
 /*
  * Scsi Error Handler Flags
  */
-#define SCSI_EH_CANCEL_CMD	0x0001	/* Cancel this cmd */
 #define SCSI_EH_ABORT_SCHEDULED	0x0002	/* Abort has been scheduled */
 
 #define SCSI_SENSE_VALID(scmd) \
@@ -71,7 +70,7 @@ extern enum blk_eh_timer_return scsi_times_out(struct request *req);
 extern int scsi_error_handler(void *host);
 extern int scsi_decide_disposition(struct scsi_cmnd *cmd);
 extern void scsi_eh_wakeup(struct Scsi_Host *shost);
-extern void scsi_eh_scmd_add(struct scsi_cmnd *, int);
+extern void scsi_eh_scmd_add(struct scsi_cmnd *);
 void scsi_eh_ready_devs(struct Scsi_Host *shost,
 			struct list_head *work_q,
 			struct list_head *done_q);
diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
index 76e9d27..cdc5a1f 100644
--- a/include/scsi/scsi_host.h
+++ b/include/scsi/scsi_host.h
@@ -444,11 +444,6 @@ struct scsi_host_template {
 	unsigned no_write_same:1;
 
 	/*
-	 * True if asynchronous aborts are not supported
-	 */
-	unsigned no_async_abort:1;
-
-	/*
 	 * Countdown for host blocking with no commands outstanding.
 	 */
 	unsigned int max_host_blocked;
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/5] scsi: Do not escalate failed EH command
  2016-06-20  9:35 [PATCH 0/5] SCSI EH cleanup Hannes Reinecke
                   ` (3 preceding siblings ...)
  2016-06-20  9:35 ` [PATCH 4/5] scsi: make asynchronous aborts mandatory Hannes Reinecke
@ 2016-06-20  9:35 ` Hannes Reinecke
  2016-06-22 13:36   ` Christoph Hellwig
  4 siblings, 1 reply; 10+ messages in thread
From: Hannes Reinecke @ 2016-06-20  9:35 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: James Bottomley, Christoph Hellwig, linux-scsi, Hannes Reinecke,
	Hannes Reinecke

If an EH command fails there is no need to escalate; we are already
in EH and the escalation will start anyway.

Signed-off-by: Hannes Reinecke <hare@suse.com>
---
 drivers/scsi/scsi_error.c | 13 +------------
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index cf47b81..7df6818 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -870,15 +870,6 @@ static int scsi_try_to_abort_cmd(struct scsi_host_template *hostt,
 	return hostt->eh_abort_handler(scmd);
 }
 
-static void scsi_abort_eh_cmnd(struct scsi_cmnd *scmd)
-{
-	if (scsi_try_to_abort_cmd(scmd->device->host->hostt, scmd) != SUCCESS)
-		if (scsi_try_bus_device_reset(scmd) != SUCCESS)
-			if (scsi_try_target_reset(scmd) != SUCCESS)
-				if (scsi_try_bus_reset(scmd) != SUCCESS)
-					scsi_try_host_reset(scmd);
-}
-
 /**
  * scsi_eh_prep_cmnd  - Save a scsi command info as part of error recovery
  * @scmd:       SCSI command structure to hijack
@@ -1062,10 +1053,8 @@ retry:
 			rtn = FAILED;
 			break;
 		}
-	} else if (rtn != FAILED) {
-		scsi_abort_eh_cmnd(scmd);
+	} else if (rtn != FAILED)
 		rtn = FAILED;
-	}
 
 	scsi_eh_restore_cmnd(scmd, &ses);
 
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] scsi: make scsi_eh_scmd_add() always succeed
  2016-06-20  9:35 ` [PATCH 2/5] scsi: make scsi_eh_scmd_add() always succeed Hannes Reinecke
@ 2016-06-22 13:28   ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2016-06-22 13:28 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Martin K. Petersen, James Bottomley, Christoph Hellwig,
	linux-scsi

Agreed, I think trying to handle these sorts of errors isn't going
to be helpful, while the WARN_ON at least gives us a chance to
diagnose the issue if it ever happened.

> +	WARN_ON(!shost->ehandler);
>  
>  	spin_lock_irqsave(shost->host_lock, flags);
> +	WARN_ON(shost->shost_state != SHOST_RUNNING &&
> +		shost->shost_state != SHOST_CANCEL &&
> +		shost->shost_state != SHOST_RECOVERY &&
> +		shost->shost_state != SHOST_CANCEL_RECOVERY);

Use WARN_ON_ONCE to avoid repeated backtraces for the same condition.

>  	if (scsi_host_set_state(shost, SHOST_RECOVERY))
> -		if (scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY))
> -			goto out_unlock;
> +		scsi_host_set_state(shost, SHOST_CANCEL_RECOVERY);

No warn_on or early return here?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/5] scsi: make asynchronous aborts mandatory
  2016-06-20  9:35 ` [PATCH 4/5] scsi: make asynchronous aborts mandatory Hannes Reinecke
@ 2016-06-22 13:31   ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2016-06-22 13:31 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Martin K. Petersen, James Bottomley, Christoph Hellwig,
	linux-scsi

Looks fine and probably should move toward the beginning of
the series:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/5] scsi: make eh_eflags persistent
  2016-06-20  9:35 ` [PATCH 3/5] scsi: make eh_eflags persistent Hannes Reinecke
@ 2016-06-22 13:33   ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2016-06-22 13:33 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Martin K. Petersen, James Bottomley, Christoph Hellwig,
	linux-scsi

On Mon, Jun 20, 2016 at 11:35:38AM +0200, Hannes Reinecke wrote:
> To detect if a failed command has been retried we must not
> clear scmd->eh_eflags when EH finishes.
> The flag should be persistent throughout the lifetime
> of the command.

Please explain what issue this solves - the behavior has been there
for a while and even documented, so explaining how this was wrong
would be very useful.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 5/5] scsi: Do not escalate failed EH command
  2016-06-20  9:35 ` [PATCH 5/5] scsi: Do not escalate failed EH command Hannes Reinecke
@ 2016-06-22 13:36   ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2016-06-22 13:36 UTC (permalink / raw)
  To: Hannes Reinecke
  Cc: Martin K. Petersen, James Bottomley, Christoph Hellwig,
	linux-scsi, Hannes Reinecke

On Mon, Jun 20, 2016 at 11:35:40AM +0200, Hannes Reinecke wrote:
> If an EH command fails there is no need to escalate; we are already
> in EH and the escalation will start anyway.

I agree with this in principle, but is this really the case for all
callers?

E.g. the call to scsi_request_sense in scsi_eh_get_sense simply
skips to the next cmd on failure.  This could use a little more
description explaining how all callers of this are indeed fine
with not escalating manually.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-06-22 13:37 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-20  9:35 [PATCH 0/5] SCSI EH cleanup Hannes Reinecke
2016-06-20  9:35 ` [PATCH 1/5] libsas: allow async aborts Hannes Reinecke
2016-06-20  9:35 ` [PATCH 2/5] scsi: make scsi_eh_scmd_add() always succeed Hannes Reinecke
2016-06-22 13:28   ` Christoph Hellwig
2016-06-20  9:35 ` [PATCH 3/5] scsi: make eh_eflags persistent Hannes Reinecke
2016-06-22 13:33   ` Christoph Hellwig
2016-06-20  9:35 ` [PATCH 4/5] scsi: make asynchronous aborts mandatory Hannes Reinecke
2016-06-22 13:31   ` Christoph Hellwig
2016-06-20  9:35 ` [PATCH 5/5] scsi: Do not escalate failed EH command Hannes Reinecke
2016-06-22 13:36   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).