public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] scsi: handle special return codes for ABORTED COMMAND
@ 2018-01-29 23:30 Martin Wilck
  2018-01-29 23:39 ` Bart Van Assche
  0 siblings, 1 reply; 3+ messages in thread
From: Martin Wilck @ 2018-01-29 23:30 UTC (permalink / raw)
  To: Martin K. Petersen, Hannes Reinecke
  Cc: James Bottomley, Xose Vazquez Perez, Martin Wilck, linux-scsi

(Resending, because I forgot to cc linux-scsi. BIG SORRY!)

Introduce a new blist flag that indicates the device may return certain
sense code/ASC/ASCQ combinations that indicate different treatment than
normal. In particular, some devices need unconditional retry (aka
ADD_TO_MLQUEUE) under certain conditions; otherwise path failures may be
falsely detected.

Because different devices use different sense codes to indicate this
condition, a single bit is not enough. But we also can't use a bit for
every possible status code. Therefore the new flag BLIST_ABORTED_CMD_QUIRK
just indicates that this is a device for which the return code should be
checked. An extra "blacklist" in scsi_aborted_cmd_quirk() then checks for
known ASC/ASCQ combinations, and handles them.

When such a combination is encountered for the first time, a message is
printed. In systems that have several different peripherals using this
flag, that might lead to a wrong match without a warning. This small risk
is a compromise between exactness and the excessive resources that would be
required to check for matching device vendor and model name every time.

I chose to recycle devinfo flag (1<<11) (former BLIST_INQUIRY_58, free
since 496c91bbc910) for this purpose rather than extending blist_flags_t to
64 bit. This could be easily changed of course.

This patch replaces the previously submitted patches
 "scsi: handle ABORTED_COMMAND on Fujitsu ETERNUS" and
 "scsi: Always retry internal target error" (Hannes Reinecke)

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 drivers/scsi/scsi_devinfo.c |  4 ++-
 drivers/scsi/scsi_error.c   | 82 +++++++++++++++++++++++++++++++++++++++++++++
 include/scsi/scsi_devinfo.h |  6 ++++
 3 files changed, 91 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_devinfo.c b/drivers/scsi/scsi_devinfo.c
index dfb8da83fa50..39455734d934 100644
--- a/drivers/scsi/scsi_devinfo.c
+++ b/drivers/scsi/scsi_devinfo.c
@@ -161,12 +161,14 @@ static struct {
 	{"DGC", "RAID", NULL, BLIST_SPARSELUN},	/* Dell PV 650F, storage on LUN 0 */
 	{"DGC", "DISK", NULL, BLIST_SPARSELUN},	/* Dell PV 650F, no storage on LUN 0 */
 	{"EMC",  "Invista", "*", BLIST_SPARSELUN | BLIST_LARGELUN},
-	{"EMC", "SYMMETRIX", NULL, BLIST_SPARSELUN | BLIST_LARGELUN | BLIST_REPORTLUN2},
+	{"EMC", "SYMMETRIX", NULL, BLIST_SPARSELUN | BLIST_LARGELUN
+	 | BLIST_REPORTLUN2 | BLIST_ABORTED_CMD_QUIRK},
 	{"EMULEX", "MD21/S2     ESDI", NULL, BLIST_SINGLELUN},
 	{"easyRAID", "16P", NULL, BLIST_NOREPORTLUN},
 	{"easyRAID", "X6P", NULL, BLIST_NOREPORTLUN},
 	{"easyRAID", "F8", NULL, BLIST_NOREPORTLUN},
 	{"FSC", "CentricStor", "*", BLIST_SPARSELUN | BLIST_LARGELUN},
+	{"FUJITSU", "ETERNUS_DXM", "*", BLIST_ABORTED_CMD_QUIRK},
 	{"Generic", "USB SD Reader", "1.00", BLIST_FORCELUN | BLIST_INQUIRY_36},
 	{"Generic", "USB Storage-SMC", "0180", BLIST_FORCELUN | BLIST_INQUIRY_36},
 	{"Generic", "USB Storage-SMC", "0207", BLIST_FORCELUN | BLIST_INQUIRY_36},
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 62b56de38ae8..91f5cdc85dfa 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -39,6 +39,7 @@
 #include <scsi/scsi_ioctl.h>
 #include <scsi/scsi_dh.h>
 #include <scsi/sg.h>
+#include <scsi/scsi_devinfo.h>
 
 #include "scsi_priv.h"
 #include "scsi_logging.h"
@@ -432,6 +433,84 @@ static void scsi_report_sense(struct scsi_device *sdev,
 	}
 }
 
+struct aborted_cmd_blist {
+	u8 asc;
+	u8 ascq;
+	int retval;
+	const char *vendor;
+	const char *model;
+	bool warned;
+};
+
+/**
+ * scsi_aborted_cmd_quirk - Handle special return codes for ABORTED COMMAND
+ * @sdev:	SCSI device that returned ABORTED COMMAND.
+ * @sshdr:	Sense data
+ *
+ * Return value:
+ *	SUCCESS or FAILED or NEEDS_RETRY or ADD_TO_MLQUEUE
+ *
+ * Notes:
+ *	This is only called for devices that have the blist flag
+ *      BLIST_ABORTED_CMD_QUIRK set.
+ */
+static int scsi_aborted_cmd_quirk(const struct scsi_device *sdev,
+				  const struct scsi_sense_hdr *sshdr)
+{
+	static struct aborted_cmd_blist blist[] = {
+		/*
+		 * 44/00: SYMMETRIX uses this code for a variety of internal
+		 * issues, all of which can be recovered by retry
+		 */
+		{ 0x44, 0x00, ADD_TO_MLQUEUE, "EMC", "SYMMETRIX", false },
+		/*
+		 * c1/01: This is used by ETERNUS to indicate the
+		 * command should be retried unconditionally
+		 */
+		{ 0xc1, 0x01, ADD_TO_MLQUEUE, "FUJITSU", "ETERNUS_DXM", false }
+	};
+	struct aborted_cmd_blist *found = NULL;
+	int ret = NEEDS_RETRY, i;
+
+	for (i = 0; i < sizeof(blist)/sizeof(struct aborted_cmd_blist); i++) {
+		if (sshdr->asc == blist[i].asc &&
+		    sshdr->ascq == blist[i].ascq) {
+			found = &blist[i];
+			ret = found->retval;
+			break;
+		}
+	}
+
+	if (found == NULL || found->warned)
+		return ret;
+
+	found->warned = true;
+
+	/*
+	 * When we encounter a known ASC/ASCQ combination, it may or may not
+	 * match the device for which this combination is known.
+	 * Warn only once for each known ASC/ASCQ combination.
+	 * We can't afford making a string comparison every time in the
+	 * SCSI command return path, and a wrong match here is expected to be
+	 * non-fatal.
+	 */
+	if (!strcmp(sdev->vendor, found->vendor) &&
+	    !strcmp(sdev->model, found->model)) {
+		SCSI_LOG_ERROR_RECOVERY(3,
+			sdev_printk(KERN_INFO, sdev,
+				    "special retcode %d for ABORTED COMMAND %02x/%02x on %s:%s (expected)",
+				    ret, sshdr->asc, sshdr->ascq,
+				    sdev->vendor, sdev->model));
+	} else {
+		SCSI_LOG_ERROR_RECOVERY(1,
+			sdev_printk(KERN_WARNING, sdev,
+				    "special retcode %d for ABORTED COMMAND %02x/%02x on %s:%s (UNEXPECTED, please inform linux-scsi@vger.kernel.org)",
+				    ret, sshdr->asc, sshdr->ascq,
+				    sdev->vendor, sdev->model));
+	}
+	return ret;
+}
+
 /**
  * scsi_check_sense - Examine scsi cmd sense
  * @scmd:	Cmd to have sense checked.
@@ -503,6 +582,9 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
 		if (sshdr.asc == 0x10) /* DIF */
 			return SUCCESS;
 
+		if (sdev->sdev_bflags & BLIST_ABORTED_CMD_QUIRK)
+			return scsi_aborted_cmd_quirk(sdev, &sshdr);
+
 		return NEEDS_RETRY;
 	case NOT_READY:
 	case UNIT_ATTENTION:
diff --git a/include/scsi/scsi_devinfo.h b/include/scsi/scsi_devinfo.h
index ea67c32e870e..1f5ed53040ab 100644
--- a/include/scsi/scsi_devinfo.h
+++ b/include/scsi/scsi_devinfo.h
@@ -28,6 +28,12 @@
 #define BLIST_LARGELUN		((__force blist_flags_t)(1 << 9))
 /* override additional length field */
 #define BLIST_INQUIRY_36	((__force blist_flags_t)(1 << 10))
+/*
+ * Device uses special return codes for ABORTED COMMAND
+ * This flag must go together with matching status handling code in
+ * scsi_aborted_cmd_quirk()
+ */
+#define BLIST_ABORTED_CMD_QUIRK	((__force blist_flags_t)(1 << 11))
 /* do not do automatic start on add */
 #define BLIST_NOSTARTONADD	((__force blist_flags_t)(1 << 12))
 /* try REPORT_LUNS even for SCSI-2 devs (if HBA supports more than 8 LUNs) */
-- 
2.16.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] scsi: handle special return codes for ABORTED COMMAND
  2018-01-29 23:30 [PATCH] scsi: handle special return codes for ABORTED COMMAND Martin Wilck
@ 2018-01-29 23:39 ` Bart Van Assche
  2018-01-30  0:01   ` Martin Wilck
  0 siblings, 1 reply; 3+ messages in thread
From: Bart Van Assche @ 2018-01-29 23:39 UTC (permalink / raw)
  To: mwilck@suse.com, hare@suse.de, martin.petersen@oracle.com
  Cc: jejb@linux.vnet.ibm.com, xose.vazquez@gmail.com,
	linux-scsi@vger.kernel.org

On Tue, 2018-01-30 at 00:30 +0100, Martin Wilck wrote:
> +	static struct aborted_cmd_blist blist[] = {

Please consider to declare this array const.

> +	for (i = 0; i < sizeof(blist)/sizeof(struct aborted_cmd_blist); i++) {

Have you considered to use ARRAY_SIZE()?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] scsi: handle special return codes for ABORTED COMMAND
  2018-01-29 23:39 ` Bart Van Assche
@ 2018-01-30  0:01   ` Martin Wilck
  0 siblings, 0 replies; 3+ messages in thread
From: Martin Wilck @ 2018-01-30  0:01 UTC (permalink / raw)
  To: Bart Van Assche, hare@suse.de, martin.petersen@oracle.com
  Cc: jejb@linux.vnet.ibm.com, xose.vazquez@gmail.com,
	linux-scsi@vger.kernel.org

On Mon, 2018-01-29 at 23:39 +0000, Bart Van Assche wrote:
> On Tue, 2018-01-30 at 00:30 +0100, Martin Wilck wrote:
> > +	static struct aborted_cmd_blist blist[] = {
> 
> Please consider to declare this array const.

That doesn't work because I want to store the "warned" flag in it. And
I think it'd be good to print one message per asc/ascq combination,
rather than just having a single static "warned" variable.

> > +	for (i = 0; i < sizeof(blist)/sizeof(struct
> > aborted_cmd_blist); i++) {
> 
> Have you considered to use ARRAY_SIZE()?

No. Thanks for the hint. I'll re-post.

Martin

-- 
Dr. Martin Wilck <mwilck@suse.com>, Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-01-30  0:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-29 23:30 [PATCH] scsi: handle special return codes for ABORTED COMMAND Martin Wilck
2018-01-29 23:39 ` Bart Van Assche
2018-01-30  0:01   ` Martin Wilck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox