public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [V2 PATCH 0/2] TUR path is down after adapter gets reset in multipath configuration
@ 2014-11-03 21:18 wenxiong
  2014-11-03 21:18 ` [V2 PATCH 1/2] scsi: call device handler for failed TUR command wenxiong
  2014-11-03 21:18 ` [V2 PATCH 2/2] scsi: TUR path is down after adapter gets reset with multipath wenxiong
  0 siblings, 2 replies; 7+ messages in thread
From: wenxiong @ 2014-11-03 21:18 UTC (permalink / raw)
  To: James.Bottomley; +Cc: hch, linux-scsi, brking

-- 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [V2 PATCH 1/2] scsi: call device handler for failed TUR command
  2014-11-03 21:18 [V2 PATCH 0/2] TUR path is down after adapter gets reset in multipath configuration wenxiong
@ 2014-11-03 21:18 ` wenxiong
  2014-11-03 21:18 ` [V2 PATCH 2/2] scsi: TUR path is down after adapter gets reset with multipath wenxiong
  1 sibling, 0 replies; 7+ messages in thread
From: wenxiong @ 2014-11-03 21:18 UTC (permalink / raw)
  To: James.Bottomley; +Cc: hch, linux-scsi, brking

[-- Attachment #1: allow_restart1 --]
[-- Type: text/plain, Size: 2092 bytes --]

Multipath devices using the TUR path checker need to see the sense
code for a failed TUR command in their device handler.  Since commit
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/
drivers/scsi/scsi_error.c?id=14216561e164671ce147458653b1fea06a
we always return success for mid layer issued TUR commands before
calling the device handler, which stopped the TUR path checker from working.

Move the call to the device handler check sense method before the early
return for TUR commands to give the device handler a chance to intercept
them.

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Tested-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
---
 drivers/scsi/scsi_error.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

Index: b/drivers/scsi/scsi_error.c
===================================================================
--- a/drivers/scsi/scsi_error.c	2014-10-23 12:54:16.000000000 -0500
+++ b/drivers/scsi/scsi_error.c	2014-10-23 12:57:44.642078988 -0500
@@ -459,14 +459,6 @@ static int scsi_check_sense(struct scsi_
 	if (! scsi_command_normalize_sense(scmd, &sshdr))
 		return FAILED;	/* no valid sense data */
 
-	if (scmd->cmnd[0] == TEST_UNIT_READY && scmd->scsi_done != scsi_eh_done)
-		/*
-		 * nasty: for mid-layer issued TURs, we need to return the
-		 * actual sense data without any recovery attempt.  For eh
-		 * issued ones, we need to try to recover and interpret
-		 */
-		return SUCCESS;
-
 	scsi_report_sense(sdev, &sshdr);
 
 	if (scsi_sense_is_deferred(&sshdr))
@@ -482,6 +474,14 @@ static int scsi_check_sense(struct scsi_
 		/* handler does not care. Drop down to default handling */
 	}
 
+	if (scmd->cmnd[0] == TEST_UNIT_READY && scmd->scsi_done != scsi_eh_done)
+		/*
+		 * nasty: for mid-layer issued TURs, we need to return the
+		 * actual sense data without any recovery attempt.  For eh
+		 * issued ones, we need to try to recover and interpret
+		 */
+		return SUCCESS;
+
 	/*
 	 * Previous logic looked for FILEMARK, EOM or ILI which are
 	 * mainly associated with tapes and returned SUCCESS.

-- 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [V2 PATCH 2/2] scsi: TUR path is down after adapter gets reset with multipath
  2014-11-03 21:18 [V2 PATCH 0/2] TUR path is down after adapter gets reset in multipath configuration wenxiong
  2014-11-03 21:18 ` [V2 PATCH 1/2] scsi: call device handler for failed TUR command wenxiong
@ 2014-11-03 21:18 ` wenxiong
  2014-11-04  7:21   ` Hannes Reinecke
  1 sibling, 1 reply; 7+ messages in thread
From: wenxiong @ 2014-11-03 21:18 UTC (permalink / raw)
  To: James.Bottomley; +Cc: hch, linux-scsi, brking

[-- Attachment #1: allow_restart2 --]
[-- Type: text/plain, Size: 2185 bytes --]

This patch fixes an issue with multipath ipr SAS devices which require a
start unit command to be issued following an adapter reset. Without this
patch, paths get marked failed following an adapter reset and since the
error handler never gets invoked to issue the start unit, the paths are
never recovered. Returning FAILED for this case ensures the error
handler wakes up to issue the start unit.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Teste-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
---
 drivers/scsi/device_handler/scsi_dh_alua.c |   20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

Index: b/drivers/scsi/device_handler/scsi_dh_alua.c
===================================================================
--- a/drivers/scsi/device_handler/scsi_dh_alua.c	2014-10-29 20:50:29.000000000 -0500
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c	2014-11-03 14:24:39.482078993 -0600
@@ -84,6 +84,7 @@ struct alua_dh_data {
 
 static char print_alua_state(int);
 static int alua_check_sense(struct scsi_device *, struct scsi_sense_hdr *);
+static int alua_check_sense_handler(struct scsi_device *, struct scsi_sense_hdr *);
 
 static inline struct alua_dh_data *get_alua_data(struct scsi_device *sdev)
 {
@@ -519,6 +520,23 @@ static int alua_check_sense(struct scsi_
 	return SCSI_RETURN_NOT_HANDLED;
 }
 
+static int alua_check_sense_handler(struct scsi_device *sdev,
+			    struct scsi_sense_hdr *sense_hdr)
+{
+	switch (sense_hdr->sense_key) {
+	case NOT_READY:
+		if (sdev->allow_restart &&
+		    sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x02)
+			/*
+			 * if the device is not started, we need to wake
+			 * the error handler to start the motor
+			 */
+			return FAILED;
+		break;
+	}
+	alua_check_sense(sdev, sense_hdr);
+}
+
 /*
  * alua_rtpg - Evaluate REPORT TARGET GROUP STATES
  * @sdev: the device to be evaluated.
@@ -827,7 +845,7 @@ static struct scsi_device_handler alua_d
 	.attach = alua_bus_attach,
 	.detach = alua_bus_detach,
 	.prep_fn = alua_prep_fn,
-	.check_sense = alua_check_sense,
+	.check_sense = alua_check_sense_handler,
 	.activate = alua_activate,
 	.set_params = alua_set_params,
 	.match = alua_match,

-- 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [ V2 PATCH 2/2] scsi: TUR path is down after adapter gets reset with multipath
  2014-11-03 21:28 [ V2 PATCH 0/2] TUR path is down after adapter gets reset in multipath configuration wenxiong
@ 2014-11-03 21:28 ` wenxiong
  0 siblings, 0 replies; 7+ messages in thread
From: wenxiong @ 2014-11-03 21:28 UTC (permalink / raw)
  To: James.Bottomley; +Cc: hch, linux-scsi, brking

[-- Attachment #1: allow_restart2 --]
[-- Type: text/plain, Size: 2185 bytes --]

This patch fixes an issue with multipath ipr SAS devices which require a
start unit command to be issued following an adapter reset. Without this
patch, paths get marked failed following an adapter reset and since the
error handler never gets invoked to issue the start unit, the paths are
never recovered. Returning FAILED for this case ensures the error
handler wakes up to issue the start unit.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Teste-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
---
 drivers/scsi/device_handler/scsi_dh_alua.c |   20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

Index: b/drivers/scsi/device_handler/scsi_dh_alua.c
===================================================================
--- a/drivers/scsi/device_handler/scsi_dh_alua.c	2014-10-29 20:50:29.000000000 -0500
+++ b/drivers/scsi/device_handler/scsi_dh_alua.c	2014-11-03 14:24:39.482078993 -0600
@@ -84,6 +84,7 @@ struct alua_dh_data {
 
 static char print_alua_state(int);
 static int alua_check_sense(struct scsi_device *, struct scsi_sense_hdr *);
+static int alua_check_sense_handler(struct scsi_device *, struct scsi_sense_hdr *);
 
 static inline struct alua_dh_data *get_alua_data(struct scsi_device *sdev)
 {
@@ -519,6 +520,23 @@ static int alua_check_sense(struct scsi_
 	return SCSI_RETURN_NOT_HANDLED;
 }
 
+static int alua_check_sense_handler(struct scsi_device *sdev,
+			    struct scsi_sense_hdr *sense_hdr)
+{
+	switch (sense_hdr->sense_key) {
+	case NOT_READY:
+		if (sdev->allow_restart &&
+		    sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x02)
+			/*
+			 * if the device is not started, we need to wake
+			 * the error handler to start the motor
+			 */
+			return FAILED;
+		break;
+	}
+	alua_check_sense(sdev, sense_hdr);
+}
+
 /*
  * alua_rtpg - Evaluate REPORT TARGET GROUP STATES
  * @sdev: the device to be evaluated.
@@ -827,7 +845,7 @@ static struct scsi_device_handler alua_d
 	.attach = alua_bus_attach,
 	.detach = alua_bus_detach,
 	.prep_fn = alua_prep_fn,
-	.check_sense = alua_check_sense,
+	.check_sense = alua_check_sense_handler,
 	.activate = alua_activate,
 	.set_params = alua_set_params,
 	.match = alua_match,

-- 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [V2 PATCH 2/2] scsi: TUR path is down after adapter gets reset with multipath
  2014-11-03 21:18 ` [V2 PATCH 2/2] scsi: TUR path is down after adapter gets reset with multipath wenxiong
@ 2014-11-04  7:21   ` Hannes Reinecke
  2014-11-04 15:01     ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Hannes Reinecke @ 2014-11-04  7:21 UTC (permalink / raw)
  To: wenxiong, James.Bottomley; +Cc: hch, linux-scsi, brking

On 11/03/2014 10:18 PM, wenxiong@linux.vnet.ibm.com wrote:
> This patch fixes an issue with multipath ipr SAS devices which require a
> start unit command to be issued following an adapter reset. Without this
> patch, paths get marked failed following an adapter reset and since the
> error handler never gets invoked to issue the start unit, the paths are
> never recovered. Returning FAILED for this case ensures the error
> handler wakes up to issue the start unit.
> 
> Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
> Teste-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
> ---
>  drivers/scsi/device_handler/scsi_dh_alua.c |   20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> Index: b/drivers/scsi/device_handler/scsi_dh_alua.c
> ===================================================================
> --- a/drivers/scsi/device_handler/scsi_dh_alua.c	2014-10-29 20:50:29.000000000 -0500
> +++ b/drivers/scsi/device_handler/scsi_dh_alua.c	2014-11-03 14:24:39.482078993 -0600
> @@ -84,6 +84,7 @@ struct alua_dh_data {
>  
>  static char print_alua_state(int);
>  static int alua_check_sense(struct scsi_device *, struct scsi_sense_hdr *);
> +static int alua_check_sense_handler(struct scsi_device *, struct scsi_sense_hdr *);
>  
>  static inline struct alua_dh_data *get_alua_data(struct scsi_device *sdev)
>  {
> @@ -519,6 +520,23 @@ static int alua_check_sense(struct scsi_
>  	return SCSI_RETURN_NOT_HANDLED;
>  }
>  
> +static int alua_check_sense_handler(struct scsi_device *sdev,
> +			    struct scsi_sense_hdr *sense_hdr)
> +{
> +	switch (sense_hdr->sense_key) {
> +	case NOT_READY:
> +		if (sdev->allow_restart &&
> +		    sense_hdr->asc == 0x04 && sense_hdr->ascq == 0x02)
> +			/*
> +			 * if the device is not started, we need to wake
> +			 * the error handler to start the motor
> +			 */
> +			return FAILED;
> +		break;
> +	}
> +	alua_check_sense(sdev, sense_hdr);
> +}
> +
>  /*
>   * alua_rtpg - Evaluate REPORT TARGET GROUP STATES
>   * @sdev: the device to be evaluated.
> @@ -827,7 +845,7 @@ static struct scsi_device_handler alua_d
>  	.attach = alua_bus_attach,
>  	.detach = alua_bus_detach,
>  	.prep_fn = alua_prep_fn,
> -	.check_sense = alua_check_sense,
> +	.check_sense = alua_check_sense_handler,
>  	.activate = alua_activate,
>  	.set_params = alua_set_params,
>  	.match = alua_match,
> 
Why did you use a wrapper for an already existing function?
Please fold the logic into alua_check_sense().

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [V2 PATCH 2/2] scsi: TUR path is down after adapter gets reset with multipath
  2014-11-04  7:21   ` Hannes Reinecke
@ 2014-11-04 15:01     ` Christoph Hellwig
  2014-11-05  8:56       ` Hannes Reinecke
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2014-11-04 15:01 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: wenxiong, James.Bottomley, hch, linux-scsi, brking

On Tue, Nov 04, 2014 at 08:21:13AM +0100, Hannes Reinecke wrote:
> Why did you use a wrapper for an already existing function?
> Please fold the logic into alua_check_sense().

That's what the first version did.  See the response from Rob to it
on why it's done this way.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [V2 PATCH 2/2] scsi: TUR path is down after adapter gets reset with multipath
  2014-11-04 15:01     ` Christoph Hellwig
@ 2014-11-05  8:56       ` Hannes Reinecke
  0 siblings, 0 replies; 7+ messages in thread
From: Hannes Reinecke @ 2014-11-05  8:56 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: wenxiong, James.Bottomley, linux-scsi, brking,
	Elliott, Robert (Server Storage)

On 11/04/2014 04:01 PM, Christoph Hellwig wrote:
> On Tue, Nov 04, 2014 at 08:21:13AM +0100, Hannes Reinecke wrote:
>> Why did you use a wrapper for an already existing function?
>> Please fold the logic into alua_check_sense().
> 
> That's what the first version did.  See the response from Rob to it
> on why it's done this way.

Hmm. I see. But I still thing it should be wrapped into
alua_check_sense().
The main objection from Rob was that a sense code of 02/04/02 should
cause a START STOP UNIT to be sent, but the latter is might be
invalid for certain ALUA states.
Thing is, we're only sending START STOP UNIT as part of the main
SCSI EH routine, so that would work in either way.

And looking a scsi_dh_alua we're only checking a return value of
ADD_TO_MLQUEUE; any other value is treated as an I/O error.
So it doesn't really matter if alua_check_sense would return
FAILED here; it still would be folded into the final return value of
SCSI_DH_IO.

So I don't think we need to have a wrapper around alua_check_sense()
but can rather merge them both.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke		      zSeries & Storage
hare@suse.de			      +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 21284 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-11-05  8:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-03 21:18 [V2 PATCH 0/2] TUR path is down after adapter gets reset in multipath configuration wenxiong
2014-11-03 21:18 ` [V2 PATCH 1/2] scsi: call device handler for failed TUR command wenxiong
2014-11-03 21:18 ` [V2 PATCH 2/2] scsi: TUR path is down after adapter gets reset with multipath wenxiong
2014-11-04  7:21   ` Hannes Reinecke
2014-11-04 15:01     ` Christoph Hellwig
2014-11-05  8:56       ` Hannes Reinecke
  -- strict thread matches above, loose matches on Subject: below --
2014-11-03 21:28 [ V2 PATCH 0/2] TUR path is down after adapter gets reset in multipath configuration wenxiong
2014-11-03 21:28 ` [ V2 PATCH 2/2] scsi: TUR path is down after adapter gets reset with multipath wenxiong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox