public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Fix system resume for SCSI devices
@ 2024-08-09 19:31 Bart Van Assche
  2024-08-09 19:31 ` [PATCH v2 1/2] scsi: core: Retry passthrough commands if SCMD_RETRY_PASSTHROUGH is set Bart Van Assche
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Bart Van Assche @ 2024-08-09 19:31 UTC (permalink / raw)
  To: Martin K . Petersen; +Cc: linux-scsi, Bart Van Assche

Hi Martin,

This patch series fixes a particular type of system resume failure for
SCSI devices. Please consider this patch series for the next merge window.

Thanks,

Bart.

Changes compared to v1:
 - Renamed SCMD_RETRY_PASST_ON_UA into SCMD_RETRY_PASSTHROUGH.

Bart Van Assche (2):
  scsi: core: Retry passthrough commands if SCMD_RETRY_PASSTHROUGH is
    set
  sd: Retry START STOP UNIT commands

 drivers/scsi/scsi_error.c | 5 ++++-
 drivers/scsi/sd.c         | 1 +
 include/scsi/scsi_cmnd.h  | 5 ++++-
 3 files changed, 9 insertions(+), 2 deletions(-)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/2] scsi: core: Retry passthrough commands if SCMD_RETRY_PASSTHROUGH is set
  2024-08-09 19:31 [PATCH v2 0/2] Fix system resume for SCSI devices Bart Van Assche
@ 2024-08-09 19:31 ` Bart Van Assche
  2024-08-19 23:25   ` Damien Le Moal
  2024-08-29  2:46   ` Martin K. Petersen
  2024-08-09 19:31 ` [PATCH v2 2/2] sd: Retry START STOP UNIT commands Bart Van Assche
  2024-08-09 22:13 ` [PATCH v2 0/2] Fix system resume for SCSI devices Martin K. Petersen
  2 siblings, 2 replies; 8+ messages in thread
From: Bart Van Assche @ 2024-08-09 19:31 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, Bart Van Assche, Damien Le Moal, Mike Christie,
	James E.J. Bottomley

The SCSI core does not retry passthrough commands even if the SCSI device
reports a retryable unit attention condition. Support retrying in this case
by introducing the SCMD_RETRY_PASSTHROUGH flag.

Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_error.c | 5 ++++-
 include/scsi/scsi_cmnd.h  | 5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 612489afe8d2..9089d39c2170 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1855,7 +1855,10 @@ bool scsi_noretry_cmd(struct scsi_cmnd *scmd)
 	 * assume caller has checked sense and determined
 	 * the check condition was retryable.
 	 */
-	if (req->cmd_flags & REQ_FAILFAST_DEV || blk_rq_is_passthrough(req))
+	if (req->cmd_flags & REQ_FAILFAST_DEV)
+		return true;
+	if (blk_rq_is_passthrough(req) &&
+	    !(scmd->flags & SCMD_RETRY_PASSTHROUGH))
 		return true;
 
 	return false;
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index 45c40d200154..19c57446ab23 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -58,8 +58,11 @@ struct scsi_pointer {
  */
 #define SCMD_FORCE_EH_SUCCESS	(1 << 3)
 #define SCMD_FAIL_IF_RECOVERING	(1 << 4)
+/* If set, retry a passthrough command in case of a unit attention. */
+#define SCMD_RETRY_PASSTHROUGH	(1 << 5)
 /* flags preserved across unprep / reprep */
-#define SCMD_PRESERVED_FLAGS	(SCMD_INITIALIZED | SCMD_FAIL_IF_RECOVERING)
+#define SCMD_PRESERVED_FLAGS	\
+	(SCMD_INITIALIZED | SCMD_FAIL_IF_RECOVERING | SCMD_RETRY_PASSTHROUGH)
 
 /* for scmd->state */
 #define SCMD_STATE_COMPLETE	0

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/2] sd: Retry START STOP UNIT commands
  2024-08-09 19:31 [PATCH v2 0/2] Fix system resume for SCSI devices Bart Van Assche
  2024-08-09 19:31 ` [PATCH v2 1/2] scsi: core: Retry passthrough commands if SCMD_RETRY_PASSTHROUGH is set Bart Van Assche
@ 2024-08-09 19:31 ` Bart Van Assche
  2024-08-19 23:26   ` Damien Le Moal
  2024-08-09 22:13 ` [PATCH v2 0/2] Fix system resume for SCSI devices Martin K. Petersen
  2 siblings, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2024-08-09 19:31 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, Bart Van Assche, Damien Le Moal, Mike Christie,
	James E.J. Bottomley

During system resume, sd_start_stop_device() submits a START STOP UNIT
command to the SCSI device that is being resumed. That command is not
retried in case of a unit attention and hence may fail. An example:

[16575.983359] sd 0:0:0:3: [sdd] Starting disk
[16575.983693] sd 0:0:0:3: [sdd] Start/Stop Unit failed: Result: hostbyte=0x00 driverbyte=DRIVER_OK
[16575.983712] sd 0:0:0:3: [sdd] Sense Key : 0x6
[16575.983730] sd 0:0:0:3: [sdd] ASC=0x29 ASCQ=0x0
[16575.983738] sd 0:0:0:3: PM: dpm_run_callback(): scsi_bus_resume+0x0/0xa0 returns -5
[16575.983783] sd 0:0:0:3: PM: failed to resume async: error -5

Make the SCSI core retry the START STOP UNIT command if a retryable
error is encountered.

Cc: Damien Le Moal <dlemoal@kernel.org>
Cc: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/sd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 718eb91ba9a5..4cbf3e5740e1 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -4093,6 +4093,7 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start)
 	const struct scsi_exec_args exec_args = {
 		.sshdr = &sshdr,
 		.req_flags = BLK_MQ_REQ_PM,
+		.scmd_flags = SCMD_RETRY_PASSTHROUGH,
 	};
 	struct scsi_device *sdp = sdkp->device;
 	int res;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/2] Fix system resume for SCSI devices
  2024-08-09 19:31 [PATCH v2 0/2] Fix system resume for SCSI devices Bart Van Assche
  2024-08-09 19:31 ` [PATCH v2 1/2] scsi: core: Retry passthrough commands if SCMD_RETRY_PASSTHROUGH is set Bart Van Assche
  2024-08-09 19:31 ` [PATCH v2 2/2] sd: Retry START STOP UNIT commands Bart Van Assche
@ 2024-08-09 22:13 ` Martin K. Petersen
  2 siblings, 0 replies; 8+ messages in thread
From: Martin K. Petersen @ 2024-08-09 22:13 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Martin K . Petersen, linux-scsi


Hi Bart!

> This patch series fixes a particular type of system resume failure for
> SCSI devices. Please consider this patch series for the next merge
> window.

I have been working on several tricky regressions where the remedy is
similar in nature to yours. I.e. retrying in some situations. I ended up
using the scsi_failure infrastructure to handle the cases I care about
but now wonder if a more generic approach is warranted. I'll contemplate
a bit and try to reconcile my changes with yours.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] scsi: core: Retry passthrough commands if SCMD_RETRY_PASSTHROUGH is set
  2024-08-09 19:31 ` [PATCH v2 1/2] scsi: core: Retry passthrough commands if SCMD_RETRY_PASSTHROUGH is set Bart Van Assche
@ 2024-08-19 23:25   ` Damien Le Moal
  2024-08-19 23:29     ` Bart Van Assche
  2024-08-29  2:46   ` Martin K. Petersen
  1 sibling, 1 reply; 8+ messages in thread
From: Damien Le Moal @ 2024-08-19 23:25 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: linux-scsi, Mike Christie, James E.J. Bottomley

On 8/10/24 04:31, Bart Van Assche wrote:
> The SCSI core does not retry passthrough commands even if the SCSI device
> reports a retryable unit attention condition. Support retrying in this case
> by introducing the SCMD_RETRY_PASSTHROUGH flag.
> 
> Cc: Damien Le Moal <dlemoal@kernel.org>
> Cc: Mike Christie <michael.christie@oracle.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  drivers/scsi/scsi_error.c | 5 ++++-
>  include/scsi/scsi_cmnd.h  | 5 ++++-
>  2 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 612489afe8d2..9089d39c2170 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -1855,7 +1855,10 @@ bool scsi_noretry_cmd(struct scsi_cmnd *scmd)
>  	 * assume caller has checked sense and determined
>  	 * the check condition was retryable.
>  	 */
> -	if (req->cmd_flags & REQ_FAILFAST_DEV || blk_rq_is_passthrough(req))
> +	if (req->cmd_flags & REQ_FAILFAST_DEV)
> +		return true;
> +	if (blk_rq_is_passthrough(req) &&
> +	    !(scmd->flags & SCMD_RETRY_PASSTHROUGH))
>  		return true;
>  
>  	return false;

	return (req->cmd_flags & REQ_FAILFAST_DEV) ||
		(blk_rq_is_passthrough(req) &&
		 !(scmd->flags & SCMD_RETRY_PASSTHROUGH));

maybe ? Not necessarily more readable though.

> diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
> index 45c40d200154..19c57446ab23 100644
> --- a/include/scsi/scsi_cmnd.h
> +++ b/include/scsi/scsi_cmnd.h
> @@ -58,8 +58,11 @@ struct scsi_pointer {
>   */
>  #define SCMD_FORCE_EH_SUCCESS	(1 << 3)
>  #define SCMD_FAIL_IF_RECOVERING	(1 << 4)
> +/* If set, retry a passthrough command in case of a unit attention. */

"...in case of error". There are other sense keys than unit attention and there
is no code checking that this flag applies only to unit attention.

With this comment fixed:

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

> +#define SCMD_RETRY_PASSTHROUGH	(1 << 5)
>  /* flags preserved across unprep / reprep */
> -#define SCMD_PRESERVED_FLAGS	(SCMD_INITIALIZED | SCMD_FAIL_IF_RECOVERING)
> +#define SCMD_PRESERVED_FLAGS	\
> +	(SCMD_INITIALIZED | SCMD_FAIL_IF_RECOVERING | SCMD_RETRY_PASSTHROUGH)
>  
>  /* for scmd->state */
>  #define SCMD_STATE_COMPLETE	0

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 2/2] sd: Retry START STOP UNIT commands
  2024-08-09 19:31 ` [PATCH v2 2/2] sd: Retry START STOP UNIT commands Bart Van Assche
@ 2024-08-19 23:26   ` Damien Le Moal
  0 siblings, 0 replies; 8+ messages in thread
From: Damien Le Moal @ 2024-08-19 23:26 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: linux-scsi, Mike Christie, James E.J. Bottomley

On 8/10/24 04:31, Bart Van Assche wrote:
> During system resume, sd_start_stop_device() submits a START STOP UNIT
> command to the SCSI device that is being resumed. That command is not
> retried in case of a unit attention and hence may fail. An example:
> 
> [16575.983359] sd 0:0:0:3: [sdd] Starting disk
> [16575.983693] sd 0:0:0:3: [sdd] Start/Stop Unit failed: Result: hostbyte=0x00 driverbyte=DRIVER_OK
> [16575.983712] sd 0:0:0:3: [sdd] Sense Key : 0x6
> [16575.983730] sd 0:0:0:3: [sdd] ASC=0x29 ASCQ=0x0
> [16575.983738] sd 0:0:0:3: PM: dpm_run_callback(): scsi_bus_resume+0x0/0xa0 returns -5
> [16575.983783] sd 0:0:0:3: PM: failed to resume async: error -5
> 
> Make the SCSI core retry the START STOP UNIT command if a retryable
> error is encountered.
> 
> Cc: Damien Le Moal <dlemoal@kernel.org>
> Cc: Mike Christie <michael.christie@oracle.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>

Looks OK to me.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] scsi: core: Retry passthrough commands if SCMD_RETRY_PASSTHROUGH is set
  2024-08-19 23:25   ` Damien Le Moal
@ 2024-08-19 23:29     ` Bart Van Assche
  0 siblings, 0 replies; 8+ messages in thread
From: Bart Van Assche @ 2024-08-19 23:29 UTC (permalink / raw)
  To: Damien Le Moal, Martin K . Petersen
  Cc: linux-scsi, Mike Christie, James E.J. Bottomley

On 8/19/24 4:25 PM, Damien Le Moal wrote:
> On 8/10/24 04:31, Bart Van Assche wrote:
>> +	if (blk_rq_is_passthrough(req) &&
>> +	    !(scmd->flags & SCMD_RETRY_PASSTHROUGH))
>>   		return true;
>>   
>>   	return false;
> 
> 	return (req->cmd_flags & REQ_FAILFAST_DEV) ||
> 		(blk_rq_is_passthrough(req) &&
> 		 !(scmd->flags & SCMD_RETRY_PASSTHROUGH));
> 
> maybe ? Not necessarily more readable though.
> 
>> diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
>> index 45c40d200154..19c57446ab23 100644
>> --- a/include/scsi/scsi_cmnd.h
>> +++ b/include/scsi/scsi_cmnd.h
>> @@ -58,8 +58,11 @@ struct scsi_pointer {
>>    */
>>   #define SCMD_FORCE_EH_SUCCESS	(1 << 3)
>>   #define SCMD_FAIL_IF_RECOVERING	(1 << 4)
>> +/* If set, retry a passthrough command in case of a unit attention. */
> 
> "...in case of error". There are other sense keys than unit attention and there
> is no code checking that this flag applies only to unit attention.
Hi Damien,

Thanks for the detailed feedback. Since Martin replied to this series
that he will reconcile his changes with mine, I plan to wait until
either Martin posts his patches or until Martin asks me to repost my
patches.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/2] scsi: core: Retry passthrough commands if SCMD_RETRY_PASSTHROUGH is set
  2024-08-09 19:31 ` [PATCH v2 1/2] scsi: core: Retry passthrough commands if SCMD_RETRY_PASSTHROUGH is set Bart Van Assche
  2024-08-19 23:25   ` Damien Le Moal
@ 2024-08-29  2:46   ` Martin K. Petersen
  1 sibling, 0 replies; 8+ messages in thread
From: Martin K. Petersen @ 2024-08-29  2:46 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, linux-scsi, Damien Le Moal, Mike Christie,
	James E.J. Bottomley


Hi Bart!

Sorry this took so long. I have made several attempts at automatically
retrying commands originating from within SCSI since the lack of retries
is a general problem.

However, in the end I prefer the approach of describing the specific
conditions we want handled using scsi_failure at each call site. Works
well.

> The SCSI core does not retry passthrough commands even if the SCSI
> device reports a retryable unit attention condition. Support retrying
> in this case by introducing the SCMD_RETRY_PASSTHROUGH flag.

FWIW, I find the use of "passthrough" to describe commands issued by the
SCSI layer quite confusing.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-08-29  2:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-09 19:31 [PATCH v2 0/2] Fix system resume for SCSI devices Bart Van Assche
2024-08-09 19:31 ` [PATCH v2 1/2] scsi: core: Retry passthrough commands if SCMD_RETRY_PASSTHROUGH is set Bart Van Assche
2024-08-19 23:25   ` Damien Le Moal
2024-08-19 23:29     ` Bart Van Assche
2024-08-29  2:46   ` Martin K. Petersen
2024-08-09 19:31 ` [PATCH v2 2/2] sd: Retry START STOP UNIT commands Bart Van Assche
2024-08-19 23:26   ` Damien Le Moal
2024-08-09 22:13 ` [PATCH v2 0/2] Fix system resume for SCSI devices Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox