[PATCH] scsi_error: send device reset if asynchronous abort failed

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] scsi_error: send device reset if asynchronous abort failed
@ 2026-06-10 10:51 Hannes Reinecke
  2026-06-10 11:03 ` sashiko-bot
  0 siblings, 1 reply; 2+ messages in thread
From: Hannes Reinecke @ 2026-06-10 10:51 UTC (permalink / raw)
  To: James Bottomley; +Cc: Martin K. Petersen, linux-scsi, Hannes Reinecke

The current SCSI EH implementation requires us to enter the error
handler before a device reset can be send, requiring us to wait for
all outstanding commands on the _host_ before a device reset can be
issued. This is causing a noticeable I/O stall on SAS JBODs if an
individual device fails.

But a device reset has a very well defined scope, so there really should
not be any implications to _other_ devices on the same host when the
device reset is issued.
So it should be possible to issue a device reset directly from
scsi_eh_abort_handler() once a command abort fails without having to
wait for outstanding commands on any other devices on the same host.

Signed-off-by: Hannes Reinecke <hare@kernel.org>
---
 drivers/scsi/scsi_error.c | 17 ++++++++++++++++-
 drivers/scsi/scsi_priv.h  |  1 +
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 147127fb4db9..23dc50332f58 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -58,6 +58,7 @@
 #define HOST_RESET_SETTLE_TIME  (10)
 
 static int scsi_eh_try_stu(struct scsi_cmnd *scmd);
+static enum scsi_disposition scsi_try_bus_device_reset(struct scsi_cmnd *scmd);
 static enum scsi_disposition scsi_try_to_abort_cmd(const struct scsi_host_template *,
 						   struct scsi_cmnd *);
 
@@ -170,7 +171,16 @@ scmd_eh_abort_handler(struct work_struct *work)
 				    "cmd abort %s\n",
 				    (rtn == FAST_IO_FAIL) ?
 				    "not send" : "failed"));
-		goto out;
+		if (scsi_host_eh_past_deadline(shost))
+			goto out;
+		rtn = scsi_try_bus_device_reset(scmd);
+		if (rtn != SUCCESS) {
+			scmd_printk(KERN_INFO, scmd,
+				    "device reset %s\n",
+				    (rtn == FAST_IO_FAIL) ?
+				    "not send" : "failed");
+			goto out;
+		}
 	}
 	set_host_byte(scmd, DID_TIME_OUT);
 	if (scsi_host_eh_past_deadline(shost)) {
@@ -1013,6 +1023,11 @@ static enum scsi_disposition scsi_try_bus_device_reset(struct scsi_cmnd *scmd)
 	if (!hostt->eh_device_reset_handler)
 		return FAILED;
 
+	/* Device reset had already been issued, escalate to next level */
+	if (scmd->eh_eflags & SCSI_EH_RESET_SCHEDULED)
+		return FAILED;
+
+	scmd->eh_eflags |= SCSI_EH_RESET_SCHEDULED;
 	rtn = hostt->eh_device_reset_handler(scmd);
 	if (rtn == SUCCESS)
 		__scsi_report_device_reset(scmd->device, NULL);
diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
index 7a193cc04e5b..be0570cea445 100644
--- a/drivers/scsi/scsi_priv.h
+++ b/drivers/scsi/scsi_priv.h
@@ -39,6 +39,7 @@ static inline u8 scsi_ml_byte(int result)
  * Scsi Error Handler Flags
  */
 #define SCSI_EH_ABORT_SCHEDULED	0x0002	/* Abort has been scheduled */
+#define SCSI_EH_RESET_SCHEDULED	0x0004  /* Reset has been scheduled */
 
 #define SCSI_SENSE_VALID(scmd) \
 	(((scmd)->sense_buffer[0] & 0x70) == 0x70)
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] scsi_error: send device reset if asynchronous abort failed
  2026-06-10 10:51 [PATCH] scsi_error: send device reset if asynchronous abort failed Hannes Reinecke
@ 2026-06-10 11:03 ` sashiko-bot
  0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-06-10 11:03 UTC (permalink / raw)
  To: Hannes Reinecke; +Cc: linux-scsi

Thank you for your contribution! Sashiko AI review found 6 potential issue(s) to consider:
- [High] Calling `eh_device_reset_handler` asynchronously outside the SCSI EH thread violates midlayer concurrency guarantees and breaks low-level drivers (LLDDs).
- [High] Lockless modification of bitfields in `__scsi_report_device_reset()` causes data races that silently corrupt `struct scsi_device` state.
- [High] Concurrent command timeouts on the same device trigger an unbounded cascade of redundant, sequential Bus Device Resets (BDRs).
- [High] Premature escalation to Target Reset due to improper handling of the `SCSI_EH_RESET_SCHEDULED` flag.
- [Medium] Missing Test Unit Ready (TUR) verification after asynchronous device reset leads to immediate I/O failures.
- [Medium] Asynchronous device reset bypasses post-recovery cleanup, leaving removable media physically unlocked but logically locked.
--

commit ba182d2d6163ba8d4c3c38885f49aacbb5f4c7fe
Author: Hannes Reinecke <hare@kernel.org>
Subject: scsi_error: send device reset if asynchronous abort failed

This commit allows a device reset to be issued directly from
scmd_eh_abort_handler() if a command abort fails. The change aims to prevent
noticeable I/O stalls on other devices sharing the same host by avoiding
the wait for all outstanding commands to finish before entering the error
handler.

> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 147127fb4db9c..23dc50332f582 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c

[ ... ]

> @@ -170,7 +171,16 @@ scmd_eh_abort_handler(struct work_struct *work)
>  				    "cmd abort %s\n",
>  				    (rtn == FAST_IO_FAIL) ?
>  				    "not send" : "failed"));
> -		goto out;
> +		if (scsi_host_eh_past_deadline(shost))
> +			goto out;
> +		rtn = scsi_try_bus_device_reset(scmd);

[Severity: High]
This invokes the device reset asynchronously from the abort workqueue, while
the host is still in the active SHOST_RUNNING state.

The SCSI midlayer documentation explicitly guarantees to low-level drivers
that strategy routines like eh_device_reset_handler() are executed exclusively
in the kernel EH thread. This ensures the host is in SHOST_RECOVERY where no
normal I/O is dispatched and no other commands are queued.

Will concurrent queuecommand() calls execute alongside the reset handler here?

Does this expose drivers to data races and hardware state corruption,
as their reset callbacks assume a quiescent state?

[Severity: Medium]
When a SCSI device is reset, removable media devices typically drop their
medium removal prevention state, physically unlocking the door.

The standard EH thread handles this by calling scsi_restart_operations()
post-recovery, which checks sdev->was_reset and sdev->locked to re-issue
ALLOW_MEDIUM_REMOVAL.

Because this async device reset never enters SHOST_RECOVERY, does it bypass
the scsi_restart_operations() cleanup entirely?

Could this leave the physical device unlocked while the kernel still considers
it locked, allowing users to unexpectedly eject mounted media?

> +		if (rtn != SUCCESS) {
> +			scmd_printk(KERN_INFO, scmd,
> +				    "device reset %s\n",
> +				    (rtn == FAST_IO_FAIL) ?
> +				    "not send" : "failed");
> +			goto out;
> +		}
>  	}
>  	set_host_byte(scmd, DID_TIME_OUT);
>  	if (scsi_host_eh_past_deadline(shost)) {

[Severity: Medium]
In the standard EH thread, a successful device reset is followed by sending
a Test Unit Ready (TUR) command via scsi_eh_tur() to ensure the device has
fully recovered.

This code proceeds to complete or retry the command immediately after a
successful reset without invoking scsi_eh_tur().

Could re-issuing commands or allowing new I/O immediately after a reset
without verifying device readiness lead to unpredictable device behavior or
immediate failures?

[ ... ]

> @@ -1013,6 +1023,11 @@ static enum scsi_disposition scsi_try_bus_device_reset(struct scsi_cmnd *scmd)
>  	if (!hostt->eh_device_reset_handler)
>  		return FAILED;
>  
> +	/* Device reset had already been issued, escalate to next level */
> +	if (scmd->eh_eflags & SCSI_EH_RESET_SCHEDULED)
> +		return FAILED;
> +
> +	scmd->eh_eflags |= SCSI_EH_RESET_SCHEDULED;

[Severity: High]
When multiple commands time out concurrently on the same device, each
schedules its own abort_work. If their aborts fail, they each independently
call scsi_try_bus_device_reset().

The deduplication flag SCSI_EH_RESET_SCHEDULED introduced here is tracked
per-command (scmd->eh_eflags), not per-device.

Does this fail to prevent BDR cascades?

Would every single timed-out command on the device sequentially trigger a full
Bus Device Reset, severely delaying recovery?

[Severity: High]
If the asynchronous scsi_try_bus_device_reset() fails, it marks the command
with SCSI_EH_RESET_SCHEDULED and delegates it to the main EH thread.

When the main EH thread attempts recovery via scsi_eh_bus_device_reset(), it
picks the first command for the device. If it picks this previously poisoned
command, it will hit the check above and immediately return FAILED.

Will this completely bypass the driver reset callback in the fully quiescent
EH thread, depriving the device of a valid reset attempt and prematurely
escalating to a Target Reset?

>  	rtn = hostt->eh_device_reset_handler(scmd);
>  	if (rtn == SUCCESS)
>  		__scsi_report_device_reset(scmd->device, NULL);

[Severity: High]
A successful asynchronous device reset calls __scsi_report_device_reset(),
which modifies bitfields on struct scsi_device:

scsi_error.c:__scsi_report_device_reset() {
	sdev->was_reset = 1;
	sdev->expecting_cc_ua = 1;
}

In C, bitfield assignments are non-atomic Read-Modify-Write (RMW) operations.
Because this executes outside of the serialized SHOST_RECOVERY state, normal
I/O completions can continue to process concurrently.

Can concurrent softirq paths, such as scsi_check_sense() clearing
expecting_cc_ua = 0, modify adjacent bitfields in the same machine word and
result in a data race that silently clobbers critical device flags?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260610105119.30903-1-hare@kernel.org?part=1

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-10 11:03 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-10 10:51 [PATCH] scsi_error: send device reset if asynchronous abort failed Hannes Reinecke
2026-06-10 11:03 ` sashiko-bot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.