The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* [RFC PATCH] usb: storage: uas: limit consecutive device resets in error handling
@ 2026-07-01  4:03 Sergey Senozhatsky
  2026-07-01  5:38 ` Greg KH
  2026-07-01  8:28 ` [usb-storage] " Oliver Neukum
  0 siblings, 2 replies; 5+ messages in thread
From: Sergey Senozhatsky @ 2026-07-01  4:03 UTC (permalink / raw)
  To: Oliver Neukum, Alan Stern
  Cc: linux-usb, linux-scsi, usb-storage, linux-kernel, Tomasz Figa,
	Sergey Senozhatsky

When a UAS storage device experiences persistent wire or hardware IO
failures, commands time out and the SCSI error handler thread invokes
uas_eh_device_reset_handler().  If usb_reset_device() succeeds at the
USB hub level but the underlying drive remains unresponsive, the reset
handler returns SUCCESS. SCSI EH then requeues pending commands with
DID_RESET (ACTION_RETRY), causing them to time out again 30 seconds
later in an infinite loop.  This blocks block layer queues indefinitely:

[..]
 sd 0:0:0:0: [sda] tag#4 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD
 sd 0:0:0:0: [sda] tag#4 CDB: Write(10) 2a 00 00 d3 98 08 00 04 00 00
 sd 0:0:0:0: [sda] tag#0 uas_eh_abort_handler 0 uas-tag 2 inflight: CMD OUT
 sd 0:0:0:0: [sda] tag#0 CDB: Write(10) 2a 00 00 d3 9c 08 00 04 00 00
 scsi host0: uas_eh_device_reset_handler start
 usb 2-1.3: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd
 scsi host0: uas_eh_device_reset_handler success
 sd 0:0:0:0: [sda] tag#3 uas_eh_abort_handler 0 uas-tag 3 inflight: CMD IN
 sd 0:0:0:0: [sda] tag#3 CDB: Read(10) 28 00 00 00 00 00 00 00 20 00
 scsi host0: uas_eh_device_reset_handler start
 sd 0:0:0:0: [sda] tag#1 uas_zap_pending 0 uas-tag 1 inflight: CMD
 sd 0:0:0:0: [sda] tag#1 CDB: Write(10) 2a 00 00 d3 98 08 00 04 00 00
 sd 0:0:0:0: [sda] tag#2 uas_zap_pending 0 uas-tag 2 inflight: CMD
 sd 0:0:0:0: [sda] tag#2 CDB: Write(10) 2a 00 00 d3 9c 08 00 04 00 00
 usb 2-1.3: reset SuperSpeed Plus Gen 2x1 USB device number 4 using xhci_hcd
 scsi host0: uas_eh_device_reset_handler success
[..]

Introduce a runtime-configurable module parameter 'reset_limit' (default
3) and track consecutive resets in devinfo->reset_cnt.  When a productive
block layer command completes successfully (SUBMITTED_BY_BLOCK_LAYER),
reset the counter to zero.  If consecutive resets exceed reset_limit,
abort the loop by completing pending commands with DID_NO_CONNECT and
returning FAILED.  This allows SCSI EH to offline the unresponsive
device.

Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
 drivers/usb/storage/uas.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
index 265162981269..a63c66c8bbad 100644
--- a/drivers/usb/storage/uas.c
+++ b/drivers/usb/storage/uas.c
@@ -32,6 +32,10 @@
 
 #define MAX_CMNDS 256
 
+static int uas_reset_limit = 3;
+module_param_named(reset_limit, uas_reset_limit, int, 0644);
+MODULE_PARM_DESC(reset_limit, "Maximum number of consecutive device resets during error handling before failing");
+
 struct uas_dev_info {
 	struct usb_interface *intf;
 	struct usb_device *udev;
@@ -40,6 +44,7 @@ struct uas_dev_info {
 	struct usb_anchor data_urbs;
 	u64 flags;
 	int qdepth, resetting;
+	int reset_cnt;
 	unsigned cmd_pipe, status_pipe, data_in_pipe, data_out_pipe;
 	unsigned use_streams:1;
 	unsigned shutdown:1;
@@ -255,6 +260,8 @@ static int uas_try_complete(struct scsi_cmnd *cmnd, const char *caller)
 		return -EBUSY;
 	devinfo->cmnd[cmdinfo->uas_tag - 1] = NULL;
 	uas_free_unsubmitted_urbs(cmnd);
+	if (cmnd->result == 0 && cmnd->submitter == SUBMITTED_BY_BLOCK_LAYER)
+		devinfo->reset_cnt = 0;
 	scsi_done(cmnd);
 	return 0;
 }
@@ -796,6 +803,21 @@ static int uas_eh_host_reset_handler(struct scsi_cmnd *cmnd)
 	usb_kill_anchored_urbs(&devinfo->cmd_urbs);
 	usb_kill_anchored_urbs(&devinfo->sense_urbs);
 	usb_kill_anchored_urbs(&devinfo->data_urbs);
+
+	spin_lock_irqsave(&devinfo->lock, flags);
+	if (uas_reset_limit > 0 && devinfo->reset_cnt >= uas_reset_limit) {
+		devinfo->resetting = 0;
+		spin_unlock_irqrestore(&devinfo->lock, flags);
+		uas_zap_pending(devinfo, DID_NO_CONNECT);
+		usb_unlock_device(udev);
+		shost_printk(KERN_ERR, sdev->host,
+			     "%s FAILED reset limit %d exceeded\n",
+			     __func__, uas_reset_limit);
+		return FAILED;
+	}
+	devinfo->reset_cnt++;
+	spin_unlock_irqrestore(&devinfo->lock, flags);
+
 	uas_zap_pending(devinfo, DID_RESET);
 
 	err = usb_reset_device(udev);
-- 
2.55.0.795.g602f6c329a-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-07-01  8:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-01  4:03 [RFC PATCH] usb: storage: uas: limit consecutive device resets in error handling Sergey Senozhatsky
2026-07-01  5:38 ` Greg KH
2026-07-01  5:57   ` Sergey Senozhatsky
2026-07-01  6:01     ` Sergey Senozhatsky
2026-07-01  8:28 ` [usb-storage] " Oliver Neukum

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox