* [PATCH] fix urb unmapping issue when the uas device is remove during ongoing data transfer @ 2025-09-30 4:53 guhuinan 2025-09-30 13:22 ` Oliver Neukum 2025-10-07 20:57 ` Michal Pecio 0 siblings, 2 replies; 4+ messages in thread From: guhuinan @ 2025-09-30 4:53 UTC (permalink / raw) To: Oliver Neukum, Alan Stern, Greg Kroah-Hartman Cc: linux-usb, linux-scsi, usb-storage, linux-kernel, Yu Chen, Owen Gu From: Owen Gu <guhuinan@xiaomi.com> When a UAS device is unplugged during data transfer, there is a probability of a system panic occurring. The root cause is an access to an invalid memory address during URB callback handling. Specifically, this happens when the dma_direct_unmap_sg() function is called within the usb_hcd_unmap_urb_for_dma() interface, but the sg->dma_address field is 0 and the sg data structure has already been freed. The SCSI driver sends transfer commands by invoking uas_queuecommand_lck() in uas.c, using the uas_submit_urbs() function to submit requests to USB. Within the uas_submit_urbs() implementation, three URBs (sense_urb, data_urb, and cmd_urb) are sequentially submitted. Device removal may occur at any point during uas_submit_urbs execution, which may result in URB submission failure. However, some URBs might have been successfully submitted before the failure, and uas_submit_urbs will return the -ENODEV error code in this case. The current error handling directly calls scsi_done(). In the SCSI driver, this eventually triggers scsi_complete() to invoke scsi_end_request() for releasing the sgtable. The successfully submitted URBs, when being completed (giveback), call usb_hcd_unmap_urb_for_dma() in hcd.c, leading to exceptions during sg unmapping operations since the sg data structure has already been freed. This patch modifies the error condition check in the uas_submit_urbs() function. When a UAS device is removed but one or more URBs have already been successfully submitted to USB, it avoids immediately invoking scsi_done(). Instead, it waits for the successfully submitted URBs to complete , and then triggers the scsi_done() function call within uas_try_complete() after all pending URB operations are finalized. Signed-off-by: Yu Chen <chenyu45@xiaomi.com> Signed-off-by: Owen Gu <guhuinan@xiaomi.com> --- drivers/usb/storage/uas.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c index 4ed0dc19afe0..6bfc7281f7ad 100644 --- a/drivers/usb/storage/uas.c +++ b/drivers/usb/storage/uas.c @@ -699,7 +699,9 @@ static int uas_queuecommand_lck(struct scsi_cmnd *cmnd) */ if (err == -ENODEV) { set_host_byte(cmnd, DID_NO_CONNECT); - scsi_done(cmnd); + if (!(cmdinfo->state & (COMMAND_INFLIGHT | DATA_IN_URB_INFLIGHT | + DATA_OUT_URB_INFLIGHT))) + scsi_done(cmnd); goto zombie; } if (err) { -- 2.43.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] fix urb unmapping issue when the uas device is remove during ongoing data transfer 2025-09-30 4:53 [PATCH] fix urb unmapping issue when the uas device is remove during ongoing data transfer guhuinan @ 2025-09-30 13:22 ` Oliver Neukum 2025-10-07 20:57 ` Michal Pecio 1 sibling, 0 replies; 4+ messages in thread From: Oliver Neukum @ 2025-09-30 13:22 UTC (permalink / raw) To: guhuinan, Alan Stern, Greg Kroah-Hartman Cc: linux-usb, linux-scsi, usb-storage, linux-kernel, Yu Chen Hi, On 30.09.25 06:53, guhuinan wrote: > From: Owen Gu <guhuinan@xiaomi.com> > > When a UAS device is unplugged during data transfer, there is > a probability of a system panic occurring. The root cause is > an access to an invalid memory address during URB callback handling. > Specifically, this happens when the dma_direct_unmap_sg() function > is called within the usb_hcd_unmap_urb_for_dma() interface, but the > sg->dma_address field is 0 and the sg data structure has already been > freed. > > The SCSI driver sends transfer commands by invoking uas_queuecommand_lck() > in uas.c, using the uas_submit_urbs() function to submit requests to USB. > Within the uas_submit_urbs() implementation, three URBs (sense_urb, > data_urb, and cmd_urb) are sequentially submitted. Device removal may > occur at any point during uas_submit_urbs execution, which may result > in URB submission failure. However, some URBs might have been successfully > submitted before the failure, and uas_submit_urbs will return the -ENODEV > error code in this case. The current error handling directly calls > scsi_done(). In the SCSI driver, this eventually triggers scsi_complete() > to invoke scsi_end_request() for releasing the sgtable. The successfully > submitted URBs, when being completed (giveback), call > usb_hcd_unmap_urb_for_dma() in hcd.c, leading to exceptions during sg > unmapping operations since the sg data structure has already been freed. > > This patch modifies the error condition check in the uas_submit_urbs() > function. When a UAS device is removed but one or more URBs have already > been successfully submitted to USB, it avoids immediately invoking > scsi_done(). Instead, it waits for the successfully submitted URBs to > complete , and then triggers the scsi_done() function call within > uas_try_complete() after all pending URB operations are finalized. > > Signed-off-by: Yu Chen <chenyu45@xiaomi.com> > Signed-off-by: Owen Gu <guhuinan@xiaomi.com> > --- > drivers/usb/storage/uas.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c > index 4ed0dc19afe0..6bfc7281f7ad 100644 > --- a/drivers/usb/storage/uas.c > +++ b/drivers/usb/storage/uas.c > @@ -699,7 +699,9 @@ static int uas_queuecommand_lck(struct scsi_cmnd *cmnd) > */ > if (err == -ENODEV) { > set_host_byte(cmnd, DID_NO_CONNECT); Why then set the host byte unconditionally? > - scsi_done(cmnd); > + if (!(cmdinfo->state & (COMMAND_INFLIGHT | DATA_IN_URB_INFLIGHT | > + DATA_OUT_URB_INFLIGHT))) > + scsi_done(cmnd); It would seem to me that in this case you need to make sure in the completion handlers that scsi_done() is always called, even if the resetting flag is set. Regards Oliver ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] fix urb unmapping issue when the uas device is remove during ongoing data transfer 2025-09-30 4:53 [PATCH] fix urb unmapping issue when the uas device is remove during ongoing data transfer guhuinan 2025-09-30 13:22 ` Oliver Neukum @ 2025-10-07 20:57 ` Michal Pecio 2025-10-11 13:36 ` Owen Gu 1 sibling, 1 reply; 4+ messages in thread From: Michal Pecio @ 2025-10-07 20:57 UTC (permalink / raw) To: guhuinan Cc: Oliver Neukum, Alan Stern, Greg Kroah-Hartman, linux-usb, linux-scsi, usb-storage, linux-kernel, Yu Chen On Tue, 30 Sep 2025 12:53:08 +0800, guhuinan wrote: > From: Owen Gu <guhuinan@xiaomi.com> > > When a UAS device is unplugged during data transfer, there is > a probability of a system panic occurring. The root cause is > an access to an invalid memory address during URB callback handling. > Specifically, this happens when the dma_direct_unmap_sg() function > is called within the usb_hcd_unmap_urb_for_dma() interface, but the > sg->dma_address field is 0 and the sg data structure has already been > freed. > > The SCSI driver sends transfer commands by invoking uas_queuecommand_lck() > in uas.c, using the uas_submit_urbs() function to submit requests to USB. > Within the uas_submit_urbs() implementation, three URBs (sense_urb, > data_urb, and cmd_urb) are sequentially submitted. Device removal may > occur at any point during uas_submit_urbs execution, which may result > in URB submission failure. However, some URBs might have been successfully > submitted before the failure, and uas_submit_urbs will return the -ENODEV > error code in this case. The current error handling directly calls > scsi_done(). In the SCSI driver, this eventually triggers scsi_complete() > to invoke scsi_end_request() for releasing the sgtable. The successfully > submitted URBs, when being completed (giveback), call > usb_hcd_unmap_urb_for_dma() in hcd.c, leading to exceptions during sg > unmapping operations since the sg data structure has already been freed. > > This patch modifies the error condition check in the uas_submit_urbs() > function. When a UAS device is removed but one or more URBs have already > been successfully submitted to USB, it avoids immediately invoking > scsi_done(). Instead, it waits for the successfully submitted URBs to > complete , and then triggers the scsi_done() function call within > uas_try_complete() after all pending URB operations are finalized. > > Signed-off-by: Yu Chen <chenyu45@xiaomi.com> > Signed-off-by: Owen Gu <guhuinan@xiaomi.com> Hi, Was this situation seen in the wild and/or reproduced, or only predicted theoretically? Was the patch tested? I wonder what happens to the submitted URBs when scsi_done() is not called. Since the command URB was not submitted (or else we wouldn't be here I guess?) the device shouldn't have selected this stream before disconnection and it seems that the xHC won't try to move data on those URBs, so they won't complete with -EPROTO. Will they sit there stuck until SCSI core times out and aborts the command? That's poor UX, speaking from experience here. Maybe it would make sense to unlink them? Unlinking Streams URBs is a sensitive topic because it's forbidden if they can become the Current Stream, but in this case it looks like they can't. Or am I missing something? Regards, Michal ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] fix urb unmapping issue when the uas device is remove during ongoing data transfer 2025-10-07 20:57 ` Michal Pecio @ 2025-10-11 13:36 ` Owen Gu 0 siblings, 0 replies; 4+ messages in thread From: Owen Gu @ 2025-10-11 13:36 UTC (permalink / raw) To: Michal Pecio Cc: Oliver Neukum, Alan Stern, Greg Kroah-Hartman, linux-usb, linux-scsi, usb-storage, linux-kernel, Yu Chen On Tue, Oct 07, 2025 at 10:57:18PM +0200, Michal Pecio wrote: > On Tue, 30 Sep 2025 12:53:08 +0800, guhuinan wrote: > > From: Owen Gu <guhuinan@xiaomi.com> > > > > When a UAS device is unplugged during data transfer, there is > > a probability of a system panic occurring. The root cause is > > an access to an invalid memory address during URB callback handling. > > Specifically, this happens when the dma_direct_unmap_sg() function > > is called within the usb_hcd_unmap_urb_for_dma() interface, but the > > sg->dma_address field is 0 and the sg data structure has already been > > freed. > > > > The SCSI driver sends transfer commands by invoking uas_queuecommand_lck() > > in uas.c, using the uas_submit_urbs() function to submit requests to USB. > > Within the uas_submit_urbs() implementation, three URBs (sense_urb, > > data_urb, and cmd_urb) are sequentially submitted. Device removal may > > occur at any point during uas_submit_urbs execution, which may result > > in URB submission failure. However, some URBs might have been successfully > > submitted before the failure, and uas_submit_urbs will return the -ENODEV > > error code in this case. The current error handling directly calls > > scsi_done(). In the SCSI driver, this eventually triggers scsi_complete() > > to invoke scsi_end_request() for releasing the sgtable. The successfully > > submitted URBs, when being completed (giveback), call > > usb_hcd_unmap_urb_for_dma() in hcd.c, leading to exceptions during sg > > unmapping operations since the sg data structure has already been freed. > > > > This patch modifies the error condition check in the uas_submit_urbs() > > function. When a UAS device is removed but one or more URBs have already > > been successfully submitted to USB, it avoids immediately invoking > > scsi_done(). Instead, it waits for the successfully submitted URBs to > > complete , and then triggers the scsi_done() function call within > > uas_try_complete() after all pending URB operations are finalized. > > > > Signed-off-by: Yu Chen <chenyu45@xiaomi.com> > > Signed-off-by: Owen Gu <guhuinan@xiaomi.com> > > Hi, > > Was this situation seen in the wild and/or reproduced, or only > predicted theoretically? Was the patch tested? > This problem occurs in real testing environments with low probability when the UAS device is unplugged during data transmission. Current patches have been tested and can resolve this issue. However, as Oliver mentioned in his response, if the resetting flag is set to 1 at this time, it might prevent scsi_done() from being called, which was not considered in Patch V1. > I wonder what happens to the submitted URBs when scsi_done() is > not called. Since the command URB was not submitted (or else we > wouldn't be here I guess?) the device shouldn't have selected this > stream before disconnection and it seems that the xHC won't try > to move data on those URBs, so they won't complete with -EPROTO. > Submitted URBs will be processed through usb_disable_endpoint() during device removal, eventually triggering usb_kill_urb() in the usb_disconnect() workflow. This will invoke the completion interface through usb_hcd_giveback_urb(), returning -ESHUTDOWN to the UAS driver. > Will they sit there stuck until SCSI core times out and aborts > the command? That's poor UX, speaking from experience here. > After applying the current patch, testing has shown that failed commands in the UAS driver for which scsi_done() was not called after submission will be completed 30 seconds after a SCSI command timeout occurs. We will optimize this issue in patch V2, which is currently under testing. > Maybe it would make sense to unlink them? Unlinking Streams URBs > is a sensitive topic because it's forbidden if they can become > the Current Stream, but in this case it looks like they can't. > > Or am I missing something? > > Regards, > Michal Dear, Sorry for the delayed response to this issue.We anticipate uploading an updated patch V2 next week. Regards ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-10-11 13:37 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-09-30 4:53 [PATCH] fix urb unmapping issue when the uas device is remove during ongoing data transfer guhuinan 2025-09-30 13:22 ` Oliver Neukum 2025-10-07 20:57 ` Michal Pecio 2025-10-11 13:36 ` Owen Gu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox