* USB disk disconnect problems @ 2022-08-21 11:17 James Dutton 2022-08-21 14:47 ` Alan Stern 2022-10-03 18:04 ` James Dutton 0 siblings, 2 replies; 15+ messages in thread From: James Dutton @ 2022-08-21 11:17 UTC (permalink / raw) To: linux-usb@vger.kernel.org Hi, Say I have mounted a usb disk. I then disconnect the usb device Linux complains about failed writes etc. I then plug the usb device back in Linux still complains about failed writes, and does not recover. How do I get Linux to recognise the reinserted usb disk and carry on as normal? I know my suggested behaviour might be detrimental for some users, in case one modifies the usb disk in another computer and then comes back, but I would like an option that assumes it has not been plugged into anything else. The reason being, I have a system that boots from a USB disk. Due to interference, the USB device disconnects for a second or two and then comes back, but Linux does not see it and I have to reboot Linux to recover. So, in this situation I wish Linux to be able to recover immediately, without needing a reboot. The physical USB device removal then reinserting reproduces the problem I am seeing, so I thought it would be a good example to get working, if we could. Can anyone give me any pointers as to where to start with fixing this? Kind Regards James ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: USB disk disconnect problems 2022-08-21 11:17 USB disk disconnect problems James Dutton @ 2022-08-21 14:47 ` Alan Stern 2022-08-21 16:36 ` James Dutton ` (2 more replies) 2022-10-03 18:04 ` James Dutton 1 sibling, 3 replies; 15+ messages in thread From: Alan Stern @ 2022-08-21 14:47 UTC (permalink / raw) To: James Dutton; +Cc: linux-usb@vger.kernel.org On Sun, Aug 21, 2022 at 12:17:30PM +0100, James Dutton wrote: > Hi, > > Say I have mounted a usb disk. > I then disconnect the usb device > Linux complains about failed writes etc. > I then plug the usb device back in > Linux still complains about failed writes, and does not recover. > > How do I get Linux to recognise the reinserted usb disk and carry on as normal? As far as I know, there's only way way to do it: Go into system suspend before disconnecting the USB drive, and plug the drive back in before waking the system up. > I know my suggested behaviour might be detrimental for some users, in > case one modifies the usb disk in another computer and then comes > back, but I would like an option that assumes it has not been plugged > into anything else. The resume procedure makes this assumption, if it finds that something has been disconnected and reconnected. > The reason being, I have a system that boots from a USB disk. > Due to interference, the USB device disconnects for a second or two > and then comes back, but Linux does not see it and I have to reboot > Linux to recover. So, in this situation I wish Linux to be able to > recover immediately, without needing a reboot. There is no way to do this. For example, consider all those failed writes that you get error messages about. Once they have failed, the system does not try to remember them in case there's a possibility of trying them again later. They're just lost. Similarly with failed reads. When a program tries to read something from a disk and the read fails, the program generally does not wait for a while and then retry the read, to see if the disk will magically start working again. > The physical USB device removal then reinserting reproduces the > problem I am seeing, so I thought it would be a good example to get > working, if we could. > > Can anyone give me any pointers as to where to start with fixing this? Sorry I can't be of any more help. Alan Stern ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: USB disk disconnect problems 2022-08-21 14:47 ` Alan Stern @ 2022-08-21 16:36 ` James Dutton 2022-08-21 16:40 ` James Dutton [not found] ` <CAA6KcBC2wEc78fgrMLBfbyEinR3rVUY6z8HeUbE=wtv0c4BP2Q@mail.gmail.com> 2022-08-21 20:03 ` Matthew Dharm 2 siblings, 1 reply; 15+ messages in thread From: James Dutton @ 2022-08-21 16:36 UTC (permalink / raw) To: Alan Stern; +Cc: linux-usb@vger.kernel.org On Sun, 21 Aug 2022 at 15:47, Alan Stern <stern@rowland.harvard.edu> wrote: > > > The reason being, I have a system that boots from a USB disk. > > Due to interference, the USB device disconnects for a second or two > > and then comes back, but Linux does not see it and I have to reboot > > Linux to recover. So, in this situation I wish Linux to be able to > > recover immediately, without needing a reboot. > > There is no way to do this. For example, consider all those failed > writes that you get error messages about. Once they have failed, the > system does not try to remember them in case there's a possibility of > trying them again later. They're just lost. I guess the solution would have to include a "retry in 1 second's time" type failure mode, instead of just lost. I.e. differentiate between the disk responding that the media failed, and the link being down to the disk so the write message could not be sent. For example, NFS waits around for the network to return, maybe we could add that functionality between a filesystem and usb storage. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: USB disk disconnect problems 2022-08-21 16:36 ` James Dutton @ 2022-08-21 16:40 ` James Dutton 2022-08-21 18:11 ` Alan Stern 0 siblings, 1 reply; 15+ messages in thread From: James Dutton @ 2022-08-21 16:40 UTC (permalink / raw) To: Alan Stern; +Cc: linux-usb@vger.kernel.org On Sun, 21 Aug 2022 at 17:36, James Dutton <james.dutton@gmail.com> wrote: > > On Sun, 21 Aug 2022 at 15:47, Alan Stern <stern@rowland.harvard.edu> wrote: > > > > > The reason being, I have a system that boots from a USB disk. > > > Due to interference, the USB device disconnects for a second or two > > > and then comes back, but Linux does not see it and I have to reboot > > > Linux to recover. So, in this situation I wish Linux to be able to > > > recover immediately, without needing a reboot. > > > > There is no way to do this. For example, consider all those failed > > writes that you get error messages about. Once they have failed, the > > system does not try to remember them in case there's a possibility of > > trying them again later. They're just lost. > I guess the solution would have to include a "retry in 1 second's > time" type failure mode, instead of just lost. > I.e. differentiate between the disk responding that the media failed, > and the link being down to the disk so the write message could not be > sent. > For example, NFS waits around for the network to return, maybe we > could add that functionality between a filesystem and usb storage. As a side note, I have seen USB links failing. Normally just to something like a keyboard or mouse, so it just comes back without the user knowing anything was wrong. The problem is USB links to disks don't recover currently. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: USB disk disconnect problems 2022-08-21 16:40 ` James Dutton @ 2022-08-21 18:11 ` Alan Stern 0 siblings, 0 replies; 15+ messages in thread From: Alan Stern @ 2022-08-21 18:11 UTC (permalink / raw) To: James Dutton; +Cc: linux-usb@vger.kernel.org On Sun, Aug 21, 2022 at 05:40:23PM +0100, James Dutton wrote: > On Sun, 21 Aug 2022 at 17:36, James Dutton <james.dutton@gmail.com> wrote: > > > > On Sun, 21 Aug 2022 at 15:47, Alan Stern <stern@rowland.harvard.edu> wrote: > > > > > > > The reason being, I have a system that boots from a USB disk. > > > > Due to interference, the USB device disconnects for a second or two > > > > and then comes back, but Linux does not see it and I have to reboot > > > > Linux to recover. So, in this situation I wish Linux to be able to > > > > recover immediately, without needing a reboot. > > > > > > There is no way to do this. For example, consider all those failed > > > writes that you get error messages about. Once they have failed, the > > > system does not try to remember them in case there's a possibility of > > > trying them again later. They're just lost. > > I guess the solution would have to include a "retry in 1 second's > > time" type failure mode, instead of just lost. Maybe, in theory. In your case, I think a better solution would be to eliminate the interference that causes the transient disconnects to occur in the first place. USB isn't designed to operate reliably in an environment filled with that much noise. > > I.e. differentiate between the disk responding that the media failed, > > and the link being down to the disk so the write message could not be > > sent. > > For example, NFS waits around for the network to return, maybe we > > could add that functionality between a filesystem and usb storage. In theory it could be done. I suspect the overall benefit would not be very large; I have not heard lots of reports from other people facing the problem you have. Consider that neither Windows nor Mac OS-X does this. Also, doing this would lead to other problems. For instace, I'm sure some people want to know that a device has stopped working as soon as the problem begins; they would get upset if the system kept trying to reconnect for tens of seconds before finally deciding the device was gone for good. (Consider the way people have complained a lot over the years about NFS and its extremely long uninterruptible waits.) > As a side note, I have seen USB links failing. Normally just to > something like a keyboard or mouse, so it just comes back without the > user knowing anything was wrong. That's different. When the link to a USB mouse fails and then starts working again, the system doesn't think the mouse has recovered; it regards what happened as a new mouse being plugged in. (Same with keyboards.) The user doesn't notice anything because the system treats all mice the same. In fact, you can even plug in two mice at the same time (that is, without bothering to wait for the first one to fail) and the system will accept input from both of them interchangeably. > The problem is USB links to disks don't recover currently. Well, you have to admit that treating disks like mice -- considering all of them to be the same -- would not be a good strategy. :-) (On the other hand, sometimes two disks really do get treated as though they are the same. That's what happens in a RAID-1 (mirroring) setup. If you have mirrored USB disks, you can unplug one of them and the system will continue working. And when you plug it back it later, the system will repair it as necessary and then go on using it normally without your noticing. But obviously this isn't what you have in mind.) Alan Stern ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <CAA6KcBC2wEc78fgrMLBfbyEinR3rVUY6z8HeUbE=wtv0c4BP2Q@mail.gmail.com>]
* Re: USB disk disconnect problems [not found] ` <CAA6KcBC2wEc78fgrMLBfbyEinR3rVUY6z8HeUbE=wtv0c4BP2Q@mail.gmail.com> @ 2022-08-21 19:03 ` Alan Stern 0 siblings, 0 replies; 15+ messages in thread From: Alan Stern @ 2022-08-21 19:03 UTC (permalink / raw) To: Matthew Dharm; +Cc: James Dutton, linux-usb@vger.kernel.org On Sun, Aug 21, 2022 at 11:42:00AM -0700, Matthew Dharm wrote: > On Sun, Aug 21, 2022 at 7:47 AM Alan Stern <stern@rowland.harvard.edu> > wrote: > > > On Sun, Aug 21, 2022 at 12:17:30PM +0100, James Dutton wrote: > > > I know my suggested behaviour might be detrimental for some users, in > > > case one modifies the usb disk in another computer and then comes > > > back, but I would like an option that assumes it has not been plugged > > > into anything else. > > > In the “old days” (that is, my original design for use-storage) it used to > do exactly what you are looking for - based on VID, DID, and SerialNumber > it would “remember” devices. The SCSI host would never be destroyed, and > when a device re-appeared it would be re-connected to the existing host. Ah yes... I do remember those days, but not very often. :-) > That caused all sorts of problems. The SCSI and block layers just couldn’t > handle it well. A clean umount / mount cycle worked fine, but if you > unexpectedly disconnected the device all hell broke loose and there was no > way to recover. > > I did it this way because, way back when, there were issues dynamically > destroying SCSI hosts. The people who worked on those other layers found it > much, much easier to fix that problem than try to make it possible to > recover from an unexpected disconnect. > > Honestly, I’m not even sure where you would need to begin to make this > work. It would require pretty radical changes is the block I/O layers to > differentiate different failure modes, keep a lot more data around after > certain types of failures, allow for specifying which devices this new > policy (which is assuming reconnected devices really haven’t been altered) > applies to, etc — it’s a big lift. Provided you don't mind giving up after 30 seconds (the default SCSI timeout), you wouldn't need to change the block or other layers. All you would have to do is avoid reporting a command failure if the reason for the failure is disconnection, wait for the device to reappear, and then retry the command. (Yes, there would be a few extra complications but that's the basic idea.) As far as the SCSI or block layers are concerned, it would look like the I/O succeeded but took an unusually long time to complete. Alan Stern ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: USB disk disconnect problems 2022-08-21 14:47 ` Alan Stern 2022-08-21 16:36 ` James Dutton [not found] ` <CAA6KcBC2wEc78fgrMLBfbyEinR3rVUY6z8HeUbE=wtv0c4BP2Q@mail.gmail.com> @ 2022-08-21 20:03 ` Matthew Dharm 2022-08-21 20:59 ` James Dutton 2 siblings, 1 reply; 15+ messages in thread From: Matthew Dharm @ 2022-08-21 20:03 UTC (permalink / raw) To: Alan Stern; +Cc: James Dutton, linux-usb@vger.kernel.org (Re-sending, as the first one got blocked by the list for having an HTML part). On Sun, Aug 21, 2022 at 7:47 AM Alan Stern <stern@rowland.harvard.edu> wrote: > > On Sun, Aug 21, 2022 at 12:17:30PM +0100, James Dutton wrote: > > I know my suggested behaviour might be detrimental for some users, in > > case one modifies the usb disk in another computer and then comes > > back, but I would like an option that assumes it has not been plugged > > into anything else. In the “old days” (that is, my original design for use-storage) it used to do exactly what you are looking for - based on VID, DID, and SerialNumber it would “remember” devices. The SCSI host would never be destroyed, and when a device re-appeared it would be re-connected to the existing host. That caused all sorts of problems. The SCSI and block layers just couldn’t handle it well. A clean umount / mount cycle worked fine, but if you unexpectedly disconnected the device all hell broke loose and there was no way to recover. I did it this way because, way back when, there were issues dynamically destroying SCSI hosts. The people who worked on those other layers found it much, much easier to fix that problem than try to make it possible to recover from an unexpected disconnect. Honestly, I’m not even sure where you would need to begin to make this work. It would require pretty radical changes is the block I/O layers to differentiate different failure modes, keep a lot more data around after certain types of failures, allow for specifying which devices this new policy (which is assuming reconnected devices really haven’t been altered) applies to, etc — it’s a big lift. Matt aka “the guy who originally designed how this works” -- Matthew Dharm Former Maintainer, USB Mass Storage driver for Linux ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: USB disk disconnect problems 2022-08-21 20:03 ` Matthew Dharm @ 2022-08-21 20:59 ` James Dutton 2022-08-21 21:26 ` Matthew Dharm 2022-08-22 10:18 ` Oliver Neukum 0 siblings, 2 replies; 15+ messages in thread From: James Dutton @ 2022-08-21 20:59 UTC (permalink / raw) To: Matthew Dharm; +Cc: Alan Stern, linux-usb@vger.kernel.org On Sun, 21 Aug 2022 at 21:03, Matthew Dharm <mdharm-usb@one-eyed-alien.net> wrote: > > (Re-sending, as the first one got blocked by the list for having an HTML part). > > On Sun, Aug 21, 2022 at 7:47 AM Alan Stern <stern@rowland.harvard.edu> wrote: > > > > On Sun, Aug 21, 2022 at 12:17:30PM +0100, James Dutton wrote: > > > I know my suggested behaviour might be detrimental for some users, in > > > case one modifies the usb disk in another computer and then comes > > > back, but I would like an option that assumes it has not been plugged > > > into anything else. > > In the “old days” (that is, my original design for use-storage) it > used to do exactly what you are looking for - based on VID, DID, and > SerialNumber it would “remember” devices. The SCSI host would never be > destroyed, and when a device re-appeared it would be re-connected to > the existing host. > > That caused all sorts of problems. The SCSI and block layers just > couldn’t handle it well. A clean umount / mount cycle worked fine, but > if you unexpectedly disconnected the device all hell broke loose and > there was no way to recover. > > I did it this way because, way back when, there were issues > dynamically destroying SCSI hosts. The people who worked on those > other layers found it much, much easier to fix that problem than try > to make it possible to recover from an unexpected disconnect. > > Honestly, I’m not even sure where you would need to begin to make this > work. It would require pretty radical changes is the block I/O layers > to differentiate different failure modes, keep a lot more data around > after certain types of failures, allow for specifying which devices > this new policy (which is assuming reconnected devices really haven’t > been altered) applies to, etc — it’s a big lift. > Are there any situations where we should actually try to recover? What about: The OS has not needed to read/write to the disk in a while. The USB disk idles out and goes into a power save mode by itself. The OS then wishes to write something, but would need to go through some sort of wake up procedure first. I don't know if that is a state that is available for USB devices, but if it was, would it be fair to try and recover? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: USB disk disconnect problems 2022-08-21 20:59 ` James Dutton @ 2022-08-21 21:26 ` Matthew Dharm 2022-08-21 22:56 ` James Dutton 2022-08-22 10:18 ` Oliver Neukum 1 sibling, 1 reply; 15+ messages in thread From: Matthew Dharm @ 2022-08-21 21:26 UTC (permalink / raw) To: James Dutton; +Cc: Alan Stern, linux-usb@vger.kernel.org On Sun, Aug 21, 2022 at 2:00 PM James Dutton <james.dutton@gmail.com> wrote: > > On Sun, 21 Aug 2022 at 21:03, Matthew Dharm > <mdharm-usb@one-eyed-alien.net> wrote: > > > > In the “old days” (that is, my original design for use-storage) it > > used to do exactly what you are looking for - based on VID, DID, and > > SerialNumber it would “remember” devices. The SCSI host would never be > > destroyed, and when a device re-appeared it would be re-connected to > > the existing host. > > > > That caused all sorts of problems. The SCSI and block layers just > > couldn’t handle it well. A clean umount / mount cycle worked fine, but > > if you unexpectedly disconnected the device all hell broke loose and > > there was no way to recover. > > Are there any situations where we should actually try to recover? > What about: > The OS has not needed to read/write to the disk in a while. The USB > disk idles out and goes into a power save mode by itself. > The OS then wishes to write something, but would need to go through > some sort of wake up procedure first. > > I don't know if that is a state that is available for USB devices, but > if it was, would it be fair to try and recover? That scenario already happens all the time; rotating disks often spin-down after an idle period and then automatically spin-up at the next media-access command. So long as they spin-up within the command timeout (typically 30 seconds), there is no issue. BUT, this is very different from what you originally asked about -- in a low-power spin-down state, the USB interface is still connected; only the rotating has stopped. From the computer's perspective, the device has always remained attached; the only anomaly is that a command takes longer-than-usual to complete. The next level of deeper power savings would be a system-wide suspend / resume, which we've already discussed and is a path which is already handled (and also different from the original scenario you described). Matt -- Matthew Dharm Former Maintainer, USB Mass Storage driver for Linux ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: USB disk disconnect problems 2022-08-21 21:26 ` Matthew Dharm @ 2022-08-21 22:56 ` James Dutton 2022-08-22 10:03 ` Oliver Neukum 0 siblings, 1 reply; 15+ messages in thread From: James Dutton @ 2022-08-21 22:56 UTC (permalink / raw) To: Matthew Dharm; +Cc: Alan Stern, linux-usb@vger.kernel.org On Sun, 21 Aug 2022 at 22:26, Matthew Dharm <mdharm-usb@one-eyed-alien.net> wrote: > > The next level of deeper power savings would be a system-wide suspend > / resume, which we've already discussed and is a path which is already > handled (and also different from the original scenario you described). > I tried a suspend / resume cycle. 1) The laptop suspends in that the screen blanks and the power LED fades in/out as an indicator of suspend mode. 2) Power to the USB device is powered on while suspended. (LED light on USB device remains on during suspend.) 3) I can remove and reinsert the USB during suspend and it still resumes ok. 4) On exit from suspend everything looks to work ok. I see these messages in the syslog during the suspend/resume cycle: <6>1 2022-08-21T23:18:57+01:00 nvme2 kernel - - - [ 1127.688557] usb 4-2: reset SuperSpeed USB device number 2 using xhci_hcd <4>1 2022-08-21T23:18:57+01:00 nvme2 kernel - - - [ 1127.782252] usb 4-2: Enable of device-initiated U1 failed. <4>1 2022-08-21T23:18:57+01:00 nvme2 kernel - - - [ 1127.784263] usb 4-2: Enable of device-initiated U2 failed. Is U1/U2 failing a problem that could maybe be causing the problems I have seen? The error is in the logs, but the resume works, and the disk is accessible. When the real problem occurs (not during suspend/resume), an extract here: <6>1 2022-05-04T14:32:53+01:00 nvme2 kernel - - - [20782.100705] sd 0:0:0:0: [sda] tag#8 uas_eh_abort_handler 0 uas-tag 2 inflight: CMD <6>1 2022-05-04T14:32:53+01:00 nvme2 kernel - - - [20782.100707] sd 0:0:0:0: [sda] tag#8 CDB: Write(10) 2a 00 1c 51 11 20 00 00 20 00 <6>1 2022-05-04T14:32:53+01:00 nvme2 kernel - - - [20782.115321] scsi host0: uas_eh_device_reset_handler start <6>1 2022-05-04T14:32:53+01:00 nvme2 kernel - - - [20782.248337] usb 4-1: reset SuperSpeed USB device number 2 using xhci_hcd <4>1 2022-05-04T14:32:58+01:00 nvme2 kernel - - - [20787.463620] xhci_hcd 0000:00:14.0: Trying to add endpoint 0x83 without dropping it. <3>1 2022-05-04T14:32:58+01:00 nvme2 kernel - - - [20787.463633] usb 4-1: failed to restore interface 0 altsetting 1 (error=-110) <6>1 2022-05-04T14:32:58+01:00 nvme2 kernel - - - [20787.471524] scsi host0: uas_eh_device_reset_handler FAILED err -19 <6>1 2022-05-04T14:32:58+01:00 nvme2 kernel - - - [20787.471540] sd 0:0:0:0: Device offlined - not ready after error recovery So, it is attempting to recover, but the recovery fails. What is error -110 and err -19 ? Are there any "quirks" that I could try enabling in relation to reset problems? Kind Regards James ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: USB disk disconnect problems 2022-08-21 22:56 ` James Dutton @ 2022-08-22 10:03 ` Oliver Neukum 0 siblings, 0 replies; 15+ messages in thread From: Oliver Neukum @ 2022-08-22 10:03 UTC (permalink / raw) To: James Dutton, Matthew Dharm; +Cc: Alan Stern, linux-usb@vger.kernel.org On 22.08.22 00:56, James Dutton wrote: > I see these messages in the syslog during the suspend/resume cycle: > <6>1 2022-08-21T23:18:57+01:00 nvme2 kernel - - - [ 1127.688557] usb > 4-2: reset SuperSpeed USB device number 2 using xhci_hcd > <4>1 2022-08-21T23:18:57+01:00 nvme2 kernel - - - [ 1127.782252] usb > 4-2: Enable of device-initiated U1 failed. > <4>1 2022-08-21T23:18:57+01:00 nvme2 kernel - - - [ 1127.784263] usb > 4-2: Enable of device-initiated U2 failed. > > Is U1/U2 failing a problem that could maybe be causing the problems I have seen? > The error is in the logs, but the resume works, and the disk is accessible. That is power management. And for a disk to use only power managementunder the host's control is not a problem. > When the real problem occurs (not during suspend/resume), an extract here: > <6>1 2022-05-04T14:32:53+01:00 nvme2 kernel - - - [20782.100705] sd > 0:0:0:0: [sda] tag#8 uas_eh_abort_handler 0 uas-tag 2 inflight: CM A timeout has happened. > <6>1 2022-05-04T14:32:53+01:00 nvme2 kernel - - - [20782.100707] sd > 0:0:0:0: [sda] tag#8 CDB: Write(10) 2a 00 1c 51 11 20 00 00 20 00 > <6>1 2022-05-04T14:32:53+01:00 nvme2 kernel - - - [20782.115321] scsi > host0: uas_eh_device_reset_handler start At that time the SCSI layer does not know why a timeout has happened, so it starts generic error hanfdling, involving a reset. > <6>1 2022-05-04T14:32:53+01:00 nvme2 kernel - - - [20782.248337] usb > 4-1: reset SuperSpeed USB device number 2 using xhci_hcd > <4>1 2022-05-04T14:32:58+01:00 nvme2 kernel - - - [20787.463620] > xhci_hcd 0000:00:14.0: Trying to add endpoint 0x83 without dropping > it. This should not happen > <3>1 2022-05-04T14:32:58+01:00 nvme2 kernel - - - [20787.463633] usb > 4-1: failed to restore interface 0 altsetting 1 (error=-110) > <6>1 2022-05-04T14:32:58+01:00 nvme2 kernel - - - [20787.471524] scsi > host0: uas_eh_device_reset_handler FAILED err -19 > <6>1 2022-05-04T14:32:58+01:00 nvme2 kernel - - - [20787.471540] sd > 0:0:0:0: Device offlined - not ready after error recovery In this case the kernel does not think that your device has been disconnected. All error handling has failed. It gives up on the device but it is still know to the system. > So, it is attempting to recover, but the recovery fails. > What is error -110 and err -19 ? -19 is ENODEV -110 is ETIMEDOUT Those numbers are to be found in include/uapi/asm-generic/errno-base.h include/uapi/asm-generic/errno.h > Are there any "quirks" that I could try enabling in relation to reset problems? Probably not. Is this log complete? Regards Oliver ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: USB disk disconnect problems 2022-08-21 20:59 ` James Dutton 2022-08-21 21:26 ` Matthew Dharm @ 2022-08-22 10:18 ` Oliver Neukum 1 sibling, 0 replies; 15+ messages in thread From: Oliver Neukum @ 2022-08-22 10:18 UTC (permalink / raw) To: James Dutton, Matthew Dharm; +Cc: Alan Stern, linux-usb@vger.kernel.org On 21.08.22 22:59, James Dutton wrote: > On Sun, 21 Aug 2022 at 21:03, Matthew Dharm > <mdharm-usb@one-eyed-alien.net> wrote: >> In the “old days” (that is, my original design for use-storage) it >> used to do exactly what you are looking for - based on VID, DID, and >> SerialNumber it would “remember” devices. The SCSI host would never be >> destroyed, and when a device re-appeared it would be re-connected to >> the existing host. Arguably, in case ACPI tells us that the port is internal we ought to reintroduce that behavior. It is very much an edge case, though. >> Honestly, I’m not even sure where you would need to begin to make this >> work. It would require pretty radical changes is the block I/O layers >> to differentiate different failure modes, keep a lot more data around >> after certain types of failures, allow for specifying which devices >> this new policy (which is assuming reconnected devices really haven’t >> been altered) applies to, etc — it’s a big lift. Basically like failover with multi path I'd say. > Are there any situations where we should actually try to recover? > What about: > The OS has not needed to read/write to the disk in a while. The USB > disk idles out and goes into a power save mode by itself. > The OS then wishes to write something, but would need to go through > some sort of wake up procedure first. We have three issues 1) Is this the same device? 2) Has it been altered while it was disconnected? 3) What do we do in case of memory pressure causing pages to be laundered? In case of device persistance we ignore #1 and #2 and #3 does not exist > I don't know if that is a state that is available for USB devices, but > if it was, would it be fair to try and recover? That is strictly speaking not a USB question. Every device has this issue. You just do not check on resumption from S3 or S4whether somebody has replaced the SATA disk in your system. Regards Oliver ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: USB disk disconnect problems 2022-08-21 11:17 USB disk disconnect problems James Dutton 2022-08-21 14:47 ` Alan Stern @ 2022-10-03 18:04 ` James Dutton 2022-10-03 18:17 ` Alan Stern 1 sibling, 1 reply; 15+ messages in thread From: James Dutton @ 2022-10-03 18:04 UTC (permalink / raw) To: linux-usb@vger.kernel.org On Sun, 21 Aug 2022 at 12:17, James Dutton <james.dutton@gmail.com> wrote: > > Hi, > > Say I have mounted a usb disk. > I then disconnect the usb device > Linux complains about failed writes etc. > I then plug the usb device back in > Linux still complains about failed writes, and does not recover. > > How do I get Linux to recognise the reinserted usb disk and carry on as normal? > > I know my suggested behaviour might be detrimental for some users, in > case one modifies the usb disk in another computer and then comes > back, but I would like an option that assumes it has not been plugged > into anything else. > > The reason being, I have a system that boots from a USB disk. > Due to interference, the USB device disconnects for a second or two > and then comes back, but Linux does not see it and I have to reboot > Linux to recover. So, in this situation I wish Linux to be able to > recover immediately, without needing a reboot. > > The physical USB device removal then reinserting reproduces the > problem I am seeing, so I thought it would be a good example to get > working, if we could. > > Can anyone give me any pointers as to where to start with fixing this? > > Kind Regards > > James I have done some more tests. With the device plugged in, and me manually send a command to reset the USB device. Using instructions listed here: https://askubuntu.com/questions/645/how-do-you-reset-a-usb-device-from-the-command-line The reset fails. It never recovers. So, I think there is some problem relating to USB 3.x reset, and maybe just my specific device which is an NVME storage in a USB dock. I think the problem is more to do with the Linux kernel's USB 3.x reset procedure, rather than any other cause. Is there any quirk or test I can add, that would remove power from the USB port and return it, as part of the reset procedure? Or, is there any extra debug logging I can enable to help diagnose where the reset function is failing? ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: USB disk disconnect problems 2022-10-03 18:04 ` James Dutton @ 2022-10-03 18:17 ` Alan Stern 2022-10-03 20:21 ` James Dutton 0 siblings, 1 reply; 15+ messages in thread From: Alan Stern @ 2022-10-03 18:17 UTC (permalink / raw) To: James Dutton; +Cc: linux-usb@vger.kernel.org On Mon, Oct 03, 2022 at 07:04:05PM +0100, James Dutton wrote: > I have done some more tests. > With the device plugged in, and me manually send a command to reset > the USB device. > Using instructions listed here: > https://askubuntu.com/questions/645/how-do-you-reset-a-usb-device-from-the-command-line > > The reset fails. > It never recovers. > So, I think there is some problem relating to USB 3.x reset, and maybe > just my specific device which is an NVME storage in a USB dock. > I think the problem is more to do with the Linux kernel's USB 3.x > reset procedure, rather than any other cause. > Is there any quirk or test I can add, that would remove power from the > USB port and return it, as part of the reset procedure? > Or, is there any extra debug logging I can enable to help diagnose > where the reset function is failing? You can try collecting a usbmon trace of the reset (instructions on the web or in Documentation/usb/usbmon.rst in the kernel source). That will provide some clues as to whether the problem lies in the reset itself or in the activities that follow the reset. Have you tried running a similar test using, say, a plain old USB thumb drive in place of the NVME storage device? Alan Stern ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: USB disk disconnect problems 2022-10-03 18:17 ` Alan Stern @ 2022-10-03 20:21 ` James Dutton 0 siblings, 0 replies; 15+ messages in thread From: James Dutton @ 2022-10-03 20:21 UTC (permalink / raw) To: Alan Stern; +Cc: linux-usb@vger.kernel.org On Mon, 3 Oct 2022 at 19:17, Alan Stern <stern@rowland.harvard.edu> wrote: > > On Mon, Oct 03, 2022 at 07:04:05PM +0100, James Dutton wrote: > > I have done some more tests. > > With the device plugged in, I manually send a command to reset > > the USB device. > > Using instructions listed here: > > https://askubuntu.com/questions/645/how-do-you-reset-a-usb-device-from-the-command-line > > > > The reset fails. > > It never recovers. > > So, I think there is some problem relating to USB 3.x reset, and maybe > > just my specific device which is an NVME storage in a USB dock. > > I think the problem is more to do with the Linux kernel's USB 3.x > > reset procedure, rather than any other cause. > > Is there any quirk or test I can add, that would remove power from the > > USB port and return it, as part of the reset procedure? > > Or, is there any extra debug logging I can enable to help diagnose > > where the reset function is failing? > > You can try collecting a usbmon trace of the reset (instructions on the > web or in Documentation/usb/usbmon.rst in the kernel source). That will > provide some clues as to whether the problem lies in the reset itself or > in the activities that follow the reset. > > Have you tried running a similar test using, say, a plain old USB thumb > drive in place of the NVME storage device? > I have tried the reset command on USB 2.0 and USB 3.0 flash sticks, and they reset OK. So, it seems to be a problem with this specific NVME USB device. This NVME USB device says it is USB 3.2 when I do lsusb. I don't have a USB 3.2 flash stick lsusb output: Bus 004 Device 002: ID 0bda:9210 Realtek Semiconductor Corp. RTL9210 M.2 NVME Adapter Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 3.20 bDeviceClass 0 bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 9 idVendor 0x0bda Realtek Semiconductor Corp. idProduct 0x9210 RTL9210 M.2 NVME Adapter bcdDevice 20.01 I will try to capture a usbmon and compare the flash sticks reset vs the NVME USB device. ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2022-10-03 20:22 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-21 11:17 USB disk disconnect problems James Dutton
2022-08-21 14:47 ` Alan Stern
2022-08-21 16:36 ` James Dutton
2022-08-21 16:40 ` James Dutton
2022-08-21 18:11 ` Alan Stern
[not found] ` <CAA6KcBC2wEc78fgrMLBfbyEinR3rVUY6z8HeUbE=wtv0c4BP2Q@mail.gmail.com>
2022-08-21 19:03 ` Alan Stern
2022-08-21 20:03 ` Matthew Dharm
2022-08-21 20:59 ` James Dutton
2022-08-21 21:26 ` Matthew Dharm
2022-08-21 22:56 ` James Dutton
2022-08-22 10:03 ` Oliver Neukum
2022-08-22 10:18 ` Oliver Neukum
2022-10-03 18:04 ` James Dutton
2022-10-03 18:17 ` Alan Stern
2022-10-03 20:21 ` James Dutton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox