From: Oliver Neukum <oneukum@suse.com>
To: Alan Stern <stern@rowland.harvard.edu>
Cc: EJ Hsu <ejh@nvidia.com>,
"linux-usb@vger.kernel.org" <linux-usb@vger.kernel.org>
Subject: Re: [PATCH] usb: uas: fix a plug & unplug racing
Date: Wed, 15 Jan 2020 10:31:23 +0100 [thread overview]
Message-ID: <1579080683.15925.24.camel@suse.com> (raw)
In-Reply-To: <Pine.LNX.4.44L0.2001140956040.1593-100000@iolanthe.rowland.org>
Am Dienstag, den 14.01.2020, 10:04 -0500 schrieb Alan Stern:
> On Tue, 14 Jan 2020, Oliver Neukum wrote:
>
> > Am Dienstag, den 14.01.2020, 03:28 +0000 schrieb EJ Hsu:
> > > Oliver Neukum wrote:
> > >
> > > > Am Sonntag, den 12.01.2020, 19:30 -0800 schrieb EJ Hsu:
> > > >
> > > > Isn't that the bug? A command to a detached device should fail.
> > > > Could you please elaborate? This issue would not be limited to uas.
> > > >
> > >
> > > In the case I mentioned, the hub thread of external hub running
> > > uas_probe() will get stuck waiting for the completion of scsi scan.
> > >
> > > The scsi scan will try to probe a single LUN using a SCSI INQUIRY.
> > > If the external hub has been unplugged before LUN probe, the device
> > > state of uas device will be set to USB_STATE_NOTATTACHED by the
> > > root hub thread. So, all the following calls to usb_submit_urb() in
> > > uas driver will return -NODEV, and accordingly uas_queuecommand_lck()
> > > will return SCSI_MLQUEUE_DEVICE_BUSY to scsi_request_fn().
> >
> > And that looks like the root cause. The queue isn't busy.
> > It is dead.
>
> No. The discussion has gotten a little confused. EJ's point is that
> if SCSI scanning takes place in the context of the hub worker thread,
> then that thread won't be available to process a disconnect
> notification. The device will be unplugged, but the kernel won't
> realize it until the SCSI scanning is finished.
OK, I think we have two possible code paths at least
First:
uas_queuecommand_lck() -> uas_submit_urbs() -> -ENODEV -> return
SCSI_MLQUEUE_DEVICE_BUSY
That looks wrong to me.
Second:
The submission actually works and we eventually terminate the URB
with an error. In that case nothing happens and eventually SCSI core
retries, times out and does error handling.
> > > scsi_request_fn() then puts this scsi command back into request queue.
> > > Because this scsi device is just created and during LUN probe process,
> > > this scsi command is the only one in the request queue. So, it will be picked
> > > up soon and dispatched to uas driver again. This cycle will continue until
> > > uas_disconnect() is called and its "resetting" flag is set. However, the
> > > hub thread of external hub still got stuck waiting for the completion of
> > > this scsi command, and may not be able to run uas_disconnect().
> > > A deadlock happened.
> >
> > I see. But we are working around insufficient error reporting in the
> > SCSI midlayer.
>
> No, the error reporting there is correct. URBs will complete with
> errors like -EPROTO but no other indication that the device is gone, so
> the midlayer believes that a retry is appropriate.
>
> Perhaps uas should treat -EPROTO, -EILSEQ, and -ETIME as fatal errors.
They could happen due to bad cables.
Now, another work queue would solve the second error case, but I think
the first error case still exists and we would paper over it.
Regards
Oliver
next prev parent reply other threads:[~2020-01-15 9:31 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-13 3:30 [PATCH] usb: uas: fix a plug & unplug racing EJ Hsu
2020-01-13 9:44 ` Oliver Neukum
2020-01-13 15:21 ` Alan Stern
2020-01-14 3:28 ` EJ Hsu
2020-01-14 14:41 ` Oliver Neukum
2020-01-14 15:04 ` Alan Stern
2020-01-15 9:31 ` Oliver Neukum [this message]
2020-01-15 15:17 ` Alan Stern
2020-01-15 15:54 ` EJ Hsu
2020-01-20 9:38 ` Oliver Neukum
2020-01-21 11:29 ` EJ Hsu
2020-01-22 9:52 ` EJ Hsu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1579080683.15925.24.camel@suse.com \
--to=oneukum@suse.com \
--cc=ejh@nvidia.com \
--cc=linux-usb@vger.kernel.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.