From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: Suwan Kim <suwan.kim027@gmail.com>
Cc: linux-usb@vger.kernel.org, Shuah Khan <skhan@linuxfoundation.org>,
Valentina Manea <valentina.manea.m@gmail.com>
Subject: Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock
Date: Tue, 10 Dec 2019 16:32:21 +0100 [thread overview]
Message-ID: <20191210153221.GG11116@mail-itl> (raw)
In-Reply-To: <20191210142535.GA4489@localhost.localdomain>
[-- Attachment #1: Type: text/plain, Size: 3683 bytes --]
On Tue, Dec 10, 2019 at 11:25:35PM +0900, Suwan Kim wrote:
> On Mon, Dec 09, 2019 at 03:19:59PM +0100, Marek Marczykowski-Górecki wrote:
> > On Mon, Dec 09, 2019 at 03:35:43PM +0900, Suwan Kim wrote:
> > > On Mon, Dec 09, 2019 at 04:37:40AM +0100, Marek Marczykowski-Górecki wrote:
> > > > On Mon, Dec 09, 2019 at 11:01:30AM +0900, Suwan Kim wrote:
> > > > > On Fri, Dec 06, 2019 at 09:57:42PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > [ 212.890519] usb 1-1: recv xbuf, 42
> > > > >
> > > > > This message is printed by receive error and before that, driver
> > > > > canceled URB transmission. we need to know the exact situation
> > > > > before this message.
> > > >
> > > > I've added some more messages and found recv_size is 0.
> > >
> > > That is the bug point. "size" is urb->actual_length that means
> > > amount of data actually received from device. And "copy" is
> > > amount of data received from usbip server. So, in this situation,
> > > vhci-hcd received all the data from usbip server even if there
> > > are more sg entries left. So, "copy == 0" means vhci-hcd receives
> > > all data from the server and we should check "if (copy == 0)" in
> > > for_each_sg() loop of usbip_recv_xbuff() to exit the loop and not
> > > to add error event.
> >
> > That makes sense. But I think there is also another issue here: hang in
> > case of an error. Here it was EINVAL, but there are probably other
> > reasons why usbip_recv can fail, like network error or misbehaving
> > server. This definitely should not cause the client to fail this way...
> > And also, the actual error code is lost.
>
> I agree. I have been taking a look at it and trying to reproduce
> the same issue on my machine. I guess race condition between hub
> irq thread and driver (vhci_urb_enqueue or hcd?)
> But I'm not sure...
I've tried it some more time and one time I've got a traceback pointing
at lock_release() called from vhci_urb_dequeue, not lock_acquire(). This
get me thinking it may not be deadlock on a spinlock, but some infinite
loop. Looking at the source, I think it's about usb_hcd_flush_endpoint
looping indefinitely because vhci_urb_dequeue() exit early on this:
spin_lock_irqsave(&vhci->lock, flags);
priv = urb->hcpriv;
if (!priv) {
/* URB was never linked! or will be soon given back by
* vhci_rx. */
spin_unlock_irqrestore(&vhci->lock, flags);
return -EIDRM;
}
Adding a print there confirms it.
And I think it's because of vhci_recv_ret_submit():
spin_lock_irqsave(&vdev->priv_lock, flags);
urb = pickup_urb_and_free_priv(vdev, pdu->base.seqnum);
// ****** priv freed here
spin_unlock_irqrestore(&vdev->priv_lock, flags);
if (!urb) {
pr_err("cannot find a urb of seqnum %u max seqnum %d\n",
pdu->base.seqnum,
atomic_read(&vhci_hcd->seqnum));
usbip_event_add(ud, VDEV_EVENT_ERROR_TCP);
return;
}
/* unpack the pdu to a urb */
usbip_pack_pdu(pdu, urb, USBIP_RET_SUBMIT, 0);
/* recv transfer buffer */
if (usbip_recv_xbuff(ud, urb) < 0)
// ***** exit early here
return;
/* recv iso_packet_descriptor */
if (usbip_recv_iso(ud, urb) < 0)
return;
I'm not really sure what should happen, but I think some cleanup in case
of usbip_recv_xbuff() failure is missing. And probably in case of
usbip_recv_iso() failure too.
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2019-12-10 15:32 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-06 3:24 "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock Marek Marczykowski-Górecki
2019-12-06 6:50 ` Suwan Kim
2019-12-06 20:57 ` Marek Marczykowski-Górecki
2019-12-06 21:12 ` Shuah Khan
2019-12-07 0:58 ` Marek Marczykowski-Górecki
2019-12-07 18:45 ` Marek Marczykowski-Górecki
2019-12-09 2:01 ` Suwan Kim
2019-12-09 3:37 ` Marek Marczykowski-Górecki
2019-12-09 6:35 ` Suwan Kim
2019-12-09 14:19 ` Marek Marczykowski-Górecki
2019-12-10 14:25 ` Suwan Kim
2019-12-10 15:32 ` Marek Marczykowski-Górecki [this message]
2019-12-11 3:07 ` Suwan Kim
2019-12-11 3:20 ` Marek Marczykowski-Górecki
2019-12-11 4:53 ` Suwan Kim
2019-12-11 6:27 ` Suwan Kim
2019-12-11 11:01 ` Marek Marczykowski-Górecki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191210153221.GG11116@mail-itl \
--to=marmarek@invisiblethingslab.com \
--cc=linux-usb@vger.kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=suwan.kim027@gmail.com \
--cc=valentina.manea.m@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).