From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: Suwan Kim <suwan.kim027@gmail.com>
Cc: linux-usb@vger.kernel.org, Shuah Khan <skhan@linuxfoundation.org>,
Valentina Manea <valentina.manea.m@gmail.com>
Subject: Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock
Date: Tue, 10 Dec 2019 16:32:21 +0100 [thread overview]
Message-ID: <20191210153221.GG11116@mail-itl> (raw)
In-Reply-To: <20191210142535.GA4489@localhost.localdomain>
[-- Attachment #1: Type: text/plain, Size: 3683 bytes --]
On Tue, Dec 10, 2019 at 11:25:35PM +0900, Suwan Kim wrote:
> On Mon, Dec 09, 2019 at 03:19:59PM +0100, Marek Marczykowski-Górecki wrote:
> > On Mon, Dec 09, 2019 at 03:35:43PM +0900, Suwan Kim wrote:
> > > On Mon, Dec 09, 2019 at 04:37:40AM +0100, Marek Marczykowski-Górecki wrote:
> > > > On Mon, Dec 09, 2019 at 11:01:30AM +0900, Suwan Kim wrote:
> > > > > On Fri, Dec 06, 2019 at 09:57:42PM +0100, Marek Marczykowski-Górecki wrote:
> > > > > > [ 212.890519] usb 1-1: recv xbuf, 42
> > > > >
> > > > > This message is printed by receive error and before that, driver
> > > > > canceled URB transmission. we need to know the exact situation
> > > > > before this message.
> > > >
> > > > I've added some more messages and found recv_size is 0.
> > >
> > > That is the bug point. "size" is urb->actual_length that means
> > > amount of data actually received from device. And "copy" is
> > > amount of data received from usbip server. So, in this situation,
> > > vhci-hcd received all the data from usbip server even if there
> > > are more sg entries left. So, "copy == 0" means vhci-hcd receives
> > > all data from the server and we should check "if (copy == 0)" in
> > > for_each_sg() loop of usbip_recv_xbuff() to exit the loop and not
> > > to add error event.
> >
> > That makes sense. But I think there is also another issue here: hang in
> > case of an error. Here it was EINVAL, but there are probably other
> > reasons why usbip_recv can fail, like network error or misbehaving
> > server. This definitely should not cause the client to fail this way...
> > And also, the actual error code is lost.
>
> I agree. I have been taking a look at it and trying to reproduce
> the same issue on my machine. I guess race condition between hub
> irq thread and driver (vhci_urb_enqueue or hcd?)
> But I'm not sure...
I've tried it some more time and one time I've got a traceback pointing
at lock_release() called from vhci_urb_dequeue, not lock_acquire(). This
get me thinking it may not be deadlock on a spinlock, but some infinite
loop. Looking at the source, I think it's about usb_hcd_flush_endpoint
looping indefinitely because vhci_urb_dequeue() exit early on this:
spin_lock_irqsave(&vhci->lock, flags);
priv = urb->hcpriv;
if (!priv) {
/* URB was never linked! or will be soon given back by
* vhci_rx. */
spin_unlock_irqrestore(&vhci->lock, flags);
return -EIDRM;
}
Adding a print there confirms it.
And I think it's because of vhci_recv_ret_submit():
spin_lock_irqsave(&vdev->priv_lock, flags);
urb = pickup_urb_and_free_priv(vdev, pdu->base.seqnum);
// ****** priv freed here
spin_unlock_irqrestore(&vdev->priv_lock, flags);
if (!urb) {
pr_err("cannot find a urb of seqnum %u max seqnum %d\n",
pdu->base.seqnum,
atomic_read(&vhci_hcd->seqnum));
usbip_event_add(ud, VDEV_EVENT_ERROR_TCP);
return;
}
/* unpack the pdu to a urb */
usbip_pack_pdu(pdu, urb, USBIP_RET_SUBMIT, 0);
/* recv transfer buffer */
if (usbip_recv_xbuff(ud, urb) < 0)
// ***** exit early here
return;
/* recv iso_packet_descriptor */
if (usbip_recv_iso(ud, urb) < 0)
return;
I'm not really sure what should happen, but I think some cleanup in case
of usbip_recv_xbuff() failure is missing. And probably in case of
usbip_recv_iso() failure too.
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2019-12-10 15:32 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-06 3:24 "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock Marek Marczykowski-Górecki
2019-12-06 6:50 ` Suwan Kim
2019-12-06 20:57 ` Marek Marczykowski-Górecki
2019-12-06 21:12 ` Shuah Khan
2019-12-07 0:58 ` Marek Marczykowski-Górecki
2019-12-07 18:45 ` Marek Marczykowski-Górecki
2019-12-09 2:01 ` Suwan Kim
2019-12-09 3:37 ` Marek Marczykowski-Górecki
2019-12-09 6:35 ` Suwan Kim
2019-12-09 14:19 ` Marek Marczykowski-Górecki
2019-12-10 14:25 ` Suwan Kim
2019-12-10 15:32 ` Marek Marczykowski-Górecki [this message]
2019-12-11 3:07 ` Suwan Kim
2019-12-11 3:20 ` Marek Marczykowski-Górecki
2019-12-11 4:53 ` Suwan Kim
2019-12-11 6:27 ` Suwan Kim
2019-12-11 11:01 ` Marek Marczykowski-Górecki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191210153221.GG11116@mail-itl \
--to=marmarek@invisiblethingslab.com \
--cc=linux-usb@vger.kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=suwan.kim027@gmail.com \
--cc=valentina.manea.m@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.