From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com ([192.55.52.115]:3560 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752815AbcJDOEZ (ORCPT ); Tue, 4 Oct 2016 10:04:25 -0400 From: Felipe Balbi To: Ville =?utf-8?B?U3lyasOkbMOk?= Cc: linux-usb@vger.kernel.org, stable@vger.kernel.org Subject: Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit" In-Reply-To: <20161004132059.GX4329@intel.com> References: <1475504282-13268-1-git-send-email-ville.syrjala@linux.intel.com> <87lgy5za7o.fsf@linux.intel.com> <20161004083855.GW4329@intel.com> <8760p8qp5b.fsf@linux.intel.com> <20161004132059.GX4329@intel.com> Date: Tue, 04 Oct 2016 17:03:58 +0300 Message-ID: <87mvikp5wx.fsf@linux.intel.com> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: stable-owner@vger.kernel.org List-ID: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi, Ville Syrj=C3=A4l=C3=A4 writes: > On Tue, Oct 04, 2016 at 03:23:12PM +0300, Felipe Balbi wrote: >>=20 >> Hi, >>=20 >> Ville Syrj=C3=A4l=C3=A4 writes: >> > On Mon, Oct 03, 2016 at 07:08:43PM +0300, Felipe Balbi wrote: >> >>=20 >> >> Hi, >> >>=20 >> >> ville.syrjala@linux.intel.com writes: >> >> > From: Ville Syrj=C3=A4l=C3=A4 >> >> > >> >> > This reverts commit 55a0237f8f47957163125e20ee9260538c5c341c. >> >> > >> >> > commit 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs= for >> >> > LST bit") causes my BYT FFRD8 with g_ether to behave poorly. ssh/scp >> >> > is very sluggish and can even stall entirely. Revert cures it. >> >> > >> >> > Cc: Felipe Balbi >> >> > Cc: stable@vger.kernel.org >> >> > Signed-off-by: Ville Syrj=C3=A4l=C3=A4 >> >>=20 >> >> Reverting that commit is not the best idea. Let's start with the usua= l: >> >>=20 >> >> - Kernel version >> >> - dmesg on both sides (host and device) >> >> - dwc3 tracepoints: >> >>=20 >> >> # mkdir -p /t >> >> # mount -t tracefs none /t >> >> # echo 8192 > /t/buffer_size_kb >> >> # echo 1 > /t/events/dwc3/enable >> >> # echo 0 > /t/events/dwc3/dwc3_readl/enable >> >> # echo 0 > /t/events/dwc3/dwc3_writel/enable >> >>=20 >> >> This should be enough to tell me what's really going on. >> > >> > I've attached the (compressed due to size) dmesgs/traces from the devi= ce. >> > For my test I just did 'for i in `seq 1 5` ; dmesg ; done' in an ssh >> > session, and I did get the lag in the bad run, so presumably the trace >> > should have caught it. Both were on 4.8 (+ the revert for the good run= ). >> > >> > Host side dmesg just showes me: >> > usb 3-3.4: new high-speed USB device number 25 using xhci_hcd >> > cdc_ether 3-3.4:1.0 usb0: register 'cdc_ether' at usb-0000:00:14.0-3.= 4, CDC Ethernet Device, ... >> > when I boot the device. Nothing extra appears when things lag around. >> > I'm running 4.5.something on the host. >>=20 >> Looking at your logs what I can see is that "revert.trace" doesn't >> respect gadget driver's IRQ throttling, which means you get an interrupt >> for each and every request, while "bad.trace" shows dwc3 respecting >> "no_interrupt" flag passed in by g_ether. >>=20 >> The only explanation I have for this behavior when we start respecting >> gadget driver's flags, is that there's a possible race in u_ether.c >> where it can go for a long time without an interrupt due to that >> throttling. >>=20 >> This only happens with IN endpoints (the ones which are throttled) and >> the behavior can be seen in log snippet below: >>=20 >> 27782: irq/17-dwc3-2524 [000] d... 666.891969: tx_complete: req= ffff88016da1a6c0 length 1514/1514 flags ZsI tx_qlen 4 qmult 5 >> 27802: ksoftirqd/0-3 [000] .Ns1 666.892047: eth_start_xmit: = req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 5 qmult 5 >> 27808: ksoftirqd/0-3 [000] .Ns2 666.892062: eth_start_xmit: = req ffff88016da1a300 length 0/1514 flags ZsI tx_qlen 6 qmult 5 >> 27814: ksoftirqd/0-3 [000] .Ns2 666.892069: eth_start_xmit: = req ffff88016da1ab40 length 0/1514 flags Zsi tx_qlen 7 qmult 5 >> 27820: ksoftirqd/0-3 [000] .Ns2 666.892075: eth_start_xmit: = req ffff88016da1acc0 length 0/1514 flags Zsi tx_qlen 8 qmult 5 >> 27826: ksoftirqd/0-3 [000] .Ns2 666.892082: eth_start_xmit: = req ffff88016da1ae40 length 0/1514 flags Zsi tx_qlen 9 qmult 5 >> 27832: ksoftirqd/0-3 [000] .Ns2 666.892088: eth_start_xmit: = req ffff88016da1a780 length 0/1514 flags Zsi tx_qlen 10 qmult 5 >> 27884: irq/17-dwc3-2524 [000] d... 666.892213: tx_complete: req= ffff88016da1a840 length 1514/1514 flags Zsi tx_qlen 9 qmult 5 >> 27889: irq/17-dwc3-2524 [000] d... 666.892215: tx_complete: req= ffff88016da1a600 length 1514/1514 flags Zsi tx_qlen 8 qmult 5 >> 27897: sshd-2684 [002] ..s1 666.892223: eth_start_xmit: = req ffff88016da1a840 length 0/1514 flags Zsi tx_qlen 9 qmult 5 >> 27898: irq/17-dwc3-2524 [000] d... 666.892223: tx_complete: req= ffff88016da1a180 length 1514/1514 flags Zsi tx_qlen 8 qmult 5 >> 27903: irq/17-dwc3-2524 [000] d... 666.892225: tx_complete: req= ffff88016da1a240 length 1514/1514 flags Zsi tx_qlen 7 qmult 5 >> 27910: sshd-2684 [002] ..s1 666.892230: eth_start_xmit: = req ffff88016da1a180 length 0/1514 flags Zsi tx_qlen 8 qmult 5 >> 27913: irq/17-dwc3-2524 [000] d... 666.892231: tx_complete: req= ffff88016da1a6c0 length 1514/1514 flags Zsi tx_qlen 7 qmult 5 >> 27920: sshd-2684 [002] ..s1 666.892236: eth_start_xmit: = req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 8 qmult 5 >> 27923: irq/17-dwc3-2524 [000] d... 666.892238: tx_complete: req= ffff88016da1a300 length 1514/1514 flags ZsI tx_qlen 7 qmult 5 >> 27928: sshd-2684 [002] ..s1 666.892242: eth_start_xmit: = req ffff88016da1a240 length 0/1514 flags Zsi tx_qlen 8 qmult 5 >> 27944: sshd-2684 [002] ..s1 666.892253: eth_start_xmit: = req ffff88016da1a300 length 0/1514 flags Zsi tx_qlen 9 qmult 5 >> 27953: sshd-2684 [002] ..s1 666.892262: eth_start_xmit: = req ffff88016da1a600 length 0/1514 flags Zsi tx_qlen 10 qmult 5 >>=20 >> Note here that we reached tx_qlen =3D 10 without any interrupts being re= quested. >>=20 >> I patched u_ether with below to capture log above: >>=20 >> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/= function/u_ether.c >> index 5f562c1ec795..e34c26ffcdf4 100644 >> --- a/drivers/usb/gadget/function/u_ether.c >> +++ b/drivers/usb/gadget/function/u_ether.c >> @@ -465,6 +465,15 @@ static void tx_complete(struct usb_ep *ep, struct u= sb_request *req) >> dev_kfree_skb_any(skb); >>=20=20 >> atomic_dec(&dev->tx_qlen); >> + >> + trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %d qmult = %d\n", >> + req, req->actual, req->length, >> + req->zero ? 'Z' : 'z', >> + req->short_not_ok ? 'S' : 's', >> + req->no_interrupt ? 'i' : 'I', >> + atomic_read(&dev->tx_qlen), >> + dev->qmult); >> + >> if (netif_carrier_ok(dev->net)) >> netif_wake_queue(dev->net); >> } >> @@ -600,6 +609,12 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *s= kb, >> case 0: >> netif_trans_update(net); >> atomic_inc(&dev->tx_qlen); >> + trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %= d qmult %d\n", >> + req, req->actual, req->length, req->zero= ? 'Z' : 'z', >> + req->short_not_ok ? 'S' : 's', >> + req->no_interrupt ? 'i' : 'I', >> + atomic_read(&dev->tx_qlen), >> + dev->qmult); >> } >>=20=20 >> if (retval) { >>=20 >> The only way I can think of to avoid this problem is to make sure we >> don't starve our request queue with something like: >>=20 >> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/= function/u_ether.c >> index 5f562c1ec795..ddfa866c2cba 100644 >> --- a/drivers/usb/gadget/function/u_ether.c >> +++ b/drivers/usb/gadget/function/u_ether.c >> @@ -587,8 +596,9 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *sk= b, >>=20=20 >> /* throttle high/super speed IRQ rate back slightly */ >> if (gadget_is_dualspeed(dev->gadget)) >> - req->no_interrupt =3D (dev->gadget->speed =3D=3D USB_SPE= ED_HIGH || >> - dev->gadget->speed =3D=3D USB_SPEED= _SUPER) >> + req->no_interrupt =3D (((dev->gadget->speed =3D=3D USB_S= PEED_HIGH || >> + dev->gadget->speed =3D=3D USB_SPE= ED_SUPER)) && >> + !list_empty(&dev->tx_reqs)) >> ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != =3D 0) >> : 0; >>=20=20 >> I have been testing this for about half an hour and I can't see delays >> or link getting stuck. Care to test ? If it works for your setup, I'll >> send this with Cc: stable and your Reported-by and Tested-by tags. > > That doesn't appear to cure the problem unfortunately. I've attached anot= her > trace, with your u_ether trace_printks included. Now I only seem to find delays on ep1out. Here are a few instances: irq/22-dwc3-803 [001] d..1 19.446979: dwc3_gadget_ep_cmd: ep1out:= cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> statu= s: Successful irq/22-dwc3-803 [001] d..1 20.452232: dwc3_event: event (00006084= ): ep1out: Transfer In-Progress irq/22-dwc3-803 [001] d..1 20.761702: dwc3_gadget_ep_cmd: ep1out:= cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> statu= s: Successful irq/22-dwc3-803 [001] d..1 21.044345: dwc3_event: event (00006084= ): ep1out: Transfer In-Progress irq/22-dwc3-803 [001] d..1 21.999383: dwc3_gadget_ep_cmd: ep1out:= cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> statu= s: Successful [...] irq/22-dwc3-803 [001] d..1 22.239576: dwc3_event: event (00006084= ): ep1out: Transfer In-Progress irq/22-dwc3-803 [001] d..1 22.979703: dwc3_gadget_ep_cmd: ep1out:= cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> statu= s: Successful [...] irq/22-dwc3-803 [001] d..1 23.125096: dwc3_event: event (00006084= ): ep1out: Transfer In-Progress irq/22-dwc3-803 [001] d..1 23.820697: dwc3_gadget_ep_cmd: ep1out:= cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> statu= s: Successful irq/22-dwc3-803 [001] d..1 24.461117: dwc3_event: event (00006084= ): ep1out: Transfer In-Progress irq/22-dwc3-803 [001] d..1 24.882567: dwc3_gadget_ep_cmd: ep1out:= cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> statu= s: Successful irq/22-dwc3-803 [001] d..1 25.693330: dwc3_event: event (00006084= ): ep1out: Transfer In-Progress irq/22-dwc3-803 [001] d..1 25.702107: dwc3_gadget_ep_cmd: ep1out:= cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> statu= s: Successful irq/22-dwc3-803 [001] d..1 26.045495: dwc3_event: event (00006084= ): ep1out: Transfer In-Progress Based on this, it would seem that host side isn't sending the data; but that's quite odd. Let's put a sniffer here tomorrow and look at this more carefully, I couldn't reproduce this with SKL as I was testing this today. =2D-=20 balbi --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIcBAEBCAAGBQJX87bOAAoJEMy+uJnhGpkGXJgQAM61Tw9LXaCn0skTHa5uOKXi 82r+ji0OlwvEn5Okhm9gLtU+32A4KfAOE5VjJ5MqLs6jcJ3kj9kncRf2f/UfwPLo rF5xmVrwy0ZXzU1yBeY+TjXdhgbaMlbdhZNIzkLg7/umVGoBBqWansBmXhCkuHfi wbk+6xVwrXoZ3RG67eELsUEzQvOSA9r27LlMk8CAKyGGOZK5B09kaA7vXHcq60RH i0EJbaTfRSXn7J6N541b3VHhL8YmLDaofKcHYBrIcSPEUvX/VogS6DpSFs6XeOgt Xtl7fC34ag6OHHYSHC4MJ61XgBLZaj/Jh3qi9TtTSdCFfUCYtzVibJzPgwFMcxJD hFtEVTFV+GB5JwNuu58+xoXoKnSuPuTSMEN/iWZ3PV/d9ShrI/DWsPzNgymoQndC 0uY+4YpSodqk4KM1nYitZJFJl5PR6jsakGTxP8i8aqvBdXMCi1K2USv+iy6bBN/i vUPd5cPtR1U8WxsromwmPcLN5kvWCQ+sduLyjKQQEEb1LhWeDWg1whobzqkduBRN 86F2DO3/ocl+UwSRc7zTqF6B4wRZ2Mwab5E90s2IzJRn92loo4+8d67Jxtwi4IyK AQxTZEw0i8N2BHz1fUTRD8tveN+EmbmX+ADcRVLzi/HiPNfyZ96Lwhv6CBebhgTa qh6KfTnQYVM/jDcdUBv4 =PyxL -----END PGP SIGNATURE----- --=-=-=--