From: Felipe Balbi <felipe.balbi@linux.intel.com>
To: "Ville Syrjälä" <ville.syrjala@linux.intel.com>
Cc: linux-usb@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit"
Date: Tue, 04 Oct 2016 17:03:58 +0300 [thread overview]
Message-ID: <87mvikp5wx.fsf@linux.intel.com> (raw)
In-Reply-To: <20161004132059.GX4329@intel.com>
[-- Attachment #1: Type: text/plain, Size: 10808 bytes --]
Hi,
Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
> On Tue, Oct 04, 2016 at 03:23:12PM +0300, Felipe Balbi wrote:
>>
>> Hi,
>>
>> Ville Syrjälä <ville.syrjala@linux.intel.com> writes:
>> > On Mon, Oct 03, 2016 at 07:08:43PM +0300, Felipe Balbi wrote:
>> >>
>> >> Hi,
>> >>
>> >> ville.syrjala@linux.intel.com writes:
>> >> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> >> >
>> >> > This reverts commit 55a0237f8f47957163125e20ee9260538c5c341c.
>> >> >
>> >> > commit 55a0237f8f47 ("usb: dwc3: gadget: use allocated/queued reqs for
>> >> > LST bit") causes my BYT FFRD8 with g_ether to behave poorly. ssh/scp
>> >> > is very sluggish and can even stall entirely. Revert cures it.
>> >> >
>> >> > Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
>> >> > Cc: stable@vger.kernel.org
>> >> > Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> >>
>> >> Reverting that commit is not the best idea. Let's start with the usual:
>> >>
>> >> - Kernel version
>> >> - dmesg on both sides (host and device)
>> >> - dwc3 tracepoints:
>> >>
>> >> # mkdir -p /t
>> >> # mount -t tracefs none /t
>> >> # echo 8192 > /t/buffer_size_kb
>> >> # echo 1 > /t/events/dwc3/enable
>> >> # echo 0 > /t/events/dwc3/dwc3_readl/enable
>> >> # echo 0 > /t/events/dwc3/dwc3_writel/enable
>> >>
>> >> This should be enough to tell me what's really going on.
>> >
>> > I've attached the (compressed due to size) dmesgs/traces from the device.
>> > For my test I just did 'for i in `seq 1 5` ; dmesg ; done' in an ssh
>> > session, and I did get the lag in the bad run, so presumably the trace
>> > should have caught it. Both were on 4.8 (+ the revert for the good run).
>> >
>> > Host side dmesg just showes me:
>> > usb 3-3.4: new high-speed USB device number 25 using xhci_hcd
>> > cdc_ether 3-3.4:1.0 usb0: register 'cdc_ether' at usb-0000:00:14.0-3.4, CDC Ethernet Device, ...
>> > when I boot the device. Nothing extra appears when things lag around.
>> > I'm running 4.5.something on the host.
>>
>> Looking at your logs what I can see is that "revert.trace" doesn't
>> respect gadget driver's IRQ throttling, which means you get an interrupt
>> for each and every request, while "bad.trace" shows dwc3 respecting
>> "no_interrupt" flag passed in by g_ether.
>>
>> The only explanation I have for this behavior when we start respecting
>> gadget driver's flags, is that there's a possible race in u_ether.c
>> where it can go for a long time without an interrupt due to that
>> throttling.
>>
>> This only happens with IN endpoints (the ones which are throttled) and
>> the behavior can be seen in log snippet below:
>>
>> 27782: irq/17-dwc3-2524 [000] d... 666.891969: tx_complete: req ffff88016da1a6c0 length 1514/1514 flags ZsI tx_qlen 4 qmult 5
>> 27802: ksoftirqd/0-3 [000] .Ns1 666.892047: eth_start_xmit: req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 5 qmult 5
>> 27808: ksoftirqd/0-3 [000] .Ns2 666.892062: eth_start_xmit: req ffff88016da1a300 length 0/1514 flags ZsI tx_qlen 6 qmult 5
>> 27814: ksoftirqd/0-3 [000] .Ns2 666.892069: eth_start_xmit: req ffff88016da1ab40 length 0/1514 flags Zsi tx_qlen 7 qmult 5
>> 27820: ksoftirqd/0-3 [000] .Ns2 666.892075: eth_start_xmit: req ffff88016da1acc0 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>> 27826: ksoftirqd/0-3 [000] .Ns2 666.892082: eth_start_xmit: req ffff88016da1ae40 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>> 27832: ksoftirqd/0-3 [000] .Ns2 666.892088: eth_start_xmit: req ffff88016da1a780 length 0/1514 flags Zsi tx_qlen 10 qmult 5
>> 27884: irq/17-dwc3-2524 [000] d... 666.892213: tx_complete: req ffff88016da1a840 length 1514/1514 flags Zsi tx_qlen 9 qmult 5
>> 27889: irq/17-dwc3-2524 [000] d... 666.892215: tx_complete: req ffff88016da1a600 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
>> 27897: sshd-2684 [002] ..s1 666.892223: eth_start_xmit: req ffff88016da1a840 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>> 27898: irq/17-dwc3-2524 [000] d... 666.892223: tx_complete: req ffff88016da1a180 length 1514/1514 flags Zsi tx_qlen 8 qmult 5
>> 27903: irq/17-dwc3-2524 [000] d... 666.892225: tx_complete: req ffff88016da1a240 length 1514/1514 flags Zsi tx_qlen 7 qmult 5
>> 27910: sshd-2684 [002] ..s1 666.892230: eth_start_xmit: req ffff88016da1a180 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>> 27913: irq/17-dwc3-2524 [000] d... 666.892231: tx_complete: req ffff88016da1a6c0 length 1514/1514 flags Zsi tx_qlen 7 qmult 5
>> 27920: sshd-2684 [002] ..s1 666.892236: eth_start_xmit: req ffff88016da1a6c0 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>> 27923: irq/17-dwc3-2524 [000] d... 666.892238: tx_complete: req ffff88016da1a300 length 1514/1514 flags ZsI tx_qlen 7 qmult 5
>> 27928: sshd-2684 [002] ..s1 666.892242: eth_start_xmit: req ffff88016da1a240 length 0/1514 flags Zsi tx_qlen 8 qmult 5
>> 27944: sshd-2684 [002] ..s1 666.892253: eth_start_xmit: req ffff88016da1a300 length 0/1514 flags Zsi tx_qlen 9 qmult 5
>> 27953: sshd-2684 [002] ..s1 666.892262: eth_start_xmit: req ffff88016da1a600 length 0/1514 flags Zsi tx_qlen 10 qmult 5
>>
>> Note here that we reached tx_qlen = 10 without any interrupts being requested.
>>
>> I patched u_ether with below to capture log above:
>>
>> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
>> index 5f562c1ec795..e34c26ffcdf4 100644
>> --- a/drivers/usb/gadget/function/u_ether.c
>> +++ b/drivers/usb/gadget/function/u_ether.c
>> @@ -465,6 +465,15 @@ static void tx_complete(struct usb_ep *ep, struct usb_request *req)
>> dev_kfree_skb_any(skb);
>>
>> atomic_dec(&dev->tx_qlen);
>> +
>> + trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %d qmult %d\n",
>> + req, req->actual, req->length,
>> + req->zero ? 'Z' : 'z',
>> + req->short_not_ok ? 'S' : 's',
>> + req->no_interrupt ? 'i' : 'I',
>> + atomic_read(&dev->tx_qlen),
>> + dev->qmult);
>> +
>> if (netif_carrier_ok(dev->net))
>> netif_wake_queue(dev->net);
>> }
>> @@ -600,6 +609,12 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>> case 0:
>> netif_trans_update(net);
>> atomic_inc(&dev->tx_qlen);
>> + trace_printk("req %p length %d/%d flags %c%c%c tx_qlen %d qmult %d\n",
>> + req, req->actual, req->length, req->zero ? 'Z' : 'z',
>> + req->short_not_ok ? 'S' : 's',
>> + req->no_interrupt ? 'i' : 'I',
>> + atomic_read(&dev->tx_qlen),
>> + dev->qmult);
>> }
>>
>> if (retval) {
>>
>> The only way I can think of to avoid this problem is to make sure we
>> don't starve our request queue with something like:
>>
>> diff --git a/drivers/usb/gadget/function/u_ether.c b/drivers/usb/gadget/function/u_ether.c
>> index 5f562c1ec795..ddfa866c2cba 100644
>> --- a/drivers/usb/gadget/function/u_ether.c
>> +++ b/drivers/usb/gadget/function/u_ether.c
>> @@ -587,8 +596,9 @@ static netdev_tx_t eth_start_xmit(struct sk_buff *skb,
>>
>> /* throttle high/super speed IRQ rate back slightly */
>> if (gadget_is_dualspeed(dev->gadget))
>> - req->no_interrupt = (dev->gadget->speed == USB_SPEED_HIGH ||
>> - dev->gadget->speed == USB_SPEED_SUPER)
>> + req->no_interrupt = (((dev->gadget->speed == USB_SPEED_HIGH ||
>> + dev->gadget->speed == USB_SPEED_SUPER)) &&
>> + !list_empty(&dev->tx_reqs))
>> ? ((atomic_read(&dev->tx_qlen) % dev->qmult) != 0)
>> : 0;
>>
>> I have been testing this for about half an hour and I can't see delays
>> or link getting stuck. Care to test ? If it works for your setup, I'll
>> send this with Cc: stable and your Reported-by and Tested-by tags.
>
> That doesn't appear to cure the problem unfortunately. I've attached another
> trace, with your u_ether trace_printks included.
Now I only seem to find delays on ep1out. Here are a few instances:
irq/22-dwc3-803 [001] d..1 19.446979: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
irq/22-dwc3-803 [001] d..1 20.452232: dwc3_event: event (00006084): ep1out: Transfer In-Progress
irq/22-dwc3-803 [001] d..1 20.761702: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
irq/22-dwc3-803 [001] d..1 21.044345: dwc3_event: event (00006084): ep1out: Transfer In-Progress
irq/22-dwc3-803 [001] d..1 21.999383: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
[...]
irq/22-dwc3-803 [001] d..1 22.239576: dwc3_event: event (00006084): ep1out: Transfer In-Progress
irq/22-dwc3-803 [001] d..1 22.979703: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
[...]
irq/22-dwc3-803 [001] d..1 23.125096: dwc3_event: event (00006084): ep1out: Transfer In-Progress
irq/22-dwc3-803 [001] d..1 23.820697: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
irq/22-dwc3-803 [001] d..1 24.461117: dwc3_event: event (00006084): ep1out: Transfer In-Progress
irq/22-dwc3-803 [001] d..1 24.882567: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
irq/22-dwc3-803 [001] d..1 25.693330: dwc3_event: event (00006084): ep1out: Transfer In-Progress
irq/22-dwc3-803 [001] d..1 25.702107: dwc3_gadget_ep_cmd: ep1out: cmd 'Update Transfer' [131079] params 00000000 00000000 00000000 --> status: Successful
irq/22-dwc3-803 [001] d..1 26.045495: dwc3_event: event (00006084): ep1out: Transfer In-Progress
Based on this, it would seem that host side isn't sending the data; but
that's quite odd. Let's put a sniffer here tomorrow and look at this
more carefully, I couldn't reproduce this with SKL as I was testing this
today.
--
balbi
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]
next prev parent reply other threads:[~2016-10-04 14:04 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-03 14:18 [PATCH] Revert "usb: dwc3: gadget: use allocated/queued reqs for LST bit" ville.syrjala
2016-10-03 16:08 ` Felipe Balbi
2016-10-04 7:43 ` Felipe Balbi
2016-10-04 8:38 ` Ville Syrjälä
2016-10-04 12:23 ` Felipe Balbi
2016-10-04 13:20 ` Ville Syrjälä
2016-10-04 14:03 ` Felipe Balbi [this message]
2016-10-06 7:36 ` Felipe Balbi
2016-10-06 9:08 ` Ville Syrjälä
2016-10-27 15:40 ` Ville Syrjälä
2016-10-28 10:16 ` Felipe Balbi
2016-10-28 16:33 ` Ville Syrjälä
2016-10-31 18:51 ` David Miller
2016-11-01 11:30 ` Felipe Balbi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87mvikp5wx.fsf@linux.intel.com \
--to=felipe.balbi@linux.intel.com \
--cc=linux-usb@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=ville.syrjala@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).