From: Ferry Toth <fntoth@gmail.com>
To: Thinh Nguyen <Thinh.Nguyen@synopsys.com>,
Felipe Balbi <balbi@kernel.org>,
Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Alan Stern <stern@rowland.harvard.edu>, USB <linux-usb@vger.kernel.org>
Subject: Re: USB network gadget / DWC3 issue
Date: Sat, 3 Apr 2021 13:25:27 +0200 [thread overview]
Message-ID: <4a0869c9-6b71-5acd-e670-e4c06b44d62d@gmail.com> (raw)
In-Reply-To: <6b3a28eb-7809-d319-d58d-520c1c7fa5d2@synopsys.com>
Hi,
Op 03-04-2021 om 04:02 schreef Thinh Nguyen:
> Ferry Toth wrote:
>> Hi,
>>
>> Op 02-04-2021 om 22:16 schreef Thinh Nguyen:
>>> Ferry Toth wrote:
>>>> Hi
>>>>
>>>> Op 30-03-2021 om 23:57 schreef Ferry Toth:
>>>>> Hi
>>>>>
>>>>> Op 30-03-2021 om 22:26 schreef Ferry Toth:
>>>>>> Hi,
>>>>>>
>>>>>> Op 30-03-2021 om 18:17 schreef Felipe Balbi:
>>>>>>> Hi,
>>>>>>>
>>>>>>> Andy Shevchenko <andy.shevchenko@gmail.com> writes:
>>>>>>>> Hi!
>>>>>>>>
>>>>>>>> I have a platform with DWC3 in Dual Role mode. Currently I'm
>>>>>>>> experimenting on v5.12-rc5 with a few patches (mostly configuration)
>>>>>>>> applied [1]. I'm using Debian Unstable on the host machine and
>>>>>>>> BuildRoot with the above mentioned kernel on the target.
>>>>>>>>
>>>>>>>> **So, scenario 0:
>>>>>>>> 1. Run iperf3 -s on target
>>>>>>>> 2. Run iperf3 -c ... -t 0 on the host
>>>>>>>> 3. 0.00-10.36 sec 237 MBytes 192 Mbits/sec
>>>>>>>> receiver
>>>>>>>>
>>>>>>>> **Scenario 1:
>>>>>>>> 1. Now, detach USB cable, wait for several seconds, attach it back,
>>>>>>>> repeat above:
>>>>>>>> 0.00-9.94 sec 209 MBytes 176 Mbits/sec receiver
>>>>>>>>
>>>>>>>> Note the bandwidth drop (177 vs. 192).
>>>>>>>>
>>>>>>>> (Repeating scenario 1 will give now the same result)
>>>>>>>>
>>>>>>>> **Scenario 2.
>>>>>>>> 1. Detach USB cable, attach a device, for example USB stick,
>>>>>>>> 2. See it being enumerated and detach it.
>>>>>>>> 3. Attach cable from host
>>>>>>>> 4 . 0.00-19.36 sec 315 MBytes 136 Mbits/sec
>>>>>>>> receiver
>>>>>>>>
>>>>>>>> Note even more bandwidth drop!
>>>>>>>>
>>>>>>>> (Repeating scenario 1 keeps the same lower bandwidth)
>>>>>>>>
>>>>>>>> NOTE, sometimes on this scenario after several seconds the target
>>>>>>>> simply reboots (w/o any logs [from kernel] printed)!
>>>>>>>>
>>>>>>>> So, any pointers on how to debug and what can be a smoking gun here?
>>>>>>>>
>>>>>>>> Ferry reported this in [2]. There are different kernel versions and
>>>>>>>> tools to establish the connection (like connman vs. none in my
>>>>>>>> case).
>>>>>>>>
>>>>>>>> [1]:
>>>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/__;!!A4F2R9G_pg!KpQnudHIK6XgK6HbPaqtbVgipDmkNBWewo-euAIuBlGdtSiaQiJ8jLn9OoMEppG6qq-d$
>>>>>>>>
>>>>>>>> [2]:
>>>>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/issues/31__;!!A4F2R9G_pg!KpQnudHIK6XgK6HbPaqtbVgipDmkNBWewo-euAIuBlGdtSiaQiJ8jLn9OoMEptMCrp-F$
>>>>>>>>
>>>>>>> dwc3 tracepoints should give some initial hints. Look at packets
>>>>>>> sizes
>>>>>>> and period of transmission. From dwc3 side, I can't think of
>>>>>>> anything we
>>>>>>> would do to throttle the transmission, but tracepoints should tell a
>>>>>>> clearer story.
>>>>>>>
>>>>>> My testing (but yes, with difference kernel and network managed by
>>>>>> connman) shows:
>>>>>>
>>>>>> 1) on cold boot eem network gadget works fine
>>>>>>
>>>>>> 2) after unplug or warm reboot (which is also an unplug) it's broken,
>>>>>> speed is lost (|12.0 Mbits/sec from 200Mb/s normally)|, packets lost,
>>>>>> no configuration received from dhcp, occasional reboot, only way to
>>>>>> fix is cold boot
>>>>>>
>>>>>> 3) if before unplug `connmanctl disable gadget`, on replugging and
>>>>>> enabling it works fine
>>>>>>
>>>>>> My theory is that some HW register is disturbed on a surprise unplug,
>>>>>> but not reset on plug or warm boot. But on cold boot is cleared.
>>>>>> Maybe that can help to narrow down tracepoints?
>>>>>>
>>>>> I captured a plug after warm and after cold boot. This includes
>>>>> network setup (dhcp). You can find it in [2] or directly link here:
>>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6232410/boot.zip
>>>>>
>>>>
>>>> While the above traces in boot.zip allow compare which regs not
>>>> correctly initialized on warm boot, I have now captured traces of
>>>> unplug/plug.
>>>>
>>>> Here kernel is 5.10.27 (LTS), cold booted with USB cable plugged and the
>>>> eem gadget network setup (dhcp). Then trace unplug. Then trace plug.
>>>>
>>>> After plug the eem connection is again broken.
>>>>
>>>> This might allow figuring out what goes wrong on unplug. Traces here:
>>>> https://urldefense.com/v3/__https://github.com/andy-shev/linux/files/6250924/plug-unplug.zip
>>>>
>>>> **
>>>>
>>> Hi,
>>>
>>> Were you able to narrow down the issue to only DWC3 device? (i.e. you
>>> tested with different hosts and different device controllers to confirm
>>> this)
>> I haven't tried with other devices. I have been forced to replace my
>> host mobo and nothing changed. But I didn't pay attention to the
>> particular host controller.
>>
> It'd be better if we can narrow down the culprit as this seems to me
> like a synchronization issue at the upper layer between the host and device.
>
>>> Did you see this issue previously? If not, is it possible to do git
>>> bisection?
>> This is with Intel Edison where main line usb gadget support appeared
>> around 4.19 iirc. I believed the problem appeared between 5.4 and 5.7
>> and tried to bisect but failed.
>>
>> I realize only now that I failed because:
>> 1) 5.4 already has this issue as I recently retested
> I'm confused, why do you believe the problem is between 5.4 and 5.7 if
> 5.4 already has this issue? So when did you start seeing this problem?
Because at the time of 5.4 I didn't notice the issue as I normally did
cold boots due to other problems on warm boot (i.e. sdhc inaccessible).
I never new that on a cold boot it works. Even during bisecting I didn't
know until the end, and then I found 5.4 has the same problem as all the
later kernels (tested up to 5.11)
> Also, these kernel versions are really old, there's been a lot of
> updates/fixes to dwc3 since then. Can we run tests on the latest kernel?
I have tested 5.10.27, 5.11.0 and 5.11.4-rt11.
But of course I am completely prepared to run Andy's latest (v5.12-rc5)
on the device.
>> 2) I didn't use a reproducible criterion. After warm reboot the eem
>> gadget fails, but you can flip the host/gadget switch back and forth and
>> have the illusion that the connection restored.
>>
>> The scenario described here is reproducible: leaving the switch in
>> gadget mode eem works after cold boot only. And it likely breaks on unplug.
>>
>> A 2nd hint is that disabling gadget (I used `connmanctl disable gadget`
>> but I believe that has the same effect as `iw link set dev usb0 down`)
>> before unplug prevents messing up the driver, so you can replug and
>> enable again.
> These data points are good. However, we'd need to know where to look
> first. The issue isn't obvious from the DWC3 controller or the DWC3 driver.
>
> Can you check a few things:
> 1) Any error/timeout messages from the host's dmesg? Or device side?
I'll add log from the host side.
For now I only see (on a warm plug):
kernel: usb 1-11: can't set config #1, error -110
> 2) What kernel version is your host using? Can you use the latest for
> both host and device?
The host is ubuntu's amd64 5.8.0-48-generic.
I will test with v5.12-rc5 from ubuntu kernel ppa on the host. And
Andy's latest (v5.12-rc5) on the device.
I am expecting results this evening.
> 3) Snapshot of dwc3 tracepoints of active transfers between the normal
> vs throttled of the latest kernel
I don't know if the problem I see is really throttling.
I can trace an active transfer, but that does actually throttle from
200Mb/s down to 139MB/s and produces a trace of 53MB. (2x1sec of iperf3).
> BR,
> Thinh
next prev parent reply other threads:[~2021-04-03 11:25 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-30 12:37 USB network gadget / DWC3 issue Andy Shevchenko
2021-03-30 16:17 ` Felipe Balbi
2021-03-30 20:26 ` Ferry Toth
2021-03-30 21:57 ` Ferry Toth
2021-04-02 19:12 ` Ferry Toth
2021-04-02 20:16 ` Thinh Nguyen
2021-04-02 22:40 ` Ferry Toth
2021-04-03 2:02 ` Thinh Nguyen
2021-04-03 11:25 ` Ferry Toth [this message]
2021-04-03 21:15 ` Ferry Toth
2021-04-05 20:59 ` Ferry Toth
2021-04-07 0:10 ` Thinh Nguyen
2021-04-07 0:24 ` Thinh Nguyen
2021-04-07 13:34 ` Andy Shevchenko
2021-04-07 16:08 ` Ferry Toth
2021-04-08 20:17 ` Ferry Toth
2021-04-08 21:12 ` Thinh Nguyen
2021-04-08 21:37 ` Thinh Nguyen
2021-04-09 13:26 ` Ferry Toth
2021-04-10 13:29 ` Ferry Toth
2021-04-10 14:08 ` Ferry Toth
2021-04-11 0:04 ` Thinh Nguyen
2021-04-11 15:26 ` Ferry Toth
2021-04-13 2:17 ` Thinh Nguyen
2021-04-13 8:45 ` Ferry Toth
2021-04-13 21:06 ` Ferry Toth
2021-04-13 21:21 ` Thinh Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4a0869c9-6b71-5acd-e670-e4c06b44d62d@gmail.com \
--to=fntoth@gmail.com \
--cc=Thinh.Nguyen@synopsys.com \
--cc=andy.shevchenko@gmail.com \
--cc=balbi@kernel.org \
--cc=linux-usb@vger.kernel.org \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox