public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed
* [BUG REPORT] usb: dwc3: Failure to enumerate from boot.
@ 2025-04-28 14:54 Jakob Trumpower
  2025-05-01  0:14 ` Thinh Nguyen
  0 siblings, 1 reply; 8+ messages in thread
From: Jakob Trumpower @ 2025-04-28 14:54 UTC (permalink / raw)
  To: balbi@kernel.org; +Cc: linux-usb@vger.kernel.org


[-- Attachment #1.1: Type: text/plain, Size: 4695 bytes --]

Hello Felipe (or whoever is the current dwc3 maintainer),

I hope you are doing well!

I have been dealing with a USB issue on a custom hardware embedded Linux system. We currently have 3 revisions of the system: a device mode, host mode, and now a dual role mode (to hopefully replace the former). The device mode has been working flawlessly. We have had issues for a few years with host mode where  devices (ethernet adapters) don't enumerate unless you cycle the plug and sometimes require a hard reboot to enumerate.


  *
Ultrascale+ ZynqMPSoC  using a custom Yocto based OS using Xilinx layers.
  *
Xilinx 6.6 kernel
     *
I know the documentation mentions using the latest mainline version but it would be quite the effort to get that running on our FPGA.
     *
I have tried porting USB  specific changes from mainline.
  *
We are using AMD's Kria K26 SOM, our motherboard routes out USB SS and we have circuitry for dual role devices, ESD protection, and the physical connector.
     *
The superspeed traces are well laid out and go straight into the SOM.
     *
The dual role hardware looks good, I have attached a PD analyzer, and the mux and role determination seem to be working for host/device.
     *
I do not think it is a hardware issue, as we have had the layout reviewed and the issue seems to vary by kernel version and log levels. As well as other reasons below.
  *
Issue effects all SuperSpeed devices that I have access to, however our only real application is USB 3.2 Gen 1x1 to ethernet adapters.
  *
USB 2.0 devices do not have the issue.
  *
Recently I have updated our kernel (5.10->6.6)
     *
This was with some OS changes as well but the results are much worse than before.
  *
The issue only happens if the physical connection is made before the xhci/dwc3 driver is probed. Any time after it works. Earlier always fails.


Things I have tried that seem to make it "better"

  *
lowering the kernel log from 8->7
     *
I am not sure why the default log level was 8 as it isn't even listed as a valid level in most kernel documentation. It also doesn't seem to print anything differently to our UART.
     *
Older kernel worked fine on 8.
  *
Adding a shutdown callback to dwc3. This just calls the remove function.
     *
This helps with cold reboots.

However, these hacks do not work on our dual-role-mode hardware. They still require a plug cycle every time. Sometimes it won't enumerate at all usually the dmesg failure is:
usb usb4-port1: Cannot enable. Maybe the USB cable is bad?


My current suspicion is some very tight timing issue at boot. That would explain why the log level makes a difference as the time spent writing to UART is significant. Which leads to the investigation inside of the USB controller/driver:

  *
Probed ULPI clock
  *
Changed to otg/host in device tree
  *
Many print statements in xhci.c, hub.c, core.c, etc....
  *
Adjusting probe order
  *
Adjusting delays in hub.c and core.c for initialization and port discovery.
  *
delaying probe so that the USB stack comes up later
  *
Adding more warm resets in the hub initialization when a port fails to init.
  *
Adding extra Chip Hardware Reset (HCRST) in xhci.c
  *
Moving the entire USB stack into a kernel module
  *
Basically every Synopsys DWC3 quirk

With all of these changes nothing seems to make a difference, I also got my hands on a Beagle 5000 superspeed analyzer (oof that was expensive). It has been useful to see independent of what Linux/drivers reports. When I see the issue after a reboot, the only thing coming out of the host side is phy errors (maybe noise?). No amount of the things I listed seem to get it out of this state. The odd thing is that a lot of the DWC/XHCI registers seem to indicate it's doing something, but nothing is actually happening electrically. The odd part is why cycling the connector works to get the controller to start responding, but not reset commands in registers.
My backup plan is a hardware change to give VBUS control to Linux so I can reset the connection after the driver is probe, but obviously kind of a hacky solution.

I attached the traces that is recommended in the documentation, but also traces that are recommended from Xilinx's USB troubleshooting page:
https://xilinx-wiki.atlassian.net/wiki/spaces/A/pages/2046656520/USB+Debug+Guide+for+Zynq+UltraScale+and+Versal+Devices


Anyways, I know that was a lot of information but would appreciate any help. I cannot get the time of day from Synopsis or Xilinx on this, and I like to think I have done my due diligence for about 1 month now on this issue.

Thanks

Jakob




[-- Attachment #1.2: Type: text/html, Size: 17474 bytes --]

[-- Attachment #2: jakob.tar.gz --]
[-- Type: application/gzip, Size: 1013760 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-06-17  2:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-28 14:54 [BUG REPORT] usb: dwc3: Failure to enumerate from boot Jakob Trumpower
2025-05-01  0:14 ` Thinh Nguyen
     [not found]   ` <PH0PR06MB7077C939CDCF50DADE0A70F6E889A@PH0PR06MB7077.namprd06.prod.outlook.com>
2025-05-08  0:32     ` Thinh Nguyen
2025-05-09 15:50       ` Jakob Trumpower
2025-05-14  0:48         ` Thinh Nguyen
2025-06-06 21:23           ` Jakob Trumpower
2025-06-10 21:36             ` Thinh Nguyen
2025-06-17  1:24               ` Thinh Nguyen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox