* DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 @ 2025-04-14 16:54 Luca Ceresoli 2025-04-14 17:26 ` Alan Stern 0 siblings, 1 reply; 12+ messages in thread From: Luca Ceresoli @ 2025-04-14 16:54 UTC (permalink / raw) To: Minas Harutyunyan, linux-usb, Kever Yang Cc: Greg Kroah-Hartman, linux-kernel, Hervé Codina, Thomas Petazzoni, Stefan Wahren, Fabrice Gasnier Hello Minas, Kever, linux-usb, recent dwc2 driver contributors, I am facing an unexpected behavior (apparently a bug) with a dwc2 controller in gadget mode, using a mainline kernel: the gadget device is enumerated normally but then disappears and gets re-enumerated about 6 seconds after the initial enumeration, for no apparent reason. Here are the details. Testing setup: SoC: Rockchip RK3308 Board: Rock Pi S [1] USB controller: rockchip,rk3308-usb, snps,dwc2 [2] Controller mode: device only (dr_mode = "peripheral") [3] Tested kernels: - v6.15-rc2 - v6.14.1 - v6.12.20 - v6.6.87 - v6.1.134 - v5.15.180 Device tree: upstream Rock Pi S dts [4] Kernel config: ARM64 defconfig Hardware setup: USB A-C cable connected from PC A port to the C connector on the Rock Pi S board. This cable provides board power as well as the connection between the host and the gadget. Behavior: 1. boot board normally 2. optionally wait some time 3. run script to start a CDC serial gadget [5] 4. after about 0.6 seconds the ttyGS0 serial device is present and working, and so is ttyACM0 on the host: so far all good 5. after about 6 seconds the dwc2 controller receives some interrupts and starts a new enumeration sequence This is what the kernel logs: [ 20.105688] dwc2 ff400000.usb: bound driver configfs-gadget.g1 [ 20.285431] dwc2 ff400000.usb: new device is high-speed [ 20.373455] dwc2 ff400000.usb: new device is high-speed [ 20.426496] dwc2 ff400000.usb: new address 28 [ 26.688388] dwc2 ff400000.usb: new device is high-speed [ 26.775363] dwc2 ff400000.usb: new device is high-speed [ 26.836880] dwc2 ff400000.usb: new address 29 Here is a side-by-side log of host and device, synced manually using a video capture (sorry about the long lines, can't do without): *** HOST *** *** DEVICE *** <<< Last line of the script: 'echo ff400000.usb > UDC' >>> [ 14.281350] dwc2 ff400000.usb: bound driver configfs-gadget.g1 [ 14.482332] dwc2 ff400000.usb: new device is high-speed [108204.084049] usb 3-2: new high-speed USB device number 39 using xhci_hcd [ 14.675692] dwc2 ff400000.usb: new device is high-speed [108204.274639] usb 3-2: New USB device found, idVendor=1209, idProduct=0001, bcdDevice= 1.00 [ 14.737395] dwc2 ff400000.usb: new address 44 [108204.274652] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [108204.274656] usb 3-2: Product: ... [108204.274659] usb 3-2: Manufacturer: ... [108204.274662] usb 3-2: SerialNumber: 12345678 [108204.282555] cdc_acm 3-2:1.0: ttyACM0: USB ACM device (...nothing happens for about 6 seconds...) [108209.972180] usb 3-2: USB disconnect, device number 39 [ 20.766950] dwc2 ff400000.usb: new device is high-speed [108210.339297] usb 3-2: new high-speed USB device number 40 using xhci_hcd [ 20.960375] dwc2 ff400000.usb: new device is high-speed [108210.739738] usb 3-2: New USB device found, idVendor=1209, idProduct=0001, bcdDevice= 1.00 [ 21.200670] dwc2 ff400000.usb: new address 45 [108210.739750] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [108210.739753] usb 3-2: Product: ... [108210.739756] usb 3-2: Manufacturer: ... [108210.739758] usb 3-2: SerialNumber: 12345678 [108210.747084] cdc_acm 3-2:1.0: ttyACM0: USB ACM device Note: the device address is different on the host and the target. Is this expected? In the driver there are 2 interrupt handlers involved: - dwc2_handle_common_intr in core_intr.c for the common events - dwc2_hsotg_irq in gadget.c for gadget events They share the same interrupt number, which AFAICU is because they actually read different bits from the same GINTSTS register. I enabled DEBUG in the dwc2 driver and captured the initial events logged after the ~6 seconds pause, i.e. where the 2nd enumeration starts. Here they are with some annotations: 1. first interrupt after the ~6 s break: - dwc2_handle_common_intr finds no bits high - dwc2_hsotg_irq finds one (early suspend bit): [ 46.203094] dwc2 ff400000.usb: dwc2_hsotg_irq: 04008428 00000400 (d88c3cc4) retry 8 [ 46.204060] dwc2 ff400000.usb: GINTSTS_ErlySusp 2. second interrupt - dwc2_handle_common_intr finds one bits high (suspend): [ 46.206807] dwc2 ff400000.usb: USB SUSPEND [ 46.206824] dwc2 ff400000.usb: dwc2_handle_usb_suspend_intr: DSTS=0x502a01 [ 46.206842] dwc2 ff400000.usb: DSTS.Suspend Status=1 HWCFG4.Power Optimize=1 HWCFG4.Hibernation=0 [ 46.206872] dwc2 ff400000.usb: dwc2_hsotg_irq: 04008028 00000000 (d88c3cc4) retry 8 - dwc2_hsotg_irq finds no bits high 3. third interrupt - dwc2_handle_common_intr finds no bits high - dwc2_hsotg_irq finds two (reset detected + USB reset): [ 46.437109] dwc2 ff400000.usb: dwc2_hsotg_irq: 04809028 00801000 (d88c3cc4) retry 8 [ 46.437607] dwc2 ff400000.usb: dwc2_hsotg_irq: USBRstDet [ 46.437630] dwc2 ff400000.usb: dwc2_hsotg_irq: USBRst [ 46.437649] dwc2 ff400000.usb: GNPTXSTS=00080010 [ 46.437673] dwc2 ff400000.usb: complete: ep 00000000dab859c8 ep0, req 000000009cb97255, -108 => 00000000acdb2ee9 [ 46.437719] dwc2 ff400000.usb: dwc2_hsotg_complete_setup: failed -108 [ 46.437765] dwc2 ff400000.usb: dwc2_hsotg_ep_disable(ep 00000000cf8cf06f) [ 46.437790] dwc2 ff400000.usb: dwc2_hsotg_ep_disable: DxEPCTL=0x08080200 ... From now on the log appears as a normal enumeration process. I'm stuck at a dead end, trying to understand what may be triggering the second enumeration. Some more facts: * the 2nd enumeration happens always * there is never a 3rd enumeration * the ~6 seconds delay is always between 5 and 6.5 seconds * no relevant kernel activity is logged during the 6 seconds, except for some OPP changes; disabling CONFIG_CPU_IDLE and CONFIG_CPU_FREQ the OPP changes disappear but USB behaves like before * happens (with same delay) if after the 1st enumeration the USB serial is opened and kept in use * happens even if using a different device class (tried 0x8, 0x2) * happens even using g_mass_storage or g_zero instead of libcomposite (but with g_zero it happens when the g_zero module is loaded, without any configfs configuration) * tried different cables, no change * there is no evidence of power glitches * happens also on a custom hardware which is self-powered * happens with different hosts: two different PCs, one running Linux and one running Windows * to be double checked: does not happen if the host is an Android phone (but I haven't gone into the details of what happens with that setup) So I'm looking for any hints or directions for further investigation. Any input would be very appreciated. Thanks in advance! Luca [1] https://wiki.radxa.com/RockpiS [2] https://elixir.bootlin.com/linux/v6.13.7/source/arch/arm64/boot/dts/rockchip/rk3308.dtsi#L696-L710 [3] https://elixir.bootlin.com/linux/v6.13.7/source/arch/arm64/boot/dts/rockchip/rk3308-rock-pi-s.dts#L383 [4] https://elixir.bootlin.com/linux/v6.13.7/source/arch/arm64/boot/dts/rockchip/rk3308-rock-pi-s.dts [5] Script used to configure the gadget serial: ------------------------8<------------------------ #!/bin/sh set -eu modprobe libcomposite mount -t configfs none /sys/kernel/config mkdir -p "/sys/kernel/config/usb_gadget/g1" cd "/sys/kernel/config/usb_gadget/g1" echo 0x0200 > bcdUSB echo 0x0100 > bcdDevice echo 0x1209 > idVendor echo 0x0001 > idProduct echo 0x02 > bDeviceClass echo 0x00 > bDeviceSubClass echo 0x00 > bDeviceProtocol mkdir -p strings/0x409 echo 12345678 > strings/0x409/serialnumber echo "ACME" > strings/0x409/manufacturer echo "foobar" > strings/0x409/product # create the configuration mkdir -p configs/c.1 mkdir -p configs/c.1/strings/0x409 echo "foobar Config" > configs/c.1/strings/0x409/configuration echo 500 > configs/c.1/MaxPower # create the function mkdir functions/acm.0 # associate the function with the configuration ln -s functions/acm.0 configs/c.1 # enable the gadget using rock pi s UDC controller name (from /sys/class/udc/) echo ff400000.usb > UDC ------------------------8<------------------------ -- Luca Ceresoli, Bootlin Embedded Linux and Kernel engineering https://bootlin.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 2025-04-14 16:54 DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 Luca Ceresoli @ 2025-04-14 17:26 ` Alan Stern 2025-04-15 11:26 ` Minas Harutyunyan 0 siblings, 1 reply; 12+ messages in thread From: Alan Stern @ 2025-04-14 17:26 UTC (permalink / raw) To: Luca Ceresoli Cc: Minas Harutyunyan, linux-usb, Kever Yang, Greg Kroah-Hartman, linux-kernel, Hervé Codina, Thomas Petazzoni, Stefan Wahren, Fabrice Gasnier On Mon, Apr 14, 2025 at 06:54:58PM +0200, Luca Ceresoli wrote: > Hello Minas, Kever, linux-usb, recent dwc2 driver contributors, > > I am facing an unexpected behavior (apparently a bug) with a dwc2 > controller in gadget mode, using a mainline kernel: the gadget device is > enumerated normally but then disappears and gets re-enumerated about 6 > seconds after the initial enumeration, for no apparent reason. Here are > the details. > > Testing setup: > > SoC: Rockchip RK3308 > Board: Rock Pi S [1] > USB controller: rockchip,rk3308-usb, snps,dwc2 [2] > Controller mode: device only (dr_mode = "peripheral") [3] > Tested kernels: > - v6.15-rc2 > - v6.14.1 > - v6.12.20 > - v6.6.87 > - v6.1.134 > - v5.15.180 > Device tree: upstream Rock Pi S dts [4] > Kernel config: ARM64 defconfig > > Hardware setup: USB A-C cable connected from PC A port to the C > connector on the Rock Pi S board. This cable provides board power as > well as the connection between the host and the gadget. > > Behavior: > 1. boot board normally > 2. optionally wait some time > 3. run script to start a CDC serial gadget [5] > 4. after about 0.6 seconds the ttyGS0 serial device is present and > working, and so is ttyACM0 on the host: so far all good > 5. after about 6 seconds the dwc2 controller receives some > interrupts and starts a new enumeration sequence > > This is what the kernel logs: > > [ 20.105688] dwc2 ff400000.usb: bound driver configfs-gadget.g1 > [ 20.285431] dwc2 ff400000.usb: new device is high-speed > [ 20.373455] dwc2 ff400000.usb: new device is high-speed > [ 20.426496] dwc2 ff400000.usb: new address 28 > [ 26.688388] dwc2 ff400000.usb: new device is high-speed > [ 26.775363] dwc2 ff400000.usb: new device is high-speed > [ 26.836880] dwc2 ff400000.usb: new address 29 > > Here is a side-by-side log of host and device, synced manually using > a video capture (sorry about the long lines, can't do without): > > *** HOST *** *** DEVICE *** > <<< Last line of the script: 'echo ff400000.usb > UDC' >>> > [ 14.281350] dwc2 ff400000.usb: bound driver configfs-gadget.g1 > [ 14.482332] dwc2 ff400000.usb: new device is high-speed > [108204.084049] usb 3-2: new high-speed USB device number 39 using xhci_hcd > [ 14.675692] dwc2 ff400000.usb: new device is high-speed > [108204.274639] usb 3-2: New USB device found, idVendor=1209, idProduct=0001, bcdDevice= 1.00 [ 14.737395] dwc2 ff400000.usb: new address 44 > [108204.274652] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 > [108204.274656] usb 3-2: Product: ... > [108204.274659] usb 3-2: Manufacturer: ... > [108204.274662] usb 3-2: SerialNumber: 12345678 > [108204.282555] cdc_acm 3-2:1.0: ttyACM0: USB ACM device > (...nothing happens for about 6 seconds...) > [108209.972180] usb 3-2: USB disconnect, device number 39 > [ 20.766950] dwc2 ff400000.usb: new device is high-speed > [108210.339297] usb 3-2: new high-speed USB device number 40 using xhci_hcd > [ 20.960375] dwc2 ff400000.usb: new device is high-speed > [108210.739738] usb 3-2: New USB device found, idVendor=1209, idProduct=0001, bcdDevice= 1.00 [ 21.200670] dwc2 ff400000.usb: new address 45 > [108210.739750] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 > [108210.739753] usb 3-2: Product: ... > [108210.739756] usb 3-2: Manufacturer: ... > [108210.739758] usb 3-2: SerialNumber: 12345678 > [108210.747084] cdc_acm 3-2:1.0: ttyACM0: USB ACM device > > Note: the device address is different on the host and the target. Is > this expected? The value on the host side is a device number, not a device address. Device numbers are assigned sequentially by the kernel, whereas addresses are assigned automatically by the xHCI hardware without software intervention. For non-xHCI controllers the two are the same, but not with xHCI. > In the driver there are 2 interrupt handlers involved: > - dwc2_handle_common_intr in core_intr.c for the common events > - dwc2_hsotg_irq in gadget.c for gadget events > > They share the same interrupt number, which AFAICU is because they > actually read different bits from the same GINTSTS register. > > I enabled DEBUG in the dwc2 driver and captured the initial events > logged after the ~6 seconds pause, i.e. where the 2nd enumeration > starts. Here they are with some annotations: > > 1. first interrupt after the ~6 s break: > - dwc2_handle_common_intr finds no bits high > - dwc2_hsotg_irq finds one (early suspend bit): > [ 46.203094] dwc2 ff400000.usb: dwc2_hsotg_irq: 04008428 00000400 (d88c3cc4) retry 8 > [ 46.204060] dwc2 ff400000.usb: GINTSTS_ErlySusp > > 2. second interrupt > - dwc2_handle_common_intr finds one bits high (suspend): > [ 46.206807] dwc2 ff400000.usb: USB SUSPEND > [ 46.206824] dwc2 ff400000.usb: dwc2_handle_usb_suspend_intr: DSTS=0x502a01 > [ 46.206842] dwc2 ff400000.usb: DSTS.Suspend Status=1 HWCFG4.Power Optimize=1 HWCFG4.Hibernation=0 > [ 46.206872] dwc2 ff400000.usb: dwc2_hsotg_irq: 04008028 00000000 (d88c3cc4) retry 8 > - dwc2_hsotg_irq finds no bits high > > 3. third interrupt > - dwc2_handle_common_intr finds no bits high > - dwc2_hsotg_irq finds two (reset detected + USB reset): > [ 46.437109] dwc2 ff400000.usb: dwc2_hsotg_irq: 04809028 00801000 (d88c3cc4) retry 8 > [ 46.437607] dwc2 ff400000.usb: dwc2_hsotg_irq: USBRstDet > [ 46.437630] dwc2 ff400000.usb: dwc2_hsotg_irq: USBRst > [ 46.437649] dwc2 ff400000.usb: GNPTXSTS=00080010 > [ 46.437673] dwc2 ff400000.usb: complete: ep 00000000dab859c8 ep0, req 000000009cb97255, -108 => 00000000acdb2ee9 > [ 46.437719] dwc2 ff400000.usb: dwc2_hsotg_complete_setup: failed -108 > [ 46.437765] dwc2 ff400000.usb: dwc2_hsotg_ep_disable(ep 00000000cf8cf06f) > [ 46.437790] dwc2 ff400000.usb: dwc2_hsotg_ep_disable: DxEPCTL=0x08080200 > ... > > From now on the log appears as a normal enumeration process. > > I'm stuck at a dead end, trying to understand what may be triggering the > second enumeration. You should enable host-side debugging before connecting the cable: echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control It's quite possible that you're getting messed up by link power management (LPM). But that's just a guess. Alan Stern ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 2025-04-14 17:26 ` Alan Stern @ 2025-04-15 11:26 ` Minas Harutyunyan 2025-04-15 14:28 ` Luca Ceresoli 0 siblings, 1 reply; 12+ messages in thread From: Minas Harutyunyan @ 2025-04-15 11:26 UTC (permalink / raw) To: Alan Stern, Luca Ceresoli Cc: Minas Harutyunyan, linux-usb@vger.kernel.org, Kever Yang, Greg Kroah-Hartman, linux-kernel@vger.kernel.org, Hervé Codina, Thomas Petazzoni, Stefan Wahren, Fabrice Gasnier Hi Luca, On 4/14/25 21:26, Alan Stern wrote: > On Mon, Apr 14, 2025 at 06:54:58PM +0200, Luca Ceresoli wrote: >> Hello Minas, Kever, linux-usb, recent dwc2 driver contributors, >> >> I am facing an unexpected behavior (apparently a bug) with a dwc2 >> controller in gadget mode, using a mainline kernel: the gadget device is >> enumerated normally but then disappears and gets re-enumerated about 6 >> seconds after the initial enumeration, for no apparent reason. Here are >> the details. >> >> Testing setup: >> >> SoC: Rockchip RK3308 >> Board: Rock Pi S [1] >> USB controller: rockchip,rk3308-usb, snps,dwc2 [2] >> Controller mode: device only (dr_mode = "peripheral") [3] >> Tested kernels: >> - v6.15-rc2 >> - v6.14.1 >> - v6.12.20 >> - v6.6.87 >> - v6.1.134 >> - v5.15.180 >> Device tree: upstream Rock Pi S dts [4] >> Kernel config: ARM64 defconfig >> >> Hardware setup: USB A-C cable connected from PC A port to the C >> connector on the Rock Pi S board. This cable provides board power as >> well as the connection between the host and the gadget. >> >> Behavior: >> 1. boot board normally >> 2. optionally wait some time >> 3. run script to start a CDC serial gadget [5] >> 4. after about 0.6 seconds the ttyGS0 serial device is present and >> working, and so is ttyACM0 on the host: so far all good >> 5. after about 6 seconds the dwc2 controller receives some >> interrupts and starts a new enumeration sequence >> >> This is what the kernel logs: >> >> [ 20.105688] dwc2 ff400000.usb: bound driver configfs-gadget.g1 >> [ 20.285431] dwc2 ff400000.usb: new device is high-speed >> [ 20.373455] dwc2 ff400000.usb: new device is high-speed >> [ 20.426496] dwc2 ff400000.usb: new address 28 >> [ 26.688388] dwc2 ff400000.usb: new device is high-speed >> [ 26.775363] dwc2 ff400000.usb: new device is high-speed >> [ 26.836880] dwc2 ff400000.usb: new address 29 >> >> Here is a side-by-side log of host and device, synced manually using >> a video capture (sorry about the long lines, can't do without): >> >> *** HOST *** *** DEVICE *** >> <<< Last line of the script: 'echo ff400000.usb > UDC' >>> >> [ 14.281350] dwc2 ff400000.usb: bound driver configfs-gadget.g1 >> [ 14.482332] dwc2 ff400000.usb: new device is high-speed >> [108204.084049] usb 3-2: new high-speed USB device number 39 using xhci_hcd >> [ 14.675692] dwc2 ff400000.usb: new device is high-speed >> [108204.274639] usb 3-2: New USB device found, idVendor=1209, idProduct=0001, bcdDevice= 1.00 [ 14.737395] dwc2 ff400000.usb: new address 44 >> [108204.274652] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 >> [108204.274656] usb 3-2: Product: ... >> [108204.274659] usb 3-2: Manufacturer: ... >> [108204.274662] usb 3-2: SerialNumber: 12345678 >> [108204.282555] cdc_acm 3-2:1.0: ttyACM0: USB ACM device >> (...nothing happens for about 6 seconds...) >> [108209.972180] usb 3-2: USB disconnect, device number 39 >> [ 20.766950] dwc2 ff400000.usb: new device is high-speed >> [108210.339297] usb 3-2: new high-speed USB device number 40 using xhci_hcd >> [ 20.960375] dwc2 ff400000.usb: new device is high-speed >> [108210.739738] usb 3-2: New USB device found, idVendor=1209, idProduct=0001, bcdDevice= 1.00 [ 21.200670] dwc2 ff400000.usb: new address 45 >> [108210.739750] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 >> [108210.739753] usb 3-2: Product: ... >> [108210.739756] usb 3-2: Manufacturer: ... >> [108210.739758] usb 3-2: SerialNumber: 12345678 >> [108210.747084] cdc_acm 3-2:1.0: ttyACM0: USB ACM device >> >> Note: the device address is different on the host and the target. Is >> this expected? > > The value on the host side is a device number, not a device address. > Device numbers are assigned sequentially by the kernel, whereas > addresses are assigned automatically by the xHCI hardware without > software intervention. > > For non-xHCI controllers the two are the same, but not with xHCI. > >> In the driver there are 2 interrupt handlers involved: >> - dwc2_handle_common_intr in core_intr.c for the common events >> - dwc2_hsotg_irq in gadget.c for gadget events >> >> They share the same interrupt number, which AFAICU is because they >> actually read different bits from the same GINTSTS register. >> >> I enabled DEBUG in the dwc2 driver and captured the initial events >> logged after the ~6 seconds pause, i.e. where the 2nd enumeration >> starts. Here they are with some annotations: >> >> 1. first interrupt after the ~6 s break: >> - dwc2_handle_common_intr finds no bits high >> - dwc2_hsotg_irq finds one (early suspend bit): >> [ 46.203094] dwc2 ff400000.usb: dwc2_hsotg_irq: 04008428 00000400 (d88c3cc4) retry 8 >> [ 46.204060] dwc2 ff400000.usb: GINTSTS_ErlySusp >> >> 2. second interrupt >> - dwc2_handle_common_intr finds one bits high (suspend): >> [ 46.206807] dwc2 ff400000.usb: USB SUSPEND >> [ 46.206824] dwc2 ff400000.usb: dwc2_handle_usb_suspend_intr: DSTS=0x502a01 >> [ 46.206842] dwc2 ff400000.usb: DSTS.Suspend Status=1 HWCFG4.Power Optimize=1 HWCFG4.Hibernation=0 >> [ 46.206872] dwc2 ff400000.usb: dwc2_hsotg_irq: 04008028 00000000 (d88c3cc4) retry 8 >> - dwc2_hsotg_irq finds no bits high >> >> 3. third interrupt >> - dwc2_handle_common_intr finds no bits high >> - dwc2_hsotg_irq finds two (reset detected + USB reset): >> [ 46.437109] dwc2 ff400000.usb: dwc2_hsotg_irq: 04809028 00801000 (d88c3cc4) retry 8 >> [ 46.437607] dwc2 ff400000.usb: dwc2_hsotg_irq: USBRstDet >> [ 46.437630] dwc2 ff400000.usb: dwc2_hsotg_irq: USBRst >> [ 46.437649] dwc2 ff400000.usb: GNPTXSTS=00080010 >> [ 46.437673] dwc2 ff400000.usb: complete: ep 00000000dab859c8 ep0, req 000000009cb97255, -108 => 00000000acdb2ee9 >> [ 46.437719] dwc2 ff400000.usb: dwc2_hsotg_complete_setup: failed -108 >> [ 46.437765] dwc2 ff400000.usb: dwc2_hsotg_ep_disable(ep 00000000cf8cf06f) >> [ 46.437790] dwc2 ff400000.usb: dwc2_hsotg_ep_disable: DxEPCTL=0x08080200 >> ... >> >> From now on the log appears as a normal enumeration process. >> >> I'm stuck at a dead end, trying to understand what may be triggering the >> second enumeration. > > You should enable host-side debugging before connecting the cable: > > echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control > > It's quite possible that you're getting messed up by link power > management (LPM). But that's just a guess. > > Alan Stern I don't see, from logs, any wrong behavior from dwc2 device side. dwc2 driver correctly handle Early Suspend, Suspend, USB Reset detect and finally USB Reset which all initiating from Host side. You need to investigate why Host after 6 second initiated above signaling sequence. Maybe after 5-6 sec host initiate autosuspend? But in this case not clear why then it perform USB reset? Did you test with different PC's with different EHCI/XHCI hosts? Thanks, Minas ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 2025-04-15 11:26 ` Minas Harutyunyan @ 2025-04-15 14:28 ` Luca Ceresoli 2025-04-15 15:58 ` Luca Ceresoli 2025-04-15 16:14 ` Alan Stern 0 siblings, 2 replies; 12+ messages in thread From: Luca Ceresoli @ 2025-04-15 14:28 UTC (permalink / raw) To: Minas Harutyunyan Cc: Alan Stern, linux-usb@vger.kernel.org, Kever Yang, Greg Kroah-Hartman, linux-kernel@vger.kernel.org, Hervé Codina, Thomas Petazzoni, Stefan Wahren, Fabrice Gasnier Hello Alan, Minas, thanks for the prompt feedback! See below my comments. On Tue, 15 Apr 2025 11:26:55 +0000 Minas Harutyunyan <Minas.Harutyunyan@synopsys.com> wrote: > Hi Luca, > > On 4/14/25 21:26, Alan Stern wrote: > > On Mon, Apr 14, 2025 at 06:54:58PM +0200, Luca Ceresoli wrote: > >> Hello Minas, Kever, linux-usb, recent dwc2 driver contributors, > >> > >> I am facing an unexpected behavior (apparently a bug) with a dwc2 > >> controller in gadget mode, using a mainline kernel: the gadget device is > >> enumerated normally but then disappears and gets re-enumerated about 6 > >> seconds after the initial enumeration, for no apparent reason. Here are > >> the details. > >> > >> Testing setup: > >> > >> SoC: Rockchip RK3308 > >> Board: Rock Pi S [1] > >> USB controller: rockchip,rk3308-usb, snps,dwc2 [2] > >> Controller mode: device only (dr_mode = "peripheral") [3] > >> Tested kernels: > >> - v6.15-rc2 > >> - v6.14.1 > >> - v6.12.20 > >> - v6.6.87 > >> - v6.1.134 > >> - v5.15.180 > >> Device tree: upstream Rock Pi S dts [4] > >> Kernel config: ARM64 defconfig > >> > >> Hardware setup: USB A-C cable connected from PC A port to the C > >> connector on the Rock Pi S board. This cable provides board power as > >> well as the connection between the host and the gadget. > >> > >> Behavior: > >> 1. boot board normally > >> 2. optionally wait some time > >> 3. run script to start a CDC serial gadget [5] > >> 4. after about 0.6 seconds the ttyGS0 serial device is present and > >> working, and so is ttyACM0 on the host: so far all good > >> 5. after about 6 seconds the dwc2 controller receives some > >> interrupts and starts a new enumeration sequence > >> > >> This is what the kernel logs: > >> > >> [ 20.105688] dwc2 ff400000.usb: bound driver configfs-gadget.g1 > >> [ 20.285431] dwc2 ff400000.usb: new device is high-speed > >> [ 20.373455] dwc2 ff400000.usb: new device is high-speed > >> [ 20.426496] dwc2 ff400000.usb: new address 28 > >> [ 26.688388] dwc2 ff400000.usb: new device is high-speed > >> [ 26.775363] dwc2 ff400000.usb: new device is high-speed > >> [ 26.836880] dwc2 ff400000.usb: new address 29 > >> > >> Here is a side-by-side log of host and device, synced manually using > >> a video capture (sorry about the long lines, can't do without): > >> > >> *** HOST *** *** DEVICE *** > >> <<< Last line of the script: 'echo ff400000.usb > UDC' >>> > >> [ 14.281350] dwc2 ff400000.usb: bound driver configfs-gadget.g1 > >> [ 14.482332] dwc2 ff400000.usb: new device is high-speed > >> [108204.084049] usb 3-2: new high-speed USB device number 39 using xhci_hcd > >> [ 14.675692] dwc2 ff400000.usb: new device is high-speed > >> [108204.274639] usb 3-2: New USB device found, idVendor=1209, idProduct=0001, bcdDevice= 1.00 [ 14.737395] dwc2 ff400000.usb: new address 44 > >> [108204.274652] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 > >> [108204.274656] usb 3-2: Product: ... > >> [108204.274659] usb 3-2: Manufacturer: ... > >> [108204.274662] usb 3-2: SerialNumber: 12345678 > >> [108204.282555] cdc_acm 3-2:1.0: ttyACM0: USB ACM device > >> (...nothing happens for about 6 seconds...) > >> [108209.972180] usb 3-2: USB disconnect, device number 39 > >> [ 20.766950] dwc2 ff400000.usb: new device is high-speed > >> [108210.339297] usb 3-2: new high-speed USB device number 40 using xhci_hcd > >> [ 20.960375] dwc2 ff400000.usb: new device is high-speed > >> [108210.739738] usb 3-2: New USB device found, idVendor=1209, idProduct=0001, bcdDevice= 1.00 [ 21.200670] dwc2 ff400000.usb: new address 45 > >> [108210.739750] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 > >> [108210.739753] usb 3-2: Product: ... > >> [108210.739756] usb 3-2: Manufacturer: ... > >> [108210.739758] usb 3-2: SerialNumber: 12345678 > >> [108210.747084] cdc_acm 3-2:1.0: ttyACM0: USB ACM device > >> > >> Note: the device address is different on the host and the target. Is > >> this expected? > > > > The value on the host side is a device number, not a device address. > > Device numbers are assigned sequentially by the kernel, whereas > > addresses are assigned automatically by the xHCI hardware without > > software intervention. > > > > For non-xHCI controllers the two are the same, but not with xHCI. I see, that clarifies! > >> In the driver there are 2 interrupt handlers involved: > >> - dwc2_handle_common_intr in core_intr.c for the common events > >> - dwc2_hsotg_irq in gadget.c for gadget events > >> > >> They share the same interrupt number, which AFAICU is because they > >> actually read different bits from the same GINTSTS register. > >> > >> I enabled DEBUG in the dwc2 driver and captured the initial events > >> logged after the ~6 seconds pause, i.e. where the 2nd enumeration > >> starts. Here they are with some annotations: > >> > >> 1. first interrupt after the ~6 s break: > >> - dwc2_handle_common_intr finds no bits high > >> - dwc2_hsotg_irq finds one (early suspend bit): > >> [ 46.203094] dwc2 ff400000.usb: dwc2_hsotg_irq: 04008428 00000400 (d88c3cc4) retry 8 > >> [ 46.204060] dwc2 ff400000.usb: GINTSTS_ErlySusp > >> > >> 2. second interrupt > >> - dwc2_handle_common_intr finds one bits high (suspend): > >> [ 46.206807] dwc2 ff400000.usb: USB SUSPEND > >> [ 46.206824] dwc2 ff400000.usb: dwc2_handle_usb_suspend_intr: DSTS=0x502a01 > >> [ 46.206842] dwc2 ff400000.usb: DSTS.Suspend Status=1 HWCFG4.Power Optimize=1 HWCFG4.Hibernation=0 > >> [ 46.206872] dwc2 ff400000.usb: dwc2_hsotg_irq: 04008028 00000000 (d88c3cc4) retry 8 > >> - dwc2_hsotg_irq finds no bits high > >> > >> 3. third interrupt > >> - dwc2_handle_common_intr finds no bits high > >> - dwc2_hsotg_irq finds two (reset detected + USB reset): > >> [ 46.437109] dwc2 ff400000.usb: dwc2_hsotg_irq: 04809028 00801000 (d88c3cc4) retry 8 > >> [ 46.437607] dwc2 ff400000.usb: dwc2_hsotg_irq: USBRstDet > >> [ 46.437630] dwc2 ff400000.usb: dwc2_hsotg_irq: USBRst > >> [ 46.437649] dwc2 ff400000.usb: GNPTXSTS=00080010 > >> [ 46.437673] dwc2 ff400000.usb: complete: ep 00000000dab859c8 ep0, req 000000009cb97255, -108 => 00000000acdb2ee9 > >> [ 46.437719] dwc2 ff400000.usb: dwc2_hsotg_complete_setup: failed -108 > >> [ 46.437765] dwc2 ff400000.usb: dwc2_hsotg_ep_disable(ep 00000000cf8cf06f) > >> [ 46.437790] dwc2 ff400000.usb: dwc2_hsotg_ep_disable: DxEPCTL=0x08080200 > >> ... > >> > >> From now on the log appears as a normal enumeration process. > >> > >> I'm stuck at a dead end, trying to understand what may be triggering the > >> second enumeration. > > > > You should enable host-side debugging before connecting the cable: > > > > echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control Did that, and here is my host log: ### First enumeration starts [105021.069029] hub 3-3:1.0: state 7 ports 4 chg 0000 evt 0010 [105021.076825] usb 3-3-port4: status 0101, change 0001, 12 Mb/s [105021.080244] usb 3-3-port4: indicator auto status 0 [105021.190710] usb 3-3-port4: debounce total 100ms stable 100ms status 0x101 [105021.286322] usb 3-3.4: new high-speed USB device number 87 using xhci_hcd [105021.396995] usb 3-3.4: skipped 1 descriptor after configuration [105021.397009] usb 3-3.4: skipped 4 descriptors after interface [105021.398516] usb 3-3.4: default language 0x0409 [105021.403247] usb 3-3.4: udev 87, busnum 3, minor = 342 [105021.403257] usb 3-3.4: New USB device found, idVendor=1209, idProduct=0001, bcdDevice= 1.00 [105021.403262] usb 3-3.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [105021.403265] usb 3-3.4: Product: foobar [105021.403267] usb 3-3.4: Manufacturer: ACME [105021.403269] usb 3-3.4: SerialNumber: 12345678 [105021.403762] usb 3-3.4: usb_probe_device [105021.403768] usb 3-3.4: configuration #1 chosen from 1 choice [105021.410537] usb 3-3.4: adding 3-3.4:1.0 (config #1, interface 0) [105021.412201] cdc_acm 3-3.4:1.0: usb_probe_interface [105021.412207] cdc_acm 3-3.4:1.0: usb_probe_interface - got id [105021.413917] cdc_acm 3-3.4:1.0: ttyACM0: USB ACM device [105021.413966] usb 3-3.4: adding 3-3.4:1.1 (config #1, interface 1) ### 6 seconds delay, second enumeration starts [105027.468951] hub 3-3:1.0: state 7 ports 4 chg 0000 evt 0010 [105027.477656] usb 3-3-port4: status 0101, change 0001, 12 Mb/s [105027.481091] usb 3-3-port4: indicator auto status 0 [105027.481116] usb 3-3.4: USB disconnect, device number 87 [105027.481123] usb 3-3.4: unregistering device [105027.481128] usb 3-3.4: unregistering interface 3-3.4:1.0 [105027.481803] usb 3-3.4: unregistering interface 3-3.4:1.1 [105027.481899] usb 3-3.4: usb_disable_device nuking all URBs [105027.591528] usb 3-3-port4: debounce total 100ms stable 100ms status 0x101 [105027.686604] usb 3-3.4: new high-speed USB device number 88 using xhci_hcd [105027.797310] usb 3-3.4: skipped 1 descriptor after configuration [105027.797328] usb 3-3.4: skipped 4 descriptors after interface [105027.798834] usb 3-3.4: default language 0x0409 [105027.803564] usb 3-3.4: udev 88, busnum 3, minor = 343 [105027.803574] usb 3-3.4: New USB device found, idVendor=1209, idProduct=0001, bcdDevice= 1.00 [105027.803578] usb 3-3.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [105027.803582] usb 3-3.4: Product: foobar [105027.803584] usb 3-3.4: Manufacturer: ACME [105027.803586] usb 3-3.4: SerialNumber: 12345678 [105027.803919] usb 3-3.4: usb_probe_device [105027.803926] usb 3-3.4: configuration #1 chosen from 1 choice [105027.810831] usb 3-3.4: adding 3-3.4:1.0 (config #1, interface 0) [105027.812617] cdc_acm 3-3.4:1.0: usb_probe_interface [105027.812627] cdc_acm 3-3.4:1.0: usb_probe_interface - got id [105027.814357] cdc_acm 3-3.4:1.0: ttyACM0: USB ACM device [105027.814424] usb 3-3.4: adding 3-3.4:1.1 (config #1, interface 1) ### Second enumeration finished Do you find useful hints in this log? To me it doesn't, I'm afraid. > > It's quite possible that you're getting messed up by link power > > management (LPM). But that's just a guess. What would be a symptom, if that happened? > I don't see, from logs, any wrong behavior from dwc2 device side. dwc2 > driver correctly handle Early Suspend, Suspend, USB Reset detect and > finally USB Reset which all initiating from Host side. That was my impression, thanks for confirming that. Clearly somethins goes wrong before that. In case it were helpful, here are the logged lines on the dwc2 device from just _before_ up to just after the 6 seconds delay: [ 40.515896] dwc2 ff400000.usb: dwc2_hsotg_irq: 04088028 00080000 (d88c3cc4) retry 8 [ 40.516402] dwc2 ff400000.usb: dwc2_hsotg_irq: daint=00010000 [ 40.516426] dwc2 ff400000.usb: dwc2_hsotg_epint: ep0(out) DxEPINT=0x00000009 [ 40.516450] dwc2 ff400000.usb: dwc2_hsotg_epint: Setup/Timeout [ 40.516469] dwc2 ff400000.usb: complete: ep 00000000dab859c8 ep0, req 000000009cb97255, 0 => 00000000acdb2ee9 [ 40.516507] dwc2 ff400000.usb: ctrl Type=80, Req=06, V=0306, I=0409, L=00ff [ 40.516542] dwc2 ff400000.usb: ep0: req 00000000326c54ef: 26@0000000010e15f7f, noi=0, zero=1, snok=0 [ 40.516577] dwc2 ff400000.usb: dwc2_hsotg_start_req: DxEPCTL=0x00028000, ep 0, dir in [ 40.516604] dwc2 ff400000.usb: ureq->length:26 ureq->actual:0 [ 40.516627] dwc2 ff400000.usb: dwc2_hsotg_start_req: 1@26/26, 0x0008001a => 0x00000910 [ 40.516656] dwc2 ff400000.usb: dwc2_hsotg_start_req: 042fa000 pad => 0x00000914 [ 40.516678] dwc2 ff400000.usb: ep0 state:1 [ 40.516696] dwc2 ff400000.usb: dwc2_hsotg_start_req: DxEPCTL=0x84028000 [ 40.516717] dwc2 ff400000.usb: dwc2_hsotg_start_req: DXEPCTL=0x80008000 [ 40.517286] dwc2 ff400000.usb: dwc2_hsotg_irq: 04048028 00040000 (d88c3cc4) retry 8 [ 40.517783] dwc2 ff400000.usb: dwc2_hsotg_irq: daint=00000001 [ 40.517805] dwc2 ff400000.usb: dwc2_hsotg_epint: ep0(in) DxEPINT=0x00000001 [ 40.517832] dwc2 ff400000.usb: dwc2_hsotg_epint: XferCompl: DxEPCTL=0x00008000, DXEPTSIZ=00000062 [ 40.517859] dwc2 ff400000.usb: dwc2_hsotg_complete_in: adjusting size done 0 => 26 [ 40.517884] dwc2 ff400000.usb: req->length:26 req->actual:26 req->zero:1 [ 40.517909] dwc2 ff400000.usb: Receiving zero-length packet on ep0 [ 40.518514] dwc2 ff400000.usb: dwc2_hsotg_irq: 04088028 00080000 (d88c3cc4) retry 8 [ 40.519011] dwc2 ff400000.usb: dwc2_hsotg_irq: daint=00010000 [ 40.519035] dwc2 ff400000.usb: dwc2_hsotg_epint: ep0(out) DxEPINT=0x00000001 [ 40.519062] dwc2 ff400000.usb: dwc2_hsotg_epint: XferCompl: DxEPCTL=0x00028000, DXEPTSIZ=60000011 [ 40.519087] dwc2 ff400000.usb: zlp packet received [ 40.519106] dwc2 ff400000.usb: complete: ep 00000000dab859c8 ep0, req 00000000326c54ef, 0 => 000000002dd872a5 [ 40.519143] dwc2 ff400000.usb: dwc2_hsotg_enqueue_setup: queueing setup request [ 40.519164] dwc2 ff400000.usb: ep0: req 000000009cb97255: 8@00000000e42d6d89, noi=0, zero=0, snok=0 [ 40.519214] dwc2 ff400000.usb: dwc2_hsotg_start_req: DxEPCTL=0x00028000, ep 0, dir out [ 40.519242] dwc2 ff400000.usb: ureq->length:8 ureq->actual:0 [ 40.519264] dwc2 ff400000.usb: dwc2_hsotg_start_req: 1@8/8, 0x00080008 => 0x00000b10 [ 40.519297] dwc2 ff400000.usb: dwc2_hsotg_start_req: 042f8000 pad => 0x00000b14 [ 40.519321] dwc2 ff400000.usb: ep0 state:0 [ 40.519340] dwc2 ff400000.usb: dwc2_hsotg_start_req: DxEPCTL=0x80028000 [ 40.519364] dwc2 ff400000.usb: dwc2_hsotg_start_req: DXEPCTL=0x80028000 [ 46.203094] dwc2 ff400000.usb: dwc2_hsotg_irq: 04008428 00000400 (d88c3cc4) retry 8 [ 46.204060] dwc2 ff400000.usb: GINTSTS_ErlySusp [ 46.205565] dwc2 ff400000.usb: gintsts=04008828 gintmsk=d88c3cc4 [ 46.206807] dwc2 ff400000.usb: USB SUSPEND [ 46.206824] dwc2 ff400000.usb: dwc2_handle_usb_suspend_intr: DSTS=0x502a01 [ 46.206842] dwc2 ff400000.usb: DSTS.Suspend Status=1 HWCFG4.Power Optimize=1 HWCFG4.Hibernation=0 [ 46.206872] dwc2 ff400000.usb: dwc2_hsotg_irq: 04008028 00000000 (d88c3cc4) retry 8 > You need to > investigate why Host after 6 second initiated above signaling sequence. > Maybe after 5-6 sec host initiate autosuspend? But in this case not > clear why then it perform USB reset? See the host logs above. > Did you test with different PC's with different EHCI/XHCI hosts? Both the PCs I tested for my initial report (including the one I tested with Windows) only have xHCI controllers (I user lsusb to find out). I now managed to find an old laptop with an EHCI controller, and connecting to that host gives the same result: re-enumeration after 6 seconds. There are two more facts that may hint at something: * using the Rockchip 4.4 vendor kernel the problem does never happen * using a custom board with the RK3308 the problem happens if the USB-C cable is connected before booting; it does not happen if the connectror is connected after the boot has completed These two have been reported to me and I haven't reproduced them (yet) so I cannot provide logs at the moment. I can prioritize working on them, or the 4.4 vendor kernel at least, if it is deemed useful. Best regards, Luca -- Luca Ceresoli, Bootlin Embedded Linux and Kernel engineering https://bootlin.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 2025-04-15 14:28 ` Luca Ceresoli @ 2025-04-15 15:58 ` Luca Ceresoli 2025-04-15 16:14 ` Alan Stern 1 sibling, 0 replies; 12+ messages in thread From: Luca Ceresoli @ 2025-04-15 15:58 UTC (permalink / raw) To: Minas Harutyunyan Cc: Alan Stern, linux-usb@vger.kernel.org, Kever Yang, Greg Kroah-Hartman, linux-kernel@vger.kernel.org, Hervé Codina, Thomas Petazzoni, Stefan Wahren, Fabrice Gasnier Hello again, On Tue, 15 Apr 2025 16:28:25 +0200 Luca Ceresoli <luca.ceresoli@bootlin.com> wrote: [...] > There are two more facts that may hint at something: > > * using the Rockchip 4.4 vendor kernel the problem does never happen > * using a custom board with the RK3308 the problem happens if the > USB-C cable is connected before booting; it does not happen if the > connectror is connected after the boot has completed > > These two have been reported to me and I haven't reproduced them (yet) > so I cannot provide logs at the moment. I can prioritize working on > them, or the 4.4 vendor kernel at least, if it is deemed useful. FWIW I managed to try the 4.4 vendor kernel: git repo: https://github.com/radxa/kernel.git commit: 4.4.143-23-rockchip-796-g09ee299143e4 As said, there is only one enumerations, not two. Here's a log of the host during said enumeration (same host as before, no changes on the host side): [110685.837058] hub 3-3:1.0: state 7 ports 4 chg 0000 evt 0010 [110685.844824] usb 3-3-port4: status 0101, change 0001, 12 Mb/s [110685.848278] usb 3-3-port4: indicator auto status 0 [110685.958750] usb 3-3-port4: debounce total 100ms stable 100ms status 0x101 [110686.053638] usb 3-3.4: new high-speed USB device number 116 using xhci_hcd [110686.163756] usb 3-3.4: skipped 2 descriptors after configuration [110686.163775] usb 3-3.4: skipped 4 descriptors after interface [110686.163996] usb 3-3.4: default language 0x0409 [110686.164557] usb 3-3.4: udev 116, busnum 3, minor = 371 [110686.164567] usb 3-3.4: New USB device found, idVendor=1209, idProduct=0001, bcdDevice= 1.00 [110686.164574] usb 3-3.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [110686.164579] usb 3-3.4: Product: foobar [110686.164583] usb 3-3.4: Manufacturer: ACME [110686.164586] usb 3-3.4: SerialNumber: 12345678 [110686.165279] usb 3-3.4: usb_probe_device [110686.165290] usb 3-3.4: configuration #1 chosen from 1 choice [110686.169116] usb 3-3.4: adding 3-3.4:1.0 (config #1, interface 0) [110686.169384] cdc_acm 3-3.4:1.0: usb_probe_interface [110686.169389] cdc_acm 3-3.4:1.0: usb_probe_interface - got id [110686.169827] cdc_acm 3-3.4:1.0: ttyACM0: USB ACM device [110686.169868] usb 3-3.4: adding 3-3.4:1.1 (config #1, interface 1) The only potentially significant difference I could spot from the previous host log is: mainline: [105021.396995] usb 3-3.4: skipped 1 descriptor after configuration 4.4: [110686.163756] usb 3-3.4: skipped 2 descriptors after configuration Luca -- Luca Ceresoli, Bootlin Embedded Linux and Kernel engineering https://bootlin.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 2025-04-15 14:28 ` Luca Ceresoli 2025-04-15 15:58 ` Luca Ceresoli @ 2025-04-15 16:14 ` Alan Stern 2025-05-02 13:53 ` Luca Ceresoli 1 sibling, 1 reply; 12+ messages in thread From: Alan Stern @ 2025-04-15 16:14 UTC (permalink / raw) To: Luca Ceresoli Cc: Minas Harutyunyan, linux-usb@vger.kernel.org, Kever Yang, Greg Kroah-Hartman, linux-kernel@vger.kernel.org, Hervé Codina, Thomas Petazzoni, Stefan Wahren, Fabrice Gasnier On Tue, Apr 15, 2025 at 04:28:25PM +0200, Luca Ceresoli wrote: > > > You should enable host-side debugging before connecting the cable: > > > > > > echo 'module usbcore =p' >/sys/kernel/debug/dynamic_debug/control > > Did that, and here is my host log: > > ### First enumeration starts > > [105021.069029] hub 3-3:1.0: state 7 ports 4 chg 0000 evt 0010 > [105021.076825] usb 3-3-port4: status 0101, change 0001, 12 Mb/s > [105021.080244] usb 3-3-port4: indicator auto status 0 > [105021.190710] usb 3-3-port4: debounce total 100ms stable 100ms status 0x101 > [105021.286322] usb 3-3.4: new high-speed USB device number 87 using xhci_hcd > [105021.396995] usb 3-3.4: skipped 1 descriptor after configuration > [105021.397009] usb 3-3.4: skipped 4 descriptors after interface > [105021.398516] usb 3-3.4: default language 0x0409 > [105021.403247] usb 3-3.4: udev 87, busnum 3, minor = 342 > [105021.403257] usb 3-3.4: New USB device found, idVendor=1209, idProduct=0001, bcdDevice= 1.00 > [105021.403262] usb 3-3.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3 > [105021.403265] usb 3-3.4: Product: foobar > [105021.403267] usb 3-3.4: Manufacturer: ACME > [105021.403269] usb 3-3.4: SerialNumber: 12345678 > [105021.403762] usb 3-3.4: usb_probe_device > [105021.403768] usb 3-3.4: configuration #1 chosen from 1 choice > [105021.410537] usb 3-3.4: adding 3-3.4:1.0 (config #1, interface 0) > [105021.412201] cdc_acm 3-3.4:1.0: usb_probe_interface > [105021.412207] cdc_acm 3-3.4:1.0: usb_probe_interface - got id > [105021.413917] cdc_acm 3-3.4:1.0: ttyACM0: USB ACM device > [105021.413966] usb 3-3.4: adding 3-3.4:1.1 (config #1, interface 1) > > ### 6 seconds delay, second enumeration starts > > [105027.468951] hub 3-3:1.0: state 7 ports 4 chg 0000 evt 0010 > [105027.477656] usb 3-3-port4: status 0101, change 0001, 12 Mb/s > [105027.481091] usb 3-3-port4: indicator auto status 0 > [105027.481116] usb 3-3.4: USB disconnect, device number 87 > [105027.481123] usb 3-3.4: unregistering device > [105027.481128] usb 3-3.4: unregistering interface 3-3.4:1.0 > [105027.481803] usb 3-3.4: unregistering interface 3-3.4:1.1 > [105027.481899] usb 3-3.4: usb_disable_device nuking all URBs This shows that as far as the host is concerned, the device was disconnected from the USB bus... > [105027.591528] usb 3-3-port4: debounce total 100ms stable 100ms status 0x101 > [105027.686604] usb 3-3.4: new high-speed USB device number 88 using xhci_hcd > [105027.797310] usb 3-3.4: skipped 1 descriptor after configuration > [105027.797328] usb 3-3.4: skipped 4 descriptors after interface > [105027.798834] usb 3-3.4: default language 0x0409 > [105027.803564] usb 3-3.4: udev 88, busnum 3, minor = 343 > [105027.803574] usb 3-3.4: New USB device found, idVendor=1209, idProduct=0001, bcdDevice= 1.00 > [105027.803578] usb 3-3.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3 > [105027.803582] usb 3-3.4: Product: foobar > [105027.803584] usb 3-3.4: Manufacturer: ACME > [105027.803586] usb 3-3.4: SerialNumber: 12345678 > [105027.803919] usb 3-3.4: usb_probe_device > [105027.803926] usb 3-3.4: configuration #1 chosen from 1 choice > [105027.810831] usb 3-3.4: adding 3-3.4:1.0 (config #1, interface 0) > [105027.812617] cdc_acm 3-3.4:1.0: usb_probe_interface > [105027.812627] cdc_acm 3-3.4:1.0: usb_probe_interface - got id > [105027.814357] cdc_acm 3-3.4:1.0: ttyACM0: USB ACM device > [105027.814424] usb 3-3.4: adding 3-3.4:1.1 (config #1, interface 1) and then reconnected. > ### Second enumeration finished > > Do you find useful hints in this log? > > To me it doesn't, I'm afraid. Not much, no. There's no way to tell the reason for the disconnection. It could be something that the device did, or a problem with the cable (unlikely since it occurs regularly at 6-second intervals). > > > It's quite possible that you're getting messed up by link power > > > management (LPM). But that's just a guess. > > What would be a symptom, if that happened? The debugging log wouldn't show much unless something went wrong. You could see if there are any files containing "lpm" in their names in the /sys/bus/usb/devices/3-3.4/ directory (while the device is connected) and what they contain. Also, there's a way to disable LPM on the host by setting a usbcore quirks module parameter: echo 1209:0001:k >/sys/module/usbcore/parameters/quirks (1209 and 0001 are the device's vendor and product IDs, and k is the code for disabling LPM.) You can do this before connecting the device and it will remain in effect until you reboot. You could also try connecting a usbmon trace for bus 3, showing what happens during the initial connection and ensuing disconnection. Any LPM transitions would show up in the trace. Alan Stern ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 2025-04-15 16:14 ` Alan Stern @ 2025-05-02 13:53 ` Luca Ceresoli 2025-05-02 17:56 ` Alan Stern 0 siblings, 1 reply; 12+ messages in thread From: Luca Ceresoli @ 2025-05-02 13:53 UTC (permalink / raw) To: Alan Stern Cc: Minas Harutyunyan, linux-usb@vger.kernel.org, Kever Yang, Greg Kroah-Hartman, linux-kernel@vger.kernel.org, Hervé Codina, Thomas Petazzoni, Stefan Wahren, Fabrice Gasnier Hello Alan, thanks for your continued support! On Tue, 15 Apr 2025 12:14:58 -0400 Alan Stern <stern@rowland.harvard.edu> wrote: [...] > > > > It's quite possible that you're getting messed up by link power > > > > management (LPM). But that's just a guess. > > > > What would be a symptom, if that happened? > > The debugging log wouldn't show much unless something went wrong. You > could see if there are any files containing "lpm" in their names in the > /sys/bus/usb/devices/3-3.4/ directory (while the device is connected) > and what they contain. Also, there's a way to disable LPM on the host > by setting a usbcore quirks module parameter: > > echo 1209:0001:k >/sys/module/usbcore/parameters/quirks Tried this. There is no file with 'lpm' in the name in /sys/bus/usb/devices/3-3.4/, and adding the quirk did not change the result: still a disconnect and reconnect in ~6 seconds. > You could also try connecting a usbmon trace for bus 3, showing what > happens during the initial connection and ensuing disconnection. Any > LPM transitions would show up in the trace. Tried this, and here are the few lines before and after the 5~6 seconds delay. ffff99621e768840 4009009102 C Bi:1:009:3 0 2 = 696e ffff99621e768840 4009009104 S Bi:1:009:3 -115 256 < ffff99621e768300 4009009115 S Bi:1:009:3 -115 256 < ffff99621e768840 4009009144 C Bi:1:009:3 0 6 = 3a383534 2033 ffff99621e768300 4009009155 C Bi:1:009:3 0 1 = 37 ffff99621e768840 4009009178 C Bi:1:009:3 0 2 = 0d0a ffff99621e768840 4009009180 S Bi:1:009:3 -115 256 < ffff996080f11900 4009009361 C Ci:1:014:0 0 26 = 1a034300 44004300 20004100 43004d00 20004400 61007400 6100 ffff99621e768300 4009009615 C Bi:1:009:3 0 3 = 5b2020 ffff99621e768300 4009009624 S Bi:1:009:3 -115 256 < ffff99621e768840 4009009645 C Bi:1:009:3 0 3 = 203233 ffff99621e768840 4009009646 S Bi:1:009:3 -115 256 < ffff99621e768300 4009009692 C Bi:1:009:3 0 4 = 2e383738 ffff99621e768300 4009009694 S Bi:1:009:3 -115 256 < ffff99621e768840 4009009703 C Bi:1:009:3 0 2 = 3731 ffff99621e768840 4009009722 S Bi:1:009:3 -115 256 < ffff99621e768840 4009009933 C Bi:1:009:3 0 2 = 7472 <<< 6 seconds delay >>> ffff9960828e9540 4014796128 C Ii:1:001:1 0:2048 2 = 1000 ffff9960828e9540 4014796145 S Ii:1:001:1 -115:2048 4 < ffff996080f11900 4014796162 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < ffff996080f11900 4014796189 C Ci:1:001:0 0 4 = 00010100 ffff996080f11900 4014796201 S Co:1:001:0 s 23 01 0010 0004 0000 0 ffff996080f11900 4014796219 C Co:1:001:0 0 0 ffff996080f11000 4014799627 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < ffff996080f11000 4014799679 C Ci:1:001:0 0 4 = 00010000 ffff996080f11000 4014826132 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < ffff996080f11000 4014826166 C Ci:1:001:0 0 4 = 00010000 ffff996080f11000 4014852075 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < ffff996080f11000 4014852122 C Ci:1:001:0 0 4 = 00010000 ffff996080f11000 4014878210 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < ffff996080f11000 4014878253 C Ci:1:001:0 0 4 = 00010000 ffff996080f11000 4014904049 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < ffff996080f11000 4014904088 C Ci:1:001:0 0 4 = 00010000 ffff9960828e9540 4014948427 C Ii:1:001:1 0:2048 2 = 1000 ffff9960828e9540 4014948456 S Ii:1:001:1 -115:2048 4 < ffff99621e768300 4014948461 C Bi:1:009:3 0 2 = 5b20 ffff99621e768300 4014948472 S Bi:1:009:3 -115 256 < ffff99621e768840 4014948488 C Bi:1:009:3 0 2 = 2020 ffff99621e768840 4014948489 S Bi:1:009:3 -115 256 < ffff996080f11000 4014948522 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < ffff99621e768300 4014948545 C Bi:1:009:3 0 58 = 32392e38 31373337 325d203e 3e3e2064 7763325f 68616e64 6c655f63 6f6d6d6f ffff99621e768300 4014948560 S Bi:1:009:3 -115 256 < ffff996080f11000 4014948607 C Ci:1:001:0 0 4 = 01010100 ffff99621e768840 4014948639 C Bi:1:009:3 0 10 = 37395d20 3e3e3e20 6477 ffff99621e768840 4014948644 S Bi:1:009:3 -115 256 < ffff99621e768300 4014948657 C Bi:1:009:3 0 3 = 63325f ffff99621e768300 4014948663 S Bi:1:009:3 -115 256 < ffff99621e768840 4014948689 C Bi:1:009:3 0 5 = 68736f74 67 ffff99621e768840 4014948693 S Bi:1:009:3 -115 256 < ffff99621e768300 4014948718 C Bi:1:009:3 0 2 = 5f69 ffff99621e768300 4014948720 S Bi:1:009:3 -115 256 < ffff99621e768840 4014948759 C Bi:1:009:3 0 4 = 72713a33 Does this give you any hints? I'm afraid it's going to take time before I'm able to decipher these hieroglyphs. :-| Full log is available, if needed. However I suspect using Wireshark to capture the USB traffic should produce the same content. If it is the case, I have available a Wireshark capture as well. The first logged event I see in Wireshark after the delay is a "URB_INTERRUPT in", which is possibly matching the "Ii" in the log above. However IIUC both the usbmon debugfs interface and Wireshark are unable to capture disconnection events because that's handled by the hardware. Correct? I hope useful hints can be found here. Otherwise I guess the only way out will be comparing the behaviour of the 4.4 Rockchip kernel (which works correctly) against mainline. I expect this to be a long and painful process, though. Best regards, Luca -- Luca Ceresoli, Bootlin Embedded Linux and Kernel engineering https://bootlin.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 2025-05-02 13:53 ` Luca Ceresoli @ 2025-05-02 17:56 ` Alan Stern 2025-05-09 7:17 ` Luca Ceresoli 0 siblings, 1 reply; 12+ messages in thread From: Alan Stern @ 2025-05-02 17:56 UTC (permalink / raw) To: Luca Ceresoli Cc: Minas Harutyunyan, linux-usb@vger.kernel.org, Kever Yang, Greg Kroah-Hartman, linux-kernel@vger.kernel.org, Hervé Codina, Thomas Petazzoni, Stefan Wahren, Fabrice Gasnier On Fri, May 02, 2025 at 03:53:08PM +0200, Luca Ceresoli wrote: > Hello Alan, > > thanks for your continued support! > > On Tue, 15 Apr 2025 12:14:58 -0400 > Alan Stern <stern@rowland.harvard.edu> wrote: > > [...] > > > > > > It's quite possible that you're getting messed up by link power > > > > > management (LPM). But that's just a guess. > > > > > > What would be a symptom, if that happened? > > > > The debugging log wouldn't show much unless something went wrong. You > > could see if there are any files containing "lpm" in their names in the > > /sys/bus/usb/devices/3-3.4/ directory (while the device is connected) > > and what they contain. Also, there's a way to disable LPM on the host > > by setting a usbcore quirks module parameter: > > > > echo 1209:0001:k >/sys/module/usbcore/parameters/quirks > > Tried this. There is no file with 'lpm' in the name in > /sys/bus/usb/devices/3-3.4/, and adding the quirk did not change the > result: still a disconnect and reconnect in ~6 seconds. Okay, so LPM doesn't seem to be the reason. > > You could also try connecting a usbmon trace for bus 3, showing what > > happens during the initial connection and ensuing disconnection. Any > > LPM transitions would show up in the trace. > > Tried this, and here are the few lines before and after the 5~6 seconds > delay. > > ffff99621e768840 4009009102 C Bi:1:009:3 0 2 = 696e > ffff99621e768840 4009009104 S Bi:1:009:3 -115 256 < > ffff99621e768300 4009009115 S Bi:1:009:3 -115 256 < > ffff99621e768840 4009009144 C Bi:1:009:3 0 6 = 3a383534 2033 > ffff99621e768300 4009009155 C Bi:1:009:3 0 1 = 37 > ffff99621e768840 4009009178 C Bi:1:009:3 0 2 = 0d0a > ffff99621e768840 4009009180 S Bi:1:009:3 -115 256 < > ffff996080f11900 4009009361 C Ci:1:014:0 0 26 = 1a034300 44004300 20004100 43004d00 20004400 61007400 6100 > ffff99621e768300 4009009615 C Bi:1:009:3 0 3 = 5b2020 > ffff99621e768300 4009009624 S Bi:1:009:3 -115 256 < > ffff99621e768840 4009009645 C Bi:1:009:3 0 3 = 203233 > ffff99621e768840 4009009646 S Bi:1:009:3 -115 256 < > ffff99621e768300 4009009692 C Bi:1:009:3 0 4 = 2e383738 > ffff99621e768300 4009009694 S Bi:1:009:3 -115 256 < > ffff99621e768840 4009009703 C Bi:1:009:3 0 2 = 3731 > ffff99621e768840 4009009722 S Bi:1:009:3 -115 256 < > ffff99621e768840 4009009933 C Bi:1:009:3 0 2 = 7472 It looks like device 9 (the lines containing :1:009:3) and device 14 are unrelated to the problem; neither of them is your DWC2 device. > > <<< 6 seconds delay >>> > > ffff9960828e9540 4014796128 C Ii:1:001:1 0:2048 2 = 1000 > ffff9960828e9540 4014796145 S Ii:1:001:1 -115:2048 4 < > ffff996080f11900 4014796162 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > ffff996080f11900 4014796189 C Ci:1:001:0 0 4 = 00010100 This shows the host system receiving a disconnect notification (for port 4) from the hardware. Which is odd, because earlier you said the device was 3-3.4, indicating that it was plugged into a hub, not directly into the host controller. But the notification here comes from the host controller. On the other hand, an even earlier email said that the device was 3-2, indicating it _was_ plugged directly into the host controller Which means you've been changing your setup while running these tests. Not a good idea. > ffff996080f11900 4014796201 S Co:1:001:0 s 23 01 0010 0004 0000 0 > ffff996080f11900 4014796219 C Co:1:001:0 0 0 > ffff996080f11000 4014799627 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > ffff996080f11000 4014799679 C Ci:1:001:0 0 4 = 00010000 > ffff996080f11000 4014826132 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > ffff996080f11000 4014826166 C Ci:1:001:0 0 4 = 00010000 > ffff996080f11000 4014852075 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > ffff996080f11000 4014852122 C Ci:1:001:0 0 4 = 00010000 > ffff996080f11000 4014878210 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > ffff996080f11000 4014878253 C Ci:1:001:0 0 4 = 00010000 > ffff996080f11000 4014904049 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > ffff996080f11000 4014904088 C Ci:1:001:0 0 4 = 00010000 > ffff9960828e9540 4014948427 C Ii:1:001:1 0:2048 2 = 1000 > ffff9960828e9540 4014948456 S Ii:1:001:1 -115:2048 4 < > ffff99621e768300 4014948461 C Bi:1:009:3 0 2 = 5b20 > ffff99621e768300 4014948472 S Bi:1:009:3 -115 256 < > ffff99621e768840 4014948488 C Bi:1:009:3 0 2 = 2020 > ffff99621e768840 4014948489 S Bi:1:009:3 -115 256 < > ffff996080f11000 4014948522 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > ffff99621e768300 4014948545 C Bi:1:009:3 0 58 = 32392e38 31373337 325d203e 3e3e2064 7763325f 68616e64 6c655f63 6f6d6d6f > ffff99621e768300 4014948560 S Bi:1:009:3 -115 256 < > ffff996080f11000 4014948607 C Ci:1:001:0 0 4 = 01010100 And then about 150 ms later (the second column of the log is a timestamp, in microseconds), a connection notification. Nothing preceding the disconnect to explain what caused it. > ffff99621e768840 4014948639 C Bi:1:009:3 0 10 = 37395d20 3e3e3e20 6477 > ffff99621e768840 4014948644 S Bi:1:009:3 -115 256 < > ffff99621e768300 4014948657 C Bi:1:009:3 0 3 = 63325f > ffff99621e768300 4014948663 S Bi:1:009:3 -115 256 < > ffff99621e768840 4014948689 C Bi:1:009:3 0 5 = 68736f74 67 > ffff99621e768840 4014948693 S Bi:1:009:3 -115 256 < > ffff99621e768300 4014948718 C Bi:1:009:3 0 2 = 5f69 > ffff99621e768300 4014948720 S Bi:1:009:3 -115 256 < > ffff99621e768840 4014948759 C Bi:1:009:3 0 4 = 72713a33 Unrelated material. Evidently device 9 is running some sort of serial connection, because everything it sends looks like ASCII characters. > Does this give you any hints? Afraid not. > I'm afraid it's going to take time before I'm able to decipher these > hieroglyphs. :-| > > Full log is available, if needed. It wouldn't hurt to see exactly what happens when the device is first plugged in. It's possible, though unlikely, that something at that time causes trouble later on. > However I suspect using Wireshark to capture the USB traffic should > produce the same content. If it is the case, I have available a > Wireshark capture as well. The first logged event I see in Wireshark > after the delay is a "URB_INTERRUPT in", which is possibly matching the > "Ii" in the log above. Yes; usbmon and Wireshark capture basically the same information. > However IIUC both the usbmon debugfs interface and Wireshark are unable > to capture disconnection events because that's handled by the hardware. > Correct? I'm not quite sure how to answer. Yes, the hardware handles disconnections -- because the hardware handles _everything_ that happens on the USB bus. And one of the things the hardware does when handling disconnections is to tell the driver that one occurred; that's why the report shows up in the usbmon (or Wireshark) trace. A USB analyzer could tell you exactly what's happening on the wire, but they are expensive. And in this case, I think all it would tell you is what we already know: that a disconnect happened. The fact that the disconnects don't happen with the vendor kernel indicates that they aren't caused by a hardware problem, such as a bad cable link, but rather by something in the device's software, i.e., the dwc2 driver. I don't know anything about that driver, though. Minas is the expert. You really need his advice. > I hope useful hints can be found here. Otherwise I guess the only way > out will be comparing the behaviour of the 4.4 Rockchip kernel (which > works correctly) against mainline. I expect this to be a long and > painful process, though. Is there any way to compare directly the driver used by the vendor kernel with the vanilla driver? Such as porting one of the drivers to run in the other kernel? Alternatively, can one get additional debugging information from the dwc2 driver in its disconnect pathway? I don't know what would be expected to show up in the log if the driver deliberately dropped the connection. Alan Stern ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 2025-05-02 17:56 ` Alan Stern @ 2025-05-09 7:17 ` Luca Ceresoli 2025-05-13 7:35 ` Minas Harutyunyan 0 siblings, 1 reply; 12+ messages in thread From: Luca Ceresoli @ 2025-05-09 7:17 UTC (permalink / raw) To: Alan Stern, Minas Harutyunyan Cc: linux-usb@vger.kernel.org, Kever Yang, Greg Kroah-Hartman, linux-kernel@vger.kernel.org, Hervé Codina, Thomas Petazzoni, Stefan Wahren, Fabrice Gasnier Hello Alan, Minas, Minas: I am reporting new relevant findings in this e-mail and have questions for you below. On Fri, 2 May 2025 13:56:01 -0400 Alan Stern <stern@rowland.harvard.edu> wrote: > On Fri, May 02, 2025 at 03:53:08PM +0200, Luca Ceresoli wrote: > > Hello Alan, > > > > thanks for your continued support! > > > > On Tue, 15 Apr 2025 12:14:58 -0400 > > Alan Stern <stern@rowland.harvard.edu> wrote: > > > > [...] > > > > > > > > It's quite possible that you're getting messed up by link power > > > > > > management (LPM). But that's just a guess. > > > > > > > > What would be a symptom, if that happened? > > > > > > The debugging log wouldn't show much unless something went wrong. You > > > could see if there are any files containing "lpm" in their names in the > > > /sys/bus/usb/devices/3-3.4/ directory (while the device is connected) > > > and what they contain. Also, there's a way to disable LPM on the host > > > by setting a usbcore quirks module parameter: > > > > > > echo 1209:0001:k >/sys/module/usbcore/parameters/quirks > > > > Tried this. There is no file with 'lpm' in the name in > > /sys/bus/usb/devices/3-3.4/, and adding the quirk did not change the > > result: still a disconnect and reconnect in ~6 seconds. > > Okay, so LPM doesn't seem to be the reason. I see, thanks for checking. > > > You could also try connecting a usbmon trace for bus 3, showing what > > > happens during the initial connection and ensuing disconnection. Any > > > LPM transitions would show up in the trace. > > > > Tried this, and here are the few lines before and after the 5~6 seconds > > delay. > > > > ffff99621e768840 4009009102 C Bi:1:009:3 0 2 = 696e > > ffff99621e768840 4009009104 S Bi:1:009:3 -115 256 < > > ffff99621e768300 4009009115 S Bi:1:009:3 -115 256 < > > ffff99621e768840 4009009144 C Bi:1:009:3 0 6 = 3a383534 2033 > > ffff99621e768300 4009009155 C Bi:1:009:3 0 1 = 37 > > ffff99621e768840 4009009178 C Bi:1:009:3 0 2 = 0d0a > > ffff99621e768840 4009009180 S Bi:1:009:3 -115 256 < > > ffff996080f11900 4009009361 C Ci:1:014:0 0 26 = 1a034300 44004300 20004100 43004d00 20004400 61007400 6100 > > ffff99621e768300 4009009615 C Bi:1:009:3 0 3 = 5b2020 > > ffff99621e768300 4009009624 S Bi:1:009:3 -115 256 < > > ffff99621e768840 4009009645 C Bi:1:009:3 0 3 = 203233 > > ffff99621e768840 4009009646 S Bi:1:009:3 -115 256 < > > ffff99621e768300 4009009692 C Bi:1:009:3 0 4 = 2e383738 > > ffff99621e768300 4009009694 S Bi:1:009:3 -115 256 < > > ffff99621e768840 4009009703 C Bi:1:009:3 0 2 = 3731 > > ffff99621e768840 4009009722 S Bi:1:009:3 -115 256 < > > ffff99621e768840 4009009933 C Bi:1:009:3 0 2 = 7472 > > It looks like device 9 (the lines containing :1:009:3) and device 14 are > unrelated to the problem; neither of them is your DWC2 device. That's probably because I ha connected an entire USB HUB to the laptop, which had in turn a USB-serial adapter to access the console on the board headers. I understand this creates more noise, so I changed my setup later on to only connect the relevant cable. > > <<< 6 seconds delay >>> > > > > ffff9960828e9540 4014796128 C Ii:1:001:1 0:2048 2 = 1000 > > ffff9960828e9540 4014796145 S Ii:1:001:1 -115:2048 4 < > > ffff996080f11900 4014796162 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > > ffff996080f11900 4014796189 C Ci:1:001:0 0 4 = 00010100 > > This shows the host system receiving a disconnect notification (for port > 4) from the hardware. Which is odd, because earlier you said the device > was 3-3.4, indicating that it was plugged into a hub, not directly into > the host controller. But the notification here comes from the host > controller. > > On the other hand, an even earlier email said that the device was 3-2, > indicating it _was_ plugged directly into the host controller > > Which means you've been changing your setup while running these tests. > Not a good idea. I had to change laptop because reading usbmon debugfs files is not working on my main laptop. I still haven't figured out the reason, but on the other laptop it works, but unavoidably it changes the bus number. Sorry about that. > > ffff996080f11900 4014796201 S Co:1:001:0 s 23 01 0010 0004 0000 0 > > ffff996080f11900 4014796219 C Co:1:001:0 0 0 > > ffff996080f11000 4014799627 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > > ffff996080f11000 4014799679 C Ci:1:001:0 0 4 = 00010000 > > ffff996080f11000 4014826132 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > > ffff996080f11000 4014826166 C Ci:1:001:0 0 4 = 00010000 > > ffff996080f11000 4014852075 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > > ffff996080f11000 4014852122 C Ci:1:001:0 0 4 = 00010000 > > ffff996080f11000 4014878210 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > > ffff996080f11000 4014878253 C Ci:1:001:0 0 4 = 00010000 > > ffff996080f11000 4014904049 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > > ffff996080f11000 4014904088 C Ci:1:001:0 0 4 = 00010000 > > ffff9960828e9540 4014948427 C Ii:1:001:1 0:2048 2 = 1000 > > ffff9960828e9540 4014948456 S Ii:1:001:1 -115:2048 4 < > > ffff99621e768300 4014948461 C Bi:1:009:3 0 2 = 5b20 > > ffff99621e768300 4014948472 S Bi:1:009:3 -115 256 < > > ffff99621e768840 4014948488 C Bi:1:009:3 0 2 = 2020 > > ffff99621e768840 4014948489 S Bi:1:009:3 -115 256 < > > ffff996080f11000 4014948522 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < > > ffff99621e768300 4014948545 C Bi:1:009:3 0 58 = 32392e38 31373337 325d203e 3e3e2064 7763325f 68616e64 6c655f63 6f6d6d6f > > ffff99621e768300 4014948560 S Bi:1:009:3 -115 256 < > > ffff996080f11000 4014948607 C Ci:1:001:0 0 4 = 01010100 > > And then about 150 ms later (the second column of the log is a > timestamp, in microseconds), a connection notification. Nothing > preceding the disconnect to explain what caused it. > > > ffff99621e768840 4014948639 C Bi:1:009:3 0 10 = 37395d20 3e3e3e20 6477 > > ffff99621e768840 4014948644 S Bi:1:009:3 -115 256 < > > ffff99621e768300 4014948657 C Bi:1:009:3 0 3 = 63325f > > ffff99621e768300 4014948663 S Bi:1:009:3 -115 256 < > > ffff99621e768840 4014948689 C Bi:1:009:3 0 5 = 68736f74 67 > > ffff99621e768840 4014948693 S Bi:1:009:3 -115 256 < > > ffff99621e768300 4014948718 C Bi:1:009:3 0 2 = 5f69 > > ffff99621e768300 4014948720 S Bi:1:009:3 -115 256 < > > ffff99621e768840 4014948759 C Bi:1:009:3 0 4 = 72713a33 > > Unrelated material. Evidently device 9 is running some sort of > serial connection, because everything it sends looks like ASCII > characters. Perhaps the USB-serial I mentioned above, to access the board console. > > However IIUC both the usbmon debugfs interface and Wireshark are unable > > to capture disconnection events because that's handled by the hardware. > > Correct? > > I'm not quite sure how to answer. Yes, the hardware handles > disconnections -- because the hardware handles _everything_ that happens > on the USB bus. And one of the things the hardware does when handling > disconnections is to tell the driver that one occurred; that's why the > report shows up in the usbmon (or Wireshark) trace. > > A USB analyzer could tell you exactly what's happening on the wire, but > they are expensive. And in this case, I think all it would tell you is > what we already know: that a disconnect happened. > > The fact that the disconnects don't happen with the vendor kernel > indicates that they aren't caused by a hardware problem, such as a bad > cable link, but rather by something in the device's software, i.e., the > dwc2 driver. > > I don't know anything about that driver, though. Minas is the expert. > You really need his advice. In the meanwhile I did two event captures, one with the mainline kernel and one with the vendor kernel, using the same laptop setup and no hub in between, and for each test I captured both the usbmon log and a wireshark file. Both are available if needed. By analyzing those captures I found that the communication between host and gadget is almost identical. The only differenceis the get configuration descriptor response has one more descriptor in the vendor case (the working one). Here it is: OTG Descriptor: bLength 3 bDescriptorType 9 bmAttributes 0x03 SRP (Session Request Protocol) HNP (Host Negotiation Protocol) I don't know exacty what that implies, but for a quick test I went in the mainline kernel and found that it can add the same descriptor if both of these is true: * dr_mode = "otg" in device tree * "DWC2 Mode Selection" is "Dual role mode" in kconfig (i.e. CONFIG_USB_DWC2_DUAL_ROLE=y) While I had: * dr_mode = "peripheral" * "DWC2 Mode Selection" = "Gadget only mode" (i.e. CONFIG_USB_DWC2_PERIPHERAL=y) With those two changes the mainline kernel now behaves correctly, just like the vendor kernel. No more disconnection after 5-6 seconds. For the records, the vendor kernel already had dr_mode = "otg" and CONFIG_USB_DWC2_DUAL_ROLE=y. Based on my very limited knowledge of USB, intuitively it looks that: * in peripheral-only mode the OTG Descriptor should not be sent * in peripheral-only mode SRP does not make sense * in peripheral-only mode HNP does not make sense Are the above correct? Whether the answer, I think these new findings do not yet explain the problem nor point to a correct solution. Apart from the added descriptor, all of the initial enumeration events seen by usbmon is identical in the two cases. Minas, were you able to have a look at the info I collected? Do they suggesting you anything about the dwc2 driver? Best regards, Luca -- Luca Ceresoli, Bootlin Embedded Linux and Kernel engineering https://bootlin.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 2025-05-09 7:17 ` Luca Ceresoli @ 2025-05-13 7:35 ` Minas Harutyunyan 2025-05-20 12:09 ` Luca Ceresoli 0 siblings, 1 reply; 12+ messages in thread From: Minas Harutyunyan @ 2025-05-13 7:35 UTC (permalink / raw) To: Luca Ceresoli, Alan Stern Cc: linux-usb@vger.kernel.org, Kever Yang, Greg Kroah-Hartman, linux-kernel@vger.kernel.org, Hervé Codina, Thomas Petazzoni, Stefan Wahren, Fabrice Gasnier Hi Luca, On 5/9/25 11:17, Luca Ceresoli wrote: > Hello Alan, Minas, > > Minas: I am reporting new relevant findings in this e-mail and have > questions for you below. > > On Fri, 2 May 2025 13:56:01 -0400 > Alan Stern <stern@rowland.harvard.edu> wrote: > >> On Fri, May 02, 2025 at 03:53:08PM +0200, Luca Ceresoli wrote: >>> Hello Alan, >>> >>> thanks for your continued support! >>> >>> On Tue, 15 Apr 2025 12:14:58 -0400 >>> Alan Stern <stern@rowland.harvard.edu> wrote: >>> >>> [...] >>> >>>>>>> It's quite possible that you're getting messed up by link power >>>>>>> management (LPM). But that's just a guess. >>>>> >>>>> What would be a symptom, if that happened? >>>> >>>> The debugging log wouldn't show much unless something went wrong. You >>>> could see if there are any files containing "lpm" in their names in the >>>> /sys/bus/usb/devices/3-3.4/ directory (while the device is connected) >>>> and what they contain. Also, there's a way to disable LPM on the host >>>> by setting a usbcore quirks module parameter: >>>> >>>> echo 1209:0001:k >/sys/module/usbcore/parameters/quirks >>> >>> Tried this. There is no file with 'lpm' in the name in >>> /sys/bus/usb/devices/3-3.4/, and adding the quirk did not change the >>> result: still a disconnect and reconnect in ~6 seconds. >> >> Okay, so LPM doesn't seem to be the reason. > > I see, thanks for checking. > >>>> You could also try connecting a usbmon trace for bus 3, showing what >>>> happens during the initial connection and ensuing disconnection. Any >>>> LPM transitions would show up in the trace. >>> >>> Tried this, and here are the few lines before and after the 5~6 seconds >>> delay. >>> >>> ffff99621e768840 4009009102 C Bi:1:009:3 0 2 = 696e >>> ffff99621e768840 4009009104 S Bi:1:009:3 -115 256 < >>> ffff99621e768300 4009009115 S Bi:1:009:3 -115 256 < >>> ffff99621e768840 4009009144 C Bi:1:009:3 0 6 = 3a383534 2033 >>> ffff99621e768300 4009009155 C Bi:1:009:3 0 1 = 37 >>> ffff99621e768840 4009009178 C Bi:1:009:3 0 2 = 0d0a >>> ffff99621e768840 4009009180 S Bi:1:009:3 -115 256 < >>> ffff996080f11900 4009009361 C Ci:1:014:0 0 26 = 1a034300 44004300 20004100 43004d00 20004400 61007400 6100 >>> ffff99621e768300 4009009615 C Bi:1:009:3 0 3 = 5b2020 >>> ffff99621e768300 4009009624 S Bi:1:009:3 -115 256 < >>> ffff99621e768840 4009009645 C Bi:1:009:3 0 3 = 203233 >>> ffff99621e768840 4009009646 S Bi:1:009:3 -115 256 < >>> ffff99621e768300 4009009692 C Bi:1:009:3 0 4 = 2e383738 >>> ffff99621e768300 4009009694 S Bi:1:009:3 -115 256 < >>> ffff99621e768840 4009009703 C Bi:1:009:3 0 2 = 3731 >>> ffff99621e768840 4009009722 S Bi:1:009:3 -115 256 < >>> ffff99621e768840 4009009933 C Bi:1:009:3 0 2 = 7472 >> >> It looks like device 9 (the lines containing :1:009:3) and device 14 are >> unrelated to the problem; neither of them is your DWC2 device. > > That's probably because I ha connected an entire USB HUB to the laptop, > which had in turn a USB-serial adapter to access the console on the > board headers. I understand this creates more noise, so I changed my > setup later on to only connect the relevant cable. > >>> <<< 6 seconds delay >>> >>> >>> ffff9960828e9540 4014796128 C Ii:1:001:1 0:2048 2 = 1000 >>> ffff9960828e9540 4014796145 S Ii:1:001:1 -115:2048 4 < >>> ffff996080f11900 4014796162 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < >>> ffff996080f11900 4014796189 C Ci:1:001:0 0 4 = 00010100 >> >> This shows the host system receiving a disconnect notification (for port >> 4) from the hardware. Which is odd, because earlier you said the device >> was 3-3.4, indicating that it was plugged into a hub, not directly into >> the host controller. But the notification here comes from the host >> controller. >> >> On the other hand, an even earlier email said that the device was 3-2, >> indicating it _was_ plugged directly into the host controller >> >> Which means you've been changing your setup while running these tests. >> Not a good idea. > > I had to change laptop because reading usbmon debugfs files is not > working on my main laptop. I still haven't figured out the reason, but > on the other laptop it works, but unavoidably it changes the bus > number. Sorry about that. > >>> ffff996080f11900 4014796201 S Co:1:001:0 s 23 01 0010 0004 0000 0 >>> ffff996080f11900 4014796219 C Co:1:001:0 0 0 >>> ffff996080f11000 4014799627 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < >>> ffff996080f11000 4014799679 C Ci:1:001:0 0 4 = 00010000 >>> ffff996080f11000 4014826132 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < >>> ffff996080f11000 4014826166 C Ci:1:001:0 0 4 = 00010000 >>> ffff996080f11000 4014852075 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < >>> ffff996080f11000 4014852122 C Ci:1:001:0 0 4 = 00010000 >>> ffff996080f11000 4014878210 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < >>> ffff996080f11000 4014878253 C Ci:1:001:0 0 4 = 00010000 >>> ffff996080f11000 4014904049 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < >>> ffff996080f11000 4014904088 C Ci:1:001:0 0 4 = 00010000 >>> ffff9960828e9540 4014948427 C Ii:1:001:1 0:2048 2 = 1000 >>> ffff9960828e9540 4014948456 S Ii:1:001:1 -115:2048 4 < >>> ffff99621e768300 4014948461 C Bi:1:009:3 0 2 = 5b20 >>> ffff99621e768300 4014948472 S Bi:1:009:3 -115 256 < >>> ffff99621e768840 4014948488 C Bi:1:009:3 0 2 = 2020 >>> ffff99621e768840 4014948489 S Bi:1:009:3 -115 256 < >>> ffff996080f11000 4014948522 S Ci:1:001:0 s a3 00 0000 0004 0004 4 < >>> ffff99621e768300 4014948545 C Bi:1:009:3 0 58 = 32392e38 31373337 325d203e 3e3e2064 7763325f 68616e64 6c655f63 6f6d6d6f >>> ffff99621e768300 4014948560 S Bi:1:009:3 -115 256 < >>> ffff996080f11000 4014948607 C Ci:1:001:0 0 4 = 01010100 >> >> And then about 150 ms later (the second column of the log is a >> timestamp, in microseconds), a connection notification. Nothing >> preceding the disconnect to explain what caused it. >> >>> ffff99621e768840 4014948639 C Bi:1:009:3 0 10 = 37395d20 3e3e3e20 6477 >>> ffff99621e768840 4014948644 S Bi:1:009:3 -115 256 < >>> ffff99621e768300 4014948657 C Bi:1:009:3 0 3 = 63325f >>> ffff99621e768300 4014948663 S Bi:1:009:3 -115 256 < >>> ffff99621e768840 4014948689 C Bi:1:009:3 0 5 = 68736f74 67 >>> ffff99621e768840 4014948693 S Bi:1:009:3 -115 256 < >>> ffff99621e768300 4014948718 C Bi:1:009:3 0 2 = 5f69 >>> ffff99621e768300 4014948720 S Bi:1:009:3 -115 256 < >>> ffff99621e768840 4014948759 C Bi:1:009:3 0 4 = 72713a33 >> >> Unrelated material. Evidently device 9 is running some sort of >> serial connection, because everything it sends looks like ASCII >> characters. > > Perhaps the USB-serial I mentioned above, to access the board console. > >>> However IIUC both the usbmon debugfs interface and Wireshark are unable >>> to capture disconnection events because that's handled by the hardware. >>> Correct? >> >> I'm not quite sure how to answer. Yes, the hardware handles >> disconnections -- because the hardware handles _everything_ that happens >> on the USB bus. And one of the things the hardware does when handling >> disconnections is to tell the driver that one occurred; that's why the >> report shows up in the usbmon (or Wireshark) trace. >> >> A USB analyzer could tell you exactly what's happening on the wire, but >> they are expensive. And in this case, I think all it would tell you is >> what we already know: that a disconnect happened. >> >> The fact that the disconnects don't happen with the vendor kernel >> indicates that they aren't caused by a hardware problem, such as a bad >> cable link, but rather by something in the device's software, i.e., the >> dwc2 driver. >> >> I don't know anything about that driver, though. Minas is the expert. >> You really need his advice. > > In the meanwhile I did two event captures, one with the mainline kernel > and one with the vendor kernel, using the same laptop setup and no hub > in between, and for each test I captured both the usbmon log and a > wireshark file. Both are available if needed. > > By analyzing those captures I found that the communication between host > and gadget is almost identical. The only differenceis the get > configuration descriptor response has one more descriptor in the vendor > case (the working one). Here it is: > > OTG Descriptor: > bLength 3 > bDescriptorType 9 > bmAttributes 0x03 > SRP (Session Request Protocol) > HNP (Host Negotiation Protocol) > > I don't know exacty what that implies, but for a quick test I went in > the mainline kernel and found that it can add the same descriptor if > both of these is true: > > * dr_mode = "otg" in device tree > * "DWC2 Mode Selection" is "Dual role mode" in kconfig > (i.e. CONFIG_USB_DWC2_DUAL_ROLE=y) > > While I had: > > * dr_mode = "peripheral" > * "DWC2 Mode Selection" = "Gadget only mode" > (i.e. CONFIG_USB_DWC2_PERIPHERAL=y) > > With those two changes the mainline kernel now behaves correctly, just > like the vendor kernel. No more disconnection after 5-6 seconds. > > For the records, the vendor kernel already had dr_mode = "otg" and > CONFIG_USB_DWC2_DUAL_ROLE=y. > > Based on my very limited knowledge of USB, intuitively it looks that: > > * in peripheral-only mode the OTG Descriptor should not be sent > * in peripheral-only mode SRP does not make sense > * in peripheral-only mode HNP does not make sense > > Are the above correct? > > Whether the answer, I think these new findings do not yet explain the > problem nor point to a correct solution. Apart from the added > descriptor, all of the initial enumeration events seen by usbmon is > identical in the two cases. > > Minas, were you able to have a look at the info I collected? > Do they suggesting you anything about the dwc2 driver? > Configuration parameters: CONFIG_USB_DWC2_HOST, CONFIG_USB_DWC2_PERIPHERAL and CONFIG_USB_DWC2_DUAL_ROLE have impact only on build process. Based on these parameters driver can build as host only, device only or host + device. OTG functionality of depend on: 1. On core configuration - GHWCFG2 bits 0:2: Mode of Operation (OtgMode) 3'b000: HNP- and SRP-Capable OTG (Host & Device) 3'b001: SRP-Capable OTG (Host & Device) 3'b010: Non-HNP and Non-SRP Capable OTG (Host and Device) 3'b011: SRP-Capable Device 3'b100: Non-OTG Device 3'b101: SRP-Capable Host 3'b110: Non-OTG Host Others: Reserved As you can see above, device only mode can support OTG, i.e. "SRP-capable device". Based on provided OTG descriptor your core's OTG mode is equal to 0, which means "HNP- and SRP-Capable OTG (Host & Device)". 2. Depend on platform (see dwc2/param.c) OTG functionality can be updated, if it allowed by above core configuration OTG parameter. 3. OTG functionality can updated also through devicetree parameters settings. Thanks, Minas > Best regards, > Luca > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 2025-05-13 7:35 ` Minas Harutyunyan @ 2025-05-20 12:09 ` Luca Ceresoli 2025-06-12 13:23 ` Louis Chauvet 0 siblings, 1 reply; 12+ messages in thread From: Luca Ceresoli @ 2025-05-20 12:09 UTC (permalink / raw) To: Minas Harutyunyan Cc: Alan Stern, linux-usb@vger.kernel.org, Kever Yang, Greg Kroah-Hartman, linux-kernel@vger.kernel.org, Hervé Codina, Thomas Petazzoni, Stefan Wahren, Fabrice Gasnier Hello Minas, On Tue, 13 May 2025 07:35:40 +0000 Minas Harutyunyan <Minas.Harutyunyan@synopsys.com> wrote: > >> I don't know anything about that driver, though. Minas is the expert. > >> You really need his advice. > > > > In the meanwhile I did two event captures, one with the mainline kernel > > and one with the vendor kernel, using the same laptop setup and no hub > > in between, and for each test I captured both the usbmon log and a > > wireshark file. Both are available if needed. > > > > By analyzing those captures I found that the communication between host > > and gadget is almost identical. The only differenceis the get > > configuration descriptor response has one more descriptor in the vendor > > case (the working one). Here it is: > > > > OTG Descriptor: > > bLength 3 > > bDescriptorType 9 > > bmAttributes 0x03 > > SRP (Session Request Protocol) > > HNP (Host Negotiation Protocol) > > > > I don't know exacty what that implies, but for a quick test I went in > > the mainline kernel and found that it can add the same descriptor if > > both of these is true: > > > > * dr_mode = "otg" in device tree > > * "DWC2 Mode Selection" is "Dual role mode" in kconfig > > (i.e. CONFIG_USB_DWC2_DUAL_ROLE=y) > > > > While I had: > > > > * dr_mode = "peripheral" > > * "DWC2 Mode Selection" = "Gadget only mode" > > (i.e. CONFIG_USB_DWC2_PERIPHERAL=y) > > > > With those two changes the mainline kernel now behaves correctly, just > > like the vendor kernel. No more disconnection after 5-6 seconds. > > > > For the records, the vendor kernel already had dr_mode = "otg" and > > CONFIG_USB_DWC2_DUAL_ROLE=y. > > > > Based on my very limited knowledge of USB, intuitively it looks that: > > > > * in peripheral-only mode the OTG Descriptor should not be sent > > * in peripheral-only mode SRP does not make sense > > * in peripheral-only mode HNP does not make sense > > > > Are the above correct? > > > > Whether the answer, I think these new findings do not yet explain the > > problem nor point to a correct solution. Apart from the added > > descriptor, all of the initial enumeration events seen by usbmon is > > identical in the two cases. > > > > Minas, were you able to have a look at the info I collected? > > Do they suggesting you anything about the dwc2 driver? > > > Configuration parameters: CONFIG_USB_DWC2_HOST, > CONFIG_USB_DWC2_PERIPHERAL and CONFIG_USB_DWC2_DUAL_ROLE have impact > only on build process. Based on these parameters driver can build as > host only, device only or host + device. > > OTG functionality of depend on: > 1. On core configuration - GHWCFG2 bits 0:2: > Mode of Operation (OtgMode) > 3'b000: HNP- and SRP-Capable OTG (Host & Device) > 3'b001: SRP-Capable OTG (Host & Device) > 3'b010: Non-HNP and Non-SRP Capable OTG (Host and Device) > 3'b011: SRP-Capable Device > 3'b100: Non-OTG Device > 3'b101: SRP-Capable Host > 3'b110: Non-OTG Host > Others: Reserved > As you can see above, device only mode can support OTG, i.e. > "SRP-capable device". > Based on provided OTG descriptor your core's OTG mode is equal to 0, > which means "HNP- and SRP-Capable OTG (Host & Device)". > 2. Depend on platform (see dwc2/param.c) OTG functionality can be > updated, if it allowed by above core configuration OTG parameter. > 3. OTG functionality can updated also through devicetree parameters > settings. Thanks for the clarification. FYI the GHWCFG2 value is 0x228e2450 on the RK3308, so OtgMode = "3'b000: HNP- and SRP-Capable OTG (Host & Device)". And I confirm the outcome of my tests: A) if dr_mode = "otg" in DT AND CONFIG_USB_DWC2_DUAL_ROLE=y: - OTG descriptor is sent - no disconnection, no re-enumeration B) if dr_mode = "peripheral" in DT OR CONFIG_USB_DWC2_PERIPHERAL=y: - OTG descriptor is not sent - disconnection+enumeration after ~6 seconds The disconnection in case B should _not_ happen. The presence/absence of the OTG descriptor is not wrong AFAICU. I'm mentioning it just because it might give some clues. I did a comparison of /sys/kernel/debug/usb/ff400000.usb/regdump in cases A and B. The only relevant difference is that bit USBOTG_GUSBCFG.ForceDevMode is 1 in case B. Based on the TRM, this seems correct. Doing other checks on registers and adding some logging to the code showed everything appears to be configured correctly. So, nothing explains why after about 6 seconds there is a disconnect. My tests are done on mainline Linux v6.15-rc2. Minas, do you have any hints or advice to understand why there is a disconnect about ~6 seconds after a successful enumeration in gadget mode? Luca -- Luca Ceresoli, Bootlin Embedded Linux and Kernel engineering https://bootlin.com ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 2025-05-20 12:09 ` Luca Ceresoli @ 2025-06-12 13:23 ` Louis Chauvet 0 siblings, 0 replies; 12+ messages in thread From: Louis Chauvet @ 2025-06-12 13:23 UTC (permalink / raw) To: Luca Ceresoli, Minas Harutyunyan Cc: Alan Stern, linux-usb@vger.kernel.org, Kever Yang, Greg Kroah-Hartman, linux-kernel@vger.kernel.org, Hervé Codina, Thomas Petazzoni, Stefan Wahren, Fabrice Gasnier, linux-arm-kernel, linux-phy, heiko, vkoul, kishon, linux-rockchip Le 20/05/2025 à 14:09, Luca Ceresoli a écrit : > Hello Minas, [...] > Minas, do you have any hints or advice to understand why there is a > disconnect about ~6 seconds after a successful enumeration in gadget > mode? +CC: vkoul@kernel.org, kishon@kernel.org, linux-rockchip@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-phy@lists.infradead.org, heiko@sntech.de Hello, I am a colleague of Luca and have been working on an issue related to USB peripheral disconnections over the past few days. I believe I have found some interesting insights that explain the disconnection. To provide some context, I have added traces in relevant functions to understand the sequence of events (filtered on interesting logs, see [1] for full logs): [... boot proccess ...] [ 0.628744] dwc2 ff400000.usb: mapped PA ff400000 to VA (____ptrval____) [ 0.629358] dwc2 ff400000.usb: supply vusb_d not found, using dummy regulator [ 0.630478] dwc2 ff400000.usb: supply vusb_a not found, using dummy regulator [ 0.631297] phy phy-ff008000.syscon:usb2phy@100.0: drivers/usb/dwc2/platform.c:157:dwc2_lowlevel_hw_enable [ 0.632406] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/phy-core.c:230:phy_init [ 0.633233] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:491:rockchip_usb2phy_init [ 0.634262] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:539:rockchip_usb2phy_init (before schedule_delayed_work otg_sm_work) [ 0.635602] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/phy-core.c:309:phy_power_on [ 0.636379] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:577:rockchip_usb2phy_power_on [ 0.637496] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:606:rockchip_usb2phy_power_on (before rockchip_usb2phy_reset) [... bunch of dwc2 ff400000.usb logs ...] [ 0.688708] phy phy-ff008000.syscon:usb2phy@100.0: drivers/usb/dwc2/platform.c:200:dwc2_lowlevel_hw_disable [ 0.689661] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/phy-core.c:360:phy_power_off [ 0.690442] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:620:rockchip_usb2phy_power_off [ 0.691522] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/phy-core.c:271:phy_exit [ 0.692261] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:648:rockchip_usb2phy_exit (before cancel_delayed_work_sync otg_sm_work/chg_work) [ 0.698909] dwmmc_rockchip ff490000.mmc: IDMAC supports 32-bit address mode. [... boot finished ...] [... call to my peripheral setup script ...] [ 44.493316] phy phy-ff008000.syscon:usb2phy@100.0: drivers/usb/dwc2/platform.c:157:dwc2_lowlevel_hw_enable [ 44.495078] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/phy-core.c:230:phy_init [ 44.495837] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:491:rockchip_usb2phy_init [ 44.496960] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:539:rockchip_usb2phy_init (before schedule_delayed_work otg_sm_work) [ 44.498317] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/phy-core.c:309:phy_power_on [ 44.499107] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:577:rockchip_usb2phy_power_on [ 44.500160] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:606:rockchip_usb2phy_power_on (before rockchip_usb2phy_reset) [... bunch of dwc2 ff400000.usb logs ...] [ 44.976346] dwc2 ff400000.usb: dwc2_hsotg_start_req: DXEPCTL=0x80028000 [... ~5 seconds delay ...] [ 50.660998] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:670:rockchip_usb2phy_otg_sm_work [ 50.662128] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:685:rockchip_usb2phy_otg_sm_work (rport->state=0 vbus_attach=1 rphy->chg_state=0 rphy->chg_type=0 extcon_get_state=0) [ 50.663862] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:831:rockchip_chg_detect_work [ 50.664973] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:839:rockchip_chg_detect_work (rphy->chg_state=0 rport->suspended=0) [ 50.666316] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:620:rockchip_usb2phy_power_off [ 50.769027] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:831:rockchip_chg_detect_work [ 50.770093] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:839:rockchip_chg_detect_work (rphy->chg_state=1 rport->suspended=1) [ 50.813030] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:831:rockchip_chg_detect_work [ 50.814083] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:839:rockchip_chg_detect_work (rphy->chg_state=2 rport->suspended=1) [ 50.857013] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:831:rockchip_chg_detect_work [ 50.858067] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:839:rockchip_chg_detect_work (rphy->chg_state=3 rport->suspended=1) [ 50.859409] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:670:rockchip_usb2phy_otg_sm_work [ 50.860475] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:685:rockchip_usb2phy_otg_sm_work (rport->state=1 vbus_attach=1 rphy->chg_state=5 rphy->chg_type=6 extcon_get_state=0) [ 50.862267] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:577:rockchip_usb2phy_power_on [ 50.863326] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:606:rockchip_usb2phy_power_on (before rockchip_usb2phy_reset) [... bunch of dwc2 ff400000.usb logs ...] [ 52.868992] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:670:rockchip_usb2phy_otg_sm_work [ 52.870108] phy phy-ff008000.syscon:usb2phy@100.0: drivers/phy/rockchip/phy-rockchip-inno-usb2.c:685:rockchip_usb2phy_otg_sm_work (rport->state=3 vbus_attach=1 rphy->chg_state=5 rphy->chg_type=6 extcon_get_state=0) [... repeated every 2 seconds ...] I found two potential issues that may be relevant: 1 - The workers rockchip_usb2phy_otg_sm_work and rockchip_chg_detect_work call rockchip_usb2phy_power_off/on directly, bypassing phy_power_off/on. This means the state of the PHY after a call to phy_power_on may not be the one expected by users of the struct phy api. 2 - The worker rockchip_chg_detect_work always disables the PHY at its first call, which creates the disconnection. For issue 1, here is the phy_power_on/off reference count: [ 0.635602] phy_power_on count = 1 [ 0.689661] phy_power_off count = 0 [ 44.498317] phy_power_on count = 1 // ./setup-cdc-serial [ 50.666316] rockchip_usb2phy_power_off => direct call to power_off! This causes issues in the dwc2 driver because the PHY is powered off even if the phy_power_on counter is not zero, leading to the disconnection [ 50.862267] rockchip_usb2phy_power_on => direct call to power_on and reset! This will reset the PHY even if the phy_power_on counter is not zero. This is my first time digging into the USB/PHY subsystem, so I might be wrong, but this behavior seems very strange. Once phy_power_on is called, I expect the PHY to never go off until phy_power_off is called. For issue 2, this is also a strange behavior. I understand that resetting the PHY before its first use might be necessary, but I don't understand why this must be delayed by 6 seconds. In my opinion, this reset sequence should occur synchronously during the phy_init call. I quickly looked at other PHY drivers, and none of them seem to have asynchronous behavior or any internal calls to power_on/power_off. Could you confirm if what I found is indeed the issue and if the rockchip_usb2phy driver needs to be fixed? Does anyone know why rockchip_usb2phy needs to perform these asynchronous tasks and these power_on/off calls? Let me know if you need more logs to understand the issue. Thanks a lot, Louis Chauvet [1]:https://paste.sr.ht/~fomys/26e0d03d03537a6eb3757fb709b6d5a81484eee3 -- Louis Chauvet, Bootlin Embedded Linux and Kernel engineering https://bootlin.com ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-06-12 13:24 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-14 16:54 DWC2 gadget: unexpected device reenumeration on Rockchip RK3308 Luca Ceresoli 2025-04-14 17:26 ` Alan Stern 2025-04-15 11:26 ` Minas Harutyunyan 2025-04-15 14:28 ` Luca Ceresoli 2025-04-15 15:58 ` Luca Ceresoli 2025-04-15 16:14 ` Alan Stern 2025-05-02 13:53 ` Luca Ceresoli 2025-05-02 17:56 ` Alan Stern 2025-05-09 7:17 ` Luca Ceresoli 2025-05-13 7:35 ` Minas Harutyunyan 2025-05-20 12:09 ` Luca Ceresoli 2025-06-12 13:23 ` Louis Chauvet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).