* dwc3 inconsistent gadget connection state?
@ 2020-07-02 21:44 John Stultz
2020-07-03 2:55 ` Jun Li
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: John Stultz @ 2020-07-02 21:44 UTC (permalink / raw)
To: Felipe Balbi
Cc: Tejas Joglekar, Yang Fei, Anurag Kumar Vulisha, YongQin Liu,
Andrzej Pietrasiewicz, Thinh Nguyen, Linux USB List
Hey Felipe,
I've been tripping over an issue on my HiKey960 where with the usb-c
gadget cable connected, the gadget code doesn't consistently seem to
initialize properly. I had rarely seen this behavior previously, but
more recently it has become more frequent and annoying.
Usually, unplugging and replugging the USB-C cable would get things
working again (but that's not helpful in test labs).
I annotated a bunch of code trying to understand what was going on and
I narrowed down the difference in the good and bad case to a dwc3
reset interrupts happening after usb_gadget_probe_driver() completes.
In the good case, we see the reset interrupts, and in the failed case
we don't.
[ 16.491953] JDB: usb_gadget_probe_driver
[ 16.495938] JDB: udc_bind_to_driver
[ 16.499555] JDB: dwc3_gadget_start irq: 65 revision: 1429417994
[ 16.503803] JDB: __dwc3_gadget_ep_enable
[ 16.507791] JDB: __dwc3_gadget_ep_enable
[ 16.511715] JDB: dwc3_gadget_enable_irq
[ 16.515582] JDB: usb_udc_connect_control
[ 16.519510] JDB: usb_gadget_connect
<in the bad case, this is all we see, the gadget device doesn't come up>
[ 16.811010] JDB: dwc3_gadget_interrupt
[ 16.814783] JDB: dwc3_gadget_reset_interrupt
[ 16.819047] JDB: dwc3_reset_gadget
[ 16.823935] JDB: dwc3_gadget_interrupt
[ 16.827686] JDB: __dwc3_gadget_ep_enable
[ 16.831611] JDB: __dwc3_gadget_ep_enable
[ 16.994477] JDB: dwc3_gadget_interrupt
[ 16.998246] JDB: dwc3_gadget_reset_interrupt
[ 17.002519] JDB: dwc3_reset_gadget
[ 17.005922] JDB: usb_gadget_udc_reset
[ 17.062422] JDB: usb_gadget_set_state state: 5
[ 17.067069] JDB: dwc3_gadget_interrupt
[ 17.070823] JDB: __dwc3_gadget_ep_enable
[ 17.074745] JDB: __dwc3_gadget_ep_enable
[ 17.170898] JDB: usb_gadget_set_state state: 6
[ 17.195605] JDB: usb_gadget_set_state state: 7
[ 17.200179] JDB: __dwc3_gadget_ep_enable
[ 17.204118] JDB: __dwc3_gadget_ep_enable
[ 17.208057] JDB: usb_gadget_vbus_draw
[ 17.211721] JDB: usb_gadget_set_state state: 7
<in the good case everything is happy here>
This sounds a bit like the issue in the comment here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/dwc3/gadget.c?h=v5.8-rc3#n3143
However, I've tried calling dwc3_gadget_reset_interrupt() and
dwc3_reset_gadget() at the tail end of dwc3_gadget_start() but that
doesn't seem to help.
I was curious if you or anyone else had any thoughts on how to debug
this further?
thanks
-john
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: dwc3 inconsistent gadget connection state? 2020-07-02 21:44 dwc3 inconsistent gadget connection state? John Stultz @ 2020-07-03 2:55 ` Jun Li 2020-07-03 3:08 ` John Stultz 2020-07-03 6:15 ` John Stultz 2020-07-03 9:54 ` Felipe Balbi 2 siblings, 1 reply; 12+ messages in thread From: Jun Li @ 2020-07-03 2:55 UTC (permalink / raw) To: John Stultz Cc: Felipe Balbi, Tejas Joglekar, Yang Fei, Anurag Kumar Vulisha, YongQin Liu, Andrzej Pietrasiewicz, Thinh Nguyen, Linux USB List John Stultz <john.stultz@linaro.org> 于2020年7月3日周五 上午5:46写道: > > Hey Felipe, > > I've been tripping over an issue on my HiKey960 where with the usb-c > gadget cable connected, the gadget code doesn't consistently seem to > initialize properly. I had rarely seen this behavior previously, but > more recently it has become more frequent and annoying. > > Usually, unplugging and replugging the USB-C cable would get things > working again (but that's not helpful in test labs). > > I annotated a bunch of code trying to understand what was going on and > I narrowed down the difference in the good and bad case to a dwc3 > reset interrupts happening after usb_gadget_probe_driver() completes. > In the good case, we see the reset interrupts, and in the failed case > we don't. > > [ 16.491953] JDB: usb_gadget_probe_driver > [ 16.495938] JDB: udc_bind_to_driver > [ 16.499555] JDB: dwc3_gadget_start irq: 65 revision: 1429417994 > [ 16.503803] JDB: __dwc3_gadget_ep_enable > [ 16.507791] JDB: __dwc3_gadget_ep_enable > [ 16.511715] JDB: dwc3_gadget_enable_irq > [ 16.515582] JDB: usb_udc_connect_control > [ 16.519510] JDB: usb_gadget_connect > <in the bad case, this is all we see, the gadget device doesn't come up> > [ 16.811010] JDB: dwc3_gadget_interrupt > [ 16.814783] JDB: dwc3_gadget_reset_interrupt > [ 16.819047] JDB: dwc3_reset_gadget > [ 16.823935] JDB: dwc3_gadget_interrupt > [ 16.827686] JDB: __dwc3_gadget_ep_enable > [ 16.831611] JDB: __dwc3_gadget_ep_enable > [ 16.994477] JDB: dwc3_gadget_interrupt > [ 16.998246] JDB: dwc3_gadget_reset_interrupt > [ 17.002519] JDB: dwc3_reset_gadget > [ 17.005922] JDB: usb_gadget_udc_reset > [ 17.062422] JDB: usb_gadget_set_state state: 5 > [ 17.067069] JDB: dwc3_gadget_interrupt > [ 17.070823] JDB: __dwc3_gadget_ep_enable > [ 17.074745] JDB: __dwc3_gadget_ep_enable > [ 17.170898] JDB: usb_gadget_set_state state: 6 > [ 17.195605] JDB: usb_gadget_set_state state: 7 > [ 17.200179] JDB: __dwc3_gadget_ep_enable > [ 17.204118] JDB: __dwc3_gadget_ep_enable > [ 17.208057] JDB: usb_gadget_vbus_draw > [ 17.211721] JDB: usb_gadget_set_state state: 7 > <in the good case everything is happy here> > > > This sounds a bit like the issue in the comment here: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/dwc3/gadget.c?h=v5.8-rc3#n3143 > > However, I've tried calling dwc3_gadget_reset_interrupt() and > dwc3_reset_gadget() at the tail end of dwc3_gadget_start() but that > doesn't seem to help. > > I was curious if you or anyone else had any thoughts on how to debug > this further? If you force your gadget to be USB2(e.g. in dts) + maximum-speed = "high-speed"; will you still reproduce this issue? Does your gadget connect to host super speed port directly via a C-to-A cable in your test labs? or there is something between? Li Jun > > thanks > -john ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dwc3 inconsistent gadget connection state? 2020-07-03 2:55 ` Jun Li @ 2020-07-03 3:08 ` John Stultz 2020-07-03 7:46 ` Jun Li 0 siblings, 1 reply; 12+ messages in thread From: John Stultz @ 2020-07-03 3:08 UTC (permalink / raw) To: Jun Li Cc: Felipe Balbi, Tejas Joglekar, Yang Fei, Anurag Kumar Vulisha, YongQin Liu, Andrzej Pietrasiewicz, Thinh Nguyen, Linux USB List On Thu, Jul 2, 2020 at 7:55 PM Jun Li <lijun.kernel@gmail.com> wrote: > John Stultz <john.stultz@linaro.org> 于2020年7月3日周五 上午5:46写道: > > I was curious if you or anyone else had any thoughts on how to debug > > this further? > > If you force your gadget to be USB2(e.g. in dts) > > + maximum-speed = "high-speed"; > > will you still reproduce this issue? Thanks for the suggestion! Unfortunately, I gave that a try, but still reproduced the same issue with this setting. Curious, what the issue is your were thinking this would help with? > Does your gadget connect to host super speed port directly via a C-to-A cable > in your test labs? or there is something between? I'm not sure of the details in the lab, however I can reproduce this on my desk with a Host machine <-> USB hub <-> USB-C port. Additionally, the board itself is a little complicated, in that the USB-C port is USB2 only (however, it does have two USB-A USB3 ports behind an on-board hub and a switch to decide if the USB-C or hub ports are enabled since there is only one usb controller). thanks -john ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dwc3 inconsistent gadget connection state? 2020-07-03 3:08 ` John Stultz @ 2020-07-03 7:46 ` Jun Li 0 siblings, 0 replies; 12+ messages in thread From: Jun Li @ 2020-07-03 7:46 UTC (permalink / raw) To: John Stultz Cc: Felipe Balbi, Tejas Joglekar, Yang Fei, Anurag Kumar Vulisha, YongQin Liu, Andrzej Pietrasiewicz, Thinh Nguyen, Linux USB List John Stultz <john.stultz@linaro.org> 于2020年7月3日周五 上午11:08写道: > > On Thu, Jul 2, 2020 at 7:55 PM Jun Li <lijun.kernel@gmail.com> wrote: > > John Stultz <john.stultz@linaro.org> 于2020年7月3日周五 上午5:46写道: > > > I was curious if you or anyone else had any thoughts on how to debug > > > this further? > > > > If you force your gadget to be USB2(e.g. in dts) > > > > + maximum-speed = "high-speed"; > > > > will you still reproduce this issue? > > Thanks for the suggestion! Unfortunately, I gave that a try, but still > reproduced the same issue with this setting. > > Curious, what the issue is your were thinking this would help with? I had experience device mode had problem on super speed channel with some switch device between the host and type-C port, then it will not downgrade to enable USB2 term so host can't detect the my board's typec-C port. > > > Does your gadget connect to host super speed port directly via a C-to-A cable > > in your test labs? or there is something between? > > I'm not sure of the details in the lab, however I can reproduce this > on my desk with a Host machine <-> USB hub <-> USB-C port. > > Additionally, the board itself is a little complicated, in that the > USB-C port is USB2 only (however, it does have two USB-A USB3 ports > behind an on-board hub and a switch to decide if the USB-C or hub > ports are enabled since there is only one usb controller). So actully you should limit the gadget speed to be high speed for your USB2 only type-C port. Does the host machine can detect the connection when you plug in? Li Jun > > thanks > -john ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dwc3 inconsistent gadget connection state? 2020-07-02 21:44 dwc3 inconsistent gadget connection state? John Stultz 2020-07-03 2:55 ` Jun Li @ 2020-07-03 6:15 ` John Stultz 2020-07-03 7:57 ` Anurag Kumar Vulisha 2020-07-03 9:54 ` Felipe Balbi 2 siblings, 1 reply; 12+ messages in thread From: John Stultz @ 2020-07-03 6:15 UTC (permalink / raw) To: Felipe Balbi Cc: Tejas Joglekar, Yang Fei, Anurag Kumar Vulisha, YongQin Liu, Andrzej Pietrasiewicz, Thinh Nguyen, Linux USB List, Jun Li On Thu, Jul 2, 2020 at 2:44 PM John Stultz <john.stultz@linaro.org> wrote: > > I've been tripping over an issue on my HiKey960 where with the usb-c > gadget cable connected, the gadget code doesn't consistently seem to > initialize properly. I had rarely seen this behavior previously, but > more recently it has become more frequent and annoying. > > Usually, unplugging and replugging the USB-C cable would get things > working again (but that's not helpful in test labs). > > I annotated a bunch of code trying to understand what was going on and > I narrowed down the difference in the good and bad case to a dwc3 > reset interrupts happening after usb_gadget_probe_driver() completes. > In the good case, we see the reset interrupts, and in the failed case > we don't. So I've kept digging around on this, and started dumping registers at the end of dwc3_gadget_start() and then dwc3_gadget_pullup() as that still is called shortly after in both cases. The one consistent difference between the working and not working case I saw was the DWC3_DSTS_COREIDLE bit in the DWC3_DSTS register. It seems when we get to gadget_start()/pullup() if the DSTS_COREIDLE bit isn't on we won't get the reset irq. I added a simple timeout loop to pullup() similar to the DSTS_DEVCTRLHLT loop, but in the failure mode it always times out with COREIDLE not being set. Searching around hasn't provided any info on what COREIDLE actually means, so I'm a bit in the dark. Any clues? thanks -john ^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: dwc3 inconsistent gadget connection state? 2020-07-03 6:15 ` John Stultz @ 2020-07-03 7:57 ` Anurag Kumar Vulisha 2020-08-05 5:32 ` John Stultz 0 siblings, 1 reply; 12+ messages in thread From: Anurag Kumar Vulisha @ 2020-07-03 7:57 UTC (permalink / raw) To: John Stultz, Felipe Balbi Cc: Tejas Joglekar, Yang Fei, YongQin Liu, Andrzej Pietrasiewicz, Thinh Nguyen, Linux USB List, Jun Li Hi John, >-----Original Message----- >From: John Stultz <john.stultz@linaro.org> >Sent: Friday, July 3, 2020 11:46 AM >To: Felipe Balbi <balbi@kernel.org> >Cc: Tejas Joglekar <tejas.joglekar@synopsys.com>; Yang Fei ><fei.yang@intel.com>; Anurag Kumar Vulisha <anuragku@xilinx.com>; >YongQin Liu <yongqin.liu@linaro.org>; Andrzej Pietrasiewicz ><andrzej.p@collabora.com>; Thinh Nguyen <thinhn@synopsys.com>; Linux >USB List <linux-usb@vger.kernel.org>; Jun Li <lijun.kernel@gmail.com> >Subject: Re: dwc3 inconsistent gadget connection state? > >On Thu, Jul 2, 2020 at 2:44 PM John Stultz <john.stultz@linaro.org> wrote: >> >> I've been tripping over an issue on my HiKey960 where with the usb-c >> gadget cable connected, the gadget code doesn't consistently seem to >> initialize properly. I had rarely seen this behavior previously, but >> more recently it has become more frequent and annoying. >> >> Usually, unplugging and replugging the USB-C cable would get things >> working again (but that's not helpful in test labs). >> >> I annotated a bunch of code trying to understand what was going on and >> I narrowed down the difference in the good and bad case to a dwc3 >> reset interrupts happening after usb_gadget_probe_driver() completes. >> In the good case, we see the reset interrupts, and in the failed case >> we don't. > >So I've kept digging around on this, and started dumping registers at the end >of dwc3_gadget_start() and then dwc3_gadget_pullup() as that still is called >shortly after in both cases. > >The one consistent difference between the working and not working case I >saw was the DWC3_DSTS_COREIDLE bit in the DWC3_DSTS register. > >It seems when we get to gadget_start()/pullup() if the DSTS_COREIDLE bit >isn't on we won't get the reset irq. > >I added a simple timeout loop to pullup() similar to the DSTS_DEVCTRLHLT >loop, but in the failure mode it always times out with COREIDLE not being set. > >Searching around hasn't provided any info on what COREIDLE actually means, >so I'm a bit in the dark. Any clues? > DSTS.CoreIdle bit indicates that the core processed all the RXFIFO data, updated the Descriptors and is in idle state. From your previous mail I understood that the USB-C connection is configured for USB 2.0 only. Since you are facing issue with reset, can u please try setting the USB2PHYCFG. XCVRDLY bit. Enabling this bit adds an extra 2.5us delay after the controller sending command to configure the ULPI transceiver to HS mode and controller driving TxValid to 0, for sending a HS chirp signal. Please check if this workaround works for you. Thanks, Anurag Kumar Vulisha ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dwc3 inconsistent gadget connection state? 2020-07-03 7:57 ` Anurag Kumar Vulisha @ 2020-08-05 5:32 ` John Stultz 0 siblings, 0 replies; 12+ messages in thread From: John Stultz @ 2020-08-05 5:32 UTC (permalink / raw) To: Anurag Kumar Vulisha Cc: Felipe Balbi, Tejas Joglekar, Yang Fei, YongQin Liu, Andrzej Pietrasiewicz, Thinh Nguyen, Linux USB List, Jun Li On Fri, Jul 3, 2020 at 12:57 AM Anurag Kumar Vulisha <anuragku@xilinx.com> wrote: > >On Thu, Jul 2, 2020 at 2:44 PM John Stultz <john.stultz@linaro.org> wrote: > >The one consistent difference between the working and not working case I > >saw was the DWC3_DSTS_COREIDLE bit in the DWC3_DSTS register. > > > >It seems when we get to gadget_start()/pullup() if the DSTS_COREIDLE bit > >isn't on we won't get the reset irq. > > > >I added a simple timeout loop to pullup() similar to the DSTS_DEVCTRLHLT > >loop, but in the failure mode it always times out with COREIDLE not being set. > > > >Searching around hasn't provided any info on what COREIDLE actually means, > >so I'm a bit in the dark. Any clues? > > > DSTS.CoreIdle bit indicates that the core processed all the RXFIFO data, updated the > Descriptors and is in idle state. > From your previous mail I understood that the USB-C connection is configured for > USB 2.0 only. Since you are facing issue with reset, can u please try setting the > USB2PHYCFG. XCVRDLY bit. Enabling this bit adds an extra 2.5us delay after the > controller sending command to configure the ULPI transceiver to HS mode and > controller driving TxValid to 0, for sending a HS chirp signal. Please check if this > workaround works for you. Hey Anurag! Sorry for the slow response! I finally took a bit more time to chase this issue today, and tried your suggestion above. Unfortunately adding the XCVRDLY bit to the USB2PHYCFG register doesn't seem to help. I see the same behavior either way. Thanks for the suggestion though! I can consistently detect the problem when the COREIDLE bit isn't set after the dwc3_ep0_out_start() call in __dwc3_gadget_start(): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/dwc3/gadget.c?h=v5.8#n2130 When it gets stuck off, the COREIDLE bit doesn't seem to ever come back while the cable is plugged in. Since unplugging and replugging the cable does seem to unstick this, and since I can consistently detect when the problem has occured, I tweaked the code so we would return a error (and that error would be handled in the calling dwc3_gadget_start() code. However, the device then tries to initialize over and over, but the COREIDLE is still stuck off. So I tried a few times to see if I could reset via dwc3_reset_gadget(), but that doesn't seem to actually do anything that unsticks the core. Then I tried to mimic something similar to the softreset code but that just ends up getting the code stuck elsewhere (i see hard hangs and rcu warnings, but not sure where it goes awry). So not much luck... Is there some recommendation for how to best reset the hardware from the gadget.c code? Or is there a better place to try to detect this COREIDLE stuck-off state and do something about it? thanks -john ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dwc3 inconsistent gadget connection state? 2020-07-02 21:44 dwc3 inconsistent gadget connection state? John Stultz 2020-07-03 2:55 ` Jun Li 2020-07-03 6:15 ` John Stultz @ 2020-07-03 9:54 ` Felipe Balbi 2020-07-04 5:51 ` John Stultz 2 siblings, 1 reply; 12+ messages in thread From: Felipe Balbi @ 2020-07-03 9:54 UTC (permalink / raw) To: John Stultz Cc: Tejas Joglekar, Yang Fei, Anurag Kumar Vulisha, YongQin Liu, Andrzej Pietrasiewicz, Thinh Nguyen, Linux USB List [-- Attachment #1: Type: text/plain, Size: 2905 bytes --] Hi, John Stultz <john.stultz@linaro.org> writes: > I've been tripping over an issue on my HiKey960 where with the usb-c > gadget cable connected, the gadget code doesn't consistently seem to > initialize properly. I had rarely seen this behavior previously, but > more recently it has become more frequent and annoying. > > Usually, unplugging and replugging the USB-C cable would get things > working again (but that's not helpful in test labs). > > I annotated a bunch of code trying to understand what was going on and > I narrowed down the difference in the good and bad case to a dwc3 > reset interrupts happening after usb_gadget_probe_driver() completes. > In the good case, we see the reset interrupts, and in the failed case > we don't. > > [ 16.491953] JDB: usb_gadget_probe_driver > [ 16.495938] JDB: udc_bind_to_driver > [ 16.499555] JDB: dwc3_gadget_start irq: 65 revision: 1429417994 > [ 16.503803] JDB: __dwc3_gadget_ep_enable > [ 16.507791] JDB: __dwc3_gadget_ep_enable > [ 16.511715] JDB: dwc3_gadget_enable_irq > [ 16.515582] JDB: usb_udc_connect_control > [ 16.519510] JDB: usb_gadget_connect > <in the bad case, this is all we see, the gadget device doesn't come up> > [ 16.811010] JDB: dwc3_gadget_interrupt > [ 16.814783] JDB: dwc3_gadget_reset_interrupt > [ 16.819047] JDB: dwc3_reset_gadget > [ 16.823935] JDB: dwc3_gadget_interrupt > [ 16.827686] JDB: __dwc3_gadget_ep_enable > [ 16.831611] JDB: __dwc3_gadget_ep_enable > [ 16.994477] JDB: dwc3_gadget_interrupt > [ 16.998246] JDB: dwc3_gadget_reset_interrupt > [ 17.002519] JDB: dwc3_reset_gadget > [ 17.005922] JDB: usb_gadget_udc_reset > [ 17.062422] JDB: usb_gadget_set_state state: 5 > [ 17.067069] JDB: dwc3_gadget_interrupt > [ 17.070823] JDB: __dwc3_gadget_ep_enable > [ 17.074745] JDB: __dwc3_gadget_ep_enable > [ 17.170898] JDB: usb_gadget_set_state state: 6 > [ 17.195605] JDB: usb_gadget_set_state state: 7 > [ 17.200179] JDB: __dwc3_gadget_ep_enable > [ 17.204118] JDB: __dwc3_gadget_ep_enable > [ 17.208057] JDB: usb_gadget_vbus_draw > [ 17.211721] JDB: usb_gadget_set_state state: 7 > <in the good case everything is happy here> > > > This sounds a bit like the issue in the comment here: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/usb/dwc3/gadget.c?h=v5.8-rc3#n3143 > > However, I've tried calling dwc3_gadget_reset_interrupt() and > dwc3_reset_gadget() at the tail end of dwc3_gadget_start() but that > doesn't seem to help. > > I was curious if you or anyone else had any thoughts on how to debug > this further? Try enabling dwc3 tracepoints and collecting working and failing cases. If I were to guess, I would say there's a small race condition between setting pullup and the transceiver sending the VBUS_VALID signal to dwc3. -- balbi [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dwc3 inconsistent gadget connection state? 2020-07-03 9:54 ` Felipe Balbi @ 2020-07-04 5:51 ` John Stultz 2020-07-04 14:38 ` Felipe Balbi 0 siblings, 1 reply; 12+ messages in thread From: John Stultz @ 2020-07-04 5:51 UTC (permalink / raw) To: Felipe Balbi Cc: Tejas Joglekar, Yang Fei, Anurag Kumar Vulisha, YongQin Liu, Andrzej Pietrasiewicz, Thinh Nguyen, Linux USB List [-- Attachment #1: Type: text/plain, Size: 514 bytes --] On Fri, Jul 3, 2020 at 2:54 AM Felipe Balbi <balbi@kernel.org> wrote: > John Stultz <john.stultz@linaro.org> writes: > > I was curious if you or anyone else had any thoughts on how to debug > > this further? > > Try enabling dwc3 tracepoints and collecting working and failing > cases. If I were to guess, I would say there's a small race condition > between setting pullup and the transceiver sending the VBUS_VALID signal > to dwc3. Trace logs attached. Let me know if you have any further ideas! thanks -john [-- Attachment #2: hikey960.tar.xz --] [-- Type: application/octet-stream, Size: 10284 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dwc3 inconsistent gadget connection state? 2020-07-04 5:51 ` John Stultz @ 2020-07-04 14:38 ` Felipe Balbi 2020-07-07 3:56 ` John Stultz 0 siblings, 1 reply; 12+ messages in thread From: Felipe Balbi @ 2020-07-04 14:38 UTC (permalink / raw) To: John Stultz Cc: Tejas Joglekar, Yang Fei, Anurag Kumar Vulisha, YongQin Liu, Andrzej Pietrasiewicz, Thinh Nguyen, Linux USB List [-- Attachment #1: Type: text/plain, Size: 1734 bytes --] Hi, John Stultz <john.stultz@linaro.org> writes: > On Fri, Jul 3, 2020 at 2:54 AM Felipe Balbi <balbi@kernel.org> wrote: >> John Stultz <john.stultz@linaro.org> writes: >> > I was curious if you or anyone else had any thoughts on how to debug >> > this further? >> >> Try enabling dwc3 tracepoints and collecting working and failing >> cases. If I were to guess, I would say there's a small race condition >> between setting pullup and the transceiver sending the VBUS_VALID signal >> to dwc3. > > Trace logs attached. Let me know if you have any further ideas! You can see from failure case that we never got a Reset event. This happens, for instance, when dwc3 doesn't know that VBUS is above VBUS_VALID threshold (4.4V). When the problem happens, I'm assuming USB is completely dead, meaning that keeping the cable connected for longer won't change anything, right? In that case, could you dump DWC3 registers (there's a debugfs interface for that)? I'm mostly interested in the PHY registers, both USB2 and USB3. Check if the PHYs are suspended in the error case. If they are, try enabling the quirk flags that disable suspend for the PHYs (check binding documentation). If that helps, then discuss with your Silicon Validation guys what are the requirements when it comes to suspend. Some PHYs are inherently quirky and need some of the quirky flags dwc3 provides. Note that disabling suspend completely is a pretty large hammer that should only be used if nothing else helps. Some PHYs are happy with a simple delay of U1/U2/U3 entry but, again, check with your Silicon Validation folks, likely they have already gone through this during chip characterization. cheers -- balbi [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dwc3 inconsistent gadget connection state? 2020-07-04 14:38 ` Felipe Balbi @ 2020-07-07 3:56 ` John Stultz 2020-07-07 10:43 ` Felipe Balbi 0 siblings, 1 reply; 12+ messages in thread From: John Stultz @ 2020-07-07 3:56 UTC (permalink / raw) To: Felipe Balbi Cc: Tejas Joglekar, Yang Fei, Anurag Kumar Vulisha, YongQin Liu, Andrzej Pietrasiewicz, Thinh Nguyen, Linux USB List On Sat, Jul 4, 2020 at 7:38 AM Felipe Balbi <balbi@kernel.org> wrote: > John Stultz <john.stultz@linaro.org> writes: > > On Fri, Jul 3, 2020 at 2:54 AM Felipe Balbi <balbi@kernel.org> wrote: > >> John Stultz <john.stultz@linaro.org> writes: > >> > I was curious if you or anyone else had any thoughts on how to debug > >> > this further? > >> > >> Try enabling dwc3 tracepoints and collecting working and failing > >> cases. If I were to guess, I would say there's a small race condition > >> between setting pullup and the transceiver sending the VBUS_VALID signal > >> to dwc3. > > > > Trace logs attached. Let me know if you have any further ideas! > > You can see from failure case that we never got a Reset event. This > happens, for instance, when dwc3 doesn't know that VBUS is above > VBUS_VALID threshold (4.4V). When the problem happens, I'm assuming USB > is completely dead, meaning that keeping the cable connected for longer > won't change anything, right? Correct. The only way to get it working is to unplug and replug the cable (sometimes more than once). > In that case, could you dump DWC3 registers (there's a debugfs interface > for that)? I'm mostly interested in the PHY registers, both USB2 and > USB3. Check if the PHYs are suspended in the error case. Here's a diff of the regdump in bad and good cases: --- regdump.bad 2020-07-07 03:44:46.799514793 +0000 +++ regdump.good 2020-07-07 03:44:44.723534198 +0000 @@ -24,7 +24,7 @@ GHWPARAMS7 = 0x04881e8d GDBGFIFOSPACE = 0x00420000 GDBGLTSSM = 0x41090440 -GDBGBMU = 0xa0b08000 +GDBGBMU = 0x20300000 GPRTBIMAP_HS0 = 0x00000000 GPRTBIMAP_HS1 = 0x00000000 GPRTBIMAP_FS0 = 0x00000000 @@ -162,29 +162,29 @@ GEVNTSIZ(0) = 0x00001000 GEVNTCOUNT(0) = 0x00000000 GHWPARAMS8 = 0x00000fea -DCFG = 0x00120804 -DCTL = 0x80f00000 +DCFG = 0x0052082c +DCTL = 0x8cf00a00 DEVTEN = 0x00001217 -DSTS = 0x00000000 +DSTS = 0x00820000 DGCMDPAR = 0x00000000 DGCMD = 0x00000000 -DALEPENA = 0x00000003 +DALEPENA = 0x0000000f DEPCMDPAR2(0) = 0x00000000 -DEPCMDPAR1(0) = 0x17a8e000 +DEPCMDPAR1(0) = 0x15935000 DEPCMDPAR0(0) = 0x00000002 DEPCMD(0) = 0x00000006 DEPCMDPAR2(1) = 0x00000000 -DEPCMDPAR1(1) = 0x02000500 -DEPCMDPAR0(1) = 0x00001000 -DEPCMD(1) = 0x00000001 +DEPCMDPAR1(1) = 0x15935000 +DEPCMDPAR0(1) = 0x00000002 +DEPCMD(1) = 0x00010006 DEPCMDPAR2(2) = 0x00000000 DEPCMDPAR1(2) = 0x00000000 -DEPCMDPAR0(2) = 0x00000001 -DEPCMD(2) = 0x00030002 +DEPCMDPAR0(2) = 0x00000000 +DEPCMD(2) = 0x00020007 DEPCMDPAR2(3) = 0x00000000 DEPCMDPAR1(3) = 0x00000000 -DEPCMDPAR0(3) = 0x00000001 -DEPCMD(3) = 0x00040002 +DEPCMDPAR0(3) = 0x00000000 +DEPCMD(3) = 0x00030007 DEPCMDPAR2(4) = 0x00000000 DEPCMDPAR1(4) = 0x00000000 DEPCMDPAR0(4) = 0x00000001 > If they are, try enabling the quirk flags that disable suspend for the > PHYs (check binding documentation). If that helps, then discuss with > your Silicon Validation guys what are the requirements when it comes to > suspend. Some PHYs are inherently quirky and need some of the quirky > flags dwc3 provides. > > Note that disabling suspend completely is a pretty large hammer that > should only be used if nothing else helps. Some PHYs are happy with a > simple delay of U1/U2/U3 entry but, again, check with your Silicon > Validation folks, likely they have already gone through this during chip > characterization. Unfortunately I don't have any access to silicon validation folks. There is already a number of the quirk bindings in use, but I'll tinker around with them a bit to see if it causes any behavior change. Thanks so much for the ideas and feedback! Much appreciated! -john ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: dwc3 inconsistent gadget connection state? 2020-07-07 3:56 ` John Stultz @ 2020-07-07 10:43 ` Felipe Balbi 0 siblings, 0 replies; 12+ messages in thread From: Felipe Balbi @ 2020-07-07 10:43 UTC (permalink / raw) To: John Stultz Cc: Tejas Joglekar, Yang Fei, Anurag Kumar Vulisha, YongQin Liu, Andrzej Pietrasiewicz, Thinh Nguyen, Linux USB List [-- Attachment #1: Type: text/plain, Size: 3157 bytes --] Hi, John Stultz <john.stultz@linaro.org> writes: > On Sat, Jul 4, 2020 at 7:38 AM Felipe Balbi <balbi@kernel.org> wrote: >> John Stultz <john.stultz@linaro.org> writes: >> > On Fri, Jul 3, 2020 at 2:54 AM Felipe Balbi <balbi@kernel.org> wrote: >> >> John Stultz <john.stultz@linaro.org> writes: >> >> > I was curious if you or anyone else had any thoughts on how to debug >> >> > this further? >> >> >> >> Try enabling dwc3 tracepoints and collecting working and failing >> >> cases. If I were to guess, I would say there's a small race condition >> >> between setting pullup and the transceiver sending the VBUS_VALID signal >> >> to dwc3. >> > >> > Trace logs attached. Let me know if you have any further ideas! >> >> You can see from failure case that we never got a Reset event. This >> happens, for instance, when dwc3 doesn't know that VBUS is above >> VBUS_VALID threshold (4.4V). When the problem happens, I'm assuming USB >> is completely dead, meaning that keeping the cable connected for longer >> won't change anything, right? > > Correct. The only way to get it working is to unplug and replug the > cable (sometimes more than once). > >> In that case, could you dump DWC3 registers (there's a debugfs interface >> for that)? I'm mostly interested in the PHY registers, both USB2 and >> USB3. Check if the PHYs are suspended in the error case. > > Here's a diff of the regdump in bad and good cases: > --- regdump.bad 2020-07-07 03:44:46.799514793 +0000 > +++ regdump.good 2020-07-07 03:44:44.723534198 +0000 > @@ -162,29 +162,29 @@ > GEVNTSIZ(0) = 0x00001000 > GEVNTCOUNT(0) = 0x00000000 > GHWPARAMS8 = 0x00000fea > -DCFG = 0x00120804 > -DCTL = 0x80f00000 > +DCFG = 0x0052082c the only interesting thing here is DCFG. Can you decode it? > +DCTL = 0x8cf00a00 IIRC, this is only telling you that your controller is in U0 or something like that. Not interesting. >> If they are, try enabling the quirk flags that disable suspend for the >> PHYs (check binding documentation). If that helps, then discuss with >> your Silicon Validation guys what are the requirements when it comes to >> suspend. Some PHYs are inherently quirky and need some of the quirky >> flags dwc3 provides. >> >> Note that disabling suspend completely is a pretty large hammer that >> should only be used if nothing else helps. Some PHYs are happy with a >> simple delay of U1/U2/U3 entry but, again, check with your Silicon >> Validation folks, likely they have already gone through this during chip >> characterization. > > Unfortunately I don't have any access to silicon validation folks. no publicly available Errata List either? Do you know which PHY IP this platform uses? > There is already a number of the quirk bindings in use, but I'll > tinker around with them a bit to see if it causes any behavior change. Would be great to review those with people who were involved with the actual Silicon development, but if you don't have access to them, the discussion is moot :-s > Thanks so much for the ideas and feedback! Much appreciated! no worries ;-) -- balbi [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-08-05 5:33 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-07-02 21:44 dwc3 inconsistent gadget connection state? John Stultz 2020-07-03 2:55 ` Jun Li 2020-07-03 3:08 ` John Stultz 2020-07-03 7:46 ` Jun Li 2020-07-03 6:15 ` John Stultz 2020-07-03 7:57 ` Anurag Kumar Vulisha 2020-08-05 5:32 ` John Stultz 2020-07-03 9:54 ` Felipe Balbi 2020-07-04 5:51 ` John Stultz 2020-07-04 14:38 ` Felipe Balbi 2020-07-07 3:56 ` John Stultz 2020-07-07 10:43 ` Felipe Balbi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).