* dwc2: irq 66: nobody cared triggered on resume
@ 2024-06-22 12:23 Stefan Wahren
[not found] ` <ZnccVLINb_xQcmZG@wunner.de>
0 siblings, 1 reply; 4+ messages in thread
From: Stefan Wahren @ 2024-06-22 12:23 UTC (permalink / raw)
To: Minas Harutyunyan, Florian Fainelli, Thomas Gleixner
Cc: linux-usb@vger.kernel.org, Linux ARM, bcm-kernel-feedback-list,
Ray Jui, Scott Branden, Marc Zyngier, Lukas Wunner
Hi,
i currently experiment with suspend to idle on the Raspberry Pi 3 A+.
Supend & resume works expected as long as no USB device is connected to
the board. If i connect a USB hub to the Pi, the resume phase is
significantly delayed and the kernel disabled IRQ 66 which belongs to DWC2.
Here are the steps to trigger this issue:
- build kernel with multi_v7_defconfig
- make sure necessary kernel options are enabled ( CONFIG_PM_DEBUG,
CONFIG_PM_ADVANCED_DEBUG )
- make sure a USB hub is connected to Raspberry Pi 3 A Plus
- Add "no_console_suspend" to cmdline.txt and reboot
- Connect via Debug UART:
echo 1 > /sys/power/pm_debug_messages
echo platform > /sys/power/pm_test
echo freeze > /sys/power/state
Here is the relevant kernel log:
[ 1108.402522] usb 1-1: new high-speed USB device number 2 using dwc2
[ 1108.655864] hub 1-1:1.0: USB hub found
[ 1108.656029] hub 1-1:1.0: 4 ports detected
[ 1117.932573] rpi_firmware_set_power: Set HDMI to 1
[ 1117.932709] rpi_firmware_set_power: Set HDMI to 0
[ 1123.282825] PM: suspend entry (s2idle)
[ 1124.429959] Filesystems sync: 1.147 seconds
[ 1124.442720] Freezing user space processes
[ 1124.444205] Freezing user space processes completed (elapsed 0.001
seconds)
[ 1124.444227] OOM killer disabled.
[ 1124.444236] Freezing remaining freezable tasks
[ 1124.445409] Freezing remaining freezable tasks completed (elapsed
0.001 seconds)
[ 1125.487258] ieee80211 phy1: brcmf_fil_cmd_data: bus is down. we have
nothing to do.
[ 1125.487284] ieee80211 phy1: brcmf_cfg80211_get_tx_power: error (-5)
[ 1126.105119] PM: suspend of devices complete after 1658.658 msecs
[ 1126.105147] PM: start suspend of devices complete after 1659.719 msecs
[ 1126.106814] PM: late suspend of devices complete after 1.650 msecs
[ 1126.107207] rpi_firmware_set_power: Set VEC to 0
[ 1126.107279] rpi_firmware_set_power: Set V3D to 0
[ 1126.108303] PM: noirq suspend of devices complete after 1.340 msecs
[ 1126.108321] PM: suspend debug: Waiting for 5 second(s).
[ 1131.109573] rpi_firmware_set_power: Set V3D to 1
[ 1131.109636] rpi_firmware_set_power: Set HDMI to 1
[ 1131.109692] rpi_firmware_set_power: Set VEC to 1
[ 1131.109745] rpi_firmware_set_power: Set USB to 1
[ 1131.109996] PM: noirq resume of devices complete after 1.273 msecs
[ 1131.111208] PM: early resume of devices complete after 1.051 msecs
[ 1131.230277] brcmfmac: brcmf_fw_alloc_request: using
brcm/brcmfmac43455-sdio for chip BCM4345/6
[ 1131.458687] irq 66: nobody cared (try booting with the "irqpoll" option)
[ 1131.458714] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W
6.10.0-rc3-g7fd4227d1bd5-dirty #49
[ 1131.458734] Hardware name: BCM2835
[ 1131.458744] Call trace:
[ 1131.458757] unwind_backtrace from show_stack+0x10/0x14
[ 1131.458796] show_stack from dump_stack_lvl+0x50/0x64
[ 1131.458822] dump_stack_lvl from __report_bad_irq+0x38/0xc0
[ 1131.458848] __report_bad_irq from note_interrupt+0x2ac/0x2f4
[ 1131.458877] note_interrupt from handle_irq_event+0x88/0x8c
[ 1131.458900] handle_irq_event from handle_level_irq+0xb4/0x1ac
[ 1131.458923] handle_level_irq from generic_handle_domain_irq+0x24/0x34
[ 1131.458957] generic_handle_domain_irq from
bcm2836_chained_handle_irq+0x24/0x28
[ 1131.458992] bcm2836_chained_handle_irq from
generic_handle_domain_irq+0x24/0x34
[ 1131.459024] generic_handle_domain_irq from
generic_handle_arch_irq+0x34/0x44
[ 1131.459056] generic_handle_arch_irq from __irq_svc+0x88/0xb0
[ 1131.459079] Exception stack(0xc1b01f20 to 0xc1b01f68)
[ 1131.459100] 1f20: 0005c0d4 00000001 00000000 00000000 c1b09780
c1d6b32c c1b04e54 c1a5eae8
[ 1131.459118] 1f40: c1b04e90 00000000 00000000 00000000 c1d6a8a0
c1b01f70 c11d2da8 c11d4160
[ 1131.459130] 1f60: 60000013 ffffffff
[ 1131.459142] __irq_svc from default_idle_call+0x1c/0xb0
[ 1131.459167] default_idle_call from do_idle+0x21c/0x284
[ 1131.459202] do_idle from cpu_startup_entry+0x28/0x2c
[ 1131.459239] cpu_startup_entry from kernel_init+0x0/0x12c
[ 1131.459271] handlers:
[ 1131.459279] [<f539e0f4>] dwc2_handle_common_intr
[ 1131.459308] [<75cd278b>] usb_hcd_irq
[ 1131.459329] Disabling IRQ #66
[ 1131.487617] brcmfmac: brcmf_c_process_txcap_blob: no txcap_blob
available (err=-2)
[ 1131.488107] brcmfmac: brcmf_c_preinit_dcmds: Firmware: BCM4345/6 wl0:
Nov 1 2021 00:37:25 version 7.45.241 (1a2f2fa CY) FWID 01-703fd60
[ 1136.552601] usb 1-1: reset high-speed USB device number 2 using dwc2
[ 1141.832522] usb 1-1: device descriptor read/64, error -110
[ 1157.272527] usb 1-1: device descriptor read/64, error -110
[ 1157.612516] usb 1-1: reset high-speed USB device number 2 using dwc2
[ 1162.872518] usb 1-1: device descriptor read/64, error -110
[ 1178.312517] usb 1-1: device descriptor read/64, error -110
[ 1178.652516] usb 1-1: reset high-speed USB device number 2 using dwc2
[ 1189.192506] usb 1-1: device not accepting address 2, error -110
[ 1189.412512] usb 1-1: reset high-speed USB device number 2 using dwc2
[ 1199.992511] usb 1-1: device not accepting address 2, error -110
[ 1199.992768] PM: resume of devices complete after 68881.537 msecs
[ 1199.994145] OOM killer enabled.
[ 1199.994155] Restarting tasks ...
[ 1199.994833] rpi_firmware_set_power: Set HDMI to 0
[ 1199.994892] usb 1-1: USB disconnect, device number 2
[ 1199.996456] done.
[ 1199.996518] random: crng reseeded on system resumption
[ 1200.014632] PM: suspend exit
An ideas what causing this issue?
^ permalink raw reply [flat|nested] 4+ messages in thread[parent not found: <ZnccVLINb_xQcmZG@wunner.de>]
* Re: dwc2: irq 66: nobody cared triggered on resume [not found] ` <ZnccVLINb_xQcmZG@wunner.de> @ 2024-06-23 13:27 ` Stefan Wahren 2024-06-24 17:02 ` Stefan Wahren 0 siblings, 1 reply; 4+ messages in thread From: Stefan Wahren @ 2024-06-23 13:27 UTC (permalink / raw) To: Lukas Wunner Cc: Minas Harutyunyan, Florian Fainelli, Thomas Gleixner, linux-usb@vger.kernel.org, Linux ARM, bcm-kernel-feedback-list, Ray Jui, Scott Branden, Marc Zyngier Hello Lukas, Am 22.06.24 um 20:47 schrieb Lukas Wunner: > On Sat, Jun 22, 2024 at 02:23:33PM +0200, Stefan Wahren wrote: >> i currently experiment with suspend to idle on the Raspberry Pi 3 A+. >> Supend & resume works expected as long as no USB device is connected to >> the board. If i connect a USB hub to the Pi, the resume phase is >> significantly delayed and the kernel disabled IRQ 66 which belongs to DWC2. > [...] >> [ 1131.109996] PM: noirq resume of devices complete after 1.273 msecs >> [ 1131.111208] PM: early resume of devices complete after 1.051 msecs >> [ 1131.230277] brcmfmac: brcmf_fw_alloc_request: using >> brcm/brcmfmac43455-sdio for chip BCM4345/6 >> [ 1131.458687] irq 66: nobody cared (try booting with the "irqpoll" option) >> [ 1131.458714] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 6.10.0-rc3-g7fd4227d1bd5-dirty #49 >> [ 1131.458734] Hardware name: BCM2835 >> [ 1131.458744] Call trace: > [...] >> [ 1131.458877] note_interrupt from handle_irq_event+0x88/0x8c >> [ 1131.458900] handle_irq_event from handle_level_irq+0xb4/0x1ac >> [ 1131.458923] handle_level_irq from generic_handle_domain_irq+0x24/0x34 >> [ 1131.458957] generic_handle_domain_irq from bcm2836_chained_handle_irq+0x24/0x28 >> [ 1131.458992] bcm2836_chained_handle_irq from generic_handle_domain_irq+0x24/0x34 >> [ 1131.459024] generic_handle_domain_irq from generic_handle_arch_irq+0x34/0x44 >> [ 1131.459056] generic_handle_arch_irq from __irq_svc+0x88/0xb0 >> [ 1131.459079] Exception stack(0xc1b01f20 to 0xc1b01f68) >> [ 1131.459142] __irq_svc from default_idle_call+0x1c/0xb0 >> [ 1131.459167] default_idle_call from do_idle+0x21c/0x284 >> [ 1131.459202] do_idle from cpu_startup_entry+0x28/0x2c >> [ 1131.459239] cpu_startup_entry from kernel_init+0x0/0x12c >> [ 1131.459271] handlers: >> [ 1131.459279] [<f539e0f4>] dwc2_handle_common_intr >> [ 1131.459308] [<75cd278b>] usb_hcd_irq >> [ 1131.459329] Disabling IRQ #66 > [...] >> An ideas what causing this issue? > Interrupts are re-enabled after the resume_noirq phase. Looks like > the chip signals an interrupt right afterwards but the two hardirq > handlers do not feel responsible. > > The only option might be to add a few printk() in dwc2_handle_common_intr(), > usb_hcd_irq() and dwc2_handle_hcd_intr() (called from usb_hcd_irq()) > to see why they're all returning IRQ_NONE without clearing the source > of the interrupt. The chip just keeps signaling interrupts because > the driver doesn't handle them, hence the IRQ storm which the IRQ core > eventually stops by outright disabling the interrupt. thanks for your suggestion. Unfortunately placing printk in those busy interrupt handler is futile, so i switched to debugfs. This issue would be much easier in case the interrupt wouldn't be shared. But first let me share some outputs before i start to extend debugfs further: 1. No hub connected to Rpi 3 A+ root@raspberrypi:/sys/kernel/debug/usb/3f980000.usb# cat state DCFG=0x00000000, DCTL=0x00000000, DSTS=0x0007ff02 DIEPMSK=0x00000000, DOEPMASK=0x00000000 GINTMSK=0xf3000806, GINTSTS=0x04000023 DAINTMSK=0x00000000, DAINT=0x00000000 GNPTXSTS=0x00080100, GRXSTSR=3f83bbfe 2. Hub connected before suspend / irq issue DCFG=0x00000000, DCTL=0x00000000, DSTS=0x0007a202 DIEPMSK=0x00000000, DOEPMASK=0x00000000 GINTMSK=0xf300080e, GINTSTS=0x04000023 DAINTMSK=0x00000000, DAINT=0x00000000 GNPTXSTS=0x08080100, GRXSTSR=789a460a 3. Hub connected after suspend / irq issue DCFG=0x00000000, DCTL=0x00000000, DSTS=0x0007ff02 DIEPMSK=0x00000000, DOEPMASK=0x00000000 GINTMSK=0xf1000806, GINTSTS=0x0500002b DAINTMSK=0x000000ff, DAINT=0x00000000 GNPTXSTS=0x29080100, GRXSTSR=befdf595 Based on my limited knowledge and observations the issue seems related to GINTMSK/GINTSTS and a outstanding GINTSTS_PRTINT. Regards ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: dwc2: irq 66: nobody cared triggered on resume 2024-06-23 13:27 ` Stefan Wahren @ 2024-06-24 17:02 ` Stefan Wahren 2024-06-24 18:17 ` Thomas Gleixner 0 siblings, 1 reply; 4+ messages in thread From: Stefan Wahren @ 2024-06-24 17:02 UTC (permalink / raw) To: Lukas Wunner, Minas Harutyunyan Cc: Florian Fainelli, Thomas Gleixner, linux-usb@vger.kernel.org, Linux ARM, bcm-kernel-feedback-list, Ray Jui, Scott Branden, Marc Zyngier Hi, Am 23.06.24 um 15:27 schrieb Stefan Wahren: > Hello Lukas, > > Am 22.06.24 um 20:47 schrieb Lukas Wunner: >> On Sat, Jun 22, 2024 at 02:23:33PM +0200, Stefan Wahren wrote: >>> i currently experiment with suspend to idle on the Raspberry Pi 3 A+. >>> Supend & resume works expected as long as no USB device is connected to >>> the board. If i connect a USB hub to the Pi, the resume phase is >>> significantly delayed and the kernel disabled IRQ 66 which belongs >>> to DWC2. >> [...] >>> [ 1131.109996] PM: noirq resume of devices complete after 1.273 msecs >>> [ 1131.111208] PM: early resume of devices complete after 1.051 msecs >>> [ 1131.230277] brcmfmac: brcmf_fw_alloc_request: using >>> brcm/brcmfmac43455-sdio for chip BCM4345/6 >>> [ 1131.458687] irq 66: nobody cared (try booting with the "irqpoll" >>> option) >>> [ 1131.458714] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W >>> 6.10.0-rc3-g7fd4227d1bd5-dirty #49 >>> [ 1131.458734] Hardware name: BCM2835 >>> [ 1131.458744] Call trace: >> [...] >>> [ 1131.458877] note_interrupt from handle_irq_event+0x88/0x8c >>> [ 1131.458900] handle_irq_event from handle_level_irq+0xb4/0x1ac >>> [ 1131.458923] handle_level_irq from >>> generic_handle_domain_irq+0x24/0x34 >>> [ 1131.458957] generic_handle_domain_irq from >>> bcm2836_chained_handle_irq+0x24/0x28 >>> [ 1131.458992] bcm2836_chained_handle_irq from >>> generic_handle_domain_irq+0x24/0x34 >>> [ 1131.459024] generic_handle_domain_irq from >>> generic_handle_arch_irq+0x34/0x44 >>> [ 1131.459056] generic_handle_arch_irq from __irq_svc+0x88/0xb0 >>> [ 1131.459079] Exception stack(0xc1b01f20 to 0xc1b01f68) >>> [ 1131.459142] __irq_svc from default_idle_call+0x1c/0xb0 >>> [ 1131.459167] default_idle_call from do_idle+0x21c/0x284 >>> [ 1131.459202] do_idle from cpu_startup_entry+0x28/0x2c >>> [ 1131.459239] cpu_startup_entry from kernel_init+0x0/0x12c >>> [ 1131.459271] handlers: >>> [ 1131.459279] [<f539e0f4>] dwc2_handle_common_intr >>> [ 1131.459308] [<75cd278b>] usb_hcd_irq >>> [ 1131.459329] Disabling IRQ #66 >> [...] >>> An ideas what causing this issue? >> Interrupts are re-enabled after the resume_noirq phase. Looks like >> the chip signals an interrupt right afterwards but the two hardirq >> handlers do not feel responsible. >> >> The only option might be to add a few printk() in >> dwc2_handle_common_intr(), >> usb_hcd_irq() and dwc2_handle_hcd_intr() (called from usb_hcd_irq()) >> to see why they're all returning IRQ_NONE without clearing the source >> of the interrupt. The chip just keeps signaling interrupts because >> the driver doesn't handle them, hence the IRQ storm which the IRQ core >> eventually stops by outright disabling the interrupt. > thanks for your suggestion. Unfortunately placing printk in those busy > interrupt handler is futile, so i switched to debugfs. This issue > would be much easier in case the interrupt wouldn't be shared. But > first let me share some outputs before i start to extend debugfs further: > > 1. No hub connected to Rpi 3 A+ > > root@raspberrypi:/sys/kernel/debug/usb/3f980000.usb# cat state > DCFG=0x00000000, DCTL=0x00000000, DSTS=0x0007ff02 > DIEPMSK=0x00000000, DOEPMASK=0x00000000 > GINTMSK=0xf3000806, GINTSTS=0x04000023 > DAINTMSK=0x00000000, DAINT=0x00000000 > GNPTXSTS=0x00080100, GRXSTSR=3f83bbfe > > 2. Hub connected before suspend / irq issue > > DCFG=0x00000000, DCTL=0x00000000, DSTS=0x0007a202 > DIEPMSK=0x00000000, DOEPMASK=0x00000000 > GINTMSK=0xf300080e, GINTSTS=0x04000023 > DAINTMSK=0x00000000, DAINT=0x00000000 > GNPTXSTS=0x08080100, GRXSTSR=789a460a > > 3. Hub connected after suspend / irq issue > > DCFG=0x00000000, DCTL=0x00000000, DSTS=0x0007ff02 > DIEPMSK=0x00000000, DOEPMASK=0x00000000 > GINTMSK=0xf1000806, GINTSTS=0x0500002b > DAINTMSK=0x000000ff, DAINT=0x00000000 > GNPTXSTS=0x29080100, GRXSTSR=befdf595 > > Based on my limited knowledge and observations the issue seems related > to GINTMSK/GINTSTS and a outstanding GINTSTS_PRTINT. i narrowed this a little bit further. At least i know the reason for the "nobody cared". It's clear that the issue is triggered by GINTSTS_PRTINT. The DWC2 controller is in host mode so dwc2_handle_common_intr() ignores the interrupt and returns IRQ_NONE. But usb_hcd_irq() also cannot handle it because HCD_FLAG_HW_ACCESSIBLE is still clear, so the handler also returns IRQ_NONE :-( Is disabling the IRQ via the upper layers an expected behavior instead of letting the DWC2 controller driver resolve the situation? But back to the root cause. I followed the suspend/resume path, why the HCD_FLAG_HW_ACCESSIBLE is not cleared. Suspend path: The power down is DWC2_POWER_DOWN_PARAM_NONE so the HCD_FLAG_HW_ACCESSIBLE is cleared ( https://elixir.bootlin.com/linux/v6.10-rc3/source/drivers/usb/dwc2/hcd.c#L4385 ). Resume path: During resume the HPRT0_CONNSTS flag is set, so the HCD_FLAG_HW_ACCESSIBLE is not set ( https://elixir.bootlin.com/linux/v6.10-rc3/source/drivers/usb/dwc2/hcd.c#L4435 ). Is the reason for this behavior the lack of clock gating support on BCM283x or is it a driver bug? How can i figure out clock gating is supported? Regards > > Regards ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: dwc2: irq 66: nobody cared triggered on resume 2024-06-24 17:02 ` Stefan Wahren @ 2024-06-24 18:17 ` Thomas Gleixner 0 siblings, 0 replies; 4+ messages in thread From: Thomas Gleixner @ 2024-06-24 18:17 UTC (permalink / raw) To: Stefan Wahren, Lukas Wunner, Minas Harutyunyan Cc: Florian Fainelli, linux-usb@vger.kernel.org, Linux ARM, bcm-kernel-feedback-list, Ray Jui, Scott Branden, Marc Zyngier On Mon, Jun 24 2024 at 19:02, Stefan Wahren wrote: > > Is disabling the IRQ via the upper layers an expected behavior instead > of letting the DWC2 controller driver resolve the situation? It can't because it's an interrupt storm, i.e. 100k unhandled interrupts in a row. So the system would just stay there forever. So one way to solve this is to disable the interrupt accross suspend resume and only reenable it once everything is in functional state again. Thanks, tglx ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-06-24 18:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-22 12:23 dwc2: irq 66: nobody cared triggered on resume Stefan Wahren
[not found] ` <ZnccVLINb_xQcmZG@wunner.de>
2024-06-23 13:27 ` Stefan Wahren
2024-06-24 17:02 ` Stefan Wahren
2024-06-24 18:17 ` Thomas Gleixner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox