public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: linux-usb@vger.kernel.org
Subject: [Bug 214789] ehci-hcd.c ISR
Date: Tue, 26 Oct 2021 21:49:06 +0000	[thread overview]
Message-ID: <bug-214789-208809-FKmMCc1yRI@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-214789-208809@https.bugzilla.kernel.org/>

https://bugzilla.kernel.org/show_bug.cgi?id=214789

--- Comment #12 from Scott Arnold (scott.c.arnold@nasa.gov) ---
Hello,
Sorry for the late reply, Outlook sometimes put emails in "Other" for some
reason.
I just reverted that machine back to the 5.3.6 kernel.
Now IRQ 16  has:
16: IO-APIC   16-fasteoi   ehci_hcd:usb1, uhci_hcd:usb3, hpilo, rt_pcclk
"uhci_hcd:usb3" does not appear with the 5.11+ kernels (with or without
rt_pcclk), maybe the issue is with uhci_hcd.
Timer card works fine in this config.
Thanks
Scott

-----Original Message-----
From: bugzilla-daemon@bugzilla.kernel.org <bugzilla-daemon@bugzilla.kernel.org> 
Sent: Monday, October 25, 2021 8:20 AM
To: Arnold, Scott C. (JSC-CD13)[SGT, INC] <scott.c.arnold@nasa.gov>
Subject: [EXTERNAL] [Bug 214789] ehci-hcd.c ISR

https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D214789&amp;data=04%7C01%7Cscott.c.arnold%40nasa.gov%7Cfabcab1a2f274f35635c08d997ba2068%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637707647925383291%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=wW0oHnSHrC19%2F3uz37f5R%2FuyUpfpNIl38tY%2BJ0rnBbA%3D&amp;reserved=0

--- Comment #11 from Johan Hovold (johan@kernel.org) --- [ Adding back bugzilla
and linux-usb on CC. ]

On Fri, Oct 22, 2021 at 07:43:04PM +0000, Arnold, Scott C. (JSC-CD13)[SGT, INC]
wrote:
> I added the WARN_ON_ONCE(!irqs_disabled()); at the beginning of 
> ehci-irq before the lock and did not notice anything.

Ok, so interrupts are already disabled as they should be.

> However after looking at the logs I discovered:
> 
> [    5.189043] irq 16: nobody cared (try booting with the "irqpoll" option)
> [    5.189112] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.14.13_OBCS_1 #4
> [    5.189180] Hardware name: HP ProLiant DL580 Gen9/ProLiant DL580 Gen9,
> BIOS U17
> 01/22/2018
> [    5.189261] Call Trace:
> [    5.189324]  <IRQ>
> [    5.189381]  ? dump_stack_lvl+0x33/0x42
> [    5.189445]  ? __report_bad_irq+0x32/0xac
> [    5.189505]  ? note_interrupt.cold.11+0xa/0x63
> [    5.189562]  ? handle_irq_event_percpu+0x65/0x70
> [    5.189623]  ? handle_irq_event+0x32/0x50
> [    5.189681]  ? handle_fasteoi_irq+0xa1/0x160
> [    5.189740]  ? __common_interrupt+0x3c/0xa0
> [    5.189798]  ? common_interrupt+0x7a/0xa0
> [    5.189859]  </IRQ>
> [    5.189913]  ? asm_common_interrupt+0x1b/0x40
> [    5.189973]  ? mwait_idle+0x50/0x70
> [    5.190031]  ? default_idle+0x10/0x10
> [    5.190088]  ? default_idle_call+0x1f/0x30
> [    5.190147]  ? do_idle+0x1df/0x1f0
> [    5.190207]  ? cpu_startup_entry+0x14/0x20
> [    5.190267]  ? start_kernel+0x616/0x63d
> [    5.190328]  ? secondary_startup_64_no_verify+0xb0/0xbb
> [    5.190388] handlers:
> [    5.190442] [<00000000da7aaaea>] usb_hcd_irq
> [    5.190504] Disabling IRQ #16
> 
> [    5.201827] irq 23: nobody cared (try booting with the "irqpoll" option)
> [    5.201885] CPU: 1 PID: 8 Comm: kworker/u145:0 Not tainted 5.14.13_OBCS_1
> #4
> [    5.201942] Hardware name: HP ProLiant DL580 Gen9/ProLiant DL580 Gen9,
> BIOS U17
> 01/22/2018
> [    5.202010] Workqueue: events_unbound async_run_entry_fn
> [    5.202069] Call Trace:
> [    5.202119]  <IRQ>
> [    5.202168]  ? dump_stack_lvl+0x33/0x42
> [    5.202223]  ? __report_bad_irq+0x32/0xac
> [    5.202277]  ? note_interrupt.cold.11+0xa/0x63
> [    5.202332]  ? handle_irq_event_percpu+0x65/0x70
> [    5.202386]  ? handle_irq_event+0x32/0x50
> [    5.202441]  ? handle_fasteoi_irq+0xa1/0x160
> [    5.202495]  ? __common_interrupt+0x3c/0xa0
> [    5.202548]  ? common_interrupt+0x7a/0xa0
> [    5.202603]  </IRQ>
> [    5.202652]  ? asm_common_interrupt+0x1b/0x40
> [    5.202707]  ? inflate_fast+0x118/0x5e0
> [    5.202764]  ? zlib_inflate+0x3d1/0x1770
> [    5.202817]  ? do_copy+0xed/0x109
> [    5.202869]  ? write_buffer+0x22/0x32
> [    5.202921]  ? initrd_load+0x268/0x268
> [    5.202975]  ? write_buffer+0x32/0x32
> [    5.203026]  ? __gunzip+0x244/0x310
> [    5.203083]  ? decompress_method+0x3c/0x3c
> [    5.203137]  ? initrd_load+0x268/0x268
> [    5.203190]  ? gunzip+0xe/0x11
> [    5.203243]  ? initrd_load+0x268/0x268
> [    5.203296]  ? unpack_to_rootfs+0x14f/0x285
> [    5.203349]  ? initrd_load+0x268/0x268
> [    5.203402]  ? do_populate_rootfs+0x6c/0x160
> [    5.203455]  ? async_run_entry_fn+0x1b/0xa0
> [    5.203508]  ? process_one_work+0x1d1/0x330
> [    5.203563]  ? worker_thread+0x28/0x3d0
> [    5.203615]  ? mod_delayed_work_on+0x90/0x90
> [    5.203668]  ? kthread+0x120/0x150
> [    5.203720]  ? set_kthread_struct+0x30/0x30
> [    5.203773]  ? ret_from_fork+0x22/0x30
> [    5.203826] handlers:
> [    5.203875] [<00000000da7aaaea>] usb_hcd_irq
> [    5.203930] Disabling IRQ #23

So this happens also for another EHCI bus IRQ. Is this IRQ also shared with
something?

> [   62.407444] irq 16: nobody cared (try booting with the "irqpoll" option)
> [   62.407474] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.14.13_OBCS_1 #4
> [   62.407499] Hardware name: HP ProLiant DL580 Gen9/ProLiant DL580 Gen9,
> BIOS U17
> 01/22/2018
> [   62.407527] Call Trace:
> [   62.407538]  <IRQ>
> [   62.407547]  ? dump_stack_lvl+0x33/0x42
> [   62.407569]  ? __report_bad_irq+0x32/0xac
> [   62.407588]  ? note_interrupt.cold.11+0xa/0x63
> [   62.407606]  ? handle_irq_event_percpu+0x65/0x70
> [   62.407626]  ? handle_irq_event+0x32/0x50
> [   62.407642]  ? handle_fasteoi_irq+0xa1/0x160
> [   62.408250]  ? __common_interrupt+0x3c/0xa0
> [   62.408820]  ? common_interrupt+0x7a/0xa0
> [   62.409386]  </IRQ>
> [   62.409934]  ? asm_common_interrupt+0x1b/0x40
> [   62.410483]  ? mwait_idle+0x50/0x70
> [   62.411026]  ? default_idle+0x10/0x10
> [   62.411565]  ? default_idle_call+0x1f/0x30
> [   62.412102]  ? do_idle+0x1df/0x1f0
> [   62.412634]  ? cpu_startup_entry+0x14/0x20
> [   62.413164]  ? start_kernel+0x616/0x63d
> [   62.413694]  ? secondary_startup_64_no_verify+0xb0/0xbb
> [   62.414218] handlers:
> [   62.414734] [<00000000da7aaaea>] usb_hcd_irq
> [   62.415257] [<000000008857253d>] ilo_isr [hpilo]
> [   62.415775] Disabling IRQ #16
> 
> There is one usb device " Bus 001 Device 003: ID 14dd:1007 Raritan 
> Computer, Inc.  D2CIM-VUSB KVM connector" and it has disappeared 
> (i.e.not working)

Thanks for confirming.

> This does not occur without the irqsave/restore in the ehci-hcd.

Now why would simply saving the interrupt state in ehci_irq() prevent these
spurious IRQs? There's something fishy going on alright.

> My timercard driver is not loaded. This is with a 5.14.13 kernel.

Are you able to reproduce this on a machine without the timer card present at
all?

> Lsusb -s1:
> Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 003 
> Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 001 Device 
> 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> 
> Lsb -s2 and -s3 are blank.

Looks like you forgot the colon in "lsusb -s1:" so this lists the devices with
number 1 instead of the devices connected to bus 1.

> On another identical machine (almost has 92 cores instead of 72) 
> running
> 5.3.6 kernel:
> Lsusb -s1:
> 
> Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 003 
> Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 001 Device 
> 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
> 
> Lsusb -s2:
> Bus 002 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching 
> Hub Bus 001 Device 002: ID 8087:0024 Intel Corp. Integrated Rate 
> Matching Hub
> 
> Lsusb -s3:
> 
> Bus 002 Device 003: ID 0424:2660 Microchip Technology, Inc. (formerly 
> SMSC) Hub Bus 001 Device 003: ID 14dd:1007 Raritan Computer, Inc. 
> D2CIM-VUSB KVM connector

Ok, but there is a device connected to bus 1 as you also mentioned above.

On Fri, Oct 22, 2021 at 09:38:28PM +0000, Arnold, Scott C. (JSC-CD13)[SGT, INC]
wrote:
> Just as another datapoint I put the ehci-hcd.c file from the 5.3.6 
> kernel into the 5.14.13 kernel.
> No more "nobody cared" messages but neither timer card or USB is 
> working now.

Yeah, that probably not going to work.

> [    6.798509] usb 2-1: new high-speed USB device number 4 using ehci-pci
> [    6.798586] usb 1-1: new high-speed USB device number 4 using ehci-pci
> [    7.238498] usb 1-1: device not accepting address 4, error -32
> [    7.238562] usb 2-1: device not accepting address 4, error -32
> [    7.388499] usb 1-1: new high-speed USB device number 5 using ehci-pci
> [    7.388561] usb 2-1: new high-speed USB device number 5 using ehci-pci
> [    7.828496] usb 1-1: device not accepting address 5, error -32
> [    7.828557] usb 2-1: device not accepting address 5, error -32
> [    7.828618] usb usb1-port1: unable to enumerate USB device
> [    7.828678] usb usb2-port1: unable to enumerate USB device
> 
> /proc/interrupts for IRQ 16 is stuck at 50.
> 
> Some combination of these two may solve problem.

It would be good if we could rule out the timer card being involved in this
(e.g. since the driver is out of tree).

Johan

--
You may reply to this email to add a comment.

You are receiving this mail because:
You reported the bug.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

  parent reply	other threads:[~2021-10-26 21:49 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-21  4:29 [Bug 214789] New: ehci-hcd.c ISR bugzilla-daemon
2021-10-21  5:56 ` Greg KH
2021-10-21  5:56 ` [Bug 214789] " bugzilla-daemon
2021-10-21 15:42 ` bugzilla-daemon
2021-10-21 16:02 ` bugzilla-daemon
2021-10-21 16:13 ` bugzilla-daemon
2021-10-21 19:27 ` bugzilla-daemon
2021-10-21 19:36 ` bugzilla-daemon
2021-10-21 19:37 ` bugzilla-daemon
2021-10-21 19:46 ` bugzilla-daemon
2021-10-21 20:04 ` bugzilla-daemon
2021-10-22  8:53   ` Johan Hovold
2021-10-22  8:53 ` bugzilla-daemon
2021-10-25 13:19   ` Johan Hovold
2021-10-25 13:19 ` bugzilla-daemon
2021-10-26 21:49 ` bugzilla-daemon [this message]
2021-10-27  2:05 ` bugzilla-daemon
2021-10-27 17:15 ` bugzilla-daemon
2021-10-27 20:07 ` bugzilla-daemon
2021-10-27 20:39 ` bugzilla-daemon
2021-10-27 21:14 ` bugzilla-daemon
2021-10-29 15:41 ` bugzilla-daemon
2021-11-01 19:07 ` bugzilla-daemon
2021-11-01 19:41 ` bugzilla-daemon
2021-11-01 19:49 ` bugzilla-daemon
2021-11-05 17:10 ` bugzilla-daemon
2021-11-05 19:39   ` Alan Stern
2021-11-16 19:40     ` Krzysztof Wilczyński
2021-11-05 18:56 ` bugzilla-daemon
2021-11-05 19:39 ` bugzilla-daemon
2021-11-16 19:40 ` bugzilla-daemon
2021-12-04  0:02 ` bugzilla-daemon
2021-12-04  0:05 ` bugzilla-daemon
2021-12-04  0:27 ` bugzilla-daemon
2021-12-04  0:29 ` bugzilla-daemon
2021-12-06 21:44 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-214789-208809-FKmMCc1yRI@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=linux-usb@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox