public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors
@ 2017-05-19  3:42 Steven Rostedt
  2017-05-19  5:42 ` Greg Kroah-Hartman
  2017-05-19  8:20 ` Christoph Hellwig
  0 siblings, 2 replies; 10+ messages in thread
From: Steven Rostedt @ 2017-05-19  3:42 UTC (permalink / raw)
  To: LKML; +Cc: Linus Torvalds, Christoph Hellwig, Mathias Nyman,
	Greg Kroah-Hartman

[-- Attachment #1: Type: text/plain, Size: 9399 bytes --]


One of my the configs I use to test ftrace with (configs that have
caused failures in the past), has lots of irq issues and fails to
initialize the network of my box. I bisected the problem down to a
single commit, and when I revert that commit, my box boots without any
network or irq issues.

Note, my other configs work fine on this box. I haven't investigated
which config is also the culprit. But since it use to work with this
config, I want to report it.

Here's the dmesg where things start to get messy:

 md: Waiting for all devices to be available before autodetect
 md: If you don't use raid, use raid=noautodetect
 md: Autodetecting RAID arrays.
 md: autorun ...
 md: ... autorun DONE.
 EXT4-fs (sda7): couldn't mount as ext3 due to feature incompatibilities
 EXT4-fs (sda7): mounted filesystem with ordered data mode. Opts: (null)
 VFS: Mounted root (ext4 filesystem) readonly on device 8:7.
 irq 16: nobody cared (try booting with the "irqpoll" option)
 CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.12.0-rc1-test-dirty #24
 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
 Call Trace:
  <IRQ>
 devtmpfs: mounted
  dump_stack+0x9a/0xd6
  __report_bad_irq+0x35/0xc0
  note_interrupt+0x234/0x270
  handle_irq_event_percpu+0x45/0x60
  handle_irq_event+0x39/0x60
  handle_fasteoi_irq+0x8f/0x160
  handle_irq+0x6f/0x110
  do_IRQ+0x46/0xd0
  common_interrupt+0x93/0x93
 RIP: 0010:native_safe_halt+0x6/0x10
 RSP: 0000:ffffb54240cd7e90 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff7e
 RAX: 0000000000000000 RBX: ffff8ea214498040 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
 RBP: ffffb54240cd7e90 R08: 0000000000000001 R09: 0000000041129b0c
 R10: ffffb54240cd7d68 R11: 0000000000000001 R12: 0000000000000002
 R13: ffff8ea214498040 R14: 0000000000000000 R15: ffff8ea214498040
  </IRQ>
  default_idle+0x38/0x160
  arch_cpu_idle+0xf/0x20
  default_idle_call+0x28/0x50
  do_idle+0x182/0x220
  cpu_startup_entry+0x1d/0x20
  start_secondary+0x132/0x160
  secondary_startup_64+0x9f/0x9f
 handlers:
 [<ffffffff9a6421a0>] xhci_msi_irq
 Disabling IRQ #16
 Freeing unused kernel memory: 3548K
 Write protecting the kernel read-only data: 16384k
 Freeing unused kernel memory: 1116K
 Freeing unused kernel memory: 1840K
 readahead: starting
nt /proc/bus/usb does not exist
		Welcome to Fedora 
		Press 'I' to enter interactive startup.
 udev: starting version 153
 udevd (244): /proc/244/oom_adj is deprecated, please use /proc/244/oom_score_adj instead.
: error opening ATTR{/sys/devices/system/cpu/cpu0/online} for writing: Permission denied

 ehci_hcd: module verification failed: signature and/or required key missing - tainting kernel
 ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
 ehci-pci: EHCI PCI platform driver
 ehci-pci 0000:00:1a.0: EHCI Host Controller
 ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 3
 ehci-pci 0000:00:1a.0: debug port 2
 ehci-pci 0000:00:1a.0: cache line size of 64 is not supported
 genirq: Flags mismatch irq 16. 00000080 (ehci_hcd:usb3) vs. 00000000 (xhci_hcd)
 CPU: 0 PID: 307 Comm: modprobe Tainted: G            E   4.12.0-rc1-test-dirty #24
 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
 Call Trace:
  dump_stack+0x9a/0xd6
  __setup_irq+0x5d4/0x630
  request_threaded_irq+0x10d/0x190
  usb_add_hcd+0x658/0x970
  ? for_each_companion+0x3e/0xb0
  usb_hcd_pci_probe+0x3e4/0x490
  ehci_pci_probe+0x36/0x40 [ehci_pci]
  local_pci_probe+0x45/0xa0
  ? pci_match_device+0xca/0x110
  pci_device_probe+0xdb/0x130
  driver_probe_device+0x2ed/0x480
  __driver_attach+0xd5/0x100
  ? driver_probe_device+0x480/0x480
  bus_for_each_dev+0x62/0xa0
  driver_attach+0x1e/0x20
  bus_add_driver+0x1c6/0x290
  driver_register+0x60/0xe0
  __pci_register_driver+0x60/0x70
  ? 0xffffffffc0346000
  ehci_pci_init+0x6a/0x1000 [ehci_pci]
  do_one_initcall+0x43/0x190
  ? kmem_cache_alloc_trace+0x1be/0x200
  do_init_module+0x7d/0x210
  load_module+0x1891/0x1eb0
  ? vmap_page_range_noflush+0x29b/0x370
  ? show_coresize+0x30/0x30
  SYSC_init_module+0x143/0x180
  ? load_module+0x5/0x1eb0
  ? SYSC_init_module+0x143/0x180
  SyS_init_module+0xe/0x10
  entry_SYSCALL_64_fastpath+0x23/0xc2
 RIP: 0033:0x3b918e0ffa
 RSP: 002b:00007ffd11d575c8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
 RAX: ffffffffffffffda RBX: 000000000061f950 RCX: 0000003b918e0ffa
 RDX: 000000000061f7d0 RSI: 00000000000036b0 RDI: 000000000062c9e0
 RBP: 0000000000000000 R08: 0000000000630090 R09: 00007f019c07c700
 R10: 00007ffd11d574f0 R11: 0000000000000246 R12: 0000000000626200
 R13: 000000000061f930 R14: 0000000000000000 R15: 000000000061f420
 ehci-pci 0000:00:1a.0: request interrupt 16 failed
 ehci-pci 0000:00:1a.0: USB bus 3 deregistered
 ehci-pci 0000:00:1a.0: init 0000:00:1a.0 fail, -16
 ehci-pci: probe of 0000:00:1a.0 failed with error -16
 ehci-pci 0000:00:1d.0: EHCI Host Controller
 ehci-pci 0000:00:1d.0: new USB bus registered, assigned bus number 3
 ehci-pci 0000:00:1d.0: debug port 2
 ehci-pci 0000:00:1d.0: cache line size of 64 is not supported
 ehci-pci 0000:00:1d.0: irq 23, io mem 0xf7d1b000
 ehci-pci 0000:00:1d.0: USB 2.0 started, EHCI 1.00
 hub 3-0:1.0: USB hub found
 hub 3-0:1.0: 2 ports detected
 sr 0:0:0:0: Attached scsi generic sg0 type 5
 sd 1:0:0:0: Attached scsi generic sg1 type 0
 r8169 0000:02:00.0 eth2: renamed from eth0
 usb 3-1: new high-speed USB device number 2 using ehci-pci
 hub 3-1:1.0: ignoring external hub
 xhci_hcd 0000:00:14.0: Error while assigning device slot ID
 xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 32.
 usb usb1-port5: couldn't allocate usb_device


Setting up Logical Volume Management:   1 logical volume(s) in volume group "vg00" now active
  3 logical volume(s) in volume group "fedora" now active

Checking filesystems
Checking all file systems.
 fsck.ext4 -a /dev/sda7 
/dev/sda7: clean, 210030/1313760 files, 1911684/5245214 blocks

 EXT4-fs (sda7): re-mounted. Opts: (null)

 EXT4-fs (dm-0): mounting ext3 file system using the ext4 subsystem
 EXT4-fs (dm-0): warning: maximal mount count reached, running e2fsck is recommended
 EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null)
 EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)



Entering non-interactive startup
Starting monitoring for VG fedora:   3 logical volume(s) in volume group "fedora" monitored

Starting monitoring for VG vg00:   1 logical volume(s) in volume group "vg00" monitored

ip6tables: Applying firewall rules: WARNING: Deprecated config file /etc/modprobe.conf, all config files belong into /etc/modprobe.d/.
FATAL: Module ip6_tables not found.
ip6tables-restore v1.4.7: ip6tables-restore: unable to initialize table 'filter'

Error occurred at line: 3
Try `ip6tables-restore -h' or 'ip6tables-restore --help' for more information.

iptables: Applying firewall rules: WARNING: Deprecated config file /etc/modprobe.conf, all config files belong into /etc/modprobe.d/.
FATAL: Module ip_tables not found.
iptables-restore v1.4.7: iptables-restore: unable to initialize table 'filter'

Error occurred at line: 3
Try `iptables-restore -h' or 'iptables-restore --help' for more information.







Tried to locate /boot/vmlinuz-4.12.0-rc1-test-dirty
Cannot open `/boot/vmlinuz-4.12.0-rc1-test-dirty': stream error




 r8169 0000:02:00.0: Direct firmware load for rtl_nic/rtl8168g-2.fw failed with error -2
 r8169 0000:02:00.0 eth2: unable to load firmware patch rtl_nic/rtl8168g-2.fw (-2)
 genirq: Flags mismatch irq 16. 00000080 (eth2) vs. 00000000 (xhci_hcd)
 CPU: 1 PID: 934 Comm: NetworkManager Tainted: G            E   4.12.0-rc1-test-dirty #24
 Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
 Call Trace:
  dump_stack+0x9a/0xd6
  __setup_irq+0x5d4/0x630
  request_threaded_irq+0x10d/0x190
  rtl_open+0x4af/0xc80
  __dev_open+0x95/0xf0
  __dev_change_flags+0x9d/0x160
  dev_change_flags+0x29/0x60
  do_setlink+0x306/0xc60
  ? sched_clock+0x13/0x20
  ? sched_clock+0x13/0x20
  ? sched_clock_local+0x17/0x90
  ? deactivate_slab+0x592/0x600
  ? nla_parse+0x35/0x140
  rtnl_setlink+0xe2/0x130
  rtnetlink_rcv_msg+0x8a/0x1f0
  ? rtnetlink_rcv+0x1b/0x40
  ? rtnl_newlink+0x910/0x910
  netlink_rcv_skb+0xd3/0xf0
  rtnetlink_rcv+0x2a/0x40
  netlink_unicast+0x165/0x1e0
  netlink_sendmsg+0x26b/0x340
  ___sys_sendmsg+0x2e9/0x300
  ? ___sys_recvmsg+0x174/0x200
  ? SYSC_sendto+0x102/0x180
  __sys_sendmsg+0x45/0x80
  ? ___sys_sendmsg+0x5/0x300
  ? __sys_sendmsg+0x45/0x80
  SyS_sendmsg+0x12/0x20
  entry_SYSCALL_64_fastpath+0x23/0xc2
 RIP: 0033:0x3b91c0eac0
 RSP: 002b:00007ffe79384be8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
 RAX: ffffffffffffffda RBX: 0000000000713400 RCX: 0000003b91c0eac0
 RDX: 0000000000000000 RSI: 00007ffe79384c30 RDI: 0000000000000009
 RBP: 00007ffe79384ac8 R08: 0000000000001003 R09: 0000000000000004
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
 R13: 0000000000000002 R14: ffffffffffffff58 R15: 00007ffe79384b10


Starting RPC idmapd: WARNING: Deprecated config file /etc/modprobe.conf, all config files belong into /etc/modprobe.d/.
WARNING: Deprecated config file /etc/modprobe.conf, all config files belong into /etc/modprobe.d/.
FATAL: Module sunrpc not found.
FATAL: Error running install command for sunrpc
Error: RPC MTAB does not exist.


I attached the full dmesg as well as my .config.

-- Steve

[-- Attachment #2: dmesg.gz --]
[-- Type: application/gzip, Size: 16553 bytes --]

[-- Attachment #3: config.gz --]
[-- Type: application/gzip, Size: 32947 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors
  2017-05-19  3:42 [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors Steven Rostedt
@ 2017-05-19  5:42 ` Greg Kroah-Hartman
  2017-05-19 10:08   ` Steven Rostedt
  2017-05-19  8:20 ` Christoph Hellwig
  1 sibling, 1 reply; 10+ messages in thread
From: Greg Kroah-Hartman @ 2017-05-19  5:42 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, linux-usb, Linus Torvalds, Christoph Hellwig, Mathias Nyman

On Thu, May 18, 2017 at 11:42:34PM -0400, Steven Rostedt wrote:
> 
> One of my the configs I use to test ftrace with (configs that have
> caused failures in the past), has lots of irq issues and fails to
> initialize the network of my box. I bisected the problem down to a
> single commit, and when I revert that commit, my box boots without any
> network or irq issues.
> 
> Note, my other configs work fine on this box. I haven't investigated
> which config is also the culprit. But since it use to work with this
> config, I want to report it.

So what commit is causing the problem?

It looks like the ehci driver is having problems, but first, your
interrupts are whack:

>  irq 16: nobody cared (try booting with the "irqpoll" option)
>  CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.12.0-rc1-test-dirty #24
>  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
>  Call Trace:
>   <IRQ>
>  devtmpfs: mounted
>   dump_stack+0x9a/0xd6
>   __report_bad_irq+0x35/0xc0
>   note_interrupt+0x234/0x270
>   handle_irq_event_percpu+0x45/0x60
>   handle_irq_event+0x39/0x60
>   handle_fasteoi_irq+0x8f/0x160
>   handle_irq+0x6f/0x110
>   do_IRQ+0x46/0xd0
>   common_interrupt+0x93/0x93
>  RIP: 0010:native_safe_halt+0x6/0x10
>  RSP: 0000:ffffb54240cd7e90 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff7e
>  RAX: 0000000000000000 RBX: ffff8ea214498040 RCX: 0000000000000000
>  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
>  RBP: ffffb54240cd7e90 R08: 0000000000000001 R09: 0000000041129b0c
>  R10: ffffb54240cd7d68 R11: 0000000000000001 R12: 0000000000000002
>  R13: ffff8ea214498040 R14: 0000000000000000 R15: ffff8ea214498040
>   </IRQ>
>   default_idle+0x38/0x160
>   arch_cpu_idle+0xf/0x20
>   default_idle_call+0x28/0x50
>   do_idle+0x182/0x220
>   cpu_startup_entry+0x1d/0x20
>   start_secondary+0x132/0x160
>   secondary_startup_64+0x9f/0x9f
>  handlers:
>  [<ffffffff9a6421a0>] xhci_msi_irq
>  Disabling IRQ #16

Have you tried taking the kernel's advice?  :)

>  ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 3
>  ehci-pci 0000:00:1a.0: debug port 2
>  ehci-pci 0000:00:1a.0: cache line size of 64 is not supported
>  genirq: Flags mismatch irq 16. 00000080 (ehci_hcd:usb3) vs. 00000000 (xhci_hcd)

What does that mean?

>  CPU: 0 PID: 307 Comm: modprobe Tainted: G            E   4.12.0-rc1-test-dirty #24
>  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
>  Call Trace:
>   dump_stack+0x9a/0xd6
>   __setup_irq+0x5d4/0x630
>   request_threaded_irq+0x10d/0x190
>   usb_add_hcd+0x658/0x970
>   ? for_each_companion+0x3e/0xb0
>   usb_hcd_pci_probe+0x3e4/0x490
>   ehci_pci_probe+0x36/0x40 [ehci_pci]
>   local_pci_probe+0x45/0xa0
>   ? pci_match_device+0xca/0x110
>   pci_device_probe+0xdb/0x130
>   driver_probe_device+0x2ed/0x480
>   __driver_attach+0xd5/0x100
>   ? driver_probe_device+0x480/0x480
>   bus_for_each_dev+0x62/0xa0
>   driver_attach+0x1e/0x20
>   bus_add_driver+0x1c6/0x290
>   driver_register+0x60/0xe0
>   __pci_register_driver+0x60/0x70
>   ? 0xffffffffc0346000
>   ehci_pci_init+0x6a/0x1000 [ehci_pci]
>   do_one_initcall+0x43/0x190
>   ? kmem_cache_alloc_trace+0x1be/0x200
>   do_init_module+0x7d/0x210
>   load_module+0x1891/0x1eb0
>   ? vmap_page_range_noflush+0x29b/0x370
>   ? show_coresize+0x30/0x30
>   SYSC_init_module+0x143/0x180
>   ? load_module+0x5/0x1eb0
>   ? SYSC_init_module+0x143/0x180
>   SyS_init_module+0xe/0x10
>   entry_SYSCALL_64_fastpath+0x23/0xc2
>  RIP: 0033:0x3b918e0ffa
>  RSP: 002b:00007ffd11d575c8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
>  RAX: ffffffffffffffda RBX: 000000000061f950 RCX: 0000003b918e0ffa
>  RDX: 000000000061f7d0 RSI: 00000000000036b0 RDI: 000000000062c9e0
>  RBP: 0000000000000000 R08: 0000000000630090 R09: 00007f019c07c700
>  R10: 00007ffd11d574f0 R11: 0000000000000246 R12: 0000000000626200
>  R13: 000000000061f930 R14: 0000000000000000 R15: 000000000061f420
>  ehci-pci 0000:00:1a.0: request interrupt 16 failed

So ehci can't use the same irq line as xhci?  No sharing allowed?

But other configs on this same hardware work, can you do a diff of a
working vs. not working?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors
  2017-05-19  3:42 [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors Steven Rostedt
  2017-05-19  5:42 ` Greg Kroah-Hartman
@ 2017-05-19  8:20 ` Christoph Hellwig
  2017-05-19 10:14   ` Steven Rostedt
  1 sibling, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2017-05-19  8:20 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Linus Torvalds, Christoph Hellwig, Mathias Nyman,
	Greg Kroah-Hartman

On Thu, May 18, 2017 at 11:42:34PM -0400, Steven Rostedt wrote:
> 
> One of my the configs I use to test ftrace with (configs that have
> caused failures in the past), has lots of irq issues and fails to
> initialize the network of my box. I bisected the problem down to a
> single commit, and when I revert that commit, my box boots without any
> network or irq issues.

Can send me the /proc/interrupts output for the system?  The commit
should not have changed behavior and that will narrow down where
I need to look for an issue.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors
  2017-05-19  5:42 ` Greg Kroah-Hartman
@ 2017-05-19 10:08   ` Steven Rostedt
  2017-05-19 12:37     ` Steven Rostedt
  0 siblings, 1 reply; 10+ messages in thread
From: Steven Rostedt @ 2017-05-19 10:08 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: LKML, linux-usb, Linus Torvalds, Christoph Hellwig, Mathias Nyman

On Fri, 19 May 2017 07:42:23 +0200
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> On Thu, May 18, 2017 at 11:42:34PM -0400, Steven Rostedt wrote:
> > 
> > One of my the configs I use to test ftrace with (configs that have
> > caused failures in the past), has lots of irq issues and fails to
> > initialize the network of my box. I bisected the problem down to a
> > single commit, and when I revert that commit, my box boots without any
> > network or irq issues.
> > 
> > Note, my other configs work fine on this box. I haven't investigated
> > which config is also the culprit. But since it use to work with this
> > config, I want to report it.  
> 
> So what commit is causing the problem?

Ug, I forgot to cut and paste the sha1. I thought I did, but I only cut
and pasted the subject into the subject of this email.

commit 77d45b4500967de674b8f75a9a91f58d57d5704d

> 
> It looks like the ehci driver is having problems, but first, your
> interrupts are whack:

Could be. It's an old board.

> 
> >  irq 16: nobody cared (try booting with the "irqpoll" option)
> >  CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.12.0-rc1-test-dirty #24
> >  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
> >  Call Trace:
> >   <IRQ>
> >  devtmpfs: mounted
> >   dump_stack+0x9a/0xd6
> >   __report_bad_irq+0x35/0xc0
> >   note_interrupt+0x234/0x270
> >   handle_irq_event_percpu+0x45/0x60
> >   handle_irq_event+0x39/0x60
> >   handle_fasteoi_irq+0x8f/0x160
> >   handle_irq+0x6f/0x110
> >   do_IRQ+0x46/0xd0
> >   common_interrupt+0x93/0x93
> >  RIP: 0010:native_safe_halt+0x6/0x10
> >  RSP: 0000:ffffb54240cd7e90 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff7e
> >  RAX: 0000000000000000 RBX: ffff8ea214498040 RCX: 0000000000000000
> >  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> >  RBP: ffffb54240cd7e90 R08: 0000000000000001 R09: 0000000041129b0c
> >  R10: ffffb54240cd7d68 R11: 0000000000000001 R12: 0000000000000002
> >  R13: ffff8ea214498040 R14: 0000000000000000 R15: ffff8ea214498040
> >   </IRQ>
> >   default_idle+0x38/0x160
> >   arch_cpu_idle+0xf/0x20
> >   default_idle_call+0x28/0x50
> >   do_idle+0x182/0x220
> >   cpu_startup_entry+0x1d/0x20
> >   start_secondary+0x132/0x160
> >   secondary_startup_64+0x9f/0x9f
> >  handlers:
> >  [<ffffffff9a6421a0>] xhci_msi_irq
> >  Disabling IRQ #16  
> 
> Have you tried taking the kernel's advice?  :)

You mean the "irqpoll"?  No. This works fine without that commit. Why
should I have to change?

> 
> >  ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 3
> >  ehci-pci 0000:00:1a.0: debug port 2
> >  ehci-pci 0000:00:1a.0: cache line size of 64 is not supported
> >  genirq: Flags mismatch irq 16. 00000080 (ehci_hcd:usb3) vs. 00000000 (xhci_hcd)  
> 
> What does that mean?

No idea ;-)

> 
> >  CPU: 0 PID: 307 Comm: modprobe Tainted: G            E   4.12.0-rc1-test-dirty #24
> >  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
> >  Call Trace:
> >   dump_stack+0x9a/0xd6
> >   __setup_irq+0x5d4/0x630
> >   request_threaded_irq+0x10d/0x190
> >   usb_add_hcd+0x658/0x970
> >   ? for_each_companion+0x3e/0xb0
> >   usb_hcd_pci_probe+0x3e4/0x490
> >   ehci_pci_probe+0x36/0x40 [ehci_pci]
> >   local_pci_probe+0x45/0xa0
> >   ? pci_match_device+0xca/0x110
> >   pci_device_probe+0xdb/0x130
> >   driver_probe_device+0x2ed/0x480
> >   __driver_attach+0xd5/0x100
> >   ? driver_probe_device+0x480/0x480
> >   bus_for_each_dev+0x62/0xa0
> >   driver_attach+0x1e/0x20
> >   bus_add_driver+0x1c6/0x290
> >   driver_register+0x60/0xe0
> >   __pci_register_driver+0x60/0x70
> >   ? 0xffffffffc0346000
> >   ehci_pci_init+0x6a/0x1000 [ehci_pci]
> >   do_one_initcall+0x43/0x190
> >   ? kmem_cache_alloc_trace+0x1be/0x200
> >   do_init_module+0x7d/0x210
> >   load_module+0x1891/0x1eb0
> >   ? vmap_page_range_noflush+0x29b/0x370
> >   ? show_coresize+0x30/0x30
> >   SYSC_init_module+0x143/0x180
> >   ? load_module+0x5/0x1eb0
> >   ? SYSC_init_module+0x143/0x180
> >   SyS_init_module+0xe/0x10
> >   entry_SYSCALL_64_fastpath+0x23/0xc2
> >  RIP: 0033:0x3b918e0ffa
> >  RSP: 002b:00007ffd11d575c8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
> >  RAX: ffffffffffffffda RBX: 000000000061f950 RCX: 0000003b918e0ffa
> >  RDX: 000000000061f7d0 RSI: 00000000000036b0 RDI: 000000000062c9e0
> >  RBP: 0000000000000000 R08: 0000000000630090 R09: 00007f019c07c700
> >  R10: 00007ffd11d574f0 R11: 0000000000000246 R12: 0000000000626200
> >  R13: 000000000061f930 R14: 0000000000000000 R15: 000000000061f420
> >  ehci-pci 0000:00:1a.0: request interrupt 16 failed  
> 
> So ehci can't use the same irq line as xhci?  No sharing allowed?
> 
> But other configs on this same hardware work, can you do a diff of a
> working vs. not working?

I could probably run my config-bisect and see what it comes up with.

-- Steve

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors
  2017-05-19  8:20 ` Christoph Hellwig
@ 2017-05-19 10:14   ` Steven Rostedt
  0 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2017-05-19 10:14 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: LKML, Linus Torvalds, Mathias Nyman, Greg Kroah-Hartman

On Fri, 19 May 2017 10:20:03 +0200
Christoph Hellwig <hch@lst.de> wrote:

> On Thu, May 18, 2017 at 11:42:34PM -0400, Steven Rostedt wrote:
> > 
> > One of my the configs I use to test ftrace with (configs that have
> > caused failures in the past), has lots of irq issues and fails to
> > initialize the network of my box. I bisected the problem down to a
> > single commit, and when I revert that commit, my box boots without any
> > network or irq issues.  
> 
> Can send me the /proc/interrupts output for the system?  The commit
> should not have changed behavior and that will narrow down where
> I need to look for an issue.

Sure, here:

# cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       
  0:        140          0          0          0   IO-APIC   2-edge      timer
  4:         91          0        190          0   IO-APIC   4-edge      ttyS0
  8:          0          0          1          0   IO-APIC   8-edge      rtc0
  9:          0          0          0          0   IO-APIC   9-fasteoi   acpi
 16:         13        116         42          0   IO-APIC  16-fasteoi   xhci-hcd:usb1, ehci_hcd:usb3, eth2
 19:        329          0       6008        714   IO-APIC  19-fasteoi   ahci[0000:00:1f.2]
 23:          1          0         12          0   IO-APIC  23-fasteoi   ehci_hcd:usb4
NMI:          0          0          0          0   Non-maskable interrupts
LOC:      10443       7996      17600       9457   Local timer interrupts
SPU:          0          0          0          0   Spurious interrupts
PMI:          0          0          0          0   Performance monitoring interrupts
IWI:          0          0          0          0   IRQ work interrupts
RTR:          1          0          0          0   APIC ICR read retries
RES:       3665       2733       2160       2566   Rescheduling interrupts
CAL:       1796       1806        258       2079   Function call interrupts
TLB:         14         15         30         10   TLB shootdowns
TRM:          0          0          0          0   Thermal event interrupts
THR:          0          0          0          0   Threshold APIC interrupts
DFR:          0          0          0          0   Deferred Error APIC interrupts
MCE:          0          0          0          0   Machine check exceptions
MCP:          0          0          0          0   Machine check polls
ERR:        130
MIS:          0
PIN:          0          0          0          0   Posted-interrupt notification event
PIW:          0          0          0          0   Posted-interrupt wakeup event

Just in case this makes a difference:

# cat /proc/cmdline 
ro root=/dev/sda7 LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us console=ttyS0,115200 nmi_watchdog=panic earlyprintk=ttyS0,115200 ignore_loglevel selinux=0 crashkernel=128M@256M isocpus=3

-- Steve

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors
  2017-05-19 10:08   ` Steven Rostedt
@ 2017-05-19 12:37     ` Steven Rostedt
  2017-05-19 12:46       ` Christoph Hellwig
  0 siblings, 1 reply; 10+ messages in thread
From: Steven Rostedt @ 2017-05-19 12:37 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: LKML, linux-usb, Linus Torvalds, Christoph Hellwig, Mathias Nyman

On Fri, 19 May 2017 06:08:56 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> > But other configs on this same hardware work, can you do a diff of a
> > working vs. not working?  
> 
> I could probably run my config-bisect and see what it comes up with.

ktest config bisect ended with:

***************************************
Found bad config: CONFIG_PCI_MSI
***************************************

diffconfig good_config bad_config
-ENA_ETHERNET n
-FM10K n
-GENERIC_MSI_IRQ y
-GENERIC_MSI_IRQ_DOMAIN y
-I40EVF n
-INTEL_IOMMU n
-IRQ_REMAP n
-IXGBEVF n
-LIQUIDIO_VF n
-NFP n
-PCIE_DW_PLAT n
-PCI_MSI_IRQ_DOMAIN y
-VMD n
 PCI_MSI y -> n


When that is not set, it fails to boot. It boots fine if I enable it.

-- Steve

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors
  2017-05-19 12:37     ` Steven Rostedt
@ 2017-05-19 12:46       ` Christoph Hellwig
  2017-05-19 13:45         ` Steven Rostedt
  2017-05-20 16:49         ` Linus Torvalds
  0 siblings, 2 replies; 10+ messages in thread
From: Christoph Hellwig @ 2017-05-19 12:46 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Greg Kroah-Hartman, LKML, linux-usb, Linus Torvalds,
	Christoph Hellwig, Mathias Nyman

On Fri, May 19, 2017 at 08:37:21AM -0400, Steven Rostedt wrote:
> ktest config bisect ended with:
> 
> ***************************************
> Found bad config: CONFIG_PCI_MSI
> ***************************************

Oh, that's interesting.  I think there's been a bug in the !CONFIG_PCI_MSI
fallback for pci_alloc_irq_vectors since the very beginning.  And it
didn't matter for any driver so far, but xhci has a very odd way
to set MSI(-X) vs legacy interrupts.

Can you try the patch below?


diff --git a/include/linux/pci.h b/include/linux/pci.h
index 33c2b0b77429..5a7fd3b6a7b9 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1342,7 +1342,7 @@ pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
 			       unsigned int max_vecs, unsigned int flags,
 			       const struct irq_affinity *aff_desc)
 {
-	if (min_vecs > 1)
+	if (min_vecs > 1 || !(flags & PCI_IRQ_LEGACY))
 		return -EINVAL;
 	return 1;
 }

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors
  2017-05-19 12:46       ` Christoph Hellwig
@ 2017-05-19 13:45         ` Steven Rostedt
  2017-05-20 16:49         ` Linus Torvalds
  1 sibling, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2017-05-19 13:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Greg Kroah-Hartman, LKML, linux-usb, Linus Torvalds,
	Mathias Nyman

On Fri, 19 May 2017 14:46:25 +0200
Christoph Hellwig <hch@lst.de> wrote:

> On Fri, May 19, 2017 at 08:37:21AM -0400, Steven Rostedt wrote:
> > ktest config bisect ended with:
> > 
> > ***************************************
> > Found bad config: CONFIG_PCI_MSI
> > ***************************************  
> 
> Oh, that's interesting.  I think there's been a bug in the !CONFIG_PCI_MSI
> fallback for pci_alloc_irq_vectors since the very beginning.  And it
> didn't matter for any driver so far, but xhci has a very odd way
> to set MSI(-X) vs legacy interrupts.
> 
> Can you try the patch below?

Works. Thanks!

Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

-- Steve

> 
> 
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 33c2b0b77429..5a7fd3b6a7b9 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1342,7 +1342,7 @@ pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
>  			       unsigned int max_vecs, unsigned int flags,
>  			       const struct irq_affinity *aff_desc)
>  {
> -	if (min_vecs > 1)
> +	if (min_vecs > 1 || !(flags & PCI_IRQ_LEGACY))
>  		return -EINVAL;
>  	return 1;
>  }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors
  2017-05-19 12:46       ` Christoph Hellwig
  2017-05-19 13:45         ` Steven Rostedt
@ 2017-05-20 16:49         ` Linus Torvalds
  2017-05-20 16:53           ` Christoph Hellwig
  1 sibling, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2017-05-20 16:49 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Steven Rostedt, Greg Kroah-Hartman, LKML, USB list, Mathias Nyman

On Fri, May 19, 2017 at 5:46 AM, Christoph Hellwig <hch@lst.de> wrote:
>
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 33c2b0b77429..5a7fd3b6a7b9 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1342,7 +1342,7 @@ pci_alloc_irq_vectors_affinity(struct pci_dev *dev, unsigned int min_vecs,
>                                unsigned int max_vecs, unsigned int flags,
>                                const struct irq_affinity *aff_desc)
>  {
> -       if (min_vecs > 1)
> +       if (min_vecs > 1 || !(flags & PCI_IRQ_LEGACY))
>                 return -EINVAL;
>         return 1;
>  }

Side note: why is it doing that " > 1" check, when any value _other_
than 1 is wrong?

Also, to match the non-MSI implementation, wouldn't it be nicer to
just write it that same way (and also verify "dev->irq"):

        if (flags & PCI_IRQ_LEGACY) {
                if (min_vecs == 1 && dev->irq)
                        return 1;
        }
        return -ENOSPC;

(the exact error value probably doesn't matter in practice, but the
CONFIG_MSI case returns ENOSPC by default and that's what
Documentation/PCI/MSI-HOWTO.txt says too).

Hmm?

             Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors
  2017-05-20 16:49         ` Linus Torvalds
@ 2017-05-20 16:53           ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2017-05-20 16:53 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Steven Rostedt, Greg Kroah-Hartman, LKML,
	USB list, Mathias Nyman

On Sat, May 20, 2017 at 09:49:56AM -0700, Linus Torvalds wrote:
> Side note: why is it doing that " > 1" check, when any value _other_
> than 1 is wrong?

It's the same effect, so either one is fine with me.

> Also, to match the non-MSI implementation, wouldn't it be nicer to
> just write it that same way (and also verify "dev->irq"):
> 
>         if (flags & PCI_IRQ_LEGACY) {
>                 if (min_vecs == 1 && dev->irq)
>                         return 1;
>         }
>         return -ENOSPC;
> 
> (the exact error value probably doesn't matter in practice, but the
> CONFIG_MSI case returns ENOSPC by default and that's what
> Documentation/PCI/MSI-HOWTO.txt says too).

Sure.  Just sent the previous version to Bjorn so that he could maybe
make it for -rc2, but I'll respin it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-05-20 16:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-19  3:42 [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors Steven Rostedt
2017-05-19  5:42 ` Greg Kroah-Hartman
2017-05-19 10:08   ` Steven Rostedt
2017-05-19 12:37     ` Steven Rostedt
2017-05-19 12:46       ` Christoph Hellwig
2017-05-19 13:45         ` Steven Rostedt
2017-05-20 16:49         ` Linus Torvalds
2017-05-20 16:53           ` Christoph Hellwig
2017-05-19  8:20 ` Christoph Hellwig
2017-05-19 10:14   ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox