* 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors @ 2017-08-15 14:54 Meelis Roos 2017-08-15 18:44 ` Bjorn Helgaas 0 siblings, 1 reply; 19+ messages in thread From: Meelis Roos @ 2017-08-15 14:54 UTC (permalink / raw) To: sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 t5120 server: can't allocate MSI-X affinity masks. [ 30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k. [ 30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000. [ 30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors [ 30.816882] scsi host1: qla2xxx [ 30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22 [ 30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000. [ 30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors [ 31.367083] scsi host1: qla2xxx [ 31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22 I do not know if the driver works since nothing is attached to the FC HBA at the moment, but from the error messages it looks like the driver fails to load. I booted 4.12 and 4.11 - the red error is not there but the failure seems to be the same error -22: [2478900.385223] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 9.00.00.00-k. [2478900.385610] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000. [2478900.930517] scsi host1: qla2xxx [2478900.990939] qla2xxx: probe of 0000:10:00.0 failed with error -22 [2478900.991222] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000. [2478901.510715] scsi host1: qla2xxx [2478901.581106] qla2xxx: probe of 0000:10:00.1 failed with error -22 Will try older kernels too if it is useful for bisection. On an older sparc64 (t1-200) with 4.13.0-rc4, qla2xxx loads fine (nothing is attached there either): [ 30.590064] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k. [ 30.699053] PCI: Enabling device: (0000:02:05.0), cmd 3 [ 30.699122] qla2xxx [0000:02:05.0]-001d: : Found an ISP2200 irq 12 iobase 0x000001ff0000a000. [ 52.463403] scsi host2: qla2xxx [ 52.545973] qla2xxx [0000:02:05.0]-4800:2: DPC handler sleeping. [ 52.627163] qla2xxx [0000:02:05.0]-00fb:2: QLogic QLA22xx - . [ 52.705428] qla2xxx [0000:02:05.0]-00fc:2: ISP2200: PCI (33 MHz) @ 0000:02:05.0 hdma- host#=2 fw=2.02.08 TP. [ 53.503221] qla2xxx [0000:02:05.0]-480f:2: Loop resync scheduled. [ 73.796964] qla2xxx [0000:02:05.0]-8038:2: Cable is unplugged... [ 73.876036] qla2xxx [0000:02:05.0]-883a:2: fw_state=4 (ffff, ffff, ffff, ffff ffff) curr time=ffffa61d. [ 73.999845] qla2xxx [0000:02:05.0]-286c:2: qla2x00_loop_resync *** FAILED ***. [ 74.094861] qla2xxx [0000:02:05.0]-4810:2: Loop resync end. [ 74.168188] qla2xxx [0000:02:05.0]-4800:2: DPC handler sleeping. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-15 14:54 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors Meelis Roos @ 2017-08-15 18:44 ` Bjorn Helgaas 2017-08-15 20:24 ` Meelis Roos 0 siblings, 1 reply; 19+ messages in thread From: Bjorn Helgaas @ 2017-08-15 18:44 UTC (permalink / raw) To: Meelis Roos Cc: sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list, Christoph Hellwig [+cc Christoph] On Tue, Aug 15, 2017 at 05:54:27PM +0300, Meelis Roos wrote: > I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 > t5120 server: can't allocate MSI-X affinity masks. > > [ 30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k. > [ 30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000. > [ 30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors > [ 30.816882] scsi host1: qla2xxx > [ 30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22 > [ 30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000. > [ 30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors > [ 31.367083] scsi host1: qla2xxx > [ 31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22 > > I do not know if the driver works since nothing is attached to the FC > HBA at the moment, but from the error messages it looks like the driver > fails to load. > > I booted 4.12 and 4.11 - the red error is not there but the failure > seems to be the same error -22: -22 is -EINVAL, so not very specific. Many failures probably use this code. There were several IRQ affinity changes between v4.12 and v4.13; it'll probably be obvious to Christoph. > [2478900.385223] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 9.00.00.00-k. > [2478900.385610] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000. > [2478900.930517] scsi host1: qla2xxx > [2478900.990939] qla2xxx: probe of 0000:10:00.0 failed with error -22 > [2478900.991222] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000. > [2478901.510715] scsi host1: qla2xxx > [2478901.581106] qla2xxx: probe of 0000:10:00.1 failed with error -22 > > Will try older kernels too if it is useful for bisection. > > On an older sparc64 (t1-200) with 4.13.0-rc4, qla2xxx loads fine (nothing is attached there either): > > [ 30.590064] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k. > [ 30.699053] PCI: Enabling device: (0000:02:05.0), cmd 3 > [ 30.699122] qla2xxx [0000:02:05.0]-001d: : Found an ISP2200 irq 12 iobase 0x000001ff0000a000. > [ 52.463403] scsi host2: qla2xxx > [ 52.545973] qla2xxx [0000:02:05.0]-4800:2: DPC handler sleeping. > [ 52.627163] qla2xxx [0000:02:05.0]-00fb:2: QLogic QLA22xx - . > [ 52.705428] qla2xxx [0000:02:05.0]-00fc:2: ISP2200: PCI (33 MHz) @ 0000:02:05.0 hdma- host#=2 fw=2.02.08 TP. > [ 53.503221] qla2xxx [0000:02:05.0]-480f:2: Loop resync scheduled. > [ 73.796964] qla2xxx [0000:02:05.0]-8038:2: Cable is unplugged... > [ 73.876036] qla2xxx [0000:02:05.0]-883a:2: fw_state=4 (ffff, ffff, ffff, ffff ffff) curr time=ffffa61d. > [ 73.999845] qla2xxx [0000:02:05.0]-286c:2: qla2x00_loop_resync *** FAILED ***. > [ 74.094861] qla2xxx [0000:02:05.0]-4810:2: Loop resync end. > [ 74.168188] qla2xxx [0000:02:05.0]-4800:2: DPC handler sleeping. > > > -- > Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-15 18:44 ` Bjorn Helgaas @ 2017-08-15 20:24 ` Meelis Roos 2017-08-16 18:39 ` Meelis Roos 0 siblings, 1 reply; 19+ messages in thread From: Meelis Roos @ 2017-08-15 20:24 UTC (permalink / raw) To: Bjorn Helgaas Cc: sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list, Christoph Hellwig > On Tue, Aug 15, 2017 at 05:54:27PM +0300, Meelis Roos wrote: > > I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 > > t5120 server: can't allocate MSI-X affinity masks. > > > > [ 30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k. > > [ 30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000. > > [ 30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors > > [ 30.816882] scsi host1: qla2xxx > > [ 30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22 > > [ 30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000. > > [ 30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors > > [ 31.367083] scsi host1: qla2xxx > > [ 31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22 > > > > I do not know if the driver works since nothing is attached to the FC > > HBA at the moment, but from the error messages it looks like the driver > > fails to load. > > > > I booted 4.12 and 4.11 - the red error is not there but the failure > > seems to be the same error -22: 4.10.0 works, 4.11.0 errors out with EINVAL and 4.13-rc4 errorr sout with more verbose MSI messages. So something between 4.10 and 4.11 has broken it. Also, 4.13-rc4 is broken on another sun4v here (T1000). So it seems to be sun4v interrupt related. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-15 20:24 ` Meelis Roos @ 2017-08-16 18:39 ` Meelis Roos 2017-08-16 19:02 ` Bjorn Helgaas 2017-08-17 10:09 ` Christoph Hellwig 0 siblings, 2 replies; 19+ messages in thread From: Meelis Roos @ 2017-08-16 18:39 UTC (permalink / raw) To: Bjorn Helgaas Cc: sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list, Christoph Hellwig > > > I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 > > > t5120 server: can't allocate MSI-X affinity masks. > > > > > > [ 30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k. > > > [ 30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000. > > > [ 30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors > > > [ 30.816882] scsi host1: qla2xxx > > > [ 30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22 > > > [ 30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000. > > > [ 30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors > > > [ 31.367083] scsi host1: qla2xxx > > > [ 31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22 > > > > > > I do not know if the driver works since nothing is attached to the FC > > > HBA at the moment, but from the error messages it looks like the driver > > > fails to load. > > > > > > I booted 4.12 and 4.11 - the red error is not there but the failure > > > seems to be the same error -22: > > 4.10.0 works, 4.11.0 errors out with EINVAL and 4.13-rc4 errorr sout > with more verbose MSI messages. So something between 4.10 and 4.11 has > broken it. I can not reproduice the older kernels that misbehave. I checked out earlier kernels and recompiled them (old config lost, nothing changed AFAIK), everything works up to 4.12 inclusive. > Also, 4.13-rc4 is broken on another sun4v here (T1000). So it seems to > be sun4v interrupt related. This still holds - 4.13-rc4 has MSI trouble on at least 2 of my sun4v machines. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-16 18:39 ` Meelis Roos @ 2017-08-16 19:02 ` Bjorn Helgaas 2017-08-17 14:47 ` Meelis Roos 2017-08-21 18:27 ` David Miller 2017-08-17 10:09 ` Christoph Hellwig 1 sibling, 2 replies; 19+ messages in thread From: Bjorn Helgaas @ 2017-08-16 19:02 UTC (permalink / raw) To: Meelis Roos Cc: sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list, Christoph Hellwig On Wed, Aug 16, 2017 at 09:39:08PM +0300, Meelis Roos wrote: > > > > I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 > > > > t5120 server: can't allocate MSI-X affinity masks. > > > > > > > > [ 30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k. > > > > [ 30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000. > > > > [ 30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors > > > > [ 30.816882] scsi host1: qla2xxx > > > > [ 30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22 > > > > [ 30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000. > > > > [ 30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors > > > > [ 31.367083] scsi host1: qla2xxx > > > > [ 31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22 > > > > > > > > I do not know if the driver works since nothing is attached to the FC > > > > HBA at the moment, but from the error messages it looks like the driver > > > > fails to load. > > > > > > > > I booted 4.12 and 4.11 - the red error is not there but the failure > > > > seems to be the same error -22: > > > > 4.10.0 works, 4.11.0 errors out with EINVAL and 4.13-rc4 errorr sout > > with more verbose MSI messages. So something between 4.10 and 4.11 has > > broken it. > > I can not reproduice the older kernels that misbehave. I checked out > earlier kernels and recompiled them (old config lost, nothing changed > AFAIK), everything works up to 4.12 inclusive. > > > Also, 4.13-rc4 is broken on another sun4v here (T1000). So it seems to > > be sun4v interrupt related. > > This still holds - 4.13-rc4 has MSI trouble on at least 2 of my sun4v > machines. IIUC, that means v4.12 works and v4.13-rc4 does not, so this is a regression we introduced this cycle. If nobody steps up with a theory, bisecting might be the easiest path forward. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-16 19:02 ` Bjorn Helgaas @ 2017-08-17 14:47 ` Meelis Roos 2017-08-21 18:27 ` David Miller 1 sibling, 0 replies; 19+ messages in thread From: Meelis Roos @ 2017-08-17 14:47 UTC (permalink / raw) To: Bjorn Helgaas Cc: sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list, Christoph Hellwig > On Wed, Aug 16, 2017 at 09:39:08PM +0300, Meelis Roos wrote: > > > > > I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 > > > > > t5120 server: can't allocate MSI-X affinity masks. > > > > > > > > > > [ 30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k. > > > > > [ 30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000. > > > > > [ 30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors > > > > > [ 30.816882] scsi host1: qla2xxx > > > > > [ 30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22 > > > > > [ 30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000. > > > > > [ 30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors > > > > > [ 31.367083] scsi host1: qla2xxx > > > > > [ 31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22 > > > > > > IIUC, that means v4.12 works and v4.13-rc4 does not, so this is a > regression we introduced this cycle. Yes, I understand the same. But under some circumstances/configs it has been probematic before too. I could not reproduce the circumstances. > If nobody steps up with a theory, bisecting might be the easiest path > forward. I finished bisecting but was not successful. The pattern was strange: good good skip good skip good skip .... bad bad bad bad bad bad. The first bad commit was - unrelated xen merge. Reverting this commit does not fix the problem. Like at some moment it got broken by side effects (code size or whatever). The skips were most because of repeated on on sparc cpuidle code, and initially some in iommu related code. This might bend the results so some commits were not tested. git bisect start # good: [6f7da290413ba713f0cdd9ff1a2a9bb129ef4f6c] Linux 4.12 git bisect good 6f7da290413ba713f0cdd9ff1a2a9bb129ef4f6c # bad: [aae4e7a8bc44722fe70d58920a36916b1043195e] Linux 4.13-rc4 git bisect bad aae4e7a8bc44722fe70d58920a36916b1043195e # good: [920f2ecdf6c3b3526f60fbd38c68597953cad3ee] Merge tag 'sound-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound git bisect good 920f2ecdf6c3b3526f60fbd38c68597953cad3ee # skip: [af3c8d98508d37541d4bf57f13a984a7f73a328c] Merge tag 'drm-for-v4.13' of git://people.freedesktop.org/~airlied/linux git bisect skip af3c8d98508d37541d4bf57f13a984a7f73a328c # good: [d29cb3e45e923715f74d8a08d5c1ea996dce5a59] xfs: make _bmap_count_blocks consistent wrt delalloc extent behavior git bisect good d29cb3e45e923715f74d8a08d5c1ea996dce5a59 # good: [fa6d095eb23a8b1aae78d221879032497f6e457f] drm/tegra: Add driver documentation git bisect good fa6d095eb23a8b1aae78d221879032497f6e457f # good: [37e51a7640c275999ea0c35410c42e6d896ff7fa] mm: clean up error handling in write_one_page git bisect good 37e51a7640c275999ea0c35410c42e6d896ff7fa # good: [4b9cdd96e7ea3dc2cd0edac67835f6f38c4f14c9] drm/omap: remove CLUT git bisect good 4b9cdd96e7ea3dc2cd0edac67835f6f38c4f14c9 # good: [7f56c30bd0a232822aca38d288da475613bdff9b] vfio: Remove unnecessary uses of vfio_container.group_lock git bisect good 7f56c30bd0a232822aca38d288da475613bdff9b # good: [d7631e30434e7fcf025dd2a7cba879f203f7849b] switch compat_drm_getsareactx() to drm_ioctl_kernel() git bisect good d7631e30434e7fcf025dd2a7cba879f203f7849b # skip: [f991af3daabaecff34684fd51fac80319d1baad1] mqueue: fix a use-after-free in sys_mq_notify() git bisect skip f991af3daabaecff34684fd51fac80319d1baad1 # good: [ecbb903c56745d59c301db26dd7d8b74b520eb84] NFS: Be more careful about mapping file permissions git bisect good ecbb903c56745d59c301db26dd7d8b74b520eb84 # skip: [b49defe83659cefbb1763d541e779da32594ab10] kvm: avoid unused variable warning for UP builds git bisect skip b49defe83659cefbb1763d541e779da32594ab10 # good: [b5ab16bf64347ebc9dbdc51a4f603511babda1e6] drm/amdgpu: properly byteswap gpu_info firmware git bisect good b5ab16bf64347ebc9dbdc51a4f603511babda1e6 # good: [3941dae15ed90437396389e8bb7d2d5b3e63ba4a] drm_dp_aux_dev: switch to read_iter/write_iter git bisect good 3941dae15ed90437396389e8bb7d2d5b3e63ba4a # good: [f0d9c8924e2c33764dca0c3a4f693a345ecf6579] [media] media: imx: Add IC subdev drivers git bisect good f0d9c8924e2c33764dca0c3a4f693a345ecf6579 # skip: [101dd590a7fa37954540cf3149a1c502c0acc524] powerpc/perf: Avoid spurious PMU interrupts after idle git bisect skip 101dd590a7fa37954540cf3149a1c502c0acc524 # good: [eb0f0373e575822cf35949627b92533c7c41629c] drm/amdgpu: fix a typo in comment git bisect good eb0f0373e575822cf35949627b92533c7c41629c # skip: [3f0bd8dad0db73f5d71b355aec5ab33b374260ba] powerpc/perf: Add POWER9 alternate PM_RUN_CYC and PM_RUN_INST_CMPL events git bisect skip 3f0bd8dad0db73f5d71b355aec5ab33b374260ba # good: [96edd61dcf44362d3ef0bed1a5361e0ac7886a63] xen/balloon: don't online new memory initially git bisect good 96edd61dcf44362d3ef0bed1a5361e0ac7886a63 # bad: [bc78d646e708dabd1744ca98744dea316f459497] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net git bisect bad bc78d646e708dabd1744ca98744dea316f459497 # bad: [0a2a1330d2621c7f963d9f55bb094811cc1c06b9] Merge branch 'for-4.13-part3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux git bisect bad 0a2a1330d2621c7f963d9f55bb094811cc1c06b9 # bad: [0ce2f385119344dc620ec635e355008a9d6f8401] Merge tag 'acpi-4.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm git bisect bad 0ce2f385119344dc620ec635e355008a9d6f8401 # bad: [da08f35b0f82b0a7a79f518faf8d0c0b477f91bc] Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost git bisect bad da08f35b0f82b0a7a79f518faf8d0c0b477f91bc # bad: [25f6a53799d667283d3bee29a6ac75ae3dae38dc] Merge tag 'jfs-4.13' of git://github.com/kleikamp/linux-shaggy git bisect bad 25f6a53799d667283d3bee29a6ac75ae3dae38dc # bad: [eeb7c41d9d7c0902accb1d481fe78d84d30c69cc] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux git bisect bad eeb7c41d9d7c0902accb1d481fe78d84d30c69cc # bad: [520eccdfe187591a51ea9ab4c1a024ae4d0f68d9] Linux 4.13-rc2 git bisect bad 520eccdfe187591a51ea9ab4c1a024ae4d0f68d9 # bad: [f47e07bc5f1a5c48ed60a8ee55352cb4b2bf4d51] Fix up MAINTAINERS file problems git bisect bad f47e07bc5f1a5c48ed60a8ee55352cb4b2bf4d51 # bad: [a56e88ec05df50110f2bf578b6e17128f37111ed] Merge tag 'for-linus-4.13b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip git bisect bad a56e88ec05df50110f2bf578b6e17128f37111ed # first bad commit: [a56e88ec05df50110f2bf578b6e17128f37111ed] Merge tag 'for-linus-4.13b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-16 19:02 ` Bjorn Helgaas 2017-08-17 14:47 ` Meelis Roos @ 2017-08-21 18:27 ` David Miller 2017-08-21 18:34 ` Christoph Hellwig 1 sibling, 1 reply; 19+ messages in thread From: David Miller @ 2017-08-21 18:27 UTC (permalink / raw) To: helgaas; +Cc: mroos, sparclinux, linux-pci, qla2xxx-upstream, linux-kernel, hch From: Bjorn Helgaas <helgaas@kernel.org> Date: Wed, 16 Aug 2017 14:02:41 -0500 > On Wed, Aug 16, 2017 at 09:39:08PM +0300, Meelis Roos wrote: >> > > > I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 >> > > > t5120 server: can't allocate MSI-X affinity masks. >> > > > >> > > > [ 30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k. >> > > > [ 30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000. >> > > > [ 30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors >> > > > [ 30.816882] scsi host1: qla2xxx >> > > > [ 30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22 >> > > > [ 30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000. >> > > > [ 30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors >> > > > [ 31.367083] scsi host1: qla2xxx >> > > > [ 31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22 >> > > > >> > > > I do not know if the driver works since nothing is attached to the FC >> > > > HBA at the moment, but from the error messages it looks like the driver >> > > > fails to load. >> > > > >> > > > I booted 4.12 and 4.11 - the red error is not there but the failure >> > > > seems to be the same error -22: >> > >> > 4.10.0 works, 4.11.0 errors out with EINVAL and 4.13-rc4 errorr sout >> > with more verbose MSI messages. So something between 4.10 and 4.11 has >> > broken it. >> >> I can not reproduice the older kernels that misbehave. I checked out >> earlier kernels and recompiled them (old config lost, nothing changed >> AFAIK), everything works up to 4.12 inclusive. >> >> > Also, 4.13-rc4 is broken on another sun4v here (T1000). So it seems to >> > be sun4v interrupt related. >> >> This still holds - 4.13-rc4 has MSI trouble on at least 2 of my sun4v >> machines. > > IIUC, that means v4.12 works and v4.13-rc4 does not, so this is a > regression we introduced this cycle. > > If nobody steps up with a theory, bisecting might be the easiest path > forward. I suspect the test added by: commit 6f9a22bc5775d231ab8fbe2c2f3c88e45e3e7c28 Author: Michael Hernandez <michael.hernandez@cavium.com> Date: Thu May 18 10:47:47 2017 -0700 PCI/MSI: Ignore affinity if pre/post vector count is more than min_vecs is triggering. The rest of the failure cases are memory allocation failures which should not be happening here. There have only been 5 commits to kernel/irq/affinity.c since v4.10 I suppose we have been getting away with something that has silently been allowed in the past, or something like that. Meelis can you run with the following debuggingspatch? diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c index d69bd77252a7..d16c6326000a 100644 --- a/kernel/irq/affinity.c +++ b/kernel/irq/affinity.c @@ -110,6 +110,9 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd) struct cpumask *masks; cpumask_var_t nmsk, *node_to_present_cpumask; + pr_info("irq_create_affinity_masks: nvecs[%d] affd->pre_vectors[%d] " + "affd->post_vectors[%d]\n", + nvecs, affd->pre_vectors, affd->post_vectors); /* * If there aren't any vectors left after applying the pre/post * vectors don't bother with assigning affinity. ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-21 18:27 ` David Miller @ 2017-08-21 18:34 ` Christoph Hellwig 2017-08-21 19:20 ` mroos 0 siblings, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2017-08-21 18:34 UTC (permalink / raw) To: David Miller Cc: helgaas, mroos, sparclinux, linux-pci, qla2xxx-upstream, linux-kernel, hch I think with this patch from -rc6 the symptoms should be cured: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c005390374957baacbc38eef96ea360559510aa7 if that theory is right. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-21 18:34 ` Christoph Hellwig @ 2017-08-21 19:20 ` mroos 2017-08-21 20:35 ` David Miller 0 siblings, 1 reply; 19+ messages in thread From: mroos @ 2017-08-21 19:20 UTC (permalink / raw) To: Christoph Hellwig Cc: David Miller, helgaas, sparclinux, linux-pci, qla2xxx-upstream, linux-kernel > I think with this patch from -rc6 the symptoms should be cured: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c005390374957baacbc38eef96ea360559510aa7 > > if that theory is right. The result with 4.13-rc6 is positive but mixed: the message about MSI-X affinty maks are still there but the rest of the detection works and the driver is loaded successfully: [ 29.924282] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k. [ 29.924710] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000. [ 29.925581] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors [ 30.483422] scsi host1: qla2xxx [ 35.495031] qla2xxx [0000:10:00.0]-00fb:1: QLogic QLE2462 - SG-(X)PCIE2FC-QF4, Sun StorageTek 4 Gb FC Enterprise PCI-Express Dual Channel H. [ 35.495274] qla2xxx [0000:10:00.0]-00fc:1: ISP2432: PCIe (2.5GT/s x4) @ 0000:10:00.0 hdma- host#=1 fw=7.03.00 (9496). [ 35.495615] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000. [ 35.496409] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors [ 35.985355] scsi host2: qla2xxx [ 40.996991] qla2xxx [0000:10:00.1]-00fb:2: QLogic QLE2462 - SG-(X)PCIE2FC-QF4, Sun StorageTek 4 Gb FC Enterprise PCI-Express Dual Channel H. [ 40.997251] qla2xxx [0000:10:00.1]-00fc:2: ISP2432: PCIe (2.5GT/s x4) @ 0000:10:00.1 hdma- host#=2 fw=7.03.00 (9496). [ 51.880945] qla2xxx [0000:10:00.0]-8038:1: Cable is unplugged... [ 57.402900] qla2xxx [0000:10:00.1]-8038:2: Cable is unplugged... With Dave Millers patch on top of 4.13-rc6, I see the following before both MSI-X messages: irq_create_affinity_masks: nvecs[2] affd->pre_vectors[2] affd->post_vectors[0] -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-21 19:20 ` mroos @ 2017-08-21 20:35 ` David Miller 2017-08-22 5:02 ` Meelis Roos 2017-08-22 6:35 ` Christoph Hellwig 0 siblings, 2 replies; 19+ messages in thread From: David Miller @ 2017-08-21 20:35 UTC (permalink / raw) To: mroos; +Cc: hch, helgaas, sparclinux, linux-pci, qla2xxx-upstream, linux-kernel From: mroos@linux.ee Date: Mon, 21 Aug 2017 22:20:22 +0300 (EEST) >> I think with this patch from -rc6 the symptoms should be cured: >> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c005390374957baacbc38eef96ea360559510aa7 >> >> if that theory is right. > > The result with 4.13-rc6 is positive but mixed: the message about MSI-X > affinty maks are still there but the rest of the detection works and the > driver is loaded successfully: Is this an SMP system? I ask because the commit log message indicates that this failure is not expected to ever happen on SMP. We really need to root cause this. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-21 20:35 ` David Miller @ 2017-08-22 5:02 ` Meelis Roos 2017-08-22 6:35 ` Christoph Hellwig 1 sibling, 0 replies; 19+ messages in thread From: Meelis Roos @ 2017-08-22 5:02 UTC (permalink / raw) To: David Miller Cc: hch, helgaas, sparclinux, linux-pci, qla2xxx-upstream, linux-kernel > > >> I think with this patch from -rc6 the symptoms should be cured: > >> > >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c005390374957baacbc38eef96ea360559510aa7 > >> > >> if that theory is right. > > > > The result with 4.13-rc6 is positive but mixed: the message about MSI-X > > affinty maks are still there but the rest of the detection works and the > > driver is loaded successfully: > > Is this an SMP system? Yes, T5120. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-21 20:35 ` David Miller 2017-08-22 5:02 ` Meelis Roos @ 2017-08-22 6:35 ` Christoph Hellwig 2017-08-22 16:31 ` David Miller 1 sibling, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2017-08-22 6:35 UTC (permalink / raw) To: David Miller Cc: mroos, hch, helgaas, sparclinux, linux-pci, qla2xxx-upstream, linux-kernel On Mon, Aug 21, 2017 at 01:35:49PM -0700, David Miller wrote: > I ask because the commit log message indicates that this failure is > not expected to ever happen on SMP. I fear my commit message (but not the code) might be wrong. irq_create_affinity_masks can return NULL any time we don't have any affinity masks. I've already had a discussion about this elsewhere with Bjorn, and I suspect we need to kill the warning or move it to irq_create_affinity_masks only for genuine failure cases. > > We really need to root cause this. ---end quoted text--- ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-22 6:35 ` Christoph Hellwig @ 2017-08-22 16:31 ` David Miller 2017-08-22 16:33 ` Meelis Roos 2017-08-22 16:39 ` Christoph Hellwig 0 siblings, 2 replies; 19+ messages in thread From: David Miller @ 2017-08-22 16:31 UTC (permalink / raw) To: hch; +Cc: mroos, helgaas, sparclinux, linux-pci, qla2xxx-upstream, linux-kernel From: Christoph Hellwig <hch@lst.de> Date: Tue, 22 Aug 2017 08:35:05 +0200 > On Mon, Aug 21, 2017 at 01:35:49PM -0700, David Miller wrote: >> I ask because the commit log message indicates that this failure is >> not expected to ever happen on SMP. > > I fear my commit message (but not the code) might be wrong. > irq_create_affinity_masks can return NULL any time we don't have any > affinity masks. I've already had a discussion about this elsewhere > with Bjorn, and I suspect we need to kill the warning or move it > to irq_create_affinity_masks only for genuine failure cases. This is a rather large machine with 64 or more cpus and several NUMA nodes. Why wouldn't there be any affinity masks available? That's why I want to root cause this. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-22 16:31 ` David Miller @ 2017-08-22 16:33 ` Meelis Roos 2017-08-22 16:45 ` David Miller 2017-08-22 16:39 ` Christoph Hellwig 1 sibling, 1 reply; 19+ messages in thread From: Meelis Roos @ 2017-08-22 16:33 UTC (permalink / raw) To: David Miller Cc: hch, helgaas, sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list > > On Mon, Aug 21, 2017 at 01:35:49PM -0700, David Miller wrote: > >> I ask because the commit log message indicates that this failure is > >> not expected to ever happen on SMP. > > > > I fear my commit message (but not the code) might be wrong. > > irq_create_affinity_masks can return NULL any time we don't have any > > affinity masks. I've already had a discussion about this elsewhere > > with Bjorn, and I suspect we need to kill the warning or move it > > to irq_create_affinity_masks only for genuine failure cases. > > This is a rather large machine with 64 or more cpus and several NUMA > nodes. Why wouldn't there be any affinity masks available? T5120 with 1 slot and 32 threads total. I have not configured any NUM on it is there any reason for that? -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-22 16:33 ` Meelis Roos @ 2017-08-22 16:45 ` David Miller 0 siblings, 0 replies; 19+ messages in thread From: David Miller @ 2017-08-22 16:45 UTC (permalink / raw) To: mroos; +Cc: hch, helgaas, sparclinux, linux-pci, qla2xxx-upstream, linux-kernel From: Meelis Roos <mroos@linux.ee> Date: Tue, 22 Aug 2017 19:33:55 +0300 (EEST) >> > On Mon, Aug 21, 2017 at 01:35:49PM -0700, David Miller wrote: >> >> I ask because the commit log message indicates that this failure is >> >> not expected to ever happen on SMP. >> > >> > I fear my commit message (but not the code) might be wrong. >> > irq_create_affinity_masks can return NULL any time we don't have any >> > affinity masks. I've already had a discussion about this elsewhere >> > with Bjorn, and I suspect we need to kill the warning or move it >> > to irq_create_affinity_masks only for genuine failure cases. >> >> This is a rather large machine with 64 or more cpus and several NUMA >> nodes. Why wouldn't there be any affinity masks available? > > T5120 with 1 slot and 32 threads total. I have not configured any NUM on > it is there any reason for that? Ok 32 cpus and 1 NUMA node, my bad :-) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-22 16:31 ` David Miller 2017-08-22 16:33 ` Meelis Roos @ 2017-08-22 16:39 ` Christoph Hellwig 2017-08-22 16:52 ` David Miller 1 sibling, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2017-08-22 16:39 UTC (permalink / raw) To: David Miller Cc: hch, mroos, helgaas, sparclinux, linux-pci, qla2xxx-upstream, linux-kernel On Tue, Aug 22, 2017 at 09:31:39AM -0700, David Miller wrote: > > I fear my commit message (but not the code) might be wrong. > > irq_create_affinity_masks can return NULL any time we don't have any > > affinity masks. I've already had a discussion about this elsewhere > > with Bjorn, and I suspect we need to kill the warning or move it > > to irq_create_affinity_masks only for genuine failure cases. > > This is a rather large machine with 64 or more cpus and several NUMA > nodes. Why wouldn't there be any affinity masks available? The drivers only asked for two MSI-X vectors, and marked bost of them as pre-vectors that should not be spread. So there is no actual vector left that we want to actually spread. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-22 16:39 ` Christoph Hellwig @ 2017-08-22 16:52 ` David Miller 0 siblings, 0 replies; 19+ messages in thread From: David Miller @ 2017-08-22 16:52 UTC (permalink / raw) To: hch; +Cc: mroos, helgaas, sparclinux, linux-pci, qla2xxx-upstream, linux-kernel From: Christoph Hellwig <hch@lst.de> Date: Tue, 22 Aug 2017 18:39:16 +0200 > On Tue, Aug 22, 2017 at 09:31:39AM -0700, David Miller wrote: >> > I fear my commit message (but not the code) might be wrong. >> > irq_create_affinity_masks can return NULL any time we don't have any >> > affinity masks. I've already had a discussion about this elsewhere >> > with Bjorn, and I suspect we need to kill the warning or move it >> > to irq_create_affinity_masks only for genuine failure cases. >> >> This is a rather large machine with 64 or more cpus and several NUMA >> nodes. Why wouldn't there be any affinity masks available? > > The drivers only asked for two MSI-X vectors, and marked bost of them > as pre-vectors that should not be spread. So there is no actual > vector left that we want to actually spread. Ok, now it makes more sense, and yes the warning should be removed. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-16 18:39 ` Meelis Roos 2017-08-16 19:02 ` Bjorn Helgaas @ 2017-08-17 10:09 ` Christoph Hellwig 2017-08-17 10:17 ` Meelis Roos 1 sibling, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2017-08-17 10:09 UTC (permalink / raw) To: Meelis Roos Cc: Bjorn Helgaas, sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list, Christoph Hellwig Just curious: these are all SMP builds, right? Just got burnt again by an UP kernel issue in that area that I sent a patch for (to Jens) a long time ago, but that didn't get fixed. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors 2017-08-17 10:09 ` Christoph Hellwig @ 2017-08-17 10:17 ` Meelis Roos 0 siblings, 0 replies; 19+ messages in thread From: Meelis Roos @ 2017-08-17 10:17 UTC (permalink / raw) To: Christoph Hellwig Cc: Bjorn Helgaas, sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list > Just curious: these are all SMP builds, right? Yes. 32 threads on that CPU. I am bisecting it slowly - some steps crash on boot for seemingly different reasons and skipping them does not advance quikly. -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2017-08-22 16:52 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-08-15 14:54 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors Meelis Roos 2017-08-15 18:44 ` Bjorn Helgaas 2017-08-15 20:24 ` Meelis Roos 2017-08-16 18:39 ` Meelis Roos 2017-08-16 19:02 ` Bjorn Helgaas 2017-08-17 14:47 ` Meelis Roos 2017-08-21 18:27 ` David Miller 2017-08-21 18:34 ` Christoph Hellwig 2017-08-21 19:20 ` mroos 2017-08-21 20:35 ` David Miller 2017-08-22 5:02 ` Meelis Roos 2017-08-22 6:35 ` Christoph Hellwig 2017-08-22 16:31 ` David Miller 2017-08-22 16:33 ` Meelis Roos 2017-08-22 16:45 ` David Miller 2017-08-22 16:39 ` Christoph Hellwig 2017-08-22 16:52 ` David Miller 2017-08-17 10:09 ` Christoph Hellwig 2017-08-17 10:17 ` Meelis Roos
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).