linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
@ 2017-08-15 14:54 Meelis Roos
  2017-08-15 18:44 ` Bjorn Helgaas
  0 siblings, 1 reply; 19+ messages in thread
From: Meelis Roos @ 2017-08-15 14:54 UTC (permalink / raw)
  To: sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list

I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 
t5120 server: can't allocate MSI-X affinity masks.

[   30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k.
[   30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000.
[   30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors
[   30.816882] scsi host1: qla2xxx
[   30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22
[   30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000.
[   30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors
[   31.367083] scsi host1: qla2xxx
[   31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22

I do not know if the driver works since nothing is attached to the FC 
HBA at the moment, but from the error messages it looks like the driver 
fails to load.

I booted 4.12 and 4.11 - the red error is not there but the failure 
seems to be the same error -22:

[2478900.385223] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 9.00.00.00-k.
[2478900.385610] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000.
[2478900.930517] scsi host1: qla2xxx
[2478900.990939] qla2xxx: probe of 0000:10:00.0 failed with error -22
[2478900.991222] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000.
[2478901.510715] scsi host1: qla2xxx
[2478901.581106] qla2xxx: probe of 0000:10:00.1 failed with error -22

Will try older kernels too if it is useful for bisection.

On an older sparc64 (t1-200) with 4.13.0-rc4, qla2xxx loads fine (nothing is attached there either):

[   30.590064] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k.
[   30.699053] PCI: Enabling device: (0000:02:05.0), cmd 3
[   30.699122] qla2xxx [0000:02:05.0]-001d: : Found an ISP2200 irq 12 iobase 0x000001ff0000a000.
[   52.463403] scsi host2: qla2xxx
[   52.545973] qla2xxx [0000:02:05.0]-4800:2: DPC handler sleeping.
[   52.627163] qla2xxx [0000:02:05.0]-00fb:2: QLogic QLA22xx - .
[   52.705428] qla2xxx [0000:02:05.0]-00fc:2: ISP2200: PCI (33 MHz) @ 0000:02:05.0 hdma- host#=2 fw=2.02.08 TP.
[   53.503221] qla2xxx [0000:02:05.0]-480f:2: Loop resync scheduled.
[   73.796964] qla2xxx [0000:02:05.0]-8038:2: Cable is unplugged...
[   73.876036] qla2xxx [0000:02:05.0]-883a:2: fw_state=4 (ffff, ffff, ffff, ffff ffff) curr time=ffffa61d.
[   73.999845] qla2xxx [0000:02:05.0]-286c:2: qla2x00_loop_resync *** FAILED ***.
[   74.094861] qla2xxx [0000:02:05.0]-4810:2: Loop resync end.
[   74.168188] qla2xxx [0000:02:05.0]-4800:2: DPC handler sleeping.


-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-15 14:54 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors Meelis Roos
@ 2017-08-15 18:44 ` Bjorn Helgaas
  2017-08-15 20:24   ` Meelis Roos
  0 siblings, 1 reply; 19+ messages in thread
From: Bjorn Helgaas @ 2017-08-15 18:44 UTC (permalink / raw)
  To: Meelis Roos
  Cc: sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list,
	Christoph Hellwig

[+cc Christoph]

On Tue, Aug 15, 2017 at 05:54:27PM +0300, Meelis Roos wrote:
> I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 
> t5120 server: can't allocate MSI-X affinity masks.
> 
> [   30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k.
> [   30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000.
> [   30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors
> [   30.816882] scsi host1: qla2xxx
> [   30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22
> [   30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000.
> [   30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors
> [   31.367083] scsi host1: qla2xxx
> [   31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22
> 
> I do not know if the driver works since nothing is attached to the FC 
> HBA at the moment, but from the error messages it looks like the driver 
> fails to load.
> 
> I booted 4.12 and 4.11 - the red error is not there but the failure 
> seems to be the same error -22:

-22 is -EINVAL, so not very specific.  Many failures probably use this
code.

There were several IRQ affinity changes between v4.12 and v4.13; it'll
probably be obvious to Christoph.

> [2478900.385223] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 9.00.00.00-k.
> [2478900.385610] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000.
> [2478900.930517] scsi host1: qla2xxx
> [2478900.990939] qla2xxx: probe of 0000:10:00.0 failed with error -22
> [2478900.991222] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000.
> [2478901.510715] scsi host1: qla2xxx
> [2478901.581106] qla2xxx: probe of 0000:10:00.1 failed with error -22
> 
> Will try older kernels too if it is useful for bisection.
> 
> On an older sparc64 (t1-200) with 4.13.0-rc4, qla2xxx loads fine (nothing is attached there either):
> 
> [   30.590064] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k.
> [   30.699053] PCI: Enabling device: (0000:02:05.0), cmd 3
> [   30.699122] qla2xxx [0000:02:05.0]-001d: : Found an ISP2200 irq 12 iobase 0x000001ff0000a000.
> [   52.463403] scsi host2: qla2xxx
> [   52.545973] qla2xxx [0000:02:05.0]-4800:2: DPC handler sleeping.
> [   52.627163] qla2xxx [0000:02:05.0]-00fb:2: QLogic QLA22xx - .
> [   52.705428] qla2xxx [0000:02:05.0]-00fc:2: ISP2200: PCI (33 MHz) @ 0000:02:05.0 hdma- host#=2 fw=2.02.08 TP.
> [   53.503221] qla2xxx [0000:02:05.0]-480f:2: Loop resync scheduled.
> [   73.796964] qla2xxx [0000:02:05.0]-8038:2: Cable is unplugged...
> [   73.876036] qla2xxx [0000:02:05.0]-883a:2: fw_state=4 (ffff, ffff, ffff, ffff ffff) curr time=ffffa61d.
> [   73.999845] qla2xxx [0000:02:05.0]-286c:2: qla2x00_loop_resync *** FAILED ***.
> [   74.094861] qla2xxx [0000:02:05.0]-4810:2: Loop resync end.
> [   74.168188] qla2xxx [0000:02:05.0]-4800:2: DPC handler sleeping.
> 
> 
> -- 
> Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-15 18:44 ` Bjorn Helgaas
@ 2017-08-15 20:24   ` Meelis Roos
  2017-08-16 18:39     ` Meelis Roos
  0 siblings, 1 reply; 19+ messages in thread
From: Meelis Roos @ 2017-08-15 20:24 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list,
	Christoph Hellwig

> On Tue, Aug 15, 2017 at 05:54:27PM +0300, Meelis Roos wrote:
> > I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 
> > t5120 server: can't allocate MSI-X affinity masks.
> > 
> > [   30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k.
> > [   30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000.
> > [   30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors
> > [   30.816882] scsi host1: qla2xxx
> > [   30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22
> > [   30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000.
> > [   30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors
> > [   31.367083] scsi host1: qla2xxx
> > [   31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22
> > 
> > I do not know if the driver works since nothing is attached to the FC 
> > HBA at the moment, but from the error messages it looks like the driver 
> > fails to load.
> > 
> > I booted 4.12 and 4.11 - the red error is not there but the failure 
> > seems to be the same error -22:

4.10.0 works, 4.11.0 errors out with EINVAL and 4.13-rc4 errorr sout 
with more verbose MSI messages. So something between 4.10 and 4.11 has 
broken it.

Also, 4.13-rc4 is broken on another sun4v here (T1000). So it seems to 
be sun4v interrupt related.

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-15 20:24   ` Meelis Roos
@ 2017-08-16 18:39     ` Meelis Roos
  2017-08-16 19:02       ` Bjorn Helgaas
  2017-08-17 10:09       ` Christoph Hellwig
  0 siblings, 2 replies; 19+ messages in thread
From: Meelis Roos @ 2017-08-16 18:39 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list,
	Christoph Hellwig

> > > I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 
> > > t5120 server: can't allocate MSI-X affinity masks.
> > > 
> > > [   30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k.
> > > [   30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000.
> > > [   30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors
> > > [   30.816882] scsi host1: qla2xxx
> > > [   30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22
> > > [   30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000.
> > > [   30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors
> > > [   31.367083] scsi host1: qla2xxx
> > > [   31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22
> > > 
> > > I do not know if the driver works since nothing is attached to the FC 
> > > HBA at the moment, but from the error messages it looks like the driver 
> > > fails to load.
> > > 
> > > I booted 4.12 and 4.11 - the red error is not there but the failure 
> > > seems to be the same error -22:
> 
> 4.10.0 works, 4.11.0 errors out with EINVAL and 4.13-rc4 errorr sout 
> with more verbose MSI messages. So something between 4.10 and 4.11 has 
> broken it.

I can not reproduice the older kernels that misbehave. I checked out 
earlier kernels and recompiled them (old config lost, nothing changed 
AFAIK), everything works up to 4.12 inclusive.

> Also, 4.13-rc4 is broken on another sun4v here (T1000). So it seems to 
> be sun4v interrupt related.

This still holds - 4.13-rc4 has MSI trouble on at least 2 of my sun4v 
machines.

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-16 18:39     ` Meelis Roos
@ 2017-08-16 19:02       ` Bjorn Helgaas
  2017-08-17 14:47         ` Meelis Roos
  2017-08-21 18:27         ` David Miller
  2017-08-17 10:09       ` Christoph Hellwig
  1 sibling, 2 replies; 19+ messages in thread
From: Bjorn Helgaas @ 2017-08-16 19:02 UTC (permalink / raw)
  To: Meelis Roos
  Cc: sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list,
	Christoph Hellwig

On Wed, Aug 16, 2017 at 09:39:08PM +0300, Meelis Roos wrote:
> > > > I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 
> > > > t5120 server: can't allocate MSI-X affinity masks.
> > > > 
> > > > [   30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k.
> > > > [   30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000.
> > > > [   30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors
> > > > [   30.816882] scsi host1: qla2xxx
> > > > [   30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22
> > > > [   30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000.
> > > > [   30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors
> > > > [   31.367083] scsi host1: qla2xxx
> > > > [   31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22
> > > > 
> > > > I do not know if the driver works since nothing is attached to the FC 
> > > > HBA at the moment, but from the error messages it looks like the driver 
> > > > fails to load.
> > > > 
> > > > I booted 4.12 and 4.11 - the red error is not there but the failure 
> > > > seems to be the same error -22:
> > 
> > 4.10.0 works, 4.11.0 errors out with EINVAL and 4.13-rc4 errorr sout 
> > with more verbose MSI messages. So something between 4.10 and 4.11 has 
> > broken it.
> 
> I can not reproduice the older kernels that misbehave. I checked out 
> earlier kernels and recompiled them (old config lost, nothing changed 
> AFAIK), everything works up to 4.12 inclusive.
> 
> > Also, 4.13-rc4 is broken on another sun4v here (T1000). So it seems to 
> > be sun4v interrupt related.
> 
> This still holds - 4.13-rc4 has MSI trouble on at least 2 of my sun4v 
> machines.

IIUC, that means v4.12 works and v4.13-rc4 does not, so this is a
regression we introduced this cycle.

If nobody steps up with a theory, bisecting might be the easiest path
forward.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-16 18:39     ` Meelis Roos
  2017-08-16 19:02       ` Bjorn Helgaas
@ 2017-08-17 10:09       ` Christoph Hellwig
  2017-08-17 10:17         ` Meelis Roos
  1 sibling, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2017-08-17 10:09 UTC (permalink / raw)
  To: Meelis Roos
  Cc: Bjorn Helgaas, sparclinux, linux-pci, qla2xxx-upstream,
	Linux Kernel list, Christoph Hellwig

Just curious:  these are all SMP builds, right?

Just got burnt again by an UP kernel issue in that area that I sent
a patch for (to Jens) a long time ago, but that didn't get fixed.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-17 10:09       ` Christoph Hellwig
@ 2017-08-17 10:17         ` Meelis Roos
  0 siblings, 0 replies; 19+ messages in thread
From: Meelis Roos @ 2017-08-17 10:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Bjorn Helgaas, sparclinux, linux-pci, qla2xxx-upstream,
	Linux Kernel list

> Just curious:  these are all SMP builds, right?

Yes. 32 threads on that CPU.

I am bisecting it slowly - some steps crash on boot for seemingly 
different reasons and skipping them does not advance quikly.

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-16 19:02       ` Bjorn Helgaas
@ 2017-08-17 14:47         ` Meelis Roos
  2017-08-21 18:27         ` David Miller
  1 sibling, 0 replies; 19+ messages in thread
From: Meelis Roos @ 2017-08-17 14:47 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: sparclinux, linux-pci, qla2xxx-upstream, Linux Kernel list,
	Christoph Hellwig

> On Wed, Aug 16, 2017 at 09:39:08PM +0300, Meelis Roos wrote:
> > > > > I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 
> > > > > t5120 server: can't allocate MSI-X affinity masks.
> > > > > 
> > > > > [   30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k.
> > > > > [   30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000.
> > > > > [   30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors
> > > > > [   30.816882] scsi host1: qla2xxx
> > > > > [   30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22
> > > > > [   30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000.
> > > > > [   30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors
> > > > > [   31.367083] scsi host1: qla2xxx
> > > > > [   31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22
> > > > > 

> IIUC, that means v4.12 works and v4.13-rc4 does not, so this is a
> regression we introduced this cycle.

Yes, I understand the same.

But under some circumstances/configs it has been probematic before too. 
I could not reproduce the circumstances.

> If nobody steps up with a theory, bisecting might be the easiest path
> forward.

I finished bisecting but was not successful. The pattern was strange:
good good skip good skip good skip .... bad bad bad bad bad bad.

The first bad commit was - unrelated xen merge. Reverting this commit 
does not fix the problem.

Like at some moment it got broken by side effects (code size or 
whatever). The skips were most because of repeated on on sparc cpuidle 
code, and initially some in iommu related code. This might bend the 
results so some commits were not tested.

git bisect start
# good: [6f7da290413ba713f0cdd9ff1a2a9bb129ef4f6c] Linux 4.12
git bisect good 6f7da290413ba713f0cdd9ff1a2a9bb129ef4f6c
# bad: [aae4e7a8bc44722fe70d58920a36916b1043195e] Linux 4.13-rc4
git bisect bad aae4e7a8bc44722fe70d58920a36916b1043195e
# good: [920f2ecdf6c3b3526f60fbd38c68597953cad3ee] Merge tag 'sound-4.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good 920f2ecdf6c3b3526f60fbd38c68597953cad3ee
# skip: [af3c8d98508d37541d4bf57f13a984a7f73a328c] Merge tag 'drm-for-v4.13' of git://people.freedesktop.org/~airlied/linux
git bisect skip af3c8d98508d37541d4bf57f13a984a7f73a328c
# good: [d29cb3e45e923715f74d8a08d5c1ea996dce5a59] xfs: make _bmap_count_blocks consistent wrt delalloc extent behavior
git bisect good d29cb3e45e923715f74d8a08d5c1ea996dce5a59
# good: [fa6d095eb23a8b1aae78d221879032497f6e457f] drm/tegra: Add driver documentation
git bisect good fa6d095eb23a8b1aae78d221879032497f6e457f
# good: [37e51a7640c275999ea0c35410c42e6d896ff7fa] mm: clean up error handling in write_one_page
git bisect good 37e51a7640c275999ea0c35410c42e6d896ff7fa
# good: [4b9cdd96e7ea3dc2cd0edac67835f6f38c4f14c9] drm/omap: remove CLUT
git bisect good 4b9cdd96e7ea3dc2cd0edac67835f6f38c4f14c9
# good: [7f56c30bd0a232822aca38d288da475613bdff9b] vfio: Remove unnecessary uses of vfio_container.group_lock
git bisect good 7f56c30bd0a232822aca38d288da475613bdff9b
# good: [d7631e30434e7fcf025dd2a7cba879f203f7849b] switch compat_drm_getsareactx() to drm_ioctl_kernel()
git bisect good d7631e30434e7fcf025dd2a7cba879f203f7849b
# skip: [f991af3daabaecff34684fd51fac80319d1baad1] mqueue: fix a use-after-free in sys_mq_notify()
git bisect skip f991af3daabaecff34684fd51fac80319d1baad1
# good: [ecbb903c56745d59c301db26dd7d8b74b520eb84] NFS: Be more careful about mapping file permissions
git bisect good ecbb903c56745d59c301db26dd7d8b74b520eb84
# skip: [b49defe83659cefbb1763d541e779da32594ab10] kvm: avoid unused variable warning for UP builds
git bisect skip b49defe83659cefbb1763d541e779da32594ab10
# good: [b5ab16bf64347ebc9dbdc51a4f603511babda1e6] drm/amdgpu: properly byteswap gpu_info firmware
git bisect good b5ab16bf64347ebc9dbdc51a4f603511babda1e6
# good: [3941dae15ed90437396389e8bb7d2d5b3e63ba4a] drm_dp_aux_dev: switch to read_iter/write_iter
git bisect good 3941dae15ed90437396389e8bb7d2d5b3e63ba4a
# good: [f0d9c8924e2c33764dca0c3a4f693a345ecf6579] [media] media: imx: Add IC subdev drivers
git bisect good f0d9c8924e2c33764dca0c3a4f693a345ecf6579
# skip: [101dd590a7fa37954540cf3149a1c502c0acc524] powerpc/perf: Avoid spurious PMU interrupts after idle
git bisect skip 101dd590a7fa37954540cf3149a1c502c0acc524
# good: [eb0f0373e575822cf35949627b92533c7c41629c] drm/amdgpu: fix a typo in comment
git bisect good eb0f0373e575822cf35949627b92533c7c41629c
# skip: [3f0bd8dad0db73f5d71b355aec5ab33b374260ba] powerpc/perf: Add POWER9 alternate PM_RUN_CYC and PM_RUN_INST_CMPL events
git bisect skip 3f0bd8dad0db73f5d71b355aec5ab33b374260ba
# good: [96edd61dcf44362d3ef0bed1a5361e0ac7886a63] xen/balloon: don't online new memory initially
git bisect good 96edd61dcf44362d3ef0bed1a5361e0ac7886a63
# bad: [bc78d646e708dabd1744ca98744dea316f459497] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
git bisect bad bc78d646e708dabd1744ca98744dea316f459497
# bad: [0a2a1330d2621c7f963d9f55bb094811cc1c06b9] Merge branch 'for-4.13-part3' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
git bisect bad 0a2a1330d2621c7f963d9f55bb094811cc1c06b9
# bad: [0ce2f385119344dc620ec635e355008a9d6f8401] Merge tag 'acpi-4.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect bad 0ce2f385119344dc620ec635e355008a9d6f8401
# bad: [da08f35b0f82b0a7a79f518faf8d0c0b477f91bc] Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
git bisect bad da08f35b0f82b0a7a79f518faf8d0c0b477f91bc
# bad: [25f6a53799d667283d3bee29a6ac75ae3dae38dc] Merge tag 'jfs-4.13' of git://github.com/kleikamp/linux-shaggy
git bisect bad 25f6a53799d667283d3bee29a6ac75ae3dae38dc
# bad: [eeb7c41d9d7c0902accb1d481fe78d84d30c69cc] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
git bisect bad eeb7c41d9d7c0902accb1d481fe78d84d30c69cc
# bad: [520eccdfe187591a51ea9ab4c1a024ae4d0f68d9] Linux 4.13-rc2
git bisect bad 520eccdfe187591a51ea9ab4c1a024ae4d0f68d9
# bad: [f47e07bc5f1a5c48ed60a8ee55352cb4b2bf4d51] Fix up MAINTAINERS file problems
git bisect bad f47e07bc5f1a5c48ed60a8ee55352cb4b2bf4d51
# bad: [a56e88ec05df50110f2bf578b6e17128f37111ed] Merge tag 'for-linus-4.13b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
git bisect bad a56e88ec05df50110f2bf578b6e17128f37111ed
# first bad commit: [a56e88ec05df50110f2bf578b6e17128f37111ed] Merge tag 'for-linus-4.13b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip



-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-16 19:02       ` Bjorn Helgaas
  2017-08-17 14:47         ` Meelis Roos
@ 2017-08-21 18:27         ` David Miller
  2017-08-21 18:34           ` Christoph Hellwig
  1 sibling, 1 reply; 19+ messages in thread
From: David Miller @ 2017-08-21 18:27 UTC (permalink / raw)
  To: helgaas; +Cc: mroos, sparclinux, linux-pci, qla2xxx-upstream, linux-kernel, hch

From: Bjorn Helgaas <helgaas@kernel.org>
Date: Wed, 16 Aug 2017 14:02:41 -0500

> On Wed, Aug 16, 2017 at 09:39:08PM +0300, Meelis Roos wrote:
>> > > > I noticed that in 4.13.0-rc4 there is a new error in dmesg on my sparc64 
>> > > > t5120 server: can't allocate MSI-X affinity masks.
>> > > > 
>> > > > [   30.274284] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k.
>> > > > [   30.274648] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000.
>> > > > [   30.275447] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors
>> > > > [   30.816882] scsi host1: qla2xxx
>> > > > [   30.877294] qla2xxx: probe of 0000:10:00.0 failed with error -22
>> > > > [   30.877578] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000.
>> > > > [   30.878387] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors
>> > > > [   31.367083] scsi host1: qla2xxx
>> > > > [   31.427500] qla2xxx: probe of 0000:10:00.1 failed with error -22
>> > > > 
>> > > > I do not know if the driver works since nothing is attached to the FC 
>> > > > HBA at the moment, but from the error messages it looks like the driver 
>> > > > fails to load.
>> > > > 
>> > > > I booted 4.12 and 4.11 - the red error is not there but the failure 
>> > > > seems to be the same error -22:
>> > 
>> > 4.10.0 works, 4.11.0 errors out with EINVAL and 4.13-rc4 errorr sout 
>> > with more verbose MSI messages. So something between 4.10 and 4.11 has 
>> > broken it.
>> 
>> I can not reproduice the older kernels that misbehave. I checked out 
>> earlier kernels and recompiled them (old config lost, nothing changed 
>> AFAIK), everything works up to 4.12 inclusive.
>> 
>> > Also, 4.13-rc4 is broken on another sun4v here (T1000). So it seems to 
>> > be sun4v interrupt related.
>> 
>> This still holds - 4.13-rc4 has MSI trouble on at least 2 of my sun4v 
>> machines.
> 
> IIUC, that means v4.12 works and v4.13-rc4 does not, so this is a
> regression we introduced this cycle.
> 
> If nobody steps up with a theory, bisecting might be the easiest path
> forward.

I suspect the test added by:

commit 6f9a22bc5775d231ab8fbe2c2f3c88e45e3e7c28
Author: Michael Hernandez <michael.hernandez@cavium.com>
Date:   Thu May 18 10:47:47 2017 -0700

    PCI/MSI: Ignore affinity if pre/post vector count is more than min_vecs

is triggering.

The rest of the failure cases are memory allocation failures which should
not be happening here.

There have only been 5 commits to kernel/irq/affinity.c since v4.10

I suppose we have been getting away with something that has silently
been allowed in the past, or something like that.

Meelis can you run with the following debuggingspatch?

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index d69bd77252a7..d16c6326000a 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -110,6 +110,9 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	struct cpumask *masks;
 	cpumask_var_t nmsk, *node_to_present_cpumask;
 
+	pr_info("irq_create_affinity_masks: nvecs[%d] affd->pre_vectors[%d] "
+		"affd->post_vectors[%d]\n",
+		nvecs, affd->pre_vectors, affd->post_vectors);
 	/*
 	 * If there aren't any vectors left after applying the pre/post
 	 * vectors don't bother with assigning affinity.

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-21 18:27         ` David Miller
@ 2017-08-21 18:34           ` Christoph Hellwig
  2017-08-21 19:20             ` mroos
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2017-08-21 18:34 UTC (permalink / raw)
  To: David Miller
  Cc: helgaas, mroos, sparclinux, linux-pci, qla2xxx-upstream,
	linux-kernel, hch

I think with this patch from -rc6 the symptoms should be cured:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c005390374957baacbc38eef96ea360559510aa7

if that theory is right.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-21 18:34           ` Christoph Hellwig
@ 2017-08-21 19:20             ` mroos
  2017-08-21 20:35               ` David Miller
  0 siblings, 1 reply; 19+ messages in thread
From: mroos @ 2017-08-21 19:20 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: David Miller, helgaas, sparclinux, linux-pci, qla2xxx-upstream,
	linux-kernel

> I think with this patch from -rc6 the symptoms should be cured:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c005390374957baacbc38eef96ea360559510aa7
> 
> if that theory is right.

The result with 4.13-rc6 is positive but mixed: the message about MSI-X 
affinty maks are still there but the rest of the detection works and the 
driver is loaded successfully:

[   29.924282] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.00-k.
[   29.924710] qla2xxx [0000:10:00.0]-001d: : Found an ISP2432 irq 21 iobase 0x000000c100d00000.
[   29.925581] qla2xxx 0000:10:00.0: can't allocate MSI-X affinity masks for 2 vectors
[   30.483422] scsi host1: qla2xxx
[   35.495031] qla2xxx [0000:10:00.0]-00fb:1: QLogic QLE2462 - SG-(X)PCIE2FC-QF4, Sun StorageTek 4 Gb FC Enterprise PCI-Express Dual Channel H.
[   35.495274] qla2xxx [0000:10:00.0]-00fc:1: ISP2432: PCIe (2.5GT/s x4) @ 0000:10:00.0 hdma- host#=1 fw=7.03.00 (9496).
[   35.495615] qla2xxx [0000:10:00.1]-001d: : Found an ISP2432 irq 22 iobase 0x000000c100d04000.
[   35.496409] qla2xxx 0000:10:00.1: can't allocate MSI-X affinity masks for 2 vectors
[   35.985355] scsi host2: qla2xxx
[   40.996991] qla2xxx [0000:10:00.1]-00fb:2: QLogic QLE2462 - SG-(X)PCIE2FC-QF4, Sun StorageTek 4 Gb FC Enterprise PCI-Express Dual Channel H.
[   40.997251] qla2xxx [0000:10:00.1]-00fc:2: ISP2432: PCIe (2.5GT/s x4) @ 0000:10:00.1 hdma- host#=2 fw=7.03.00 (9496).
[   51.880945] qla2xxx [0000:10:00.0]-8038:1: Cable is unplugged...
[   57.402900] qla2xxx [0000:10:00.1]-8038:2: Cable is unplugged...

With Dave Millers patch on top of 4.13-rc6, I see the following before 
both MSI-X messages:

irq_create_affinity_masks: nvecs[2] affd->pre_vectors[2] affd->post_vectors[0]

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-21 19:20             ` mroos
@ 2017-08-21 20:35               ` David Miller
  2017-08-22  5:02                 ` Meelis Roos
  2017-08-22  6:35                 ` Christoph Hellwig
  0 siblings, 2 replies; 19+ messages in thread
From: David Miller @ 2017-08-21 20:35 UTC (permalink / raw)
  To: mroos; +Cc: hch, helgaas, sparclinux, linux-pci, qla2xxx-upstream,
	linux-kernel

From: mroos@linux.ee
Date: Mon, 21 Aug 2017 22:20:22 +0300 (EEST)

>> I think with this patch from -rc6 the symptoms should be cured:
>> 
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c005390374957baacbc38eef96ea360559510aa7
>> 
>> if that theory is right.
> 
> The result with 4.13-rc6 is positive but mixed: the message about MSI-X 
> affinty maks are still there but the rest of the detection works and the 
> driver is loaded successfully:

Is this an SMP system?

I ask because the commit log message indicates that this failure is
not expected to ever happen on SMP.

We really need to root cause this.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-21 20:35               ` David Miller
@ 2017-08-22  5:02                 ` Meelis Roos
  2017-08-22  6:35                 ` Christoph Hellwig
  1 sibling, 0 replies; 19+ messages in thread
From: Meelis Roos @ 2017-08-22  5:02 UTC (permalink / raw)
  To: David Miller
  Cc: hch, helgaas, sparclinux, linux-pci, qla2xxx-upstream,
	linux-kernel

> 
> >> I think with this patch from -rc6 the symptoms should be cured:
> >> 
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c005390374957baacbc38eef96ea360559510aa7
> >> 
> >> if that theory is right.
> > 
> > The result with 4.13-rc6 is positive but mixed: the message about MSI-X 
> > affinty maks are still there but the rest of the detection works and the 
> > driver is loaded successfully:
> 
> Is this an SMP system?

Yes, T5120.

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-21 20:35               ` David Miller
  2017-08-22  5:02                 ` Meelis Roos
@ 2017-08-22  6:35                 ` Christoph Hellwig
  2017-08-22 16:31                   ` David Miller
  1 sibling, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2017-08-22  6:35 UTC (permalink / raw)
  To: David Miller
  Cc: mroos, hch, helgaas, sparclinux, linux-pci, qla2xxx-upstream,
	linux-kernel

On Mon, Aug 21, 2017 at 01:35:49PM -0700, David Miller wrote:
> I ask because the commit log message indicates that this failure is
> not expected to ever happen on SMP.

I fear my commit message (but not the code) might be wrong.
irq_create_affinity_masks can return NULL any time we don't have any
affinity masks.  I've already had a discussion about this elsewhere
with Bjorn, and I suspect we need to kill the warning or move it
to irq_create_affinity_masks only for genuine failure cases.

> 
> We really need to root cause this.
---end quoted text---

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-22  6:35                 ` Christoph Hellwig
@ 2017-08-22 16:31                   ` David Miller
  2017-08-22 16:33                     ` Meelis Roos
  2017-08-22 16:39                     ` Christoph Hellwig
  0 siblings, 2 replies; 19+ messages in thread
From: David Miller @ 2017-08-22 16:31 UTC (permalink / raw)
  To: hch; +Cc: mroos, helgaas, sparclinux, linux-pci, qla2xxx-upstream,
	linux-kernel

From: Christoph Hellwig <hch@lst.de>
Date: Tue, 22 Aug 2017 08:35:05 +0200

> On Mon, Aug 21, 2017 at 01:35:49PM -0700, David Miller wrote:
>> I ask because the commit log message indicates that this failure is
>> not expected to ever happen on SMP.
> 
> I fear my commit message (but not the code) might be wrong.
> irq_create_affinity_masks can return NULL any time we don't have any
> affinity masks.  I've already had a discussion about this elsewhere
> with Bjorn, and I suspect we need to kill the warning or move it
> to irq_create_affinity_masks only for genuine failure cases.

This is a rather large machine with 64 or more cpus and several NUMA
nodes.  Why wouldn't there be any affinity masks available?

That's why I want to root cause this.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-22 16:31                   ` David Miller
@ 2017-08-22 16:33                     ` Meelis Roos
  2017-08-22 16:45                       ` David Miller
  2017-08-22 16:39                     ` Christoph Hellwig
  1 sibling, 1 reply; 19+ messages in thread
From: Meelis Roos @ 2017-08-22 16:33 UTC (permalink / raw)
  To: David Miller
  Cc: hch, helgaas, sparclinux, linux-pci, qla2xxx-upstream,
	Linux Kernel list

> > On Mon, Aug 21, 2017 at 01:35:49PM -0700, David Miller wrote:
> >> I ask because the commit log message indicates that this failure is
> >> not expected to ever happen on SMP.
> > 
> > I fear my commit message (but not the code) might be wrong.
> > irq_create_affinity_masks can return NULL any time we don't have any
> > affinity masks.  I've already had a discussion about this elsewhere
> > with Bjorn, and I suspect we need to kill the warning or move it
> > to irq_create_affinity_masks only for genuine failure cases.
> 
> This is a rather large machine with 64 or more cpus and several NUMA
> nodes.  Why wouldn't there be any affinity masks available?

T5120 with 1 slot and 32 threads total. I have not configured any NUM on 
it is there any reason for that?

-- 
Meelis Roos (mroos@linux.ee)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-22 16:31                   ` David Miller
  2017-08-22 16:33                     ` Meelis Roos
@ 2017-08-22 16:39                     ` Christoph Hellwig
  2017-08-22 16:52                       ` David Miller
  1 sibling, 1 reply; 19+ messages in thread
From: Christoph Hellwig @ 2017-08-22 16:39 UTC (permalink / raw)
  To: David Miller
  Cc: hch, mroos, helgaas, sparclinux, linux-pci, qla2xxx-upstream,
	linux-kernel

On Tue, Aug 22, 2017 at 09:31:39AM -0700, David Miller wrote:
> > I fear my commit message (but not the code) might be wrong.
> > irq_create_affinity_masks can return NULL any time we don't have any
> > affinity masks.  I've already had a discussion about this elsewhere
> > with Bjorn, and I suspect we need to kill the warning or move it
> > to irq_create_affinity_masks only for genuine failure cases.
> 
> This is a rather large machine with 64 or more cpus and several NUMA
> nodes.  Why wouldn't there be any affinity masks available?

The drivers only asked for two MSI-X vectors, and marked bost of them
as pre-vectors that should not be spread.  So there is no actual
vector left that we want to actually spread.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-22 16:33                     ` Meelis Roos
@ 2017-08-22 16:45                       ` David Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2017-08-22 16:45 UTC (permalink / raw)
  To: mroos; +Cc: hch, helgaas, sparclinux, linux-pci, qla2xxx-upstream,
	linux-kernel

From: Meelis Roos <mroos@linux.ee>
Date: Tue, 22 Aug 2017 19:33:55 +0300 (EEST)

>> > On Mon, Aug 21, 2017 at 01:35:49PM -0700, David Miller wrote:
>> >> I ask because the commit log message indicates that this failure is
>> >> not expected to ever happen on SMP.
>> > 
>> > I fear my commit message (but not the code) might be wrong.
>> > irq_create_affinity_masks can return NULL any time we don't have any
>> > affinity masks.  I've already had a discussion about this elsewhere
>> > with Bjorn, and I suspect we need to kill the warning or move it
>> > to irq_create_affinity_masks only for genuine failure cases.
>> 
>> This is a rather large machine with 64 or more cpus and several NUMA
>> nodes.  Why wouldn't there be any affinity masks available?
> 
> T5120 with 1 slot and 32 threads total. I have not configured any NUM on 
> it is there any reason for that?

Ok 32 cpus and 1 NUMA node, my bad :-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors
  2017-08-22 16:39                     ` Christoph Hellwig
@ 2017-08-22 16:52                       ` David Miller
  0 siblings, 0 replies; 19+ messages in thread
From: David Miller @ 2017-08-22 16:52 UTC (permalink / raw)
  To: hch; +Cc: mroos, helgaas, sparclinux, linux-pci, qla2xxx-upstream,
	linux-kernel

From: Christoph Hellwig <hch@lst.de>
Date: Tue, 22 Aug 2017 18:39:16 +0200

> On Tue, Aug 22, 2017 at 09:31:39AM -0700, David Miller wrote:
>> > I fear my commit message (but not the code) might be wrong.
>> > irq_create_affinity_masks can return NULL any time we don't have any
>> > affinity masks.  I've already had a discussion about this elsewhere
>> > with Bjorn, and I suspect we need to kill the warning or move it
>> > to irq_create_affinity_masks only for genuine failure cases.
>> 
>> This is a rather large machine with 64 or more cpus and several NUMA
>> nodes.  Why wouldn't there be any affinity masks available?
> 
> The drivers only asked for two MSI-X vectors, and marked bost of them
> as pre-vectors that should not be spread.  So there is no actual
> vector left that we want to actually spread.

Ok, now it makes more sense, and yes the warning should be removed.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-08-22 16:52 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-15 14:54 4.13.0-rc4 sparc64: can't allocate MSI-X affinity masks for 2 vectors Meelis Roos
2017-08-15 18:44 ` Bjorn Helgaas
2017-08-15 20:24   ` Meelis Roos
2017-08-16 18:39     ` Meelis Roos
2017-08-16 19:02       ` Bjorn Helgaas
2017-08-17 14:47         ` Meelis Roos
2017-08-21 18:27         ` David Miller
2017-08-21 18:34           ` Christoph Hellwig
2017-08-21 19:20             ` mroos
2017-08-21 20:35               ` David Miller
2017-08-22  5:02                 ` Meelis Roos
2017-08-22  6:35                 ` Christoph Hellwig
2017-08-22 16:31                   ` David Miller
2017-08-22 16:33                     ` Meelis Roos
2017-08-22 16:45                       ` David Miller
2017-08-22 16:39                     ` Christoph Hellwig
2017-08-22 16:52                       ` David Miller
2017-08-17 10:09       ` Christoph Hellwig
2017-08-17 10:17         ` Meelis Roos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).