From: Lyude Paul <lyude@redhat.com>
To: "Ghannam, Yazen" <Yazen.Ghannam@amd.com>,
Thomas Gleixner <tglx@linutronix.de>
Cc: "hpa@zytor.com" <hpa@zytor.com>,
"keith.busch@intel.com" <keith.busch@intel.com>,
"mingo@kernel.org" <mingo@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Borislav Petkov <bp@alien8.de>
Subject: Re: "irq/matrix: Spread interrupts on allocation" breaks nouveau in mainline kernel
Date: Wed, 24 Jan 2018 15:02:01 -0500 [thread overview]
Message-ID: <1516824121.4109.28.camel@redhat.com> (raw)
In-Reply-To: <1516823810.4109.26.camel@redhat.com>
On Wed, 2018-01-24 at 14:56 -0500, Lyude Paul wrote:
> On Wed, 2018-01-24 at 19:13 +0000, Ghannam, Yazen wrote:
> > > -----Original Message-----
> > > From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
> > > owner@vger.kernel.org] On Behalf Of Lyude Paul
> > > Sent: Wednesday, January 24, 2018 12:49 PM
> > > To: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: hpa@zytor.com; keith.busch@intel.com; mingo@kernel.org; linux-
> > > kernel@vger.kernel.org
> > > Subject: Re: "irq/matrix: Spread interrupts on allocation" breaks
> > > nouveau
> > > in
> > > mainline kernel
> > >
> > > Hi, please ignore the warning: it happens before and after the
> > > regressing
> > > commit (I didn't actually mean to include it on the log I gave here,
> > > whoops).
> > > As for how I determined nouveau is getting assigned the same IRQ vector
> > > as
> > > another device, I checked using /sys/kernel/debug/irq. Additionally;
> > > when
> > > nouveau does initialize properly after resume (e.g. after reverting this
> > > patch) I see it get assigned a seperate vector from the other devices.
> > >
> >
> > +Boris. This thread seems to have split.
> >
> > Lyude,
> > Does the warning show on mainline or does it only show when bisecting?
> >
> > Sorry, I'm not sure what you mean by "it happens before and after the
> > regressing commit".
>
> Sorry about that! Let me clarify a little bit: this is a problem that shows
> up
> on mainline. Normally when we suspend the GPU in nouveau, we free the IRQs
> it's using before going into suspend
> (drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c:88), then reserve IRQs again
> on resume (drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c:134). Since this
> patch got pushed to mainline, the IRQ we get from request_irq() ends up
> having
> the same MSI vector as another device on the system:
>
> Before suspend, nouveau's IRQ allocation:
>
> handler: handle_edge_irq
> device: 0000:22:00.0
> status: 0x00000000
> istate: 0x00000000
> ddepth: 0
> wdepth: 0
> dstate: 0x01400200
> IRQD_ACTIVATED
> IRQD_IRQ_STARTED
> IRQD_SINGLE_TARGET
> node: 0
> affinity: 0-7
> effectiv: 1
> pending:
> domain: PCI-MSI-2
> hwirq: 0x1100000
> chip: PCI-MSI
> flags: 0x10
> IRQCHIP_SKIP_SET_WAKE
> parent:
> domain: VECTOR
> hwirq: 0x2f
> chip: APIC
> flags: 0x0
> Vector: 35
> Target: 1
>
> After resume and allocating the interrupt for nouveau again, we get a
> message
> from the kernel saying:
>
> [ 217.150787] do_IRQ: 1.35 No irq handler for vector
>
> As well, nouveau ends up getting no interrupts from the card and as a
> result
> fails to come back up:
>
> [ 219.153049] nouveau 0000:22:00.0: DRM: EVO timeout
> [ 220.226254] r8169 0000:1e:00.0 enp30s0: link up
> [ 221.153054] nouveau 0000:22:00.0: DRM: base-0: timeout
> [ 223.153528] nouveau 0000:22:00.0: DRM: base-0: timeout
>
> If we look through all of the other IRQ allocations, we'll find that now
> two
> devices have the MSI vector 35:
>
> nouveau:
> handler: handle_edge_irq
> device: 0000:22:00.0
> status: 0x00000000
> istate: 0x00000000
> ddepth: 0
> wdepth: 0
> dstate: 0x01400200
> IRQD_ACTIVATED
> IRQD_IRQ_STARTED
> IRQD_SINGLE_TARGET
> node: 0
> affinity: 0-7
> effectiv: 1
> pending:
> domain: PCI-MSI-2
> hwirq: 0x1100000
> chip: PCI-MSI
> flags: 0x10
> IRQCHIP_SKIP_SET_WAKE
> parent:
> domain: VECTOR
> hwirq: 0x2f
> chip: APIC
> flags: 0x0
> Vector: 35
> Target: 1
>
> and the PCI bridge (00:01.3 PCI bridge: Advanced Micro Devices, Inc.
> [AMD]
> Family 17h (Models 00h-0fh) PCIe GPP Bridge):
>
> handler: handle_edge_irq
> device: 0000:00:01.3
> status: 0x00000000
> istate: 0x00000000
> ddepth: 0
> wdepth: 0
> dstate: 0x03400200
> IRQD_ACTIVATED
> IRQD_IRQ_STARTED
> IRQD_SINGLE_TARGET
> node: 0
> affinity: 0-7
> effectiv: 0
> pending:
> domain: PCI-MSI-2
> hwirq: 0x5800
> chip: PCI-MSI
> flags: 0x10
> IRQCHIP_SKIP_SET_WAKE
> parent:
> domain: VECTOR
> hwirq: 0x19
> chip: APIC
> flags: 0x0
> Vector: 35
> Target: 0
>
> hope this helps clarify, I will keep looking at this from my end as well
> >
Almost forgot to mention: I came across this patch because reverting it
locally on the mainline kernel makes request_irq() behave normally (it doesn't
attempt to allocate the same vector twice anymore) and nouveau starts doing
suspend/resume correctly again
> >
> > Boris,
> > In any case, I like your idea on saving the block addresses. I can look
> > into
> > this.
> >
> > Thanks,
> > Yazen
--
Cheers,
Lyude Paul
next prev parent reply other threads:[~2018-01-24 20:02 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-23 22:01 "irq/matrix: Spread interrupts on allocation" breaks nouveau in mainline kernel Lyude Paul
2018-01-24 1:26 ` Lyude Paul
2018-01-24 12:52 ` Thomas Gleixner
2018-01-24 17:49 ` Lyude Paul
2018-01-24 19:13 ` Ghannam, Yazen
2018-01-24 19:56 ` Lyude Paul
2018-01-24 20:02 ` Lyude Paul [this message]
2018-01-25 3:29 ` Mike Galbraith
2018-01-25 18:29 ` Lyude Paul
2018-01-25 8:54 ` Thomas Gleixner
2018-01-25 18:23 ` Lyude Paul
2018-01-25 18:46 ` Thomas Gleixner
2018-01-25 19:25 ` Lyude Paul
2018-01-25 20:12 ` Thomas Gleixner
2018-01-24 12:50 ` Thomas Gleixner
2018-01-24 13:38 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1516824121.4109.28.camel@redhat.com \
--to=lyude@redhat.com \
--cc=Yazen.Ghannam@amd.com \
--cc=bp@alien8.de \
--cc=hpa@zytor.com \
--cc=keith.busch@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.