* eeh-powernv.c: Unbalanced IRQ warning @ 2015-07-27 7:37 Daniel Axtens 2015-07-28 1:11 ` Gavin Shan 2015-07-28 1:14 ` Alistair Popple 0 siblings, 2 replies; 6+ messages in thread From: Daniel Axtens @ 2015-07-27 7:37 UTC (permalink / raw) To: Alistair Popple; +Cc: linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 2917 bytes --] Hi Alistair, I've just rebased some CAPI patches on top of 4.2-rc4 and I'm getting a new WARN relating to IRQs in EEH, which I believe is related to your patch 79231448c929 ("powernv/eeh: Update the EEH code to use the opal irq domain"). This is what I see after injecting a PHB fence on a CAPI card. [ 126.022390] EEH: Notify device driver to resume [ 126.022421] Unbalanced enable for IRQ 17 [ 126.022432] ------------[ cut here ]------------ [ 126.022440] WARNING: at /scratch/dja/linux-capi/kernel/irq/manage.c:511 [ 126.022451] Modules linked in: cxl [ 126.022465] CPU: 3 PID: 123 Comm: eehd Not tainted 4.2.0-rc4-00013-g86caa74-dirty #86 [ 126.022479] task: c000000751b0af50 ti: c000000751b94000 task.ti: c000000751b94000 [ 126.022493] NIP: c0000000000f1760 LR: c0000000000f175c CTR: c0000000006000c0 [ 126.022509] REGS: c000000751b97710 TRAP: 0700 Not tainted (4.2.0-rc4-00013-g86caa74-dirty) [ 126.022522] MSR: 9000000100029032 <SF,HV,EE,ME,IR,DR,RI> CR: 22008022 XER: 20000000 [ 126.022560] CFAR: c0000000008a8680 SOFTE: 0 GPR00: c0000000000f175c c000000751b97990 c000000000e80c00 000000000000001c GPR04: 0000000000000000 000000000000002c 00000000000000ff 000000000000001f GPR08: c000000000d86cc0 c000000000d86cb8 c000000000d86cc0 0000000000000000 GPR12: 0000000042008028 c00000000fdc0d80 c0000000000bb460 c000000758162580 GPR16: 0000000000000000 0000000000000000 c00000074d3a1000 c000000000b35240 GPR20: c000000000b35210 c000000000b35278 c000000000b352e8 c000000000b2e2a8 GPR24: c0000000008d35b8 c0000000008d3510 c000000000efa408 c000000751b97c10 GPR28: 0000000000000000 c000000000d7a330 0000000000000011 c000000751eaec00 [ 126.022735] NIP [c0000000000f1760] .__enable_irq+0x30/0xd0 [ 126.022747] LR [c0000000000f175c] .__enable_irq+0x2c/0xd0 [ 126.022756] Call Trace: [ 126.022764] [c000000751b97990] [c0000000000f175c] .__enable_irq+0x2c/0xd0 (unreliable) [ 126.022780] [c000000751b97a20] [c0000000000f1848] .enable_irq+0x48/0x90 [ 126.022796] [c000000751b97ab0] [c00000000006ab00] .pnv_eeh_next_error+0x1f0/0x6f0 [ 126.022812] [c000000751b97ba0] [c000000000035908] .eeh_handle_event+0xb8/0x2f0 [ 126.022827] [c000000751b97c70] [c000000000035cf8] .eeh_event_handler+0x1b8/0x1c0 [ 126.022844] [c000000751b97d30] [c0000000000bb564] .kthread+0x104/0x130 [ 126.022860] [c000000751b97e30] [c0000000000095a4] .ret_from_kernel_thread+0x58/0xb4 [ 126.022874] Instruction dump: [ 126.022882] 7c0802a6 fbe1fff8 7c7f1b78 f8010010 f821ff71 81230170 2f890000 409e0034 [ 126.022915] 3c62ffcd 3863a730 487b6ec9 60000000 <0fe00000> 38210090 e8010010 ebe1fff8 [ 126.022935] ---[ end trace 26e6323a0534e98d ]--- manage.c:511 suggests that this is probably the result of the IRQ being enabled when it's already enabled. Do you know what might be causing this and how it might be fixed? Thanks in advance! -- Regards, Daniel [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 860 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: eeh-powernv.c: Unbalanced IRQ warning 2015-07-27 7:37 eeh-powernv.c: Unbalanced IRQ warning Daniel Axtens @ 2015-07-28 1:11 ` Gavin Shan 2015-07-28 1:14 ` Alistair Popple 1 sibling, 0 replies; 6+ messages in thread From: Gavin Shan @ 2015-07-28 1:11 UTC (permalink / raw) To: Daniel Axtens; +Cc: Alistair Popple, linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 3127 bytes --] On Mon, Jul 27, 2015 at 05:37:03PM +1000, Daniel Axtens wrote: >Hi Alistair, > >I've just rebased some CAPI patches on top of 4.2-rc4 and I'm getting a >new WARN relating to IRQs in EEH, which I believe is related to your >patch 79231448c929 ("powernv/eeh: Update the EEH code to use the opal >irq domain"). > >This is what I see after injecting a PHB fence on a CAPI card. > >[ 126.022390] EEH: Notify device driver to resume >[ 126.022421] Unbalanced enable for IRQ 17 >[ 126.022432] ------------[ cut here ]------------ >[ 126.022440] WARNING: at /scratch/dja/linux-capi/kernel/irq/manage.c:511 >[ 126.022451] Modules linked in: cxl >[ 126.022465] CPU: 3 PID: 123 Comm: eehd Not tainted 4.2.0-rc4-00013-g86caa74-dirty #86 >[ 126.022479] task: c000000751b0af50 ti: c000000751b94000 task.ti: c000000751b94000 >[ 126.022493] NIP: c0000000000f1760 LR: c0000000000f175c CTR: c0000000006000c0 >[ 126.022509] REGS: c000000751b97710 TRAP: 0700 Not tainted (4.2.0-rc4-00013-g86caa74-dirty) >[ 126.022522] MSR: 9000000100029032 <SF,HV,EE,ME,IR,DR,RI> CR: 22008022 XER: 20000000 >[ 126.022560] CFAR: c0000000008a8680 SOFTE: 0 >GPR00: c0000000000f175c c000000751b97990 c000000000e80c00 000000000000001c >GPR04: 0000000000000000 000000000000002c 00000000000000ff 000000000000001f >GPR08: c000000000d86cc0 c000000000d86cb8 c000000000d86cc0 0000000000000000 >GPR12: 0000000042008028 c00000000fdc0d80 c0000000000bb460 c000000758162580 >GPR16: 0000000000000000 0000000000000000 c00000074d3a1000 c000000000b35240 >GPR20: c000000000b35210 c000000000b35278 c000000000b352e8 c000000000b2e2a8 >GPR24: c0000000008d35b8 c0000000008d3510 c000000000efa408 c000000751b97c10 >GPR28: 0000000000000000 c000000000d7a330 0000000000000011 c000000751eaec00 >[ 126.022735] NIP [c0000000000f1760] .__enable_irq+0x30/0xd0 >[ 126.022747] LR [c0000000000f175c] .__enable_irq+0x2c/0xd0 >[ 126.022756] Call Trace: >[ 126.022764] [c000000751b97990] [c0000000000f175c] .__enable_irq+0x2c/0xd0 (unreliable) >[ 126.022780] [c000000751b97a20] [c0000000000f1848] .enable_irq+0x48/0x90 >[ 126.022796] [c000000751b97ab0] [c00000000006ab00] .pnv_eeh_next_error+0x1f0/0x6f0 >[ 126.022812] [c000000751b97ba0] [c000000000035908] .eeh_handle_event+0xb8/0x2f0 >[ 126.022827] [c000000751b97c70] [c000000000035cf8] .eeh_event_handler+0x1b8/0x1c0 >[ 126.022844] [c000000751b97d30] [c0000000000bb564] .kthread+0x104/0x130 >[ 126.022860] [c000000751b97e30] [c0000000000095a4] .ret_from_kernel_thread+0x58/0xb4 >[ 126.022874] Instruction dump: >[ 126.022882] 7c0802a6 fbe1fff8 7c7f1b78 f8010010 f821ff71 81230170 2f890000 409e0034 >[ 126.022915] 3c62ffcd 3863a730 487b6ec9 60000000 <0fe00000> 38210090 e8010010 ebe1fff8 >[ 126.022935] ---[ end trace 26e6323a0534e98d ]--- > >manage.c:511 suggests that this is probably the result of the IRQ being >enabled when it's already enabled. > >Do you know what might be causing this and how it might be fixed? >Thanks in advance! > Daniel, could you check if the attached patch fixes the issue? If it helps, I'll clean it up and send it out for review together other cleanup patches. Thanks, Gavin [-- Attachment #2: 0001-powerpc-powernv-Reenable-EEH-IRQ-if-necessary.patch --] [-- Type: text/x-diff, Size: 3021 bytes --] >From 64484296abf5a6419e9c31d7b394f92e541d73d3 Mon Sep 17 00:00:00 2001 From: Gavin Shan <gwshan@linux.vnet.ibm.com> Date: Tue, 28 Jul 2015 10:58:29 +1000 Subject: [PATCH] powerpc/powernv: Reenable EEH IRQ if necessary pnv_eeh_next_error() is called to handle EEH special event. The function can be called for multiple times for one EEH special event. So we can't enable the EEH IRQ without limitation. Otherwise, the following warning would be seen because of attempt to enable IRQ, which has been enabled. The patch introduces another flag to track the EEH IRQ enablement state and doesn't enable it if it's already enabled. EEH: Notify device driver to resume Unbalanced enable for IRQ 17 ------------[ cut here ]------------ WARNING: at /scratch/dja/linux-capi/kernel/irq/manage.c:511 Modules linked in: cxl : NIP [c0000000000f1760] .__enable_irq+0x30/0xd0 LR [c0000000000f175c] .__enable_irq+0x2c/0xd0 Call Trace: [c000000751b97990] [c0000000000f175c] .__enable_irq+0x2c/0xd0 (unreliable) [c000000751b97a20] [c0000000000f1848] .enable_irq+0x48/0x90 [c000000751b97ab0] [c00000000006ab00] .pnv_eeh_next_error+0x1f0/0x6f0 [c000000751b97ba0] [c000000000035908] .eeh_handle_event+0xb8/0x2f0 [c000000751b97c70] [c000000000035cf8] .eeh_event_handler+0x1b8/0x1c0 [c000000751b97d30] [c0000000000bb564] .kthread+0x104/0x130 [c000000751b97e30] [c0000000000095a4] .ret_from_kernel_thread+0x58/0xb4 Reported-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> --- arch/powerpc/platforms/powernv/eeh-powernv.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c index 5cf5e6e..28ac8d1 100644 --- a/arch/powerpc/platforms/powernv/eeh-powernv.c +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c @@ -41,6 +41,7 @@ #include "pci.h" static bool pnv_eeh_nb_init = false; +static bool pnv_eeh_irq_enabled = false; static int eeh_event_irq = -EINVAL; /** @@ -98,7 +99,10 @@ static irqreturn_t pnv_eeh_event(int irq, void *data) * finished processing the outstanding ones. Event processing * gets unmasked in next_error() if EEH is enabled. */ - disable_irq_nosync(irq); + if (pnv_eeh_irq_enabled) { + disable_irq_nosync(irq); + pnv_eeh_irq_enabled = false; + } if (eeh_enabled()) eeh_send_failure_event(NULL); @@ -243,11 +247,14 @@ static int pnv_eeh_post_init(void) return ret; } + pnv_eeh_irq_enabled = true; pnv_eeh_nb_init = true; } - if (!eeh_enabled()) + if (!eeh_enabled() && pnv_eeh_irq_enabled) { disable_irq(eeh_event_irq); + pnv_eeh_irq_enabled = false; + } list_for_each_entry(hose, &hose_list, list_node) { phb = hose->private_data; @@ -1478,8 +1485,10 @@ static int pnv_eeh_next_error(struct eeh_pe **pe) } /* Unmask the event */ - if (eeh_enabled()) + if (eeh_enabled() && !pnv_eeh_irq_enabled) { enable_irq(eeh_event_irq); + pnv_eeh_irq_enabled = true; + } return ret; } -- 2.1.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: eeh-powernv.c: Unbalanced IRQ warning 2015-07-27 7:37 eeh-powernv.c: Unbalanced IRQ warning Daniel Axtens 2015-07-28 1:11 ` Gavin Shan @ 2015-07-28 1:14 ` Alistair Popple 2015-07-28 1:28 ` Gavin Shan 1 sibling, 1 reply; 6+ messages in thread From: Alistair Popple @ 2015-07-28 1:14 UTC (permalink / raw) To: Daniel Axtens; +Cc: linuxppc-dev, Gavin Shan Hi Daniel, I see the problem - pnv_eeh_next_error() re-enables the interrupt but it gets called from a loop if there are more outstanding events to process. The most obvious solution would be to do this check before enabling interrupts: if (ret == EEH_NEXT_ERR_NONE && eeh_enabled()) instead of: if (eeh_enabled()) This should work fine so long as pnv_eeh_next_error() is called continuously until is returns either EEH_NEXT_ERR_NONE or another value which signals that pnv_eeh_next_error() should never be called again. As far as I can tell this looks to be true (perhaps Gavin can confirm?) Would you mind trying the below patch and seeing if it fixes the problem? Thanks! -- >8 -- >From 6eeed1d6dd25e8cf6bfe3423dc50ff855d1cbc42 Mon Sep 17 00:00:00 2001 From: Alistair Popple <alistair@popple.id.au> --- arch/powerpc/platforms/powernv/eeh-powernv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c index ca825ec..ff41c03 100644 --- a/arch/powerpc/platforms/powernv/eeh-powernv.c +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c @@ -1478,7 +1478,7 @@ static int pnv_eeh_next_error(struct eeh_pe **pe) } /* Unmask the event */ - if (eeh_enabled()) + if (ret == EEH_NEXT_ERR_NONE && eeh_enabled()) enable_irq(eeh_event_irq); return ret; -- 1.8.3.2 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: eeh-powernv.c: Unbalanced IRQ warning 2015-07-28 1:14 ` Alistair Popple @ 2015-07-28 1:28 ` Gavin Shan 2015-07-28 1:56 ` Daniel Axtens 0 siblings, 1 reply; 6+ messages in thread From: Gavin Shan @ 2015-07-28 1:28 UTC (permalink / raw) To: Alistair Popple; +Cc: Daniel Axtens, linuxppc-dev, Gavin Shan On Tue, Jul 28, 2015 at 11:14:51AM +1000, Alistair Popple wrote: >Hi Daniel, > >I see the problem - pnv_eeh_next_error() re-enables the interrupt but it gets >called from a loop if there are more outstanding events to process. The most >obvious solution would be to do this check before enabling interrupts: > > if (ret == EEH_NEXT_ERR_NONE && eeh_enabled()) > >instead of: > > if (eeh_enabled()) > >This should work fine so long as pnv_eeh_next_error() is called continuously >until is returns either EEH_NEXT_ERR_NONE or another value which signals that >pnv_eeh_next_error() should never be called again. As far as I can tell this >looks to be true (perhaps Gavin can confirm?) > Yeah, I confirmed. The way that Alistair fixes the issue is simple and more precise. Please try his fix and ignore the one I sent couple of minutes before. Thanks, Gavin >Would you mind trying the below patch and seeing if it fixes the problem? >Thanks! > >-- >8 -- >>From 6eeed1d6dd25e8cf6bfe3423dc50ff855d1cbc42 Mon Sep 17 00:00:00 2001 >From: Alistair Popple <alistair@popple.id.au> > >--- > arch/powerpc/platforms/powernv/eeh-powernv.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > >diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c >index ca825ec..ff41c03 100644 >--- a/arch/powerpc/platforms/powernv/eeh-powernv.c >+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c >@@ -1478,7 +1478,7 @@ static int pnv_eeh_next_error(struct eeh_pe **pe) > } > > /* Unmask the event */ >- if (eeh_enabled()) >+ if (ret == EEH_NEXT_ERR_NONE && eeh_enabled()) > enable_irq(eeh_event_irq); > > return ret; >-- >1.8.3.2 > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: eeh-powernv.c: Unbalanced IRQ warning 2015-07-28 1:28 ` Gavin Shan @ 2015-07-28 1:56 ` Daniel Axtens 2015-07-28 5:39 ` Alistair Popple 0 siblings, 1 reply; 6+ messages in thread From: Daniel Axtens @ 2015-07-28 1:56 UTC (permalink / raw) To: Gavin Shan; +Cc: Alistair Popple, linuxppc-dev [-- Attachment #1: Type: text/plain, Size: 2086 bytes --] Hi Alistair and Gavin, The patch from Alistair fixes my issue. Thanks heaps! Alistair, are you right to post that formally? Regards, Daniel On Tue, 2015-07-28 at 11:28 +1000, Gavin Shan wrote: > On Tue, Jul 28, 2015 at 11:14:51AM +1000, Alistair Popple wrote: > >Hi Daniel, > > > >I see the problem - pnv_eeh_next_error() re-enables the interrupt but it gets > >called from a loop if there are more outstanding events to process. The most > >obvious solution would be to do this check before enabling interrupts: > > > > if (ret == EEH_NEXT_ERR_NONE && eeh_enabled()) > > > >instead of: > > > > if (eeh_enabled()) > > > >This should work fine so long as pnv_eeh_next_error() is called continuously > >until is returns either EEH_NEXT_ERR_NONE or another value which signals that > >pnv_eeh_next_error() should never be called again. As far as I can tell this > >looks to be true (perhaps Gavin can confirm?) > > > > Yeah, I confirmed. The way that Alistair fixes the issue is simple and more > precise. Please try his fix and ignore the one I sent couple of minutes > before. > > Thanks, > Gavin > > >Would you mind trying the below patch and seeing if it fixes the problem? > >Thanks! > > > >-- >8 -- > >>From 6eeed1d6dd25e8cf6bfe3423dc50ff855d1cbc42 Mon Sep 17 00:00:00 2001 > >From: Alistair Popple <alistair@popple.id.au> > > > >--- > > arch/powerpc/platforms/powernv/eeh-powernv.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > >diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c > >index ca825ec..ff41c03 100644 > >--- a/arch/powerpc/platforms/powernv/eeh-powernv.c > >+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c > >@@ -1478,7 +1478,7 @@ static int pnv_eeh_next_error(struct eeh_pe **pe) > > } > > > > /* Unmask the event */ > >- if (eeh_enabled()) > >+ if (ret == EEH_NEXT_ERR_NONE && eeh_enabled()) > > enable_irq(eeh_event_irq); > > > > return ret; > >-- > >1.8.3.2 > > > > > -- Regards, Daniel [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 860 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: eeh-powernv.c: Unbalanced IRQ warning 2015-07-28 1:56 ` Daniel Axtens @ 2015-07-28 5:39 ` Alistair Popple 0 siblings, 0 replies; 6+ messages in thread From: Alistair Popple @ 2015-07-28 5:39 UTC (permalink / raw) To: Daniel Axtens; +Cc: Gavin Shan, linuxppc-dev On Tue, 28 Jul 2015 11:56:31 Daniel Axtens wrote: > Hi Alistair and Gavin, > > The patch from Alistair fixes my issue. Thanks heaps! > > Alistair, are you right to post that formally? Yep. I'll post it tomorrow morning. > Regards, > Daniel > > On Tue, 2015-07-28 at 11:28 +1000, Gavin Shan wrote: > > On Tue, Jul 28, 2015 at 11:14:51AM +1000, Alistair Popple wrote: > > >Hi Daniel, > > > > > >I see the problem - pnv_eeh_next_error() re-enables the interrupt but it gets > > >called from a loop if there are more outstanding events to process. The most > > >obvious solution would be to do this check before enabling interrupts: > > > > > > if (ret == EEH_NEXT_ERR_NONE && eeh_enabled()) > > > > > >instead of: > > > > > > if (eeh_enabled()) > > > > > >This should work fine so long as pnv_eeh_next_error() is called continuously > > >until is returns either EEH_NEXT_ERR_NONE or another value which signals that > > >pnv_eeh_next_error() should never be called again. As far as I can tell this > > >looks to be true (perhaps Gavin can confirm?) > > > > > > > Yeah, I confirmed. The way that Alistair fixes the issue is simple and more > > precise. Please try his fix and ignore the one I sent couple of minutes > > before. > > > > Thanks, > > Gavin > > > > >Would you mind trying the below patch and seeing if it fixes the problem? > > >Thanks! > > > > > >-- >8 -- > > >>From 6eeed1d6dd25e8cf6bfe3423dc50ff855d1cbc42 Mon Sep 17 00:00:00 2001 > > >From: Alistair Popple <alistair@popple.id.au> > > > > > >--- > > > arch/powerpc/platforms/powernv/eeh-powernv.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > >diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c > > >index ca825ec..ff41c03 100644 > > >--- a/arch/powerpc/platforms/powernv/eeh-powernv.c > > >+++ b/arch/powerpc/platforms/powernv/eeh-powernv.c > > >@@ -1478,7 +1478,7 @@ static int pnv_eeh_next_error(struct eeh_pe **pe) > > > } > > > > > > /* Unmask the event */ > > >- if (eeh_enabled()) > > >+ if (ret == EEH_NEXT_ERR_NONE && eeh_enabled()) > > > enable_irq(eeh_event_irq); > > > > > > return ret; > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2015-07-28 5:39 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-07-27 7:37 eeh-powernv.c: Unbalanced IRQ warning Daniel Axtens 2015-07-28 1:11 ` Gavin Shan 2015-07-28 1:14 ` Alistair Popple 2015-07-28 1:28 ` Gavin Shan 2015-07-28 1:56 ` Daniel Axtens 2015-07-28 5:39 ` Alistair Popple
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).