* Pending interrupts not always replayed @ 2010-04-18 3:34 Guillaume Knispel 2010-04-18 14:46 ` Guillaume Knispel 0 siblings, 1 reply; 4+ messages in thread From: Guillaume Knispel @ 2010-04-18 3:34 UTC (permalink / raw) To: linux-kernel, Linuxppc-dev Cc: Lars-Peter Clausen, Bartlomiej Zolnierkiewicz, Peter Zijlstra, Michael Buesch, Ingo Molnar, Linus Torvalds, Thomas Gleixner Hi, =46rom reading the code (kernel/irq stuffs), it seems that at least some handle_edge_irq based interrupts are not replayed when enabling if desc->chip->retrigger =3D=3D NULL and on a platform where CONFIG_HARDIRQS_SW_RESEND is not set (which for now is only defined for (some?) arm and avr32). Depending on the pic this problem might be masked if multiple interrupts are received between disable_irq and enable_irq (maybe other conditions exist with controllers i don't know). I happen to program a PPC SoC (MPC8555E to be precise), using some edge interrupts on the PortC of the CPM2 pic (arch/powerpc/sysdev/cpm2_pic.c) in a driver that does some enable_irq/disable_irq while various patterns of interrupts are received and i think i've hit the problem. Is the behavior intended (not always replaying edge triggered interrupts) or is it just a bug to fix? Anybody knows if it's safe to activate CONFIG_HARDIRQS_SW_RESEND for a powerpc or are there some known dangerous interactions? (I'll give it a try on my side in the meantime anyway) Cheers! --=20 Guillaume Knispel ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Pending interrupts not always replayed 2010-04-18 3:34 Pending interrupts not always replayed Guillaume Knispel @ 2010-04-18 14:46 ` Guillaume Knispel 2010-04-18 19:14 ` Thomas Gleixner 0 siblings, 1 reply; 4+ messages in thread From: Guillaume Knispel @ 2010-04-18 14:46 UTC (permalink / raw) To: linux-kernel, Linuxppc-dev Cc: Lars-Peter Clausen, Bartlomiej Zolnierkiewicz, Peter Zijlstra, Michael Buesch, Ingo Molnar, Linus Torvalds, Thomas Gleixner On Sun, 18 Apr 2010 05:34:39 +0200 Guillaume Knispel <gknispel@proformatique.com> wrote: > From reading the code (kernel/irq stuffs), it seems that at least some > handle_edge_irq based interrupts are not replayed when enabling if > desc->chip->retrigger == NULL and on a platform where > CONFIG_HARDIRQS_SW_RESEND is not set (which for now is only defined for > (some?) arm and avr32). Depending on the pic this problem might > be masked if multiple interrupts are received between disable_irq and > enable_irq (maybe other conditions exist with controllers i don't know). > > I happen to program a PPC SoC (MPC8555E to be precise), using some edge > interrupts on the PortC of the CPM2 pic (arch/powerpc/sysdev/cpm2_pic.c) > in a driver that does some enable_irq/disable_irq while various > patterns of interrupts are received and i think i've hit the problem. > > Is the behavior intended (not always replaying edge triggered interrupts) > or is it just a bug to fix? > > Anybody knows if it's safe to activate CONFIG_HARDIRQS_SW_RESEND for a > powerpc or are there some known dangerous interactions? > (I'll give it a try on my side in the meantime anyway) A follow up on this issue and on my CONFIG_HARDIRQS_SW_RESEND tests: I've basically done this: Index: linux-2.6.22.18/arch/powerpc/Kconfig =================================================================== --- linux-2.6.22.18.orig/arch/powerpc/Kconfig 2010-04-18 15:22:24.000000000 +0200 +++ linux-2.6.22.18/arch/powerpc/Kconfig 2010-04-18 15:23:21.000000000 +0200 @@ -35,6 +35,10 @@ bool default y +config HARDIRQS_SW_RESEND + bool + default y + config IRQ_PER_CPU bool default y (yeah, I know, old kernel, but the code involved hasn't change much since) and of course +CONFIG_HARDIRQS_SW_RESEND=y in my .config Now everything seems to work fine: my device was not previously not interrupting anymore after typically 1 or 2 minutes (because the interrupt signal stays at level low until the device is served, so if one falling edge is missed no more interruption will ever be seen), and right now it has been running for 1 hour without any observable problem. Cheers! -- Guillaume Knispel ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Pending interrupts not always replayed 2010-04-18 14:46 ` Guillaume Knispel @ 2010-04-18 19:14 ` Thomas Gleixner 2010-04-19 13:09 ` Guillaume Knispel 0 siblings, 1 reply; 4+ messages in thread From: Thomas Gleixner @ 2010-04-18 19:14 UTC (permalink / raw) To: Guillaume Knispel Cc: Lars-Peter Clausen, Bartlomiej Zolnierkiewicz, Peter Zijlstra, linux-kernel, Linus Torvalds, Michael Buesch, Ingo Molnar, Linuxppc-dev On Sun, 18 Apr 2010, Guillaume Knispel wrote: > Now everything seems to work fine: my device was not previously not > interrupting anymore after typically 1 or 2 minutes (because the > interrupt signal stays at level low until the device is served, so So you are having a level interrupt device and the irq line is handled by an edge type handler ? Why not configuring the irq line for level and in the first place ? Thanks, tglx ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Pending interrupts not always replayed 2010-04-18 19:14 ` Thomas Gleixner @ 2010-04-19 13:09 ` Guillaume Knispel 0 siblings, 0 replies; 4+ messages in thread From: Guillaume Knispel @ 2010-04-19 13:09 UTC (permalink / raw) To: Thomas Gleixner Cc: Lars-Peter Clausen, Bartlomiej Zolnierkiewicz, Peter Zijlstra, linux-kernel, Linus Torvalds, Michael Buesch, Ingo Molnar, Linuxppc-dev On Sun, 18 Apr 2010 21:14:12 +0200 (CEST) Thomas Gleixner <tglx@linutronix.de> wrote: > On Sun, 18 Apr 2010, Guillaume Knispel wrote: > > Now everything seems to work fine: my device was not previously not > > interrupting anymore after typically 1 or 2 minutes (because the > > interrupt signal stays at level low until the device is served, so > > So you are having a level interrupt device and the irq line is handled > by an edge type handler ? Why not configuring the irq line for level > and in the first place ? Not really the problem here, it indeed helped me do discover the real bug (because I would not have seen that an interrupt went missing if the device have generated another one all by itself soon after that). I would of course prefer to have this line in level trigger, but the pic just can't do it for this signal, so I play with masking and unmasking interrupts on the device side to regenerate edges when needed. When interrupts are enabled, I do that in the isr, so if it is not called after a falling edge it won't be anymore in the future. What is a real bug IMHO is that when in falling edge trigger mode, the sequence: - disable_irq(), - 1 falling edge, - enable_irq() doesn't lead to the isr being called just after enable_irq(). This is because: * __disable_irq basically do (when really disabling) desc->status |= IRQ_DISABLED; desc->chip->disable(irq); <= noop or racy see under * arch/powerpc/sysdev/cpm2_pic.c does not define a disable() in struct irq_chip cpm2_pic. So default_disable is used, and it's a noop. For pics where this is not a noop, there can be a race if the interrupt triggers before the disable(): the flow handler will still be executed just after desc->lock is unlocked in disable_irq_nosync(). So it does not really matters whether disable() is implemented or is a noop. * When the flow handler is executed after disable_irq_nosync(), handle_edge_irq() does: desc->status &= ~(IRQ_REPLAY | IRQ_WAITING); desc->status |= (IRQ_PENDING | IRQ_MASKED); mask_ack_irq(desc, irq); The responsibility of triggering the interrupt has now been transferred from the pic to the kernel (the interrupt has been acked at pic level), so I would say that if the kernel does not run replay it later, it will be a bug. * When __enable_irq() is called, it does: unsigned int status = desc->status & ~IRQ_DISABLED; desc->status = status | IRQ_NOPROBE; check_irq_resend(desc, irq); * if it's an edge triggered IRQ (and in my case it is) check_irq_resend() sometimes, if the planets are aligned, resend the IRQ. Planet alignement is defined by retrigger being provided for the irq_chip and retrigger succeed, or CONFIG_HARDIRQS_SW_RESEND is defined (in which case and if retrigger is absent or has failed, the hardirq is indeed executed in a tasklet -- which might maybe cause obscure problems?). My guess is that CONFIG_HARDIRQS_SW_RESEND was meant to be defined on any platform where a pic can exist and can do edge trigger but does not implement retrigger or implements a retrigger that can fail (because otherwise interrupts can be missed). This seems to be a quite unknown fact that might have been missed by multiple irq_chips (and could be missed by more in the future), and I would propose to just unconditionally build the resend_irqs() tasklet and its scheduling code, and completely remove CONFIG_HARDIRQS_SW_RESEND, at least if it is sure that executing a flow handler in a tasklet can never cause obscure problems. Cheers! -- Guillaume Knispel ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-04-19 13:09 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-04-18 3:34 Pending interrupts not always replayed Guillaume Knispel 2010-04-18 14:46 ` Guillaume Knispel 2010-04-18 19:14 ` Thomas Gleixner 2010-04-19 13:09 ` Guillaume Knispel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).