* [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path @ 2012-01-18 15:01 Will Deacon 2012-01-18 15:07 ` Russell King - ARM Linux 0 siblings, 1 reply; 6+ messages in thread From: Will Deacon @ 2012-01-18 15:01 UTC (permalink / raw) To: linux-arm-kernel The kexec machine crash code can be called in interrupt context via a sysrq trigger made using the magic key combination. If the irq chip dealing with the serial interrupt is using the fasteoi flow handler, then we will never EOI the interrupt because the interrupt handler will be fatal. In the case of a GIC, this results in the crash kernel not receiving interrupts on that CPU interface. This patch adds code (based on the PowerPC implementation) to EOI any pending interrupts on the crash CPU before masking and disabling all interrupts. Secondary cores are not a problem since they are placed into a cpu_relax() loop via an IPI. Reported-by: Lei Wen <leiwen@marvell.com> Signed-off-by: Will Deacon <will.deacon@arm.com> --- arch/arm/kernel/machine_kexec.c | 24 ++++++++++++++++++++++++ 1 files changed, 24 insertions(+), 0 deletions(-) diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c index 764bd45..f007c54 100644 --- a/arch/arm/kernel/machine_kexec.c +++ b/arch/arm/kernel/machine_kexec.c @@ -7,6 +7,7 @@ #include <linux/delay.h> #include <linux/reboot.h> #include <linux/io.h> +#include <linux/irq.h> #include <asm/pgtable.h> #include <asm/pgalloc.h> #include <asm/mmu_context.h> @@ -53,6 +54,28 @@ void machine_crash_nonpanic_core(void *unused) cpu_relax(); } +static void machine_kexec_mask_interrupts(void) { + unsigned int i; + struct irq_desc *desc; + + for_each_irq_desc(i, desc) { + struct irq_chip *chip; + + chip = irq_desc_get_chip(desc); + if (!chip) + continue; + + if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data)) + chip->irq_eoi(&desc->irq_data); + + if (chip->irq_mask) + chip->irq_mask(&desc->irq_data); + + if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data)) + chip->irq_disable(&desc->irq_data); + } +} + void machine_crash_shutdown(struct pt_regs *regs) { unsigned long msecs; @@ -70,6 +93,7 @@ void machine_crash_shutdown(struct pt_regs *regs) printk(KERN_WARNING "Non-crashing CPUs did not react to IPI\n"); crash_save_cpu(regs, smp_processor_id()); + machine_kexec_mask_interrupts(); printk(KERN_INFO "Loading crashdump kernel...\n"); } -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path 2012-01-18 15:01 [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path Will Deacon @ 2012-01-18 15:07 ` Russell King - ARM Linux 2012-01-18 15:31 ` Will Deacon 0 siblings, 1 reply; 6+ messages in thread From: Russell King - ARM Linux @ 2012-01-18 15:07 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 18, 2012 at 03:01:25PM +0000, Will Deacon wrote: > The kexec machine crash code can be called in interrupt context via a > sysrq trigger made using the magic key combination. If the irq chip > dealing with the serial interrupt is using the fasteoi flow handler, > then we will never EOI the interrupt because the interrupt handler will > be fatal. In the case of a GIC, this results in the crash kernel not > receiving interrupts on that CPU interface. > > This patch adds code (based on the PowerPC implementation) to EOI any > pending interrupts on the crash CPU before masking and disabling all > interrupts. Secondary cores are not a problem since they are placed into > a cpu_relax() loop via an IPI. So, what happens if we fault in an interrupt handler, we have panic_on_oops set, and we have panic configured to automatically reboot after a period? I think we actually want this to happen at boot to make sure that the CPU interfaces are properly initialized each time the kernel is brought up. > @@ -53,6 +54,28 @@ void machine_crash_nonpanic_core(void *unused) > cpu_relax(); > } > > +static void machine_kexec_mask_interrupts(void) { Coding style error. ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path 2012-01-18 15:07 ` Russell King - ARM Linux @ 2012-01-18 15:31 ` Will Deacon 2012-01-18 15:40 ` Russell King - ARM Linux 0 siblings, 1 reply; 6+ messages in thread From: Will Deacon @ 2012-01-18 15:31 UTC (permalink / raw) To: linux-arm-kernel Hi Russell, On Wed, Jan 18, 2012 at 03:07:24PM +0000, Russell King - ARM Linux wrote: > On Wed, Jan 18, 2012 at 03:01:25PM +0000, Will Deacon wrote: > > The kexec machine crash code can be called in interrupt context via a > > sysrq trigger made using the magic key combination. If the irq chip > > dealing with the serial interrupt is using the fasteoi flow handler, > > then we will never EOI the interrupt because the interrupt handler will > > be fatal. In the case of a GIC, this results in the crash kernel not > > receiving interrupts on that CPU interface. > > > > This patch adds code (based on the PowerPC implementation) to EOI any > > pending interrupts on the crash CPU before masking and disabling all > > interrupts. Secondary cores are not a problem since they are placed into > > a cpu_relax() loop via an IPI. > > So, what happens if we fault in an interrupt handler, we have > panic_on_oops set, and we have panic configured to automatically > reboot after a period? If we fault in an interrupt handler, we'll end up with the faulting CPU in machine_crash_shutdown, with the secondaries getting put into machine_crash_nonpanic_core. Then the faulting CPU will EOI the interrupt it was previously handling, before masking it. The interrupt may of course remain asserted at the distributor, but it will be masked, so it means the new crash kernel might receive a spurious interrupt when it unmasks it via request_irq. Or have you idenfified an issue that I'm missing? > I think we actually want this to happen at boot to make sure that the > CPU interfaces are properly initialized each time the kernel is brought > up. The problem with that is working out which interrupts to EOI, on which CPU interfaces and in which order. The GIC manual states that EOIing an interrupt which hasn't been previously acked on that interface is UNPREDICTABLE. > > @@ -53,6 +54,28 @@ void machine_crash_nonpanic_core(void *unused) > > cpu_relax(); > > } > > > > +static void machine_kexec_mask_interrupts(void) { > > Coding style error. Will fix. Cheers, Will ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path 2012-01-18 15:31 ` Will Deacon @ 2012-01-18 15:40 ` Russell King - ARM Linux 2012-01-18 15:48 ` Will Deacon 0 siblings, 1 reply; 6+ messages in thread From: Russell King - ARM Linux @ 2012-01-18 15:40 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 18, 2012 at 03:31:20PM +0000, Will Deacon wrote: > Hi Russell, > > On Wed, Jan 18, 2012 at 03:07:24PM +0000, Russell King - ARM Linux wrote: > > On Wed, Jan 18, 2012 at 03:01:25PM +0000, Will Deacon wrote: > > > The kexec machine crash code can be called in interrupt context via a > > > sysrq trigger made using the magic key combination. If the irq chip > > > dealing with the serial interrupt is using the fasteoi flow handler, > > > then we will never EOI the interrupt because the interrupt handler will > > > be fatal. In the case of a GIC, this results in the crash kernel not > > > receiving interrupts on that CPU interface. > > > > > > This patch adds code (based on the PowerPC implementation) to EOI any > > > pending interrupts on the crash CPU before masking and disabling all > > > interrupts. Secondary cores are not a problem since they are placed into > > > a cpu_relax() loop via an IPI. > > > > So, what happens if we fault in an interrupt handler, we have > > panic_on_oops set, and we have panic configured to automatically > > reboot after a period? > > If we fault in an interrupt handler, we'll end up with the faulting CPU in > machine_crash_shutdown, with the secondaries getting put into > machine_crash_nonpanic_core. Then the faulting CPU will EOI the interrupt it > was previously handling, before masking it. How does the faulting interrupt get EOI'd? If we've faulted in an interrupt handler, we don't return to the interrupt handler to complete the handling. > The problem with that is working out which interrupts to EOI, on which > CPU interfaces and in which order. The GIC manual states that EOIing an > interrupt which hasn't been previously acked on that interface is > UNPREDICTABLE. Is there no way to re-initialize the gic after an interrupt has begun to be processed, but not EOI'd ? ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path 2012-01-18 15:40 ` Russell King - ARM Linux @ 2012-01-18 15:48 ` Will Deacon 2012-01-19 13:33 ` Will Deacon 0 siblings, 1 reply; 6+ messages in thread From: Will Deacon @ 2012-01-18 15:48 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 18, 2012 at 03:40:44PM +0000, Russell King - ARM Linux wrote: > On Wed, Jan 18, 2012 at 03:31:20PM +0000, Will Deacon wrote: > > On Wed, Jan 18, 2012 at 03:07:24PM +0000, Russell King - ARM Linux wrote: > > > On Wed, Jan 18, 2012 at 03:01:25PM +0000, Will Deacon wrote: > > > > The kexec machine crash code can be called in interrupt context via a > > > > sysrq trigger made using the magic key combination. If the irq chip > > > > dealing with the serial interrupt is using the fasteoi flow handler, > > > > then we will never EOI the interrupt because the interrupt handler will > > > > be fatal. In the case of a GIC, this results in the crash kernel not > > > > receiving interrupts on that CPU interface. > > > > > > > > This patch adds code (based on the PowerPC implementation) to EOI any > > > > pending interrupts on the crash CPU before masking and disabling all > > > > interrupts. Secondary cores are not a problem since they are placed into > > > > a cpu_relax() loop via an IPI. > > > > > > So, what happens if we fault in an interrupt handler, we have > > > panic_on_oops set, and we have panic configured to automatically > > > reboot after a period? > > > > If we fault in an interrupt handler, we'll end up with the faulting CPU in > > machine_crash_shutdown, with the secondaries getting put into > > machine_crash_nonpanic_core. Then the faulting CPU will EOI the interrupt it > > was previously handling, before masking it. > > How does the faulting interrupt get EOI'd? If we've faulted in an > interrupt handler, we don't return to the interrupt handler to complete > the handling. That's what the new machine_kexec_mask_interrupts is supposed to do. It iterates over all of the irq_descs and does: if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data)) chip->irq_eoi(&desc->irq_data); This is called from machine_crash_shutdown, so we should pick up the previous interrupt there without having to return to the handler. > > The problem with that is working out which interrupts to EOI, on which > > CPU interfaces and in which order. The GIC manual states that EOIing an > > interrupt which hasn't been previously acked on that interface is > > UNPREDICTABLE. > > Is there no way to re-initialize the gic after an interrupt has begun > to be processed, but not EOI'd ? Unfortunately, I'm not aware of anything like this (I was hoping for some reset functionality). I'll double check that I didn't miss anything, but it seems that you can't reset the GIC state machine without manually putting it all back. Will ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path 2012-01-18 15:48 ` Will Deacon @ 2012-01-19 13:33 ` Will Deacon 0 siblings, 0 replies; 6+ messages in thread From: Will Deacon @ 2012-01-19 13:33 UTC (permalink / raw) To: linux-arm-kernel On Wed, Jan 18, 2012 at 03:48:09PM +0000, Will Deacon wrote: > On Wed, Jan 18, 2012 at 03:40:44PM +0000, Russell King - ARM Linux wrote: > > > > Is there no way to re-initialize the gic after an interrupt has begun > > to be processed, but not EOI'd ? > > Unfortunately, I'm not aware of anything like this (I was hoping for some > reset functionality). I'll double check that I didn't miss anything, but it > seems that you can't reset the GIC state machine without manually putting it > all back. Ok, I re-read the docs and also spoke to some of the hardware guys and it seems that you can only kill of all the active interrupts from GICv2 (i.e. Cortex-A15 onwards). So I reckon the best way for the time being is to do this on the crash path. Will ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-01-19 13:33 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-01-18 15:01 [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path Will Deacon 2012-01-18 15:07 ` Russell King - ARM Linux 2012-01-18 15:31 ` Will Deacon 2012-01-18 15:40 ` Russell King - ARM Linux 2012-01-18 15:48 ` Will Deacon 2012-01-19 13:33 ` Will Deacon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).