* [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path
@ 2012-01-18 15:01 Will Deacon
2012-01-18 15:07 ` Russell King - ARM Linux
0 siblings, 1 reply; 6+ messages in thread
From: Will Deacon @ 2012-01-18 15:01 UTC (permalink / raw)
To: linux-arm-kernel
The kexec machine crash code can be called in interrupt context via a
sysrq trigger made using the magic key combination. If the irq chip
dealing with the serial interrupt is using the fasteoi flow handler,
then we will never EOI the interrupt because the interrupt handler will
be fatal. In the case of a GIC, this results in the crash kernel not
receiving interrupts on that CPU interface.
This patch adds code (based on the PowerPC implementation) to EOI any
pending interrupts on the crash CPU before masking and disabling all
interrupts. Secondary cores are not a problem since they are placed into
a cpu_relax() loop via an IPI.
Reported-by: Lei Wen <leiwen@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/arm/kernel/machine_kexec.c | 24 ++++++++++++++++++++++++
1 files changed, 24 insertions(+), 0 deletions(-)
diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c
index 764bd45..f007c54 100644
--- a/arch/arm/kernel/machine_kexec.c
+++ b/arch/arm/kernel/machine_kexec.c
@@ -7,6 +7,7 @@
#include <linux/delay.h>
#include <linux/reboot.h>
#include <linux/io.h>
+#include <linux/irq.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
#include <asm/mmu_context.h>
@@ -53,6 +54,28 @@ void machine_crash_nonpanic_core(void *unused)
cpu_relax();
}
+static void machine_kexec_mask_interrupts(void) {
+ unsigned int i;
+ struct irq_desc *desc;
+
+ for_each_irq_desc(i, desc) {
+ struct irq_chip *chip;
+
+ chip = irq_desc_get_chip(desc);
+ if (!chip)
+ continue;
+
+ if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data))
+ chip->irq_eoi(&desc->irq_data);
+
+ if (chip->irq_mask)
+ chip->irq_mask(&desc->irq_data);
+
+ if (chip->irq_disable && !irqd_irq_disabled(&desc->irq_data))
+ chip->irq_disable(&desc->irq_data);
+ }
+}
+
void machine_crash_shutdown(struct pt_regs *regs)
{
unsigned long msecs;
@@ -70,6 +93,7 @@ void machine_crash_shutdown(struct pt_regs *regs)
printk(KERN_WARNING "Non-crashing CPUs did not react to IPI\n");
crash_save_cpu(regs, smp_processor_id());
+ machine_kexec_mask_interrupts();
printk(KERN_INFO "Loading crashdump kernel...\n");
}
--
1.7.4.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path
2012-01-18 15:01 [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path Will Deacon
@ 2012-01-18 15:07 ` Russell King - ARM Linux
2012-01-18 15:31 ` Will Deacon
0 siblings, 1 reply; 6+ messages in thread
From: Russell King - ARM Linux @ 2012-01-18 15:07 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Jan 18, 2012 at 03:01:25PM +0000, Will Deacon wrote:
> The kexec machine crash code can be called in interrupt context via a
> sysrq trigger made using the magic key combination. If the irq chip
> dealing with the serial interrupt is using the fasteoi flow handler,
> then we will never EOI the interrupt because the interrupt handler will
> be fatal. In the case of a GIC, this results in the crash kernel not
> receiving interrupts on that CPU interface.
>
> This patch adds code (based on the PowerPC implementation) to EOI any
> pending interrupts on the crash CPU before masking and disabling all
> interrupts. Secondary cores are not a problem since they are placed into
> a cpu_relax() loop via an IPI.
So, what happens if we fault in an interrupt handler, we have
panic_on_oops set, and we have panic configured to automatically
reboot after a period?
I think we actually want this to happen at boot to make sure that the
CPU interfaces are properly initialized each time the kernel is brought
up.
> @@ -53,6 +54,28 @@ void machine_crash_nonpanic_core(void *unused)
> cpu_relax();
> }
>
> +static void machine_kexec_mask_interrupts(void) {
Coding style error.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path
2012-01-18 15:07 ` Russell King - ARM Linux
@ 2012-01-18 15:31 ` Will Deacon
2012-01-18 15:40 ` Russell King - ARM Linux
0 siblings, 1 reply; 6+ messages in thread
From: Will Deacon @ 2012-01-18 15:31 UTC (permalink / raw)
To: linux-arm-kernel
Hi Russell,
On Wed, Jan 18, 2012 at 03:07:24PM +0000, Russell King - ARM Linux wrote:
> On Wed, Jan 18, 2012 at 03:01:25PM +0000, Will Deacon wrote:
> > The kexec machine crash code can be called in interrupt context via a
> > sysrq trigger made using the magic key combination. If the irq chip
> > dealing with the serial interrupt is using the fasteoi flow handler,
> > then we will never EOI the interrupt because the interrupt handler will
> > be fatal. In the case of a GIC, this results in the crash kernel not
> > receiving interrupts on that CPU interface.
> >
> > This patch adds code (based on the PowerPC implementation) to EOI any
> > pending interrupts on the crash CPU before masking and disabling all
> > interrupts. Secondary cores are not a problem since they are placed into
> > a cpu_relax() loop via an IPI.
>
> So, what happens if we fault in an interrupt handler, we have
> panic_on_oops set, and we have panic configured to automatically
> reboot after a period?
If we fault in an interrupt handler, we'll end up with the faulting CPU in
machine_crash_shutdown, with the secondaries getting put into
machine_crash_nonpanic_core. Then the faulting CPU will EOI the interrupt it
was previously handling, before masking it. The interrupt may of course
remain asserted at the distributor, but it will be masked, so it means the
new crash kernel might receive a spurious interrupt when it unmasks it via
request_irq.
Or have you idenfified an issue that I'm missing?
> I think we actually want this to happen at boot to make sure that the
> CPU interfaces are properly initialized each time the kernel is brought
> up.
The problem with that is working out which interrupts to EOI, on which
CPU interfaces and in which order. The GIC manual states that EOIing an
interrupt which hasn't been previously acked on that interface is
UNPREDICTABLE.
> > @@ -53,6 +54,28 @@ void machine_crash_nonpanic_core(void *unused)
> > cpu_relax();
> > }
> >
> > +static void machine_kexec_mask_interrupts(void) {
>
> Coding style error.
Will fix.
Cheers,
Will
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path
2012-01-18 15:31 ` Will Deacon
@ 2012-01-18 15:40 ` Russell King - ARM Linux
2012-01-18 15:48 ` Will Deacon
0 siblings, 1 reply; 6+ messages in thread
From: Russell King - ARM Linux @ 2012-01-18 15:40 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Jan 18, 2012 at 03:31:20PM +0000, Will Deacon wrote:
> Hi Russell,
>
> On Wed, Jan 18, 2012 at 03:07:24PM +0000, Russell King - ARM Linux wrote:
> > On Wed, Jan 18, 2012 at 03:01:25PM +0000, Will Deacon wrote:
> > > The kexec machine crash code can be called in interrupt context via a
> > > sysrq trigger made using the magic key combination. If the irq chip
> > > dealing with the serial interrupt is using the fasteoi flow handler,
> > > then we will never EOI the interrupt because the interrupt handler will
> > > be fatal. In the case of a GIC, this results in the crash kernel not
> > > receiving interrupts on that CPU interface.
> > >
> > > This patch adds code (based on the PowerPC implementation) to EOI any
> > > pending interrupts on the crash CPU before masking and disabling all
> > > interrupts. Secondary cores are not a problem since they are placed into
> > > a cpu_relax() loop via an IPI.
> >
> > So, what happens if we fault in an interrupt handler, we have
> > panic_on_oops set, and we have panic configured to automatically
> > reboot after a period?
>
> If we fault in an interrupt handler, we'll end up with the faulting CPU in
> machine_crash_shutdown, with the secondaries getting put into
> machine_crash_nonpanic_core. Then the faulting CPU will EOI the interrupt it
> was previously handling, before masking it.
How does the faulting interrupt get EOI'd? If we've faulted in an
interrupt handler, we don't return to the interrupt handler to complete
the handling.
> The problem with that is working out which interrupts to EOI, on which
> CPU interfaces and in which order. The GIC manual states that EOIing an
> interrupt which hasn't been previously acked on that interface is
> UNPREDICTABLE.
Is there no way to re-initialize the gic after an interrupt has begun
to be processed, but not EOI'd ?
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path
2012-01-18 15:40 ` Russell King - ARM Linux
@ 2012-01-18 15:48 ` Will Deacon
2012-01-19 13:33 ` Will Deacon
0 siblings, 1 reply; 6+ messages in thread
From: Will Deacon @ 2012-01-18 15:48 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Jan 18, 2012 at 03:40:44PM +0000, Russell King - ARM Linux wrote:
> On Wed, Jan 18, 2012 at 03:31:20PM +0000, Will Deacon wrote:
> > On Wed, Jan 18, 2012 at 03:07:24PM +0000, Russell King - ARM Linux wrote:
> > > On Wed, Jan 18, 2012 at 03:01:25PM +0000, Will Deacon wrote:
> > > > The kexec machine crash code can be called in interrupt context via a
> > > > sysrq trigger made using the magic key combination. If the irq chip
> > > > dealing with the serial interrupt is using the fasteoi flow handler,
> > > > then we will never EOI the interrupt because the interrupt handler will
> > > > be fatal. In the case of a GIC, this results in the crash kernel not
> > > > receiving interrupts on that CPU interface.
> > > >
> > > > This patch adds code (based on the PowerPC implementation) to EOI any
> > > > pending interrupts on the crash CPU before masking and disabling all
> > > > interrupts. Secondary cores are not a problem since they are placed into
> > > > a cpu_relax() loop via an IPI.
> > >
> > > So, what happens if we fault in an interrupt handler, we have
> > > panic_on_oops set, and we have panic configured to automatically
> > > reboot after a period?
> >
> > If we fault in an interrupt handler, we'll end up with the faulting CPU in
> > machine_crash_shutdown, with the secondaries getting put into
> > machine_crash_nonpanic_core. Then the faulting CPU will EOI the interrupt it
> > was previously handling, before masking it.
>
> How does the faulting interrupt get EOI'd? If we've faulted in an
> interrupt handler, we don't return to the interrupt handler to complete
> the handling.
That's what the new machine_kexec_mask_interrupts is supposed to do. It
iterates over all of the irq_descs and does:
if (chip->irq_eoi && irqd_irq_inprogress(&desc->irq_data))
chip->irq_eoi(&desc->irq_data);
This is called from machine_crash_shutdown, so we should pick up the
previous interrupt there without having to return to the handler.
> > The problem with that is working out which interrupts to EOI, on which
> > CPU interfaces and in which order. The GIC manual states that EOIing an
> > interrupt which hasn't been previously acked on that interface is
> > UNPREDICTABLE.
>
> Is there no way to re-initialize the gic after an interrupt has begun
> to be processed, but not EOI'd ?
Unfortunately, I'm not aware of anything like this (I was hoping for some
reset functionality). I'll double check that I didn't miss anything, but it
seems that you can't reset the GIC state machine without manually putting it
all back.
Will
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path
2012-01-18 15:48 ` Will Deacon
@ 2012-01-19 13:33 ` Will Deacon
0 siblings, 0 replies; 6+ messages in thread
From: Will Deacon @ 2012-01-19 13:33 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Jan 18, 2012 at 03:48:09PM +0000, Will Deacon wrote:
> On Wed, Jan 18, 2012 at 03:40:44PM +0000, Russell King - ARM Linux wrote:
> >
> > Is there no way to re-initialize the gic after an interrupt has begun
> > to be processed, but not EOI'd ?
>
> Unfortunately, I'm not aware of anything like this (I was hoping for some
> reset functionality). I'll double check that I didn't miss anything, but it
> seems that you can't reset the GIC state machine without manually putting it
> all back.
Ok, I re-read the docs and also spoke to some of the hardware guys and it
seems that you can only kill of all the active interrupts from GICv2 (i.e.
Cortex-A15 onwards). So I reckon the best way for the time being is to do
this on the crash path.
Will
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-01-19 13:33 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-01-18 15:01 [PATCH] ARM: kexec: EOI active and mask all interrupts in kexec crash path Will Deacon
2012-01-18 15:07 ` Russell King - ARM Linux
2012-01-18 15:31 ` Will Deacon
2012-01-18 15:40 ` Russell King - ARM Linux
2012-01-18 15:48 ` Will Deacon
2012-01-19 13:33 ` Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).