* powerpc/cell/axon-msi: fix MSI after kexec
@ 2008-12-12 19:19 Arnd Bergmann
2008-12-15 4:30 ` Michael Ellerman
0 siblings, 1 reply; 2+ messages in thread
From: Arnd Bergmann @ 2008-12-12 19:19 UTC (permalink / raw)
To: paulus, linuxppc-dev, cbe-oss-dev, Michael Ellerman, benh
Commit d015fe995 'powerpc/cell/axon-msi: Retry on missing interrupt'
has turned a rare failure to kexec on QS22 into a reproducible
error, which we have now analysed.
The problem is that after a kexec, the MSIC hardware still points
into the middle of the old ring buffer. We set up the ring buffer
during reboot, but not the offset into it. On older kernels, this
would cause a storm of thousands of spurious interrupts after a
kexec, which would most of the time get dropped silently.
With the new code, we time out on each interrupt, waiting for
it to become valid. If more interrupts come in that we time
out on, this goes on indefinitely, which eventually leads to
a hard crash.
The solution in this patch is to read the current offset from
the MSIC when reinitializing it. This now works correctly, as
expected.
Reported-by: Dirk Herrendoerfer <d.herrendoerfer@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
Please apply when Dirk and Michael have given their Ack.
Should we have it in 2.6.28? Not sure if going from 'works sometimes'
to 'works never' counts as a regression. Most users won't be impacted,
because they don't use kexec on QS22.
diff --git a/arch/powerpc/platforms/cell/axon_msi.c b/arch/powerpc/platforms/cell/axon_msi.c
index 442cf36..548fa4e 100644
--- a/arch/powerpc/platforms/cell/axon_msi.c
+++ b/arch/powerpc/platforms/cell/axon_msi.c
@@ -413,6 +422,9 @@ static int axon_msi_probe(struct of_device *device,
MSIC_CTRL_IRQ_ENABLE | MSIC_CTRL_ENABLE |
MSIC_CTRL_FIFO_SIZE);
+ msic->read_offset = dcr_read(msic->dcr_host, MSIC_WRITE_OFFSET_REG)
+ & MSIC_FIFO_SIZE_MASK;
+
device->dev.platform_data = msic;
ppc_md.setup_msi_irqs = axon_msi_setup_msi_irqs;
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: powerpc/cell/axon-msi: fix MSI after kexec
2008-12-12 19:19 powerpc/cell/axon-msi: fix MSI after kexec Arnd Bergmann
@ 2008-12-15 4:30 ` Michael Ellerman
0 siblings, 0 replies; 2+ messages in thread
From: Michael Ellerman @ 2008-12-15 4:30 UTC (permalink / raw)
To: Arnd Bergmann; +Cc: linuxppc-dev, paulus, cbe-oss-dev
[-- Attachment #1: Type: text/plain, Size: 2168 bytes --]
On Fri, 2008-12-12 at 20:19 +0100, Arnd Bergmann wrote:
> Commit d015fe995 'powerpc/cell/axon-msi: Retry on missing interrupt'
> has turned a rare failure to kexec on QS22 into a reproducible
> error, which we have now analysed.
>
> The problem is that after a kexec, the MSIC hardware still points
> into the middle of the old ring buffer. We set up the ring buffer
> during reboot, but not the offset into it. On older kernels, this
> would cause a storm of thousands of spurious interrupts after a
> kexec, which would most of the time get dropped silently.
>
> With the new code, we time out on each interrupt, waiting for
> it to become valid. If more interrupts come in that we time
> out on, this goes on indefinitely, which eventually leads to
> a hard crash.
>
> The solution in this patch is to read the current offset from
> the MSIC when reinitializing it. This now works correctly, as
> expected.
>
> Reported-by: Dirk Herrendoerfer <d.herrendoerfer@de.ibm.com>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>
> Please apply when Dirk and Michael have given their Ack.
> Should we have it in 2.6.28? Not sure if going from 'works sometimes'
> to 'works never' counts as a regression. Most users won't be impacted,
> because they don't use kexec on QS22.
I think it does count, it's a pretty small fix.
> diff --git a/arch/powerpc/platforms/cell/axon_msi.c b/arch/powerpc/platforms/cell/axon_msi.c
> index 442cf36..548fa4e 100644
> --- a/arch/powerpc/platforms/cell/axon_msi.c
> +++ b/arch/powerpc/platforms/cell/axon_msi.c
> @@ -413,6 +422,9 @@ static int axon_msi_probe(struct of_device *device,
> MSIC_CTRL_IRQ_ENABLE | MSIC_CTRL_ENABLE |
> MSIC_CTRL_FIFO_SIZE);
>
> + msic->read_offset = dcr_read(msic->dcr_host, MSIC_WRITE_OFFSET_REG)
> + & MSIC_FIFO_SIZE_MASK;
> +
Acked-by: Michael Ellerman <michael@ellerman.id.au>
cheers
--
Michael Ellerman
OzLabs, IBM Australia Development Lab
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)
We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2008-12-15 4:30 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-12 19:19 powerpc/cell/axon-msi: fix MSI after kexec Arnd Bergmann
2008-12-15 4:30 ` Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).