ppc826x BAD interrupts

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* ppc826x BAD interrupts
@ 2004-01-16 16:02 Jeff Angielski
  2004-01-16 16:35 ` Rob Baxter
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff Angielski @ 2004-01-16 16:02 UTC (permalink / raw)
  To: linuxppc-dev

Looking at /proc/interrupts, I see a large number of "BAD" interrups on
both my MPC8260 reference board (2.4.21) and my PPC8266 custom board
(2.4.23).  Both use u-boot as the bootloader.

bash-2.05# cat /proc/interrupts
           CPU0
 24:          0   8260 SIU   Edge      PCI IRQ demux
 33: 2658326944   8260 SIU   Edge      fenet
 40:      32524   8260 SIU   Edge      uart
 41:          0   8260 SIU   Edge      uart
BAD:    8862006  <<====== this the problem

The source of this count is ppc_spurious_interrupts which is incremented
in the arch/ppc/kernel/irq.c if:

	1) there is no interrupt handler installed

	2) SIVEC is showing zero (no interrupts pending)

Looking into the problem it would appear that the problem is the later
case and the get_irq() function in ppc8260_pic.c is indeed reading a
zero from the SIVEC.

The questions I have are:

1) Has anybody seen this behavior on their PowerPC platform?
2) Does anybody know why the SIVEC would be showing a zero?

TIA,
Jeff Angielski
The PTR Group

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: ppc826x BAD interrupts
@ 2004-01-16 16:29 Muhammad Sarwar
  2004-01-16 18:47 ` Jeff Angielski
  2004-01-17  3:22 ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 6+ messages in thread
From: Muhammad Sarwar @ 2004-01-16 16:29 UTC (permalink / raw)
  To: Jeff Angielski, linuxppc-dev

This problem was discussed on mailing list before also and you can eliminate this problem by inserting a sync instruction at a certain place in the 8260 interrupt handling code. See, for example, http://www.geocrawler.com/archives/3/8358/2002/11/100/10173445/

Add a __asm__ volatile("sync"); at the end of the m8260_mask_and_ack  function in arch/ppc/kernel/ppc8260_pic.c to fix it.

Regards,

Muhammad Sarwar
Mangrove Systems Inc.

-----Original Message-----
From: Jeff Angielski [mailto:jeff@theptrgroup.com]
Sent: Friday, January 16, 2004 11:03 AM
To: linuxppc-dev@lists.linuxppc.org
Subject: ppc826x BAD interrupts

Looking at /proc/interrupts, I see a large number of "BAD" interrups on
both my MPC8260 reference board (2.4.21) and my PPC8266 custom board
(2.4.23).  Both use u-boot as the bootloader.

bash-2.05# cat /proc/interrupts
           CPU0
 24:          0   8260 SIU   Edge      PCI IRQ demux
 33: 2658326944   8260 SIU   Edge      fenet
 40:      32524   8260 SIU   Edge      uart
 41:          0   8260 SIU   Edge      uart
BAD:    8862006  <<====== this the problem

The source of this count is ppc_spurious_interrupts which is incremented
in the arch/ppc/kernel/irq.c if:

	1) there is no interrupt handler installed

	2) SIVEC is showing zero (no interrupts pending)

Looking into the problem it would appear that the problem is the later
case and the get_irq() function in ppc8260_pic.c is indeed reading a
zero from the SIVEC.

The questions I have are:

1) Has anybody seen this behavior on their PowerPC platform?
2) Does anybody know why the SIVEC would be showing a zero?

TIA,
Jeff Angielski
The PTR Group

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ppc826x BAD interrupts
  2004-01-16 16:02 ppc826x BAD interrupts Jeff Angielski
@ 2004-01-16 16:35 ` Rob Baxter
  2004-01-16 20:18   ` Randy Vinson
  0 siblings, 1 reply; 6+ messages in thread
From: Rob Baxter @ 2004-01-16 16:35 UTC (permalink / raw)
  To: Jeff Angielski; +Cc: linuxppc-dev

On Fri, Jan 16, 2004 at 11:02:53AM -0500, Jeff Angielski wrote:
>
> Looking at /proc/interrupts, I see a large number of "BAD" interrups on
> both my MPC8260 reference board (2.4.21) and my PPC8266 custom board
> (2.4.23).  Both use u-boot as the bootloader.
>
> bash-2.05# cat /proc/interrupts
>            CPU0
>  24:          0   8260 SIU   Edge      PCI IRQ demux
>  33: 2658326944   8260 SIU   Edge      fenet
>  40:      32524   8260 SIU   Edge      uart
>  41:          0   8260 SIU   Edge      uart
> BAD:    8862006  <<====== this the problem
>
> The source of this count is ppc_spurious_interrupts which is incremented
> in the arch/ppc/kernel/irq.c if:
>
> 	1) there is no interrupt handler installed
>
> 	2) SIVEC is showing zero (no interrupts pending)
>
> Looking into the problem it would appear that the problem is the later
> case and the get_irq() function in ppc8260_pic.c is indeed reading a
> zero from the SIVEC.
>
> The questions I have are:
>
> 1) Has anybody seen this behavior on their PowerPC platform?
> 2) Does anybody know why the SIVEC would be showing a zero?
>
> TIA,
> Jeff Angielski
> The PTR Group
>
>
>

Hi Jeff,

Yes, I have seen this on a PowerPC platform.  And what I have noticed is
that faster (i.e., internal core frequency) the processor the more
likelihood of this happening.  However, it is highly dependent upon the
platform (e.g., interrupting devices, interrupt controller).

A good example is an interrupt request from a PCI bus device.  Many device
driver interrupt handlers will clear the source of the interrupt by either
reading or writing some register within the device, perform some necessary
actions, and return from the handler.  The PCI device is slow to negate its
interrupt request and the interrupt controller sees the interrupt request
from the device again.  With the platforms that I'm associated with I've
seen this happen more frequently (i.e., BAD interrupts) as processor
internal core frequencies increase, especially with the 7457.

--
Rob Baxter

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: ppc826x BAD interrupts
  2004-01-16 16:29 Muhammad Sarwar
@ 2004-01-16 18:47 ` Jeff Angielski
  2004-01-17  3:22 ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff Angielski @ 2004-01-16 18:47 UTC (permalink / raw)
  To: Muhammad Sarwar; +Cc: linuxppc-dev


Nice catch.  Adding the sync does, in fact, fix the problem.

I guess this means I need to go back to Google-school since I scoured
the net looking for this very information...  ;)

Jeff Angielski
The PTR Group


On Fri, 2004-01-16 at 11:29, Muhammad Sarwar wrote:
> This problem was discussed on mailing list before also and you can eliminate this problem by inserting a sync instruction at a certain place in the 8260 interrupt handling code. See, for example, http://www.geocrawler.com/archives/3/8358/2002/11/100/10173445/
>
> Add a __asm__ volatile("sync"); at the end of the m8260_mask_and_ack  function in arch/ppc/kernel/ppc8260_pic.c to fix it.
>
>
> Regards,
>
> Muhammad Sarwar
> Mangrove Systems Inc.
>
>
> -----Original Message-----
> From: Jeff Angielski [mailto:jeff@theptrgroup.com]
> Sent: Friday, January 16, 2004 11:03 AM
> To: linuxppc-dev@lists.linuxppc.org
> Subject: ppc826x BAD interrupts
>
>
> Looking at /proc/interrupts, I see a large number of "BAD" interrups on
> both my MPC8260 reference board (2.4.21) and my PPC8266 custom board
> (2.4.23).  Both use u-boot as the bootloader.
>
> bash-2.05# cat /proc/interrupts
>            CPU0
>  24:          0   8260 SIU   Edge      PCI IRQ demux
>  33: 2658326944   8260 SIU   Edge      fenet
>  40:      32524   8260 SIU   Edge      uart
>  41:          0   8260 SIU   Edge      uart
> BAD:    8862006  <<====== this the problem
>
> The source of this count is ppc_spurious_interrupts which is incremented
> in the arch/ppc/kernel/irq.c if:
>
> 	1) there is no interrupt handler installed
>
> 	2) SIVEC is showing zero (no interrupts pending)
>
> Looking into the problem it would appear that the problem is the later
> case and the get_irq() function in ppc8260_pic.c is indeed reading a
> zero from the SIVEC.
>
> The questions I have are:
>
> 1) Has anybody seen this behavior on their PowerPC platform?
> 2) Does anybody know why the SIVEC would be showing a zero?
>
> TIA,
> Jeff Angielski
> The PTR Group
>
>


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: ppc826x BAD interrupts
  2004-01-16 16:35 ` Rob Baxter
@ 2004-01-16 20:18   ` Randy Vinson
  0 siblings, 0 replies; 6+ messages in thread
From: Randy Vinson @ 2004-01-16 20:18 UTC (permalink / raw)
  To: Jeff Angielski, linuxppc-dev

Jeff,
   Like Rob, I've seen the same behavior, especially if the interrupting
device is located behind a PCI to PCI bridge. However, I think the true
cause is actually propagation delays though the bridge(s). I've seen a
recommendation that interrupt service routines should always perform a
read as the last transaction with the device to ensure that the
interrupt clearing write has been flushed though the bridge and has
reached the device. I think this comes from the PCI spec or the PCI
bridge spec, but I'm not certain. Most of the interrupt service routines
that I've seen have some sort of loop that reads interrupt status
register from the device, services the events indicated and the loops
around to check for additional events that have ocurred during the
interrupt service routine. This approach satifies the recommendation,
because the read that shows no outstanding events will flush the bridge(s).

   The addition of a "sync" instruction in the interrupt controller
routine is simply adding a delay. This was noted in the post referenced
by Muhammad Sarwar
(http://www.geocrawler.com/archives/3/8358/2002/11/100/10173445/).

  I took a look at the interrupt service routine (fcc_enet_interrupt in
arch/ppc/8260_io/fcc_enet.c) and it does not perform a read-back before
exiting. It would would be interesting to add a readback operation in
the interrupt routine and see if that also fixes the problem. If it
works, I think it would be better than adding a sync delay to every
interrupt.

   A temporary hack would be something like:

	.
	.
	.
	/* Get the interrupt events that caused us to be here.
         */
         int_events = cep->fccp->fcc_fcce;
         cep->fccp->fcc_fcce = int_events;
	{
		ushort temp;
		temp = cep->fccp->fcc_fcce;
	}
         must_restart = 0;
	.
	.
	.

			Randy Vinson
			MontaVist Software

Rob Baxter wrote:

> Hi Jeff,
>
> Yes, I have seen this on a PowerPC platform.  And what I have noticed is
> that faster (i.e., internal core frequency) the processor the more
> likelihood of this happening.  However, it is highly dependent upon the
> platform (e.g., interrupting devices, interrupt controller).
>
> A good example is an interrupt request from a PCI bus device.  Many device
> driver interrupt handlers will clear the source of the interrupt by either
> reading or writing some register within the device, perform some necessary
> actions, and return from the handler.  The PCI device is slow to negate its
> interrupt request and the interrupt controller sees the interrupt request
> from the device again.  With the platforms that I'm associated with I've
> seen this happen more frequently (i.e., BAD interrupts) as processor
> internal core frequencies increase, especially with the 7457.
>
> --
> Rob Baxter
>
>
>
>

Muhammad Sarwar wrote:
 > This problem was discussed on mailing list before also and you can
 > eliminate this problem by inserting a sync instruction at a certain
 > place in the 8260 interrupt handling code. See, for example,
 > http://www.geocrawler.com/archives/3/8358/2002/11/100/10173445/
 >
 > Add a __asm__ volatile("sync"); at the end of the m8260_mask_and_ack
 > function in arch/ppc/kernel/ppc8260_pic.c to fix it.
 >
 >
 > Regards,
 >
 > Muhammad Sarwar
 > Mangrove Systems Inc.

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: ppc826x BAD interrupts
  2004-01-16 16:29 Muhammad Sarwar
  2004-01-16 18:47 ` Jeff Angielski
@ 2004-01-17  3:22 ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 6+ messages in thread
From: Benjamin Herrenschmidt @ 2004-01-17  3:22 UTC (permalink / raw)
  To: Muhammad Sarwar; +Cc: Jeff Angielski, linuxppc-dev list


On Sat, 2004-01-17 at 03:29, Muhammad Sarwar wrote:
> This problem was discussed on mailing list before also and you can eliminate this problem by inserting a sync instruction at a certain place in the 8260 interrupt handling code. See, for example, http://www.geocrawler.com/archives/3/8358/2002/11/100/10173445/
>
> Add a __asm__ volatile("sync"); at the end of the m8260_mask_and_ack  function in arch/ppc/kernel/ppc8260_pic.c to fix it.

The code looks like crap... do we have any guarantee that those accesses
are done in order and did read the controller ?

I'd rather add eieios and read back the value to enforce ordering...

Ben.


** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-01-17  3:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-01-16 16:02 ppc826x BAD interrupts Jeff Angielski
2004-01-16 16:35 ` Rob Baxter
2004-01-16 20:18   ` Randy Vinson
  -- strict thread matches above, loose matches on Subject: below --
2004-01-16 16:29 Muhammad Sarwar
2004-01-16 18:47 ` Jeff Angielski
2004-01-17  3:22 ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).