From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4008471A.5070903@mvista.com> Date: Fri, 16 Jan 2004 13:18:34 -0700 From: Randy Vinson MIME-Version: 1.0 To: Jeff Angielski , linuxppc-dev@lists.linuxppc.org Subject: Re: ppc826x BAD interrupts References: <1074268973.4323.13.camel@localhost.localdomain> <20040116163545.GA8577@synergymicro.com> In-Reply-To: <20040116163545.GA8577@synergymicro.com> Content-Type: text/plain; charset=us-ascii; format=flowed Sender: owner-linuxppc-dev@lists.linuxppc.org List-Id: Jeff, Like Rob, I've seen the same behavior, especially if the interrupting device is located behind a PCI to PCI bridge. However, I think the true cause is actually propagation delays though the bridge(s). I've seen a recommendation that interrupt service routines should always perform a read as the last transaction with the device to ensure that the interrupt clearing write has been flushed though the bridge and has reached the device. I think this comes from the PCI spec or the PCI bridge spec, but I'm not certain. Most of the interrupt service routines that I've seen have some sort of loop that reads interrupt status register from the device, services the events indicated and the loops around to check for additional events that have ocurred during the interrupt service routine. This approach satifies the recommendation, because the read that shows no outstanding events will flush the bridge(s). The addition of a "sync" instruction in the interrupt controller routine is simply adding a delay. This was noted in the post referenced by Muhammad Sarwar (http://www.geocrawler.com/archives/3/8358/2002/11/100/10173445/). I took a look at the interrupt service routine (fcc_enet_interrupt in arch/ppc/8260_io/fcc_enet.c) and it does not perform a read-back before exiting. It would would be interesting to add a readback operation in the interrupt routine and see if that also fixes the problem. If it works, I think it would be better than adding a sync delay to every interrupt. A temporary hack would be something like: . . . /* Get the interrupt events that caused us to be here. */ int_events = cep->fccp->fcc_fcce; cep->fccp->fcc_fcce = int_events; { ushort temp; temp = cep->fccp->fcc_fcce; } must_restart = 0; . . . Randy Vinson MontaVist Software Rob Baxter wrote: > Hi Jeff, > > Yes, I have seen this on a PowerPC platform. And what I have noticed is > that faster (i.e., internal core frequency) the processor the more > likelihood of this happening. However, it is highly dependent upon the > platform (e.g., interrupting devices, interrupt controller). > > A good example is an interrupt request from a PCI bus device. Many device > driver interrupt handlers will clear the source of the interrupt by either > reading or writing some register within the device, perform some necessary > actions, and return from the handler. The PCI device is slow to negate its > interrupt request and the interrupt controller sees the interrupt request > from the device again. With the platforms that I'm associated with I've > seen this happen more frequently (i.e., BAD interrupts) as processor > internal core frequencies increase, especially with the 7457. > > -- > Rob Baxter > > > > Muhammad Sarwar wrote: > This problem was discussed on mailing list before also and you can > eliminate this problem by inserting a sync instruction at a certain > place in the 8260 interrupt handling code. See, for example, > http://www.geocrawler.com/archives/3/8358/2002/11/100/10173445/ > > Add a __asm__ volatile("sync"); at the end of the m8260_mask_and_ack > function in arch/ppc/kernel/ppc8260_pic.c to fix it. > > > Regards, > > Muhammad Sarwar > Mangrove Systems Inc. ** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/