linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Disabling interrupts on a SMP system
@ 2004-10-28 21:45 Arrigo Benedetti
  2004-10-28 23:39 ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 12+ messages in thread
From: Arrigo Benedetti @ 2004-10-28 21:45 UTC (permalink / raw)
  To: linuxppc-dev

Dear all,

how can I (temporarily) disable all or some specific interrupts on a 
specific CPU in an SMP system
from user space code? In my case this is an Apple dual G5 system.

thanks
-Arrigo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling interrupts on a SMP system
  2004-10-28 21:45 Disabling interrupts on a SMP system Arrigo Benedetti
@ 2004-10-28 23:39 ` Benjamin Herrenschmidt
  2004-10-28 23:58   ` Arrigo Benedetti
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2004-10-28 23:39 UTC (permalink / raw)
  To: Arrigo Benedetti; +Cc: linuxppc-dev list

On Thu, 2004-10-28 at 14:45 -0700, Arrigo Benedetti wrote:
> Dear all,
> 
> how can I (temporarily) disable all or some specific interrupts on a 
> specific CPU in an SMP system
> from user space code? In my case this is an Apple dual G5 system.

You can't ... why do you want to do that ?

Ben.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling interrupts on a SMP system
  2004-10-28 23:39 ` Benjamin Herrenschmidt
@ 2004-10-28 23:58   ` Arrigo Benedetti
  2004-10-29  0:51     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 12+ messages in thread
From: Arrigo Benedetti @ 2004-10-28 23:58 UTC (permalink / raw)
  To: linuxppc-dev list

Benjamin Herrenschmidt wrote:

>On Thu, 2004-10-28 at 14:45 -0700, Arrigo Benedetti wrote:
>
>>Dear all,
>>
>>how can I (temporarily) disable all or some specific interrupts on a 
>>specific CPU in an SMP system
>>from user space code? In my case this is an Apple dual G5 system.
>>
>
>You can't ... why do you want to do that ?
>
>

To achieve real-time performance in a very critical section of code. 
Even after moving all the
interrupts to CPU0, there are still two interrupts running on CPU1 that 
are disturbing the
execution of the time-critical code:

           CPU0       CPU1      
  0:      45127          0   OpenPIC   Level     libata
 25:        225          0   OpenPIC   Level     VIA-PMU
 26:          0          0   OpenPIC   Level     keywest i2c
 27:          0          0   OpenPIC   Level     ohci_hcd
 28:          0          0   OpenPIC   Level     ohci_hcd
 39:     189380          0   OpenPIC   Level     ide0
 40:        304          0   OpenPIC   Level     ohci1394
 41:    1288195          0   OpenPIC   Level     eth0
 47:          0          0   OpenPIC   Level     GPIO1/ADB
 55:          0          0   OpenPIC   Edge      NMI - XMON
 56:          1          0   OpenPIC   Edge      U3->K2 Cascade
 63:      15212          0   OpenPIC   Level     ehci_hcd, ohci_hcd, 
ohci_hcd
118:         15      21134   OpenPIC   Level     IPI0 (call function)
119:        888        904   OpenPIC   Level     IPI1 (reschedule)
120:          0          0   OpenPIC   Edge      IPI2 (invalidate tlb)
121:          0          0   OpenPIC   Edge      IPI3 (xmon break)
128:          0          0   OpenPIC2  Level     keywest i2c
IPI (recv/sent):      22941/22941
BAD:          1


I agree that this is not an elegant solution, but I would like to give 
it a try anyway...

Thanks

-Arrigo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling interrupts on a SMP system
  2004-10-28 23:58   ` Arrigo Benedetti
@ 2004-10-29  0:51     ` Benjamin Herrenschmidt
  2004-10-29 10:10       ` Gabriel Paubert
  2004-10-29 17:32       ` Arrigo Benedetti
  0 siblings, 2 replies; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2004-10-29  0:51 UTC (permalink / raw)
  To: Arrigo Benedetti; +Cc: linuxppc-dev list

On Thu, 2004-10-28 at 16:58 -0700, Arrigo Benedetti wrote:

> To achieve real-time performance in a very critical section of code. 
> Even after moving all the
> interrupts to CPU0, there are still two interrupts running on CPU1 that 
> are disturbing the
> execution of the time-critical code:

> 118:         15      21134   OpenPIC   Level     IPI0 (call function)
> 119:        888        904   OpenPIC   Level     IPI1 (reschedule)

Those are normal, they are cross-CPU interrupts used internally by the
kernel. There are also non-visible in that list the timer interrupts on
both CPUs. You just can't do anything against these.

Ben.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling interrupts on a SMP system
  2004-10-29  0:51     ` Benjamin Herrenschmidt
@ 2004-10-29 10:10       ` Gabriel Paubert
  2004-10-29 23:00         ` Benjamin Herrenschmidt
  2004-10-29 17:32       ` Arrigo Benedetti
  1 sibling, 1 reply; 12+ messages in thread
From: Gabriel Paubert @ 2004-10-29 10:10 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Arrigo Benedetti, linuxppc-dev list

On Fri, Oct 29, 2004 at 10:51:30AM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2004-10-28 at 16:58 -0700, Arrigo Benedetti wrote:
> 
> > To achieve real-time performance in a very critical section of code. 
> > Even after moving all the
> > interrupts to CPU0, there are still two interrupts running on CPU1 that 
> > are disturbing the
> > execution of the time-critical code:
> 
> > 118:         15      21134   OpenPIC   Level     IPI0 (call function)
> > 119:        888        904   OpenPIC   Level     IPI1 (reschedule)
> 
> Those are normal, they are cross-CPU interrupts used internally by the
> kernel. There are also non-visible in that list the timer interrupts on
> both CPUs. You just can't do anything against these.

I alway wondered why the decrementer interrupts are not listed, 
actually. Perhaps even with a count of the decrementer interrupts
which result in multiple updates of jiffies, because they indicate
that something has avery high latency.

BTW, on my Pismo, the number of bad interrupts is amazing:


           CPU0       
  9:          0   OpenPIC   Edge      Built-in Sound out
 10:          0   OpenPIC   Edge      Built-in Sound in
 19:     616569   OpenPIC   Level     ide0
 24:         23   OpenPIC   Level     Built-in Sound misc
 25:   12784655   OpenPIC   Level     VIA-PMU
 26:          2   OpenPIC   Level     keywest i2c
 27:          0   OpenPIC   Level     ohci_hcd
 28:          0   OpenPIC   Level     ohci_hcd
 40:          3   OpenPIC   Level     ohci1394
 41:    1334956   OpenPIC   Level     eth0
 42:          4   OpenPIC   Level     keywest i2c
 47:     503221   OpenPIC   Level     GPIO1/ADB
BAD:   21458276

in about one week uptime, but over half the time sleeping.

I have a fix for that, but it's not yet ready for submission. 
I might find time over the week-end.

	Regards,
	Gabriel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling interrupts on a SMP system
  2004-10-29  0:51     ` Benjamin Herrenschmidt
  2004-10-29 10:10       ` Gabriel Paubert
@ 2004-10-29 17:32       ` Arrigo Benedetti
  2004-10-29 23:11         ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 12+ messages in thread
From: Arrigo Benedetti @ 2004-10-29 17:32 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev list

Benjamin Herrenschmidt wrote:

>On Thu, 2004-10-28 at 16:58 -0700, Arrigo Benedetti wrote:
>
>  
>
>>To achieve real-time performance in a very critical section of code. 
>>Even after moving all the
>>interrupts to CPU0, there are still two interrupts running on CPU1 that 
>>are disturbing the
>>execution of the time-critical code:
>>    
>>
>
>  
>
>>118:         15      21134   OpenPIC   Level     IPI0 (call function)
>>119:        888        904   OpenPIC   Level     IPI1 (reschedule)
>>    
>>
>
>Those are normal, they are cross-CPU interrupts used internally by the
>kernel. There are also non-visible in that list the timer interrupts on
>both CPUs. You just can't do anything against these.
>
>  
>

Have these interrupts anything to do with the load balancer? I have 
disabled the load balancer code
in linux/sched.c (just commented out all the code in load_balance()).
Maybe the only solution is to write a kernel module?

-Arrigo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling interrupts on a SMP system
  2004-10-29 10:10       ` Gabriel Paubert
@ 2004-10-29 23:00         ` Benjamin Herrenschmidt
  2004-11-03 12:30           ` Gabriel Paubert
  0 siblings, 1 reply; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2004-10-29 23:00 UTC (permalink / raw)
  To: Gabriel Paubert; +Cc: Arrigo Benedetti, linuxppc-dev list


> I alway wondered why the decrementer interrupts are not listed, 
> actually. Perhaps even with a count of the decrementer interrupts
> which result in multiple updates of jiffies, because they indicate
> that something has avery high latency.
> 
> BTW, on my Pismo, the number of bad interrupts is amazing:
>
> .../...
>
> BAD:   21458276
> 
> in about one week uptime, but over half the time sleeping.
> 
> I have a fix for that, but it's not yet ready for submission. 
> I might find time over the week-end.

Ah ok, what is it ? Those seem to be "short" interrupts, they don't
happen on my tipb but they do happen on paul's older one (same mobo as
Pismo).

Looks like between clearing the irq source and exiting the handler, the
IRQ line stays asserted a bit longer or so ...

BTW, We should remove the cruft of early/late eoi too while we are at
it. A single "late" EOI is all we need. The MPIC will latch an edge irq
coming in between the ACK and the EOI and we don't want the CPU priority
to drop too early.

Ben.
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling interrupts on a SMP system
  2004-10-29 17:32       ` Arrigo Benedetti
@ 2004-10-29 23:11         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2004-10-29 23:11 UTC (permalink / raw)
  To: Arrigo Benedetti; +Cc: linuxppc-dev list

On Fri, 2004-10-29 at 10:32 -0700, Arrigo Benedetti wrote:

> Have these interrupts anything to do with the load balancer? I have 
> disabled the load balancer code
> in linux/sched.c (just commented out all the code in load_balance()).
> Maybe the only solution is to write a kernel module?

Yes, a kernel module is probably your best solution here

Ben.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling interrupts on a SMP system
  2004-10-29 23:00         ` Benjamin Herrenschmidt
@ 2004-11-03 12:30           ` Gabriel Paubert
  2004-11-03 22:11             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 12+ messages in thread
From: Gabriel Paubert @ 2004-11-03 12:30 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Arrigo Benedetti, linuxppc-dev list

On Sat, Oct 30, 2004 at 09:00:06AM +1000, Benjamin Herrenschmidt wrote:
> 
> > I alway wondered why the decrementer interrupts are not listed, 
> > actually. Perhaps even with a count of the decrementer interrupts
> > which result in multiple updates of jiffies, because they indicate
> > that something has avery high latency.
> > 
> > BTW, on my Pismo, the number of bad interrupts is amazing:
> >
> > .../...
> >
> > BAD:   21458276
> > 
> > in about one week uptime, but over half the time sleeping.
> > 
> > I have a fix for that, but it's not yet ready for submission. 
> > I might find time over the week-end.
> 
> Ah ok, what is it ? Those seem to be "short" interrupts, they don't
> happen on my tipb but they do happen on paul's older one (same mobo as
> Pismo).

Well, actually I no more have a fix, sorry for that. I believed so but I 
was mistaken, unless you consider correct a fix which would say that 
all interrupts have the SA_INTERRUPT flag set.

> 
> Looks like between clearing the irq source and exiting the handler, the
> IRQ line stays asserted a bit longer or so ...

Not quite, what happens is that we have "shadow interrupts" from the
OpenPIC. The sequence should be:
1) read the interrupt vector
2) the interrupt request is released at the hardware interrupt
   pin of the processor
3) we can now enable interrupts if we want...

Actually 2) is a bit slow, I suspect that the signal from the chipset
is an open-drain with a passive pull-up (there might even be a bit
in the chipset to control whether the decativation of the interrupt
output pin is active or not, I've seen in in other cases) and this
results in a spurious interrupt taken at step 3.

I have also seen a few spurious interrupts of the type you
suspect, from the serial and adb drivers, but they  were 
very few in comparison (0.01% or so, let's tackle the bulk
of them first).

> 
> BTW, We should remove the cruft of early/late eoi too while we are at
> it. A single "late" EOI is all we need. The MPIC will latch an edge irq
> coming in between the ACK and the EOI and we don't want the CPU priority
> to drop too early.

Ok, this changes everything, this means that since all hardware interrupts 
are set at the same priority, they are effectively serialized in hardware, 
so why reenable interrupts in the case the SA_INTERRUPT flag is not set?

I'm speaking only for UP, I don't have any SMP machine (and my laptop
which shows the problem obviously is not) and I don't know if priorities
are used for IPI. I believe that OpenPIC timers are never used, but I
might be wrong...

Actually I never understood very well the goal of the SA_INTERRUPT flag,
since on shared interrupts it will depend on whoever is first on the
list, which is more or less equivalent to determine it from the phase
of the moon. The only way it could make sense would be to insert 
SA_INTERRUPT handlers at the head of the queue, and non SA_INTERRUPT 
ones at the tail, dividing the handlers into 2 categories (or alternatively 
to have two handler lists per vector). But most of these issues do not 
affect PPC machines to my knowledge. 

In short I believe that this is a historical artifact from when 
interrupts could not be shared (edge-triggered only on ISA bus);  
at that time it did make sense to perform this kind of distinction 
between slow and fast interrupts.

I have appended the patch that I'm currently running that shows the
behaviour and adds a timebase tick delay to a wait loop every time
the spurious interrupt on reenabling interrupt in interrupt dispatcher 
is taken. On my G3/400, the delay converges to 4 ticks rapidly during
boot and increases to 5 when I start the modem, that's about 200ns.
The patch is horrible, unsafe and disgusting, but still usable as a 
tool to locate the source of spurious interrupts.

	Regards,
	Gabriel

===== arch/ppc/kernel/irq.c 1.44 vs edited =====
--- 1.44/arch/ppc/kernel/irq.c	2004-08-31 17:27:26 +02:00
+++ edited/arch/ppc/kernel/irq.c	2004-10-16 20:07:19 +02:00
@@ -411,18 +411,35 @@
 	return 0;
 }
 
+
+static int hardirq_shadow_ticks=0;
+extern unsigned long __hardirq_shadow_nip;
 static inline void
 handle_irq_event(int irq, struct pt_regs *regs, struct irqaction *action)
 {
 	int status = 0;
 	int ret;
 
-	if (!(action->flags & SA_INTERRUPT))
+	if (!(action->flags & SA_INTERRUPT)) {
+		int start, now;
+		asm volatile(	"	isync\n"
+				"	mftb %0\n"
+				"1:	mftb %1\n"
+				"	sub %1,%1,%0\n"
+				"	cmplw %1,%2\n"
+				"	blt 1b\n"
+				"	isync\n"
+				: "=&r" (start), "=&r" (now)
+				: "r" (hardirq_shadow_ticks)
+				: "cr0");
 		local_irq_enable();
+		/* FIXME: happens to work */
+		asm volatile("\n__hardirq_shadow_nip: isync\n");
+	}
 
 	do {
 		ret = action->handler(irq, action->dev_id, regs);
-		if (ret == IRQ_HANDLED)
+		if (likely(ret == IRQ_HANDLED))
 			status |= action->flags;
 		action = action->next;
 	} while (action);
@@ -523,6 +540,9 @@
 void do_IRQ(struct pt_regs *regs)
 {
 	int irq, first = 1;
+	static int prev_irq=-3, prev_prev_irq=-4;
+	static unsigned spurious_jiffies;
+	static int prev_spurious;
         irq_enter();
 
 	/*
@@ -536,10 +556,28 @@
 	while ((irq = ppc_md.get_irq(regs)) >= 0) {
 		ppc_irq_dispatch_handler(regs, irq);
 		first = 0;
+		prev_prev_irq = prev_irq;
+		prev_irq = irq;
 	}
-	if (irq != -2 && first)
+	if (irq != -2 && first) {
 		/* That's not SMP safe ... but who cares ? */
 		ppc_spurious_interrupts++;
+		if (jiffies-spurious_jiffies>HZ) {
+		  	if ((ppc_spurious_interrupts-prev_spurious) > 2 &&
+			    (regs->nip - (unsigned long)&__hardirq_shadow_nip)
+			    <=4) {
+				printk("Interrupt hardirq_shadow_ticks " 
+				       "set to %d.\n",
+				       ++hardirq_shadow_ticks);
+			}
+			prev_spurious = ppc_spurious_interrupts;
+			spurious_jiffies=jiffies;
+			printk(KERN_NOTICE
+			       "Spurious interrupt, last vectors %d:%d, "
+			       "NIP=0x%lx\n",
+			       prev_prev_irq, prev_irq, regs->nip);
+		}
+	}
         irq_exit();
 }
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling interrupts on a SMP system
  2004-11-03 12:30           ` Gabriel Paubert
@ 2004-11-03 22:11             ` Benjamin Herrenschmidt
  2004-11-04 12:57               ` Gabriel Paubert
  2004-11-15 11:55               ` Gabriel Paubert
  0 siblings, 2 replies; 12+ messages in thread
From: Benjamin Herrenschmidt @ 2004-11-03 22:11 UTC (permalink / raw)
  To: Gabriel Paubert; +Cc: Arrigo Benedetti, linuxppc-dev list

On Wed, 2004-11-03 at 13:30 +0100, Gabriel Paubert wrote:

> Well, actually I no more have a fix, sorry for that. I believed so but I 
> was mistaken, unless you consider correct a fix which would say that 
> all interrupts have the SA_INTERRUPT flag set.

Gack... that would be bad.

> > Looks like between clearing the irq source and exiting the handler, the
> > IRQ line stays asserted a bit longer or so ...
> 
> Not quite, what happens is that we have "shadow interrupts" from the
> OpenPIC. The sequence should be:
> 1) read the interrupt vector
> 2) the interrupt request is released at the hardware interrupt
>    pin of the processor
> 3) we can now enable interrupts if we want...
> 
> Actually 2) is a bit slow, I suspect that the signal from the chipset
> is an open-drain with a passive pull-up (there might even be a bit
> in the chipset to control whether the decativation of the interrupt
> output pin is active or not, I've seen in in other cases) and this
> results in a spurious interrupt taken at step 3.

Ah... so that would explain why newer machines don't show it ? the
openpic is faster or such ? I'll test you theory by adding a small delay
after reading the ack (just to test)...

> I have also seen a few spurious interrupts of the type you
> suspect, from the serial and adb drivers, but they  were 
> very few in comparison (0.01% or so, let's tackle the bulk
> of them first).

Right, that's what I would expect.
> >
> Ok, this changes everything, this means that since all hardware interrupts 
> are set at the same priority, they are effectively serialized in hardware, 
> so why reenable interrupts in the case the SA_INTERRUPT flag is not set?

Well, we may want to play with priority later, and there are the DEC
interrupts that I want to still take while processing HW ones. But this
serialisation is a "good thing", I think, to avoid possible kernel stack
overflows. We have very few edge interrupts so I suppose that
serialisation is s almost what happens today already

> I'm speaking only for UP, I don't have any SMP machine (and my laptop
> which shows the problem obviously is not) and I don't know if priorities
> are used for IPI. I believe that OpenPIC timers are never used, but I
> might be wrong...

We don't use the timers (and Apple removed them in latest chipsets) but
I think we raise the IPI priority above normal IRQs yes.

> Actually I never understood very well the goal of the SA_INTERRUPT flag,
> since on shared interrupts it will depend on whoever is first on the
> list, which is more or less equivalent to determine it from the phase
> of the moon. The only way it could make sense would be to insert 
> SA_INTERRUPT handlers at the head of the queue, and non SA_INTERRUPT 
> ones at the tail, dividing the handlers into 2 categories (or alternatively 
> to have two handler lists per vector). But most of these issues do not 
> affect PPC machines to my knowledge. 

It's for legacy ISA cruft I'd say :)

> In short I believe that this is a historical artifact from when 
> interrupts could not be shared (edge-triggered only on ISA bus);  
> at that time it did make sense to perform this kind of distinction 
> between slow and fast interrupts.

Yes.

> I have appended the patch that I'm currently running that shows the
> behaviour and adds a timebase tick delay to a wait loop every time
> the spurious interrupt on reenabling interrupt in interrupt dispatcher 
> is taken. On my G3/400, the delay converges to 4 ticks rapidly during
> boot and increases to 5 when I start the modem, that's about 200ns.
> The patch is horrible, unsafe and disgusting, but still usable as a 
> tool to locate the source of spurious interrupts.

Ah cool, a patch...  :)

It's strange tho... the interrupt ACK beeing a read, I would have
expected it to be rather synchronous with the bus, unless the MPIC
itself completes the read transaction before actually getting rid of the
IRQ signal (or mayb the CPU itself is latching it a bit too long because
of a crappy pull up as you mentioned).

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling interrupts on a SMP system
  2004-11-03 22:11             ` Benjamin Herrenschmidt
@ 2004-11-04 12:57               ` Gabriel Paubert
  2004-11-15 11:55               ` Gabriel Paubert
  1 sibling, 0 replies; 12+ messages in thread
From: Gabriel Paubert @ 2004-11-04 12:57 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Arrigo Benedetti, linuxppc-dev list

On Thu, Nov 04, 2004 at 09:11:41AM +1100, Benjamin Herrenschmidt wrote:
> On Wed, 2004-11-03 at 13:30 +0100, Gabriel Paubert wrote:
> 
> > Well, actually I no more have a fix, sorry for that. I believed so but I 
> > was mistaken, unless you consider correct a fix which would say that 
> > all interrupts have the SA_INTERRUPT flag set.
> 
> Gack... that would be bad.

Well, not that bad if the interrupt handlers are _really_ short and just
postpone the job to a tasklet/bottom-half or however it is called these
days. A series of short interrupts that are serialized and reuse
the same part of the stack, avoiding stack overflows and thrashing
less cache.

Now there are interrupt handlers that can't be as lightweight as I would
like, for example the one I wrote for VME busses. Essentially they are 
another level of dispatch through a cascaded interrupt controller, and 
that's pretty much impossible to avoid.

> 
> > > Looks like between clearing the irq source and exiting the handler, the
> > > IRQ line stays asserted a bit longer or so ...
> > 
> > Not quite, what happens is that we have "shadow interrupts" from the
> > OpenPIC. The sequence should be:
> > 1) read the interrupt vector
> > 2) the interrupt request is released at the hardware interrupt
> >    pin of the processor
> > 3) we can now enable interrupts if we want...
> > 
> > Actually 2) is a bit slow, I suspect that the signal from the chipset
> > is an open-drain with a passive pull-up (there might even be a bit
> > in the chipset to control whether the decativation of the interrupt
> > output pin is active or not, I've seen in in other cases) and this
> > results in a spurious interrupt taken at step 3.
> 
> Ah... so that would explain why newer machines don't show it ? the
> openpic is faster or such ? I'll test you theory by adding a small delay
> after reading the ack (just to test)...

Actually, as you see on the patch, the delay is only useful
on non SA_INTERRUPT handlers. I don't see them on my PM 466,
but it has a UniNorth 1.5 and it does not really actively use 
the same interrupts.

> 
> > I have also seen a few spurious interrupts of the type you
> > suspect, from the serial and adb drivers, but they  were 
> > very few in comparison (0.01% or so, let's tackle the bulk
> > of them first).
> 
> Right, that's what I would expect.

I forgot the PMU in the list.


> Well, we may want to play with priority later, and there are the DEC
> interrupts that I want to still take while processing HW ones. But this
> serialisation is a "good thing", I think, to avoid possible kernel stack
> overflows. We have very few edge interrupts so I suppose that
> serialisation is s almost what happens today already

Indeed the DEC interrupts are the problem, BTW I have to clean up
some timekeeping patch because there are other problems in this
area right now.

> 
> > I'm speaking only for UP, I don't have any SMP machine (and my laptop
> > which shows the problem obviously is not) and I don't know if priorities
> > are used for IPI. I believe that OpenPIC timers are never used, but I
> > might be wrong...
> 
> We don't use the timers (and Apple removed them in latest chipsets) but
> I think we raise the IPI priority above normal IRQs yes.

This makes sense, besides this some OpenPIC documentations claim that 
every IPI (and timer) should have a different priority level.

> 
> > Actually I never understood very well the goal of the SA_INTERRUPT flag,
> > since on shared interrupts it will depend on whoever is first on the
> > list, which is more or less equivalent to determine it from the phase
> > of the moon. The only way it could make sense would be to insert 
> > SA_INTERRUPT handlers at the head of the queue, and non SA_INTERRUPT 
> > ones at the tail, dividing the handlers into 2 categories (or alternatively 
> > to have two handler lists per vector). But most of these issues do not 
> > affect PPC machines to my knowledge. 
> 
> It's for legacy ISA cruft I'd say :)
> 
> > In short I believe that this is a historical artifact from when 
> > interrupts could not be shared (edge-triggered only on ISA bus);  
> > at that time it did make sense to perform this kind of distinction 
> > between slow and fast interrupts.
> 
> Yes.

But with shared interrupts as seen on PCI on i386 (especially 
notebooks where you often see almost all interrupts sharing the
same PIC input), the fact that an interrupt is classified as fast 
or slow depends on who is first in the list of handlers. This 
does not make _any_ sense.

> 
> > I have appended the patch that I'm currently running that shows the
> > behaviour and adds a timebase tick delay to a wait loop every time
> > the spurious interrupt on reenabling interrupt in interrupt dispatcher 
> > is taken. On my G3/400, the delay converges to 4 ticks rapidly during
> > boot and increases to 5 when I start the modem, that's about 200ns.
> > The patch is horrible, unsafe and disgusting, but still usable as a 
> > tool to locate the source of spurious interrupts.
> 
> Ah cool, a patch...  :)
> 
> It's strange tho... the interrupt ACK beeing a read, I would have
> expected it to be rather synchronous with the bus, unless the MPIC
> itself completes the read transaction before actually getting rid of the
> IRQ signal (or mayb the CPU itself is latching it a bit too long because
> of a crappy pull up as you mentioned).

Well, it takes about 2-3 bus cycles for the signal to reach the 
core from the pin due to resynchronization/metastability avoidance 
flip-flops (I read it somewhere in a PPC doc), so it's more or
less guaranted that:
	
	read the vector (lwz or lwbrx)
	ensure that the read is performed (tw+isync, or sync)
	mtmsr with EE set
	
will result in a "shadow" interrupt, whatever processor in the G3/G4
series you use, but the internal hardware delay is less than a timebase 
tick and I need 5 (that's at least 16 bus clocks) to be safe.

Now I don't know the exact reason, it may be an open-drain output 
because some version was designed to share the processor interrupt 
request signal with another chip. Maybe there is a bit to control 
this (open-drain or not) but you'd need the docs, or maybe the 
internal logic is slow and takes time to react and remove the 
interrupt request to the processor.  

But all of this is speculation, the fact is that I get these "shadow" 
interrupts and my scary patch proves this.

	Regards,
	Gabriel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Disabling interrupts on a SMP system
  2004-11-03 22:11             ` Benjamin Herrenschmidt
  2004-11-04 12:57               ` Gabriel Paubert
@ 2004-11-15 11:55               ` Gabriel Paubert
  1 sibling, 0 replies; 12+ messages in thread
From: Gabriel Paubert @ 2004-11-15 11:55 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Arrigo Benedetti, linuxppc-dev list


	Hi Ben,

back to an old subject.


On Thu, Nov 04, 2004 at 09:11:41AM +1100, Benjamin Herrenschmidt wrote:
> Ah... so that would explain why newer machines don't show it ? the
> openpic is faster or such ? I'll test you theory by adding a small delay
> after reading the ack (just to test)...

Well, the new irq consolidation code has rendered my test patch 
absolutely useless. I just send you what I have now, it's not
but is minimally invasive in the generic code.

The nested_irq_enable function is not really nice: you'll get
a link time error if you want to use it more than once. But
the only workaround would be very ugly macros.

	Regards,
	Gabriel

===== arch/ppc/kernel/irq.c 1.47 vs edited =====
--- 1.47/arch/ppc/kernel/irq.c	2004-10-22 04:41:23 +02:00
+++ edited/arch/ppc/kernel/irq.c	2004-11-15 00:51:49 +01:00
@@ -76,6 +76,9 @@
 extern int tau_interrupts(int);
 #endif
 
+int hardirq_shadow_ticks=0;
+extern unsigned long __hardirq_shadow_nip;
+
 int show_interrupts(struct seq_file *p, void *v)
 {
 	int i = *(loff_t *) v, j;
@@ -138,6 +141,9 @@
 void do_IRQ(struct pt_regs *regs)
 {
 	int irq, first = 1;
+	static int prev_irq=-3, prev_prev_irq=-4;
+	static unsigned spurious_jiffies;
+	static int prev_spurious;
         irq_enter();
 
 	/*
@@ -151,10 +157,28 @@
 	while ((irq = ppc_md.get_irq(regs)) >= 0) {
 		__do_IRQ(irq, regs);
 		first = 0;
+		prev_prev_irq = prev_irq;
+		prev_irq = irq;
 	}
-	if (irq != -2 && first)
+	if (irq != -2 && first) {
 		/* That's not SMP safe ... but who cares ? */
 		ppc_spurious_interrupts++;
+		if (jiffies-spurious_jiffies>HZ) {
+			if ((ppc_spurious_interrupts-prev_spurious) > 2 &&
+			    (regs->nip - (unsigned long)&__hardirq_shadow_nip)
+			    <=4) {
+				printk("Interrupt hardirq_shadow_ticks "
+				       "set to %d.\n",
+				       ++hardirq_shadow_ticks);
+			}
+			prev_spurious = ppc_spurious_interrupts;
+			spurious_jiffies=jiffies;
+			printk(KERN_NOTICE
+			       "Spurious interrupt, last vectors %d:%d, "
+			       "NIP=0x%lx\n",
+			       prev_prev_irq, prev_irq, regs->nip);
+		}
+	}
         irq_exit();
 }
 
===== include/asm-ppc/hw_irq.h 1.10 vs edited =====
--- 1.10/include/asm-ppc/hw_irq.h	2004-10-19 07:26:40 +02:00
+++ edited/include/asm-ppc/hw_irq.h	2004-11-15 11:38:45 +01:00
@@ -32,6 +32,29 @@
 	mtmsr(msr | MSR_EE);
 }
 
+#define ARCH_SHADOW_IRQS
+
+static inline void nested_irq_enable(void)
+{
+	extern int hardirq_shadow_ticks;
+	int start, now;
+	asm volatile(	"	mftb %0\n"
+			"1:	mftb %1\n"
+			"	sub %1,%1,%0\n"
+			"	cmplw %1,%2\n"
+			"	blt 1b\n"
+			"	isync\n"
+			"	mfmsr %0\n"
+			"	ori %0,%0,0x8000\n"
+			"	mtmsr %0\n"
+			"	.globl __hardirq_shadow_nip\n"
+			"__hardirq_shadow_nip: isync\n"
+			: "=&r" (start), "=&r" (now)
+			: "r" (hardirq_shadow_ticks)
+			: "cr0");
+	__asm__ __volatile__("": : :"memory");
+}
+
 static inline void local_irq_save_ptr(unsigned long *flags)
 {
 	unsigned long msr;
===== kernel/irq/handle.c 1.3 vs edited =====
--- 1.3/kernel/irq/handle.c	2004-11-04 20:13:19 +01:00
+++ edited/kernel/irq/handle.c	2004-11-15 11:37:48 +01:00
@@ -35,6 +35,12 @@
 	}
 };
 
+#ifndef ARCH_SHADOW_IRQS
+#define nested_irq_enable local_irq_enable
+#endif
+
+
+
 /*
  * Generic 'no controller' code
  */
@@ -91,9 +97,9 @@
 {
 	int ret, retval = 0, status = 0;
 
-	if (!(action->flags & SA_INTERRUPT))
-		local_irq_enable();
-
+	if (!(action->flags & SA_INTERRUPT)) {
+		nested_irq_enable();
+	}
 	do {
 		ret = action->handler(irq, action->dev_id, regs);
 		if (ret == IRQ_HANDLED)

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2004-11-15 12:00 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-10-28 21:45 Disabling interrupts on a SMP system Arrigo Benedetti
2004-10-28 23:39 ` Benjamin Herrenschmidt
2004-10-28 23:58   ` Arrigo Benedetti
2004-10-29  0:51     ` Benjamin Herrenschmidt
2004-10-29 10:10       ` Gabriel Paubert
2004-10-29 23:00         ` Benjamin Herrenschmidt
2004-11-03 12:30           ` Gabriel Paubert
2004-11-03 22:11             ` Benjamin Herrenschmidt
2004-11-04 12:57               ` Gabriel Paubert
2004-11-15 11:55               ` Gabriel Paubert
2004-10-29 17:32       ` Arrigo Benedetti
2004-10-29 23:11         ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).