public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
@ 2006-05-31 20:02 Ingo Molnar
  2006-05-31 20:31 ` Arjan van de Ven
  0 siblings, 1 reply; 32+ messages in thread
From: Ingo Molnar @ 2006-05-31 20:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Arjan van de Ven, linux-kernel

untested on 8390 hardware, but ought to solve the lockdep false 
positive.

-----------------
Subject: locking validator: special rule: 8390.c disable_irq()
From: Ingo Molnar <mingo@elte.hu>

8390.c knows that ei_local->page_lock can only be used by an irq
context that it disabled - and can hence take the ->page_lock
without disabling hardirqs. Teach lockdep about this.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 drivers/net/8390.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: linux/drivers/net/8390.c
===================================================================
--- linux.orig/drivers/net/8390.c
+++ linux/drivers/net/8390.c
@@ -249,7 +249,7 @@ void ei_tx_timeout(struct net_device *de
 
 	/* Ugly but a reset can be slow, yet must be protected */
 		
-	disable_irq_nosync(dev->irq);
+	disable_irq_nosync_lockdep(dev->irq);
 	spin_lock(&ei_local->page_lock);
 		
 	/* Try to restart the card.  Perhaps the user has fixed something. */
@@ -257,7 +257,7 @@ void ei_tx_timeout(struct net_device *de
 	NS8390_init(dev, 1);
 		
 	spin_unlock(&ei_local->page_lock);
-	enable_irq(dev->irq);
+	enable_irq_lockdep(dev->irq);
 	netif_wake_queue(dev);
 }
     

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-05-31 20:02 [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq() Ingo Molnar
@ 2006-05-31 20:31 ` Arjan van de Ven
  2006-05-31 21:27   ` Ingo Molnar
  2006-05-31 21:41   ` Alan Cox
  0 siblings, 2 replies; 32+ messages in thread
From: Arjan van de Ven @ 2006-05-31 20:31 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: alan, Andrew Morton, linux-kernel

On Wed, 2006-05-31 at 22:02 +0200, Ingo Molnar wrote:
> untested on 8390 hardware, but ought to solve the lockdep false 
> positive.
> 
> -----------------
> Subject: locking validator: special rule: 8390.c disable_irq()
> From: Ingo Molnar <mingo@elte.hu>
> 
> 8390.c knows that ei_local->page_lock can only be used by an irq
> context that it disabled -

btw I think this is no longer correct with the irq polling stuff Alan
added to the kernel recently...


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-05-31 20:31 ` Arjan van de Ven
@ 2006-05-31 21:27   ` Ingo Molnar
  2006-05-31 21:41   ` Alan Cox
  1 sibling, 0 replies; 32+ messages in thread
From: Ingo Molnar @ 2006-05-31 21:27 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: alan, Andrew Morton, linux-kernel


* Arjan van de Ven <arjan@infradead.org> wrote:

> On Wed, 2006-05-31 at 22:02 +0200, Ingo Molnar wrote:
> > untested on 8390 hardware, but ought to solve the lockdep false 
> > positive.
> > 
> > -----------------
> > Subject: locking validator: special rule: 8390.c disable_irq()
> > From: Ingo Molnar <mingo@elte.hu>
> > 
> > 8390.c knows that ei_local->page_lock can only be used by an irq
> > context that it disabled -
> 
> btw I think this is no longer correct with the irq polling stuff Alan 
> added to the kernel recently...

hm, indeed. misrouted_irq() goes through all irq descriptors and ignores 
IRQ_DISABLED flag - rendering disable_irq() useless in essence, and 
introducing the kind of deadlocks that lockdep warned about.

Andrew, as far as i can see with irqfixup this isnt a lockdep false 
positive but a real deadlock scenario - a spurious IRQ might arrive 
during vortex_timer() execution and might cause the execution of 
misrouted_irq(), which could execute vortex_interrupt() => deadlock.

Alan, is this a necessary property of irqpoll/irqfixup? Shouldnt 
irqfixup leave irq descriptors alone that are disabled?

	Ingo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-05-31 20:31 ` Arjan van de Ven
  2006-05-31 21:27   ` Ingo Molnar
@ 2006-05-31 21:41   ` Alan Cox
  2006-05-31 21:43     ` Arjan van de Ven
  1 sibling, 1 reply; 32+ messages in thread
From: Alan Cox @ 2006-05-31 21:41 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Ingo Molnar, alan, Andrew Morton, linux-kernel

On Wed, May 31, 2006 at 10:31:40PM +0200, Arjan van de Ven wrote:
> > 8390.c knows that ei_local->page_lock can only be used by an irq
> > context that it disabled -
> 
> btw I think this is no longer correct with the irq polling stuff Alan
> added to the kernel recently...

We could make the misrouted IRQ logic skip all handlers on a disabled IRQ
but that might actually be worse than the disease we are trying to cure by
doing so.

Alan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-05-31 21:41   ` Alan Cox
@ 2006-05-31 21:43     ` Arjan van de Ven
  2006-05-31 21:47       ` Ingo Molnar
  0 siblings, 1 reply; 32+ messages in thread
From: Arjan van de Ven @ 2006-05-31 21:43 UTC (permalink / raw)
  To: Alan Cox; +Cc: Ingo Molnar, Andrew Morton, linux-kernel

On Wed, 2006-05-31 at 17:41 -0400, Alan Cox wrote:
> On Wed, May 31, 2006 at 10:31:40PM +0200, Arjan van de Ven wrote:
> > > 8390.c knows that ei_local->page_lock can only be used by an irq
> > > context that it disabled -
> > 
> > btw I think this is no longer correct with the irq polling stuff Alan
> > added to the kernel recently...
> 
> We could make the misrouted IRQ logic skip all handlers on a disabled IRQ
> but that might actually be worse than the disease we are trying to cure by
> doing so.

yeah since misrouted irqs will cause the kernel do disable irqs 'at
random' more or less .. for which the handlers now would become
unreachable which isn't good.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-05-31 21:43     ` Arjan van de Ven
@ 2006-05-31 21:47       ` Ingo Molnar
  2006-05-31 21:56         ` Arjan van de Ven
  2006-06-01  9:46         ` [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq() Alan Cox
  0 siblings, 2 replies; 32+ messages in thread
From: Ingo Molnar @ 2006-05-31 21:47 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Alan Cox, Andrew Morton, linux-kernel


* Arjan van de Ven <arjan@infradead.org> wrote:

> On Wed, 2006-05-31 at 17:41 -0400, Alan Cox wrote:
> > On Wed, May 31, 2006 at 10:31:40PM +0200, Arjan van de Ven wrote:
> > > > 8390.c knows that ei_local->page_lock can only be used by an irq
> > > > context that it disabled -
> > > 
> > > btw I think this is no longer correct with the irq polling stuff Alan
> > > added to the kernel recently...
> > 
> > We could make the misrouted IRQ logic skip all handlers on a disabled IRQ
> > but that might actually be worse than the disease we are trying to cure by
> > doing so.
> 
> yeah since misrouted irqs will cause the kernel do disable irqs 'at 
> random' more or less .. for which the handlers now would become 
> unreachable which isn't good.

couldnt most of these problems be avoided by tracking whether a handler 
_ever_ returned a success status? That means that irqpoll could safely 
poll handlers for which we know that they somehow arent yet matched up 
to any IRQ line?

	Ingo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-05-31 21:47       ` Ingo Molnar
@ 2006-05-31 21:56         ` Arjan van de Ven
  2006-05-31 22:00           ` Ingo Molnar
  2006-06-03 14:37           ` Steven Rostedt
  2006-06-01  9:46         ` [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq() Alan Cox
  1 sibling, 2 replies; 32+ messages in thread
From: Arjan van de Ven @ 2006-05-31 21:56 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Alan Cox, Andrew Morton, linux-kernel

On Wed, 2006-05-31 at 23:47 +0200, Ingo Molnar wrote:
> * Arjan van de Ven <arjan@infradead.org> wrote:
> 
> > On Wed, 2006-05-31 at 17:41 -0400, Alan Cox wrote:
> > > On Wed, May 31, 2006 at 10:31:40PM +0200, Arjan van de Ven wrote:
> > > > > 8390.c knows that ei_local->page_lock can only be used by an irq
> > > > > context that it disabled -
> > > > 
> > > > btw I think this is no longer correct with the irq polling stuff Alan
> > > > added to the kernel recently...
> > > 
> > > We could make the misrouted IRQ logic skip all handlers on a disabled IRQ
> > > but that might actually be worse than the disease we are trying to cure by
> > > doing so.
> > 
> > yeah since misrouted irqs will cause the kernel do disable irqs 'at 
> > random' more or less .. for which the handlers now would become 
> > unreachable which isn't good.
> 
> couldnt most of these problems be avoided by tracking whether a handler 
> _ever_ returned a success status? That means that irqpoll could safely 
> poll handlers for which we know that they somehow arent yet matched up 
> to any IRQ line?

I suspect the real solution is to have a

disable_irq_handler(irq, handler) 

function which does 2 things
1) disable the irq at the hardware level
2) mark the handler as "don't call me"

it matches the semantics here; what these drivers want is 1) not get an
irq handler called and 2) not get an irq flood



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-05-31 21:56         ` Arjan van de Ven
@ 2006-05-31 22:00           ` Ingo Molnar
  2006-05-31 22:02             ` Arjan van de Ven
  2006-06-03 14:37           ` Steven Rostedt
  1 sibling, 1 reply; 32+ messages in thread
From: Ingo Molnar @ 2006-05-31 22:00 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Alan Cox, Andrew Morton, linux-kernel


* Arjan van de Ven <arjan@infradead.org> wrote:

> > couldnt most of these problems be avoided by tracking whether a handler 
> > _ever_ returned a success status? That means that irqpoll could safely 
> > poll handlers for which we know that they somehow arent yet matched up 
> > to any IRQ line?
> 
> I suspect the real solution is to have a
> 
> disable_irq_handler(irq, handler) 
> 
> function which does 2 things
> 1) disable the irq at the hardware level
> 2) mark the handler as "don't call me"
> 
> it matches the semantics here; what these drivers want is 1) not get 
> an irq handler called and 2) not get an irq flood

ok, this would work. But there is a practical problem: only in drivers/* 
there's 310 disable_irq() calls - each would have to be changed to 
disable_irq_handler() [and i dont see any good way to automate that 
conversion] ...

	Ingo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-05-31 22:00           ` Ingo Molnar
@ 2006-05-31 22:02             ` Arjan van de Ven
  0 siblings, 0 replies; 32+ messages in thread
From: Arjan van de Ven @ 2006-05-31 22:02 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Alan Cox, Andrew Morton, linux-kernel

On Thu, 2006-06-01 at 00:00 +0200, Ingo Molnar wrote:
> * Arjan van de Ven <arjan@infradead.org> wrote:
> 
> > > couldnt most of these problems be avoided by tracking whether a handler 
> > > _ever_ returned a success status? That means that irqpoll could safely 
> > > poll handlers for which we know that they somehow arent yet matched up 
> > > to any IRQ line?
> > 
> > I suspect the real solution is to have a
> > 
> > disable_irq_handler(irq, handler) 
> > 
> > function which does 2 things
> > 1) disable the irq at the hardware level
> > 2) mark the handler as "don't call me"
> > 
> > it matches the semantics here; what these drivers want is 1) not get 
> > an irq handler called and 2) not get an irq flood
> 
> ok, this would work. But there is a practical problem: only in drivers/* 
> there's 310 disable_irq() calls - each would have to be changed to 
> disable_irq_handler() [and i dont see any good way to automate that 
> conversion] ...

want to take a bet on the number of those 310 that are just totally
bogus ?



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-05-31 21:47       ` Ingo Molnar
  2006-05-31 21:56         ` Arjan van de Ven
@ 2006-06-01  9:46         ` Alan Cox
  2006-06-01 10:02           ` Ingo Molnar
  1 sibling, 1 reply; 32+ messages in thread
From: Alan Cox @ 2006-06-01  9:46 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Arjan van de Ven, Alan Cox, Andrew Morton, linux-kernel

On Wed, May 31, 2006 at 11:47:29PM +0200, Ingo Molnar wrote:
> couldnt most of these problems be avoided by tracking whether a handler 
> _ever_ returned a success status? That means that irqpoll could safely 
> poll handlers for which we know that they somehow arent yet matched up 
> to any IRQ line?

But you may get random positive hits from this when a real IRQ for an
unrelated device happens to get delivered. We could poll enabled IRQs first
then disabled ones ?


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-06-01  9:46         ` [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq() Alan Cox
@ 2006-06-01 10:02           ` Ingo Molnar
  0 siblings, 0 replies; 32+ messages in thread
From: Ingo Molnar @ 2006-06-01 10:02 UTC (permalink / raw)
  To: Alan Cox; +Cc: Arjan van de Ven, Andrew Morton, linux-kernel


* Alan Cox <alan@redhat.com> wrote:

> On Wed, May 31, 2006 at 11:47:29PM +0200, Ingo Molnar wrote:
> > couldnt most of these problems be avoided by tracking whether a handler 
> > _ever_ returned a success status? That means that irqpoll could safely 
> > poll handlers for which we know that they somehow arent yet matched up 
> > to any IRQ line?
> 
> But you may get random positive hits from this when a real IRQ for an 
> unrelated device happens to get delivered. We could poll enabled IRQs 
> first then disabled ones ?

hm, you are right. Actions that are registered to the wrong IRQ might 
still appear to 'work' by pure accident, if they also share the IRQ with 
another (correctly routed) action.

the basic problem isnt really the polling method that irqpoll does - it 
is the insensitivity of the IRQ_DISABLED flag: we dont know whether it's 
disabled because the driver wants it, or because it was screaming 
before. Maybe we could (ab-)use irq_desc->depth for that - if that is 0 
but IRQ_DISABLED is set then you may ignore IRQ_DISABLED. Ok?

The patch below implements this logic, ontop of -rc5-mm2. Can you see 
any hole in it? (It built and booted up fine on x86_64 but i dont have 
any misrouted irqs.)

	Ingo

------------------
Subject: fix irqpoll to honor disable_irq()
From: Ingo Molnar <mingo@elte.hu>

irqpoll/irqfixup ignored IRQ_DISABLED but that could cause lockups. So 
listen to desc->depth to correctly honor disable_irq(). Also, when an 
interrupt it screaming, set IRQ_DISABLED but do not touch ->depth.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Index: linux/kernel/irq/spurious.c
===================================================================
--- linux.orig/kernel/irq/spurious.c
+++ linux/kernel/irq/spurious.c
@@ -56,6 +56,15 @@ static int misrouted_irq(int irq, struct
 		local_irq_disable();
 		/* Now clean up the flags */
 		spin_lock(&desc->lock);
+		/*
+		 * NOTE: we only listen to desc->depth here, not to
+		 * IRQ_DISABLED - which might have been set due to
+		 * a screaming interrupt.
+		 */
+		if (desc->depth) {
+			spin_unlock(&desc->lock);
+			continue;
+		}
 		action = desc->action;
 
 		/*
@@ -163,10 +172,12 @@ void note_interrupt(unsigned int irq, st
 		__report_bad_irq(irq, desc, action_ret);
 		/*
 		 * Now kill the IRQ
+		 *
+		 * We keep desc->depth unchanged - so that irqpoll can
+		 * honor driver IRQ-disabling.
 		 */
 		printk(KERN_EMERG "Disabling IRQ #%d\n", irq);
 		desc->status |= IRQ_DISABLED;
-		desc->depth = 1;
 		desc->chip->disable(irq);
 	}
 	desc->irqs_unhandled = 0;

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-05-31 21:56         ` Arjan van de Ven
  2006-05-31 22:00           ` Ingo Molnar
@ 2006-06-03 14:37           ` Steven Rostedt
  2006-06-03 21:53             ` Alan Cox
  1 sibling, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2006-06-03 14:37 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Ingo Molnar, Alan Cox, Andrew Morton, linux-kernel

On Wed, 2006-05-31 at 23:56 +0200, Arjan van de Ven wrote:
> On Wed, 2006-05-31 at 23:47 +0200, Ingo Molnar wrote:
> > * Arjan van de Ven <arjan@infradead.org> wrote:
> > > yeah since misrouted irqs will cause the kernel do disable irqs 'at 
> > > random' more or less .. for which the handlers now would become 
> > > unreachable which isn't good.
> > 
> > couldnt most of these problems be avoided by tracking whether a handler 
> > _ever_ returned a success status? That means that irqpoll could safely 
> > poll handlers for which we know that they somehow arent yet matched up 
> > to any IRQ line?
> 
> I suspect the real solution is to have a
> 
> disable_irq_handler(irq, handler) 
> 
> function which does 2 things
> 1) disable the irq at the hardware level
> 2) mark the handler as "don't call me"
> 
> it matches the semantics here; what these drivers want is 1) not get an
> irq handler called and 2) not get an irq flood

(Sorry for coming in late, but I'm slowly catching up with LKML)

Couldn't it be possible to have the misrouted irq function mark the
DISABLED_IRQ handlers as IRQ_PENDING?  Then have the enable_irq that
actually enables the irq to call the handlers with interrupts disabled
if the IRQ_PENDING is set?

-- Steve



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-06-03 14:37           ` Steven Rostedt
@ 2006-06-03 21:53             ` Alan Cox
  2006-06-03 22:34               ` Steven Rostedt
  0 siblings, 1 reply; 32+ messages in thread
From: Alan Cox @ 2006-06-03 21:53 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Arjan van de Ven, Ingo Molnar, Alan Cox, Andrew Morton,
	linux-kernel

On Sat, Jun 03, 2006 at 10:37:01AM -0400, Steven Rostedt wrote:
> Couldn't it be possible to have the misrouted irq function mark the
> DISABLED_IRQ handlers as IRQ_PENDING?  Then have the enable_irq that
> actually enables the irq to call the handlers with interrupts disabled
> if the IRQ_PENDING is set?

We still have the ambiguity with disable_irq. Really we need to have
disable_irq_handler(irq, handler)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-06-03 21:53             ` Alan Cox
@ 2006-06-03 22:34               ` Steven Rostedt
  2006-06-04  9:34                 ` Arjan van de Ven
  0 siblings, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2006-06-03 22:34 UTC (permalink / raw)
  To: Alan Cox; +Cc: Arjan van de Ven, Ingo Molnar, Andrew Morton, linux-kernel

On Sat, 2006-06-03 at 17:53 -0400, Alan Cox wrote:
> On Sat, Jun 03, 2006 at 10:37:01AM -0400, Steven Rostedt wrote:
> > Couldn't it be possible to have the misrouted irq function mark the
> > DISABLED_IRQ handlers as IRQ_PENDING?  Then have the enable_irq that
> > actually enables the irq to call the handlers with interrupts disabled
> > if the IRQ_PENDING is set?
> 
> We still have the ambiguity with disable_irq. Really we need to have
> disable_irq_handler(irq, handler)

Yeah, that does make sense, but I think the IRQ_PENDING idea works too.
This way disable_irq_handler doesn't need to mask the interrupt even
without the irqpoll and irqfixup.  Let the interrupt happen and just
skip those handlers that that are disabled.  Then when the handler is
re-enabled, then we can call the handler. Of course we would need a
enable_irq_handler too.

This would make the vortex card's disable_irq not hurt all the other
devices that share the irq with it.

-- Steve



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-06-03 22:34               ` Steven Rostedt
@ 2006-06-04  9:34                 ` Arjan van de Ven
  2006-06-04 13:16                   ` Steven Rostedt
  0 siblings, 1 reply; 32+ messages in thread
From: Arjan van de Ven @ 2006-06-04  9:34 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Alan Cox, Ingo Molnar, Andrew Morton, linux-kernel

On Sat, 2006-06-03 at 18:34 -0400, Steven Rostedt wrote:
> On Sat, 2006-06-03 at 17:53 -0400, Alan Cox wrote:
> > On Sat, Jun 03, 2006 at 10:37:01AM -0400, Steven Rostedt wrote:
> > > Couldn't it be possible to have the misrouted irq function mark the
> > > DISABLED_IRQ handlers as IRQ_PENDING?  Then have the enable_irq that
> > > actually enables the irq to call the handlers with interrupts disabled
> > > if the IRQ_PENDING is set?
> > 
> > We still have the ambiguity with disable_irq. Really we need to have
> > disable_irq_handler(irq, handler)
> 
> Yeah, that does make sense, but I think the IRQ_PENDING idea works too.
> This way disable_irq_handler doesn't need to mask the interrupt even
> without the irqpoll and irqfixup.  Let the interrupt happen and just
> skip those handlers that that are disabled.  Then when the handler is
> re-enabled, then we can call the handler. Of course we would need a
> enable_irq_handler too.
> 
> This would make the vortex card's disable_irq not hurt all the other
> devices that share the irq with it.
can't do that; if you get an irq anyway from the hardware you now have a
screamer...... which is why vortex really needs to disable the irq at
the hardware level.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-06-04  9:34                 ` Arjan van de Ven
@ 2006-06-04 13:16                   ` Steven Rostedt
  2006-06-04 15:38                     ` Alan Cox
  2006-06-04 16:10                     ` Alan Cox
  0 siblings, 2 replies; 32+ messages in thread
From: Steven Rostedt @ 2006-06-04 13:16 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Alan Cox, Ingo Molnar, Andrew Morton, linux-kernel

On Sun, 2006-06-04 at 11:34 +0200, Arjan van de Ven wrote:
> On Sat, 2006-06-03 at 18:34 -0400, Steven Rostedt wrote:
> > On Sat, 2006-06-03 at 17:53 -0400, Alan Cox wrote:
> > > On Sat, Jun 03, 2006 at 10:37:01AM -0400, Steven Rostedt wrote:
> > > > Couldn't it be possible to have the misrouted irq function mark the
> > > > DISABLED_IRQ handlers as IRQ_PENDING?  Then have the enable_irq that
> > > > actually enables the irq to call the handlers with interrupts disabled
> > > > if the IRQ_PENDING is set?
> > > 
> > > We still have the ambiguity with disable_irq. Really we need to have
> > > disable_irq_handler(irq, handler)
> > 
> > Yeah, that does make sense, but I think the IRQ_PENDING idea works too.
> > This way disable_irq_handler doesn't need to mask the interrupt even
> > without the irqpoll and irqfixup.  Let the interrupt happen and just
> > skip those handlers that that are disabled.  Then when the handler is
> > re-enabled, then we can call the handler. Of course we would need a
> > enable_irq_handler too.
> > 
> > This would make the vortex card's disable_irq not hurt all the other
> > devices that share the irq with it.
> can't do that; if you get an irq anyway from the hardware you now have a
> screamer...... which is why vortex really needs to disable the irq at
> the hardware level.
> 

I woke up in the middle of the night last night, thinking about this. I
suddenly realized that this cant work with level interrupts.  Oh well,
if we lived in a world of edge only, then it could work. :-/

But still for the fixirq case, it might still be an option to delay the
handler.

But I'm not sure if I fully understand this misrouting irq.  Is it to
fix broken machines that trigger interrupts on the wrong line?  Or is
this some strange case that happens on normal setups?  If an irq does
get misrouted, and we happen to be in the disable_irq of that device
that had its irq misrouted, couldn't it cause a storm if we don't call
the handler for it?  So really disable_irq is broken for the misrouting
case, and perhaps needs to be replaced with a spin_lock_irqsave?

-- Steve

-- Steve


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-06-04 13:16                   ` Steven Rostedt
@ 2006-06-04 15:38                     ` Alan Cox
  2006-06-04 15:44                       ` Steven Rostedt
  2006-06-04 16:10                     ` Alan Cox
  1 sibling, 1 reply; 32+ messages in thread
From: Alan Cox @ 2006-06-04 15:38 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Arjan van de Ven, Alan Cox, Ingo Molnar, Andrew Morton,
	linux-kernel

On Sun, Jun 04, 2006 at 09:16:01AM -0400, Steven Rostedt wrote:
> that had its irq misrouted, couldn't it cause a storm if we don't call
> the handler for it?  So really disable_irq is broken for the misrouting
> case, and perhaps needs to be replaced with a spin_lock_irqsave?

For the ne2k at least that simply is not possible, the latencies are so
bad that you start dropping serial characters and the like if you do.

The disease is not as bad as the cure..


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-06-04 15:38                     ` Alan Cox
@ 2006-06-04 15:44                       ` Steven Rostedt
  0 siblings, 0 replies; 32+ messages in thread
From: Steven Rostedt @ 2006-06-04 15:44 UTC (permalink / raw)
  To: Alan Cox; +Cc: Arjan van de Ven, Ingo Molnar, Andrew Morton, linux-kernel

On Sun, 2006-06-04 at 11:38 -0400, Alan Cox wrote:
> On Sun, Jun 04, 2006 at 09:16:01AM -0400, Steven Rostedt wrote:
> > that had its irq misrouted, couldn't it cause a storm if we don't call
> > the handler for it?  So really disable_irq is broken for the misrouting
> > case, and perhaps needs to be replaced with a spin_lock_irqsave?
> 
> For the ne2k at least that simply is not possible, the latencies are so
> bad that you start dropping serial characters and the like if you do.
> 
> The disease is not as bad as the cure..
> 

Forgive me on my ignorance of misrouted irqs.  I really don't understand
when and why they happen.

But my question still stands (maybe because I don't understand). If we
don't call the handler of the misrouted irq because if disable_irq,
can't we still get an interrupt storm?

-- Steve



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-06-04 13:16                   ` Steven Rostedt
  2006-06-04 15:38                     ` Alan Cox
@ 2006-06-04 16:10                     ` Alan Cox
  2006-06-04 16:22                       ` Steven Rostedt
  1 sibling, 1 reply; 32+ messages in thread
From: Alan Cox @ 2006-06-04 16:10 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Arjan van de Ven, Alan Cox, Ingo Molnar, Andrew Morton,
	linux-kernel

Ar Sul, 2006-06-04 am 09:16 -0400, ysgrifennodd Steven Rostedt:
> But I'm not sure if I fully understand this misrouting irq.  Is it to
> fix broken machines that trigger interrupts on the wrong line?  Or is

It is solely to deal with machines where IRQs turn up on the wrong line,
generally meaning broken ACPI IRQ tables. It has to be enabled as a boot
option.

Alan


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-06-04 16:10                     ` Alan Cox
@ 2006-06-04 16:22                       ` Steven Rostedt
  2006-06-04 21:26                         ` Alan Cox
  0 siblings, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2006-06-04 16:22 UTC (permalink / raw)
  To: Alan Cox
  Cc: Arjan van de Ven, Alan Cox, Ingo Molnar, Andrew Morton,
	linux-kernel

On Sun, 2006-06-04 at 17:10 +0100, Alan Cox wrote:
> Ar Sul, 2006-06-04 am 09:16 -0400, ysgrifennodd Steven Rostedt:
> > But I'm not sure if I fully understand this misrouting irq.  Is it to
> > fix broken machines that trigger interrupts on the wrong line?  Or is
> 
> It is solely to deal with machines where IRQs turn up on the wrong line,
> generally meaning broken ACPI IRQ tables. It has to be enabled as a boot
> option.

Thanks for the answer Alan.

But can't this machine still cause an interrupt storm if the interrupt
comes on a wrong line, and we don't call the handler for the interrupt
source because we are now honoring disable_irq?

-- Steve



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-06-04 16:22                       ` Steven Rostedt
@ 2006-06-04 21:26                         ` Alan Cox
  2006-06-04 21:28                           ` Steven Rostedt
  0 siblings, 1 reply; 32+ messages in thread
From: Alan Cox @ 2006-06-04 21:26 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Arjan van de Ven, Alan Cox, Ingo Molnar, Andrew Morton,
	linux-kernel

Ar Sul, 2006-06-04 am 12:22 -0400, ysgrifennodd Steven Rostedt:
> But can't this machine still cause an interrupt storm if the interrupt
> comes on a wrong line, and we don't call the handler for the interrupt
> source because we are now honoring disable_irq?

Yes - that is why we can't honour disable_irq in this case but have to
hope 8)


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-06-04 21:26                         ` Alan Cox
@ 2006-06-04 21:28                           ` Steven Rostedt
  2006-06-04 21:44                             ` Ingo Molnar
  0 siblings, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2006-06-04 21:28 UTC (permalink / raw)
  To: Alan Cox
  Cc: Arjan van de Ven, Alan Cox, Ingo Molnar, Andrew Morton,
	linux-kernel

On Sun, 2006-06-04 at 22:26 +0100, Alan Cox wrote:
> Ar Sul, 2006-06-04 am 12:22 -0400, ysgrifennodd Steven Rostedt:
> > But can't this machine still cause an interrupt storm if the interrupt
> > comes on a wrong line, and we don't call the handler for the interrupt
> > source because we are now honoring disable_irq?
> 
> Yes - that is why we can't honour disable_irq in this case but have to
> hope 8)
> 

Hmm, maybe this can be solved with something like what the -rt patch
does with threading interrupts and the interrupt mask.  I'm not
suggesting threading interrupts.  But, if the misrouted irq comes across
a disabled_irq, that it sets some flag, and doesn't unmask the interrupt
when finished.  Have enable_irq see the flag and have it unmask the
interrupt if it is safe to do so.

This all may be pretty hacky, but it's trying to fix code for hardware
that is already hacky.  Note, that this would need to be compiled in as
on option to actually implement any of this crap.

-- Steve



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-06-04 21:28                           ` Steven Rostedt
@ 2006-06-04 21:44                             ` Ingo Molnar
  2006-06-05  1:04                               ` Steven Rostedt
                                                 ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Ingo Molnar @ 2006-06-04 21:44 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Alan Cox, Arjan van de Ven, Alan Cox, Andrew Morton, linux-kernel


* Steven Rostedt <rostedt@goodmis.org> wrote:

> On Sun, 2006-06-04 at 22:26 +0100, Alan Cox wrote:
> > Ar Sul, 2006-06-04 am 12:22 -0400, ysgrifennodd Steven Rostedt:
> > > But can't this machine still cause an interrupt storm if the interrupt
> > > comes on a wrong line, and we don't call the handler for the interrupt
> > > source because we are now honoring disable_irq?
> > 
> > Yes - that is why we can't honour disable_irq in this case but have to
> > hope 8)
> > 
> 
> Hmm, maybe this can be solved with something like what the -rt patch 
> does with threading interrupts and the interrupt mask.  I'm not 
> suggesting threading interrupts.  But, if the misrouted irq comes 
> across a disabled_irq, that it sets some flag, and doesn't unmask the 
> interrupt when finished.  Have enable_irq see the flag and have it 
> unmask the interrupt if it is safe to do so.
> 
> This all may be pretty hacky, but it's trying to fix code for hardware 
> that is already hacky.  Note, that this would need to be compiled in 
> as on option to actually implement any of this crap.

no ... lets not mix threaded IRQs in here. The model of executing an 
interrupt handler has nothing to do with the irq flow itself. Whatever 
can be done with a threaded handler can be done via atomic ISRs too.

pretty much the only correct solution seems to be to go with Arjan's 
suggestion and make the 'disabled' property per-action, instead of the 
current per-desc thing. (obviously the physical act of masking an 
interrupt line is fundamentally per-desc, but the act of running an 
action "behind the back" of a masked line is still OK.) Unfortunately 
this would also mean the manual conversion of 300+ places that use 
disable_irq()/enable_irq() currently ... so it's no small work. (and the 
hardest part of the work is to find a safe method to convert them 
without introducing bugs)

	Ingo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq()
  2006-06-04 21:44                             ` Ingo Molnar
@ 2006-06-05  1:04                               ` Steven Rostedt
  2006-06-06  3:33                               ` [PATCH -mm] misroute-irq: Don't call desc->chip->end because of edge interrupts Steven Rostedt
  2006-06-06  3:49                               ` [RFC][PATCH -mm] postpone misrouted irqs when disabled Steven Rostedt
  2 siblings, 0 replies; 32+ messages in thread
From: Steven Rostedt @ 2006-06-05  1:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Alan Cox, Arjan van de Ven, Alan Cox, Andrew Morton, linux-kernel

On Sun, 2006-06-04 at 23:44 +0200, Ingo Molnar wrote:
> * Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Sun, 2006-06-04 at 22:26 +0100, Alan Cox wrote:
> > > Ar Sul, 2006-06-04 am 12:22 -0400, ysgrifennodd Steven Rostedt:
> > > > But can't this machine still cause an interrupt storm if the interrupt
> > > > comes on a wrong line, and we don't call the handler for the interrupt
> > > > source because we are now honoring disable_irq?
> > > 
> > > Yes - that is why we can't honour disable_irq in this case but have to
> > > hope 8)
> > > 
> > 
> > Hmm, maybe this can be solved with something like what the -rt patch 
> > does with threading interrupts and the interrupt mask.  I'm not 
> > suggesting threading interrupts.  But, if the misrouted irq comes 
> > across a disabled_irq, that it sets some flag, and doesn't unmask the 
> > interrupt when finished.  Have enable_irq see the flag and have it 
> > unmask the interrupt if it is safe to do so.
> > 
> > This all may be pretty hacky, but it's trying to fix code for hardware 
> > that is already hacky.  Note, that this would need to be compiled in 
> > as on option to actually implement any of this crap.
> 
> no ... lets not mix threaded IRQs in here. The model of executing an 
> interrupt handler has nothing to do with the irq flow itself. Whatever 
> can be done with a threaded handler can be done via atomic ISRs too.

I wasn't suggesting threading the handler.  But to use the technique
that threaded handlers use.  That is to postpone the unmasking of the
interrupt until a later time. Sorry to not make that point clearer.

> 
> pretty much the only correct solution seems to be to go with Arjan's 
> suggestion and make the 'disabled' property per-action, instead of the 
> current per-desc thing. (obviously the physical act of masking an 
> interrupt line is fundamentally per-desc, but the act of running an 
> action "behind the back" of a masked line is still OK.) Unfortunately 
> this would also mean the manual conversion of 300+ places that use 
> disable_irq()/enable_irq() currently ... so it's no small work. (and the 
> hardest part of the work is to find a safe method to convert them 
> without introducing bugs)

I don't think that solves the problem.  Even if you disable at the
handler level, and that handler happened to cause an interrupt that is
level sensitive and needs the driver to disarm it, the problem returns.
So if disable_irq_handler disabled the handler and the interrupt, but
then this interrupt happened to come in on another line what happens?
If you decide to skip this handler because it was labeled "don't touch
me", we will again get the interrupt storm because as soon as you return
from the interrupt handler from the misrouted irq, it will be triggered
again!

So what I'm suggesting, is that we flag the case that a handler (or the
irq in general) is disabled when a misrouted irq is performed.  If the
irq was not covered by an interrupt handler, and there was a skipped
handler due to disable_irq, we then don't unmask the irq on return from
the interrupt but instead flag it.  Now when the irq is reenabled, the
flag is checked and if it wasn't handled already, we call the handler
then.

Actually this means that we really don't need the use of the
disable_irq_handler. We keep all the code with the interrupt part, and
we can do the accounting there.

Understand where I'm getting at?  By skipping a handler you risk having
an interrupt storm anyway.  But my idea is to have accounting to handle
disabled_irqs and only unmask the irq if it was handled and we didn't
skip any handlers due to disable_irq.  That way we can push the work
into enable_irq and avoid modifying 300+ calls to disable_irq not to
mention enable_irq as well.

I'd do a patch to show what I mean, but you will have to wait a week and
a half. I'll be traveling again, and wont be able to work on it until
then (unless I can squeeze some time tomorrow ;)

-- Steve


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH -mm] misroute-irq: Don't call desc->chip->end because of edge interrupts
  2006-06-04 21:44                             ` Ingo Molnar
  2006-06-05  1:04                               ` Steven Rostedt
@ 2006-06-06  3:33                               ` Steven Rostedt
  2006-06-06  4:20                                 ` Andrew Morton
  2006-06-06  8:01                                 ` Ingo Molnar
  2006-06-06  3:49                               ` [RFC][PATCH -mm] postpone misrouted irqs when disabled Steven Rostedt
  2 siblings, 2 replies; 32+ messages in thread
From: Steven Rostedt @ 2006-06-06  3:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Alan Cox, Arjan van de Ven, Alan Cox, Andrew Morton, linux-kernel


Hit the following BUG with irqpoll.  The below patch fixes it.

hda: WDC WD2000BB-00GUA0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14<1>BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
00000000
*pde = 00000000
Oops: 0000 [#1]
PREEMPT SMP
last sysfs file:
Modules linked in:
CPU:    0
EIP:    0060:[<00000000>]    Not tainted VLI
EFLAGS: 00010002   (2.6.17-rc5-mm3 #23)
EIP is at 0x0
eax: c03f2540   ebx: c277dcc0   ecx: c03ea9c4   edx: 00000001
esi: 0000000e   edi: c03ea9e4   ebp: c03f5f08   esp: c03f5ed8
ds: 007b   es: 007b   ss: 0068
Process idle (pid: 0, threadinfo=c03f4000 task=c036d540)
Stack: c01488c5 0000000e c03f5f60 c277dcc0 00000000 00000001 c03ea9c4 c03ea9d0
       c03ea9d4 c03ea2c0 00000000 c03ea2e4 c03f5f38 c014904d 00000000 c03ea2c0
       00000001 c03f5f60 c03f5f60 00000000 c0370c00 c0148f30 00000000 c03f5f60
Call Trace:
 [<c0103db7>] show_stack_log_lvl+0xa7/0xf0
 [<c0103fc0>] show_registers+0x1c0/0x260
 [<c0104535>] die+0x135/0x2d0
 [<c0114be2>] do_page_fault+0x6b2/0x79c
 [<c01035f5>] error_code+0x39/0x40
 [<c014904d>] handle_edge_irq+0x11d/0x150
 [<c01055de>] do_IRQ+0x5e/0xb0
 [<c010348d>] common_interrupt+0x25/0x2c
 [<c010162d>] cpu_idle+0x4d/0xb0
 [<c01002e5>] rest_init+0x45/0x50
 [<c03fa81a>] start_kernel+0x32a/0x460
 [<c0100210>] 0xc0100210
Code:  Bad EIP value.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>


Index: linux-2.6.17-rc5-mm3/kernel/irq/spurious.c
===================================================================
--- linux-2.6.17-rc5-mm3.orig/kernel/irq/spurious.c	2006-06-05 17:26:15.000000000 -0400
+++ linux-2.6.17-rc5-mm3/kernel/irq/spurious.c	2006-06-05 19:47:58.000000000 -0400
@@ -77,7 +77,7 @@ static int misrouted_irq(int irq, struct
 		 * If we did actual work for the real IRQ line we must let the
 		 * IRQ controller clean up too
 		 */
-		if (work)
+		if (work && disc->chip && desc->chip->end)
 			desc->chip->end(i);
 		spin_unlock(&desc->lock);
 	}



^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC][PATCH -mm] postpone misrouted irqs when disabled
  2006-06-04 21:44                             ` Ingo Molnar
  2006-06-05  1:04                               ` Steven Rostedt
  2006-06-06  3:33                               ` [PATCH -mm] misroute-irq: Don't call desc->chip->end because of edge interrupts Steven Rostedt
@ 2006-06-06  3:49                               ` Steven Rostedt
  2 siblings, 0 replies; 32+ messages in thread
From: Steven Rostedt @ 2006-06-06  3:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Alan Cox, Arjan van de Ven, Alan Cox, Andrew Morton, linux-kernel

On Sun, 2006-06-04 at 23:44 +0200, Ingo Molnar wrote:

> pretty much the only correct solution seems to be to go with Arjan's 
> suggestion and make the 'disabled' property per-action, instead of the 
> current per-desc thing. (obviously the physical act of masking an 
> interrupt line is fundamentally per-desc, but the act of running an 
> action "behind the back" of a masked line is still OK.) Unfortunately 
> this would also mean the manual conversion of 300+ places that use 
> disable_irq()/enable_irq() currently ... so it's no small work. (and the 
> hardest part of the work is to find a safe method to convert them 
> without introducing bugs)

OK, I got some time to play.

This is what I was talking about before.  Here I honour the IRQ_DISABLED
in misrouted_irq, but I have some accounting that will not enable the
irq on exit of the interrupt handler if the misrouted irq encountered a
disabled interrupt.  What happens is that the IRQ stays masked until who
ever disabled the irq enables it.

On enable_irq a check is made and if a misrouted irq was done while the
irq was disabled, it then calls the interrupt handler, and checks to see
if it should unmask the irq in question.

Disclaimer: I did this patch in between compiles of doing real work, so
it is really a big hack.  But it helps to explain what I'm getting at.
BTW, my machine with the vortex card deadlocks without this patch
(happens in the disabled_irq with spin_lock-irqs-enabled section).  But
with this patch it actual runs, and I put in my internal logging to see
if the disable_irq + misrouted_irq that deadlocks normal happens.  It
did happen and the machine kept on going :)

And Ingo, this was inspired by the way -rt does hard irq threading.
That is to keep the irq unmasked until the thread takes care of it.
That's all, I wasn't suggesting that we add irq threads here.

Another positive with this approach: you don't need to update 300+ calls
to disable_irq and enable_irq and make sure they are right.

This patch goes against -mm3 + the patch I sent earlier to fix the
desc->chip->end() bug.

-- Steve

(Signing off, but this isn't a real patch - needs more work)

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>


Index: linux-2.6.17-rc5-mm3/include/linux/irq.h
===================================================================
--- linux-2.6.17-rc5-mm3.orig/include/linux/irq.h	2006-06-05 17:26:15.000000000 -0400
+++ linux-2.6.17-rc5-mm3/include/linux/irq.h	2006-06-05 23:08:07.000000000 -0400
@@ -165,6 +165,8 @@ struct irq_desc {
 
 extern struct irq_desc irq_desc[NR_IRQS];
 
+extern void misroute_enable_irq(unsigned int irq, struct irq_desc *desc);
+
 /*
  * Migration helpers for obsolete names, they will go away:
  */
@@ -332,8 +334,8 @@ static inline void generic_handle_irq(un
 }
 
 /* Handling of unhandled and spurious interrupts: */
-extern void note_interrupt(unsigned int irq, struct irq_desc *desc,
-			   int action_ret, struct pt_regs *regs);
+extern int note_interrupt(unsigned int irq, struct irq_desc *desc,
+			  int action_ret, struct pt_regs *regs);
 
 /* Resending of interrupts :*/
 void check_irq_resend(struct irq_desc *desc, unsigned int irq);
Index: linux-2.6.17-rc5-mm3/kernel/irq/handle.c
===================================================================
--- linux-2.6.17-rc5-mm3.orig/kernel/irq/handle.c	2006-06-05 22:34:55.000000000 -0400
+++ linux-2.6.17-rc5-mm3/kernel/irq/handle.c	2006-06-05 23:07:33.000000000 -0400
@@ -153,6 +153,7 @@ fastcall unsigned int __do_IRQ(unsigned 
 	struct irq_desc *desc = irq_desc + irq;
 	struct irqaction *action;
 	unsigned int status;
+	int misrouted_pending = 0;
 
 	spin_lock(&sdr_lock);
 	if (!irq) {
@@ -237,19 +238,23 @@ fastcall unsigned int __do_IRQ(unsigned 
 
 		spin_lock(&desc->lock);
 		if (!noirqdebug)
-			note_interrupt(irq, desc, action_ret, regs);
+			misrouted_pending = note_interrupt(irq, desc, action_ret, regs);
 		if (likely(!(desc->status & IRQ_PENDING)))
 			break;
+		if (unlikely(misrouted_pending))
+			break;
 		desc->status &= ~IRQ_PENDING;
 	}
 	desc->status &= ~IRQ_INPROGRESS;
 
 out:
-	/*
-	 * The ->end() handler has to deal with interrupts which got
-	 * disabled while the handler was running.
-	 */
-	desc->chip->end(irq);
+	if (likely(!misrouted_pending)) {
+		/*
+		 * The ->end() handler has to deal with interrupts which got
+		 * disabled while the handler was running.
+		 */
+		desc->chip->end(irq);
+	}
 	spin_unlock(&desc->lock);
 
 	return 1;
Index: linux-2.6.17-rc5-mm3/kernel/irq/manage.c
===================================================================
--- linux-2.6.17-rc5-mm3.orig/kernel/irq/manage.c	2006-06-05 17:26:15.000000000 -0400
+++ linux-2.6.17-rc5-mm3/kernel/irq/manage.c	2006-06-05 23:05:24.000000000 -0400
@@ -123,6 +123,16 @@ void enable_irq(unsigned int irq)
 
 		/* Prevent probing on this irq: */
 		desc->status = status | IRQ_NOPROBE;
+		/*
+		 * Here we check to see if we need to handle
+		 * a misrouted irq, and we were disabled when
+		 * it happened.
+		 * Really, the check_irq_resend could handle this
+		 * but that code is dependent on the chip having
+		 * a retrigger operation and the misrouted irq
+		 * can't depend on that.
+		 */
+		misroute_enable_irq(irq, desc);
 		check_irq_resend(desc, irq);
 		/* fall-through */
 	}
Index: linux-2.6.17-rc5-mm3/kernel/irq/spurious.c
===================================================================
--- linux-2.6.17-rc5-mm3.orig/kernel/irq/spurious.c	2006-06-05 19:53:27.000000000 -0400
+++ linux-2.6.17-rc5-mm3/kernel/irq/spurious.c	2006-06-05 23:29:42.000000000 -0400
@@ -10,15 +10,20 @@
 #include <linux/module.h>
 #include <linux/kallsyms.h>
 #include <linux/interrupt.h>
+#include <linux/bitops.h>
 
 static int irqfixup __read_mostly;
 
+DEFINE_SPINLOCK(misroute_lock);
+DECLARE_BITMAP(misroute_irq_disabled, NR_IRQS);
+DECLARE_BITMAP(misroute_irq_pending, NR_IRQS);
+
 /*
  * Recovery handler for misrouted interrupts.
  */
 static int misrouted_irq(int irq, struct pt_regs *regs)
 {
-	int i, ok = 0, work = 0;
+	int i, ok = 0, work = 0, disabled = 0;
 
 	for (i = 1; i < NR_IRQS; i++) {
 		struct irq_desc *desc = irq_desc + i;
@@ -39,8 +44,28 @@ static int misrouted_irq(int irq, struct
 			spin_unlock(&desc->lock);
 			continue;
 		}
+		/* Honour disable irq */
+		if (desc->status & IRQ_DISABLED) {
+			/*
+			 * Disabled: We set the pending bit to let the
+			 * enabled_irq know that it should check to
+			 * run again, and that we might not have unmasked
+			 * the interrupt yet.
+			 */
+			if (desc->action && (desc->action->flags & SA_SHIRQ)) {
+				desc->status |= IRQ_PENDING;
+				spin_lock(&misroute_lock);
+				__set_bit(i, misroute_irq_disabled);
+				spin_unlock(&misroute_lock);
+				disabled = 1;
+			}
+			spin_unlock(&desc->lock);
+			continue;
+		}
+
 		/* Honour the normal IRQ locking */
 		desc->status |= IRQ_INPROGRESS;
+
 		action = desc->action;
 		spin_unlock(&desc->lock);
 
@@ -133,9 +158,12 @@ report_bad_irq(unsigned int irq, struct 
 	}
 }
 
-void note_interrupt(unsigned int irq, struct irq_desc *desc,
-		    irqreturn_t action_ret, struct pt_regs *regs)
+int note_interrupt(unsigned int irq, struct irq_desc *desc,
+		   irqreturn_t action_ret, struct pt_regs *regs)
 {
+	int ok = -1;
+	int ret = 0;
+
 	if (unlikely(action_ret != IRQ_HANDLED)) {
 		desc->irqs_unhandled++;
 		if (unlikely(action_ret != IRQ_NONE))
@@ -145,15 +173,26 @@ void note_interrupt(unsigned int irq, st
 	if (unlikely(irqfixup)) {
 		/* Don't punish working computers */
 		if ((irqfixup == 2 && irq == 0) || action_ret == IRQ_NONE) {
-			int ok = misrouted_irq(irq, regs);
+			ok = misrouted_irq(irq, regs);
 			if (action_ret == IRQ_NONE)
 				desc->irqs_unhandled -= ok;
 		}
 	}
 
 	desc->irq_count++;
+	spin_lock(&misroute_lock);
+	/*
+	 * If we didn't handle an interrupt, and we have disabled
+	 * interrupts, then don't unmask this when we leave
+	 * the interrupt handler.
+	 */
+	if (!ok && find_first_bit(misroute_irq_disabled, NR_IRQS) != NR_IRQS) {
+		__set_bit(irq, misroute_irq_pending);
+		ret = 1;
+	}
+	spin_unlock(&misroute_lock);
 	if (likely(desc->irq_count < 100000))
-		return;
+		return ret;
 
 	desc->irq_count = 0;
 	if (unlikely(desc->irqs_unhandled > 99900)) {
@@ -170,6 +209,114 @@ void note_interrupt(unsigned int irq, st
 		desc->chip->disable(irq);
 	}
 	desc->irqs_unhandled = 0;
+	return ret;
+}
+
+/*
+ * When called, the desc->lock is held.
+ */
+void misroute_enable_irq(unsigned int irq, struct irq_desc *desc)
+{
+	struct irqaction *action;
+	int ok = 0;
+	int i;
+
+	/*
+	 * Perhaps a new flag would be good so that systems
+	 * without misroute irq enabled would skip this every time.
+	 */
+	if ((desc->status & IRQ_PENDING) == 0)
+		return;
+
+	spin_lock(&misroute_lock);
+	if (!test_and_clear_bit(irq, misroute_irq_disabled)) {
+		spin_unlock(&misroute_lock);
+		return;
+	}
+
+	spin_unlock(&misroute_lock);
+
+	/*
+	 * Run our handler.  This really needs to be wrapped up
+	 * in another function.
+	 */
+	desc->status |= IRQ_INPROGRESS;
+	desc->status &= ~IRQ_PENDING;
+
+	action = desc->action;
+	spin_unlock(&desc->lock);
+
+	while (action) {
+		/* Only shared IRQ handlers are safe to call */
+		if (action->flags & SA_SHIRQ) {
+			/*
+			 * Really no handler looks at regs.
+			 */
+			if (action->handler(irq, action->dev_id, NULL) ==
+			    IRQ_HANDLED)
+				ok = 1;
+		}
+		action = action->next;
+	}
+	/* Now clean up the flags */
+	spin_lock(&desc->lock);
+	action = desc->action;
+
+	/*
+	 * While we were looking for a fixup someone queued a real
+	 * IRQ clashing with our walk:
+	 */
+	while ((desc->status & IRQ_PENDING) && action) {
+		/*
+		 * Perform real IRQ processing for the IRQ we deferred
+		 */
+		ok = 1;
+		spin_unlock(&desc->lock);
+		handle_IRQ_event(irq, NULL, action);
+		spin_lock(&desc->lock);
+		desc->status &= ~IRQ_PENDING;
+	}
+	desc->status &= ~IRQ_INPROGRESS;
+	spin_unlock(&desc->lock);
+
+	spin_lock(&misroute_lock);
+	/*
+	 * If we did something or there are no more misrouted irqs then
+	 * unmask all the irqs that were waiting on us.
+	 */
+	while ((i = find_first_bit(misroute_irq_pending, NR_IRQS)) != NR_IRQS) {
+		struct irq_desc *d = irq_desc + i;
+		int m;
+
+		spin_unlock(&misroute_lock);
+
+		spin_lock(&d->lock);
+		spin_lock(&misroute_lock);
+		/*
+		 * Need to make sure that we suddenly didn't have another misroute.
+		 */
+		if ((m=find_first_bit(misroute_irq_disabled, NR_IRQS)) != NR_IRQS) {
+			spin_unlock(&d->lock);
+			break;
+		}
+
+		if (desc->chip && desc->chip->end)
+			desc->chip->end(i);
+		clear_bit(i, misroute_irq_pending);
+		spin_lock(&misroute_lock);
+		spin_unlock(&d->lock);
+	}
+	{
+		static int once;
+		if (!once) {
+			once = 1;
+			LOGDUMP();
+		}
+	}
+	spin_unlock(&misroute_lock);
+
+	/* leave as we came */
+	spin_lock(&desc->lock);
 }
 
 int noirqdebug __read_mostly;



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH -mm] misroute-irq: Don't call desc->chip->end because of edge interrupts
  2006-06-06  3:33                               ` [PATCH -mm] misroute-irq: Don't call desc->chip->end because of edge interrupts Steven Rostedt
@ 2006-06-06  4:20                                 ` Andrew Morton
  2006-06-06 10:46                                   ` Steven Rostedt
  2006-06-06  8:01                                 ` Ingo Molnar
  1 sibling, 1 reply; 32+ messages in thread
From: Andrew Morton @ 2006-06-06  4:20 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: mingo, alan, arjan, alan, linux-kernel

On Mon, 05 Jun 2006 23:33:50 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> 
> Hit the following BUG with irqpoll.  The below patch fixes it.
> 

Call me a cynic, but

> +		if (work && disc->chip && desc->chip->end)

that doesn't look super-tested to me.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH -mm] misroute-irq: Don't call desc->chip->end because of edge interrupts
  2006-06-06  3:33                               ` [PATCH -mm] misroute-irq: Don't call desc->chip->end because of edge interrupts Steven Rostedt
  2006-06-06  4:20                                 ` Andrew Morton
@ 2006-06-06  8:01                                 ` Ingo Molnar
  2006-06-06 10:50                                   ` Steven Rostedt
  1 sibling, 1 reply; 32+ messages in thread
From: Ingo Molnar @ 2006-06-06  8:01 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Alan Cox, Arjan van de Ven, Alan Cox, Andrew Morton, linux-kernel


* Steven Rostedt <rostedt@goodmis.org> wrote:

> Hit the following BUG with irqpoll.  The below patch fixes it.

> -		if (work)
> +		if (work && disc->chip && desc->chip->end)
>  			desc->chip->end(i);

typo - plus shouldnt this call ->eoi() as well if available?

	Ingo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH -mm] misroute-irq: Don't call desc->chip->end because of edge interrupts
  2006-06-06  4:20                                 ` Andrew Morton
@ 2006-06-06 10:46                                   ` Steven Rostedt
  0 siblings, 0 replies; 32+ messages in thread
From: Steven Rostedt @ 2006-06-06 10:46 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mingo, alan, arjan, alan, linux-kernel

On Mon, 2006-06-05 at 21:20 -0700, Andrew Morton wrote:
> On Mon, 05 Jun 2006 23:33:50 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > 
> > Hit the following BUG with irqpoll.  The below patch fixes it.
> > 
> 
> Call me a cynic, but
> 
> > +		if (work && disc->chip && desc->chip->end)
> 
> that doesn't look super-tested to me.

No, it hasn't been refreshed! Damn quilt!

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>


Index: linux-2.6.17-rc5-mm3/kernel/irq/spurious.c
===================================================================
--- linux-2.6.17-rc5-mm3.orig/kernel/irq/spurious.c	2006-06-05 17:26:15.000000000 -0400
+++ linux-2.6.17-rc5-mm3/kernel/irq/spurious.c	2006-06-06 06:43:43.000000000 -0400
@@ -77,7 +77,7 @@ static int misrouted_irq(int irq, struct
 		 * If we did actual work for the real IRQ line we must let the
 		 * IRQ controller clean up too
 		 */
-		if (work)
+		if (work && desc->chip && desc->chip->end)
 			desc->chip->end(i);
 		spin_unlock(&desc->lock);
 	}




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH -mm] misroute-irq: Don't call desc->chip->end because of edge interrupts
  2006-06-06  8:01                                 ` Ingo Molnar
@ 2006-06-06 10:50                                   ` Steven Rostedt
  2006-06-06 21:48                                     ` Andrew Morton
  0 siblings, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2006-06-06 10:50 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Alan Cox, Arjan van de Ven, Alan Cox, Andrew Morton, linux-kernel

On Tue, 2006-06-06 at 10:01 +0200, Ingo Molnar wrote:
> * Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > Hit the following BUG with irqpoll.  The below patch fixes it.
> 
> > -		if (work)
> > +		if (work && disc->chip && desc->chip->end)
> >  			desc->chip->end(i);
> 
> typo - plus shouldnt this call ->eoi() as well if available?
> 

I saw Andrews replay and said, WTF I already fixed that. But then
noticed that it wasn't refreshed.

As for eio, could be.  This was part of my whole misroute thing which I
didn't have time to get to deep in.  I'm leaving today, where I won't
have any computers or Internet til next Wednesday or Thursday, so I'm
not going to be able to work on this further till then.

-- Steve


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH -mm] misroute-irq: Don't call desc->chip->end because of edge interrupts
  2006-06-06 10:50                                   ` Steven Rostedt
@ 2006-06-06 21:48                                     ` Andrew Morton
  2006-06-06 21:50                                       ` Ingo Molnar
  0 siblings, 1 reply; 32+ messages in thread
From: Andrew Morton @ 2006-06-06 21:48 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: mingo, alan, arjan, alan, linux-kernel

On Tue, 06 Jun 2006 06:50:19 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Tue, 2006-06-06 at 10:01 +0200, Ingo Molnar wrote:
> > * Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > Hit the following BUG with irqpoll.  The below patch fixes it.
> > 
> > > -		if (work)
> > > +		if (work && disc->chip && desc->chip->end)
> > >  			desc->chip->end(i);
> > 

Why is this change necessary?  2.6.17-rc6 doesn't have it, and it doesn't
oops.  So somebody changed something.  What?  Why?  Was it intentional?  Was
it correct?

> 
> As for eio, could be.  This was part of my whole misroute thing which I
> didn't have time to get to deep in.  I'm leaving today, where I won't
> have any computers or Internet til next Wednesday or Thursday, so I'm
> not going to be able to work on this further till then.

If "could be" == "yes" then what would be the consequences of this
bug-which-were-not-sure-is-there-yet?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH -mm] misroute-irq: Don't call desc->chip->end because of edge interrupts
  2006-06-06 21:48                                     ` Andrew Morton
@ 2006-06-06 21:50                                       ` Ingo Molnar
  0 siblings, 0 replies; 32+ messages in thread
From: Ingo Molnar @ 2006-06-06 21:50 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Steven Rostedt, alan, arjan, alan, linux-kernel


* Andrew Morton <akpm@osdl.org> wrote:

> On Tue, 06 Jun 2006 06:50:19 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Tue, 2006-06-06 at 10:01 +0200, Ingo Molnar wrote:
> > > * Steven Rostedt <rostedt@goodmis.org> wrote:
> > > 
> > > > Hit the following BUG with irqpoll.  The below patch fixes it.
> > > 
> > > > -		if (work)
> > > > +		if (work && disc->chip && desc->chip->end)
> > > >  			desc->chip->end(i);
> > > 
> 
> Why is this change necessary?  2.6.17-rc6 doesn't have it, and it 
> doesn't oops.  So somebody changed something.  What?  Why?  Was it 
> intentional?  Was it correct?

that's due to the irqchips change on x86 and x86_64. So it's not 
something in -rc6 or some other unknown effect.

	Ingo

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2006-06-06 21:51 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-31 20:02 [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq() Ingo Molnar
2006-05-31 20:31 ` Arjan van de Ven
2006-05-31 21:27   ` Ingo Molnar
2006-05-31 21:41   ` Alan Cox
2006-05-31 21:43     ` Arjan van de Ven
2006-05-31 21:47       ` Ingo Molnar
2006-05-31 21:56         ` Arjan van de Ven
2006-05-31 22:00           ` Ingo Molnar
2006-05-31 22:02             ` Arjan van de Ven
2006-06-03 14:37           ` Steven Rostedt
2006-06-03 21:53             ` Alan Cox
2006-06-03 22:34               ` Steven Rostedt
2006-06-04  9:34                 ` Arjan van de Ven
2006-06-04 13:16                   ` Steven Rostedt
2006-06-04 15:38                     ` Alan Cox
2006-06-04 15:44                       ` Steven Rostedt
2006-06-04 16:10                     ` Alan Cox
2006-06-04 16:22                       ` Steven Rostedt
2006-06-04 21:26                         ` Alan Cox
2006-06-04 21:28                           ` Steven Rostedt
2006-06-04 21:44                             ` Ingo Molnar
2006-06-05  1:04                               ` Steven Rostedt
2006-06-06  3:33                               ` [PATCH -mm] misroute-irq: Don't call desc->chip->end because of edge interrupts Steven Rostedt
2006-06-06  4:20                                 ` Andrew Morton
2006-06-06 10:46                                   ` Steven Rostedt
2006-06-06  8:01                                 ` Ingo Molnar
2006-06-06 10:50                                   ` Steven Rostedt
2006-06-06 21:48                                     ` Andrew Morton
2006-06-06 21:50                                       ` Ingo Molnar
2006-06-06  3:49                               ` [RFC][PATCH -mm] postpone misrouted irqs when disabled Steven Rostedt
2006-06-01  9:46         ` [patch, -rc5-mm1] locking validator: special rule: 8390.c disable_irq() Alan Cox
2006-06-01 10:02           ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox