linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Question on -rt synchronize_irq()
@ 2007-09-21  1:12 Paul E. McKenney
  2007-09-23 17:46 ` Paul E. McKenney
  0 siblings, 1 reply; 2+ messages in thread
From: Paul E. McKenney @ 2007-09-21  1:12 UTC (permalink / raw)
  To: linux-rt-users; +Cc: mingo, tglx, dvhltc, tytso

Hello!

Color me blind, but I don't see how the following race is avoided:

CPU 0:	A hardware interrupt is received for a threaded irq, which
	eventually results in do_hardirq() being invoked and the
	descriptor lock being acquired.  Because the IRQ_INPROGRESS
	status bit is set, execution continues.  Once the handler
	returns, having already cleared the IRQ_INPROGRESS status bit,
	the descriptor lock is released.

CPU 1:	A second hardware interrupt is received for the same threaded
	irq, which also wends its way to do_hardirq() with the
	IRQ_INPROGRESS status bit set.  It enters the handler (having
	released the descriptor lock) and accesses some data structure
	that CPU 2 now wants to get rid of.

CPU 2:	A synchronize_irq() is executed, again for this same irq.
	Because the descriptor status does not have the IRQ_NODELAY
	bit set, and because the IRQ_INPROGRESS status bit is set,
	this task blocks.

CPU 0:	Execution continues near the end of do_hardirq(), which notices
	that the descriptor wait_for_handler queue is non-empty,
	and therefore wakes up CPU 2's task.

CPU 2:	The task starts running, and proceeds to clean up the data
	structures that CPU 1 is still using.

CPU 1:	This second handler is suddenly and fatally disappointed by
	the disappearance of its data structures.


So what am I missing here?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Question on -rt synchronize_irq()
  2007-09-21  1:12 Question on -rt synchronize_irq() Paul E. McKenney
@ 2007-09-23 17:46 ` Paul E. McKenney
  0 siblings, 0 replies; 2+ messages in thread
From: Paul E. McKenney @ 2007-09-23 17:46 UTC (permalink / raw)
  To: linux-rt-users; +Cc: mingo, tglx, dvhltc, tytso

On Thu, Sep 20, 2007 at 06:12:22PM -0700, Paul E. McKenney wrote:
> Hello!
> 
> Color me blind, but I don't see how the following race is avoided:
> 
> CPU 0:	A hardware interrupt is received for a threaded irq, which
> 	eventually results in do_hardirq() being invoked and the
> 	descriptor lock being acquired.  Because the IRQ_INPROGRESS
> 	status bit is set, execution continues.  Once the handler
> 	returns, having already cleared the IRQ_INPROGRESS status bit,
> 	the descriptor lock is released.
> 
> CPU 1:	A second hardware interrupt is received for the same threaded
> 	irq, which also wends its way to do_hardirq() with the
> 	IRQ_INPROGRESS status bit set.  It enters the handler (having
> 	released the descriptor lock) and accesses some data structure
> 	that CPU 2 now wants to get rid of.
> 
> CPU 2:	A synchronize_irq() is executed, again for this same irq.
> 	Because the descriptor status does not have the IRQ_NODELAY
> 	bit set, and because the IRQ_INPROGRESS status bit is set,
> 	this task blocks.
> 
> CPU 0:	Execution continues near the end of do_hardirq(), which notices
> 	that the descriptor wait_for_handler queue is non-empty,
> 	and therefore wakes up CPU 2's task.
> 
> CPU 2:	The task starts running, and proceeds to clean up the data
> 	structures that CPU 1 is still using.

And the above was where I was in fact missing something.  When CPU 2
awakens, it will see that desc->status has the IRQ_INPROGRESS bit set,
and will therefore go back to sleep.  It will be awakened again when the
second interrupt finishes.

Please accept my apologies for the false alarm!!!

So the only shortcoming I am aware of for synchronize_irq() in -rt is
that if there are two concurrent synchronize_irq() calls for the same
IRQ that happen during the last interrupt ever for this IRQ, then the
second of the two will sleep forever.  A similar effect could cause
a precisely timed sequence of interrupts to indefinitely delay a given
synchronize_irq().  I think, anyway.  (If this is the case, then adding
a generation number to the irq_desc structure would be one possible
way of fixing this.)

							Thanx, Paul

> CPU 1:	This second handler is suddenly and fatally disappointed by
> 	the disappearance of its data structures.
> 
> 
> So what am I missing here?
> 
> 							Thanx, Paul

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2007-09-23 17:46 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-09-21  1:12 Question on -rt synchronize_irq() Paul E. McKenney
2007-09-23 17:46 ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).