linux-can.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] genirq: Sanitize spurious interrupt detection of threaded irqs
@ 2013-12-23 19:25 Austin Schuh
  2014-01-06 13:32 ` Oliver Hartkopp
  0 siblings, 1 reply; 9+ messages in thread
From: Austin Schuh @ 2013-12-23 19:25 UTC (permalink / raw)
  To: Thomas Gleixner, Wolfgang Grandegger, Pavel Pisa,
	Marc Kleine-Budde, Oliver Hartkopp, linux-can

Hi Thomas,

Did anything happen with your patch to note_interrupt, originally
posted on May 8th of 2013?  (https://lkml.org/lkml/2013/3/7/222)

I am seeing an issue on a machine right now running a
config-preempt-rt kernel and a SJA1000 CAN card from PEAK.  It works
for ~1 day, and then proceeds to die with a "Disabling IRQ #18"
message.  I posted on the Linux CAN mailing list, and Oliver Hartkopp
was able to reproduce the issue only on a realtime kernel.  A function
trace ending when the IRQ was disabled shows that note_interrupt is
being called regularly from the IRQ handler threads, and one of the
threads is doing work (and therefore calling note_interrupt with
IRQ_HANDLED).

Oliver Hartkopp and I ran tests over the weekend on numerous machines
and verified that the patch that you proposed fixes the problem.  We
think that the race condition that Till reported is causing the
problem here.

In reply to the comment about using the upper bit of
threads_handled_last for holding the SPURIOUS_DEFERRED flag, while
that may still be an over-optimization, the code should still work.
All comparisons are done with the bit set, which just makes it a 31
bit counter.  It will take 8 more days for the counter to overflow on
my machine, so I won't know for certain until then.

My only concern is that there may still be a small race condition with
this new code.  If the interrupt handler thread is running at a
realtime priority, but lower than another task, it may not get run
until a large number of IRQs get triggered, and then process them
quickly.  With your new handler code, this would be counted as one
single handled interrupt.  With the current constants, this is only a
problem if more than 1000 calls to the handler happen between IRQs.  I
starved my card's irq threads by running 4 tasks at a higher realtime
priority than the handler threads, and saw the number of unhandled
IRQs jump from 1/100000 to 3/100000, so that problem may not show up
in practice.

Austin Schuh

Tested-by: Austin Schuh <austin@peloton-tech.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-04-28 20:43 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-23 19:25 [PATCH] genirq: Sanitize spurious interrupt detection of threaded irqs Austin Schuh
2014-01-06 13:32 ` Oliver Hartkopp
2014-04-07 18:38   ` Austin Schuh
2014-04-07 18:41     ` Thomas Gleixner
2014-04-07 20:05       ` Austin Schuh
2014-04-07 20:07         ` Thomas Gleixner
2014-04-07 20:08           ` Austin Schuh
2014-04-28 20:20             ` Austin Schuh
2014-04-28 20:44               ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).