From: Tejun Heo <tj@kernel.org>
To: mingo@elte.hu, tglx@linutronix.de, bphilips@suse.de,
yinghai@kernel.org, akpm@linux-foundation.org,
torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
jeff@garzik.orglinux
Subject: [PATCHSET] irq: better lost/spurious irq handling
Date: Sun, 13 Jun 2010 17:31:26 +0200 [thread overview]
Message-ID: <1276443098-20653-1-git-send-email-tj@kernel.org> (raw)
Hello,
This is the first take of better-lost-spurious-irq-handling patchset.
IRQs can go wrong in two opposite directions. There can be too many
or too few. Currently, the former is handled by spurious IRQ
detection and polling (the "nobody cared" thing) and the latter by
irqpoll kernel parameter, which currently is broken on many
configurations due to tickless timer and missing IRQF_IRQPOLL.
Certain hardware classes are inherently prone to IRQ related problems.
ATA is one very good example. When the traditional IDE interface is
used, regardless of PATA or SATA, handling of IRQ is very fragile.
There is no reliable way to tell whether the controller is raising an
interrupt or not and the driver should expect IRQ according to HSM
transitions and hope that the host controller stays in sync and does
what it's expected to do. Occasionally but surely something goes
wrong and IRQ storm or timeout follows. Furthermore, the IRQ is
ultimately under the control of the ATA device not the ATA controller
making things a whole lot more fragile and prone to permanent and
transient IRQ problems.
Even if the controller and hardware themselves are okay, IRQ sharing
means that any device can be a victim of rogue interrupts. For
example, there was an I2C device for which the driver didn't use IRQ
but when the configuration is right (well, rather, wrong), its IRQ
line would assert and cause IRQ storm and there wasn't much the driver
could do to prevent that. There's also the BIOS and OS expecting
different things especially during suspend and resume. On my x61s
when resuming from STR something funny happens and the IRQ line for a
USB host gets stuck once in a while.
Most of these problems can be worked around much more efficiently
without adding noticeable runtime overhead or driver complexity by
using polling carefully. This patchset improves the existing spurious
IRQ handling and implements two mechanisms to work around lost
interrupts.
Emphasis was put on making it easy to use for drivers. Drivers only
need IRQF_SHARED on the interrupt handler and add some function calls
here and there. Functions which can be used in hot paths are
efficient and can be called without worrying about performance
implications by virtually any driver which deals with an actual
hardware. Except for init functions, all don't care about calling
context and won't fail catastrophically even if used incorrectly.
Also, operational parameters are predetermined and/or self regulating.
After this patchset, the following three mechanisms are in place to
deal with IRQ problems.
* IRQ expecting: Tightly coupled with controller operation. Provides
strong protection against most lost IRQ problems. Applied to
libata.
* IRQ watching: Loosely coupled with controller operation. Provides
protection against common lost IRQ problems (misrouting). Applied
to usb.
* Spurious IRQ handling: More responsive and less expensive than the
existing implementation. Tries to disengage after some period so
that transient problems don't end up having prolonged effects.
With the patchset applied, my test machine works fine with IRQ routing
messed up. By applying the mechanism to more drivers, things will
improve but, even in the current state, many systems with IRQ problems
will be able to cope with transient problems much better and install
and run the base system well enough to allow bug reporting and
debugging of persistent ones.
This patchset contains the following 12 patches.
0001-irq-cleanup-irqfixup.patch
0002-irq-make-spurious-poll-timer-per-desc.patch
0003-irq-use-desc-poll_timer-for-irqpoll.patch
0004-irq-kill-IRQF_IRQPOLL.patch
0005-irq-misc-preparations-for-further-changes.patch
0006-irq-implement-irq_schedule_poll.patch
0007-irq-improve-spurious-IRQ-handling.patch
0008-irq-implement-IRQ-watching.patch
0009-irq-implement-IRQ-expecting.patch
0010-irq-add-comment-about-overall-design-of-lost-spuriou.patch
0011-libata-use-IRQ-expecting.patch
0012-usb-use-IRQ-watching.patch
0001 is cleanup.
0002-0004 convert the existing polling mechanisms to use per-desc
timer instead of IRQF_IRQPOLL. This is more reliable and cheaper and
easier to maintain.
0005-0006 prepare for further changes.
0007-0010 implement better lost/spurious interrupt handling
mechanisms.
0011-0012 apply them to libata and usb.
This patchset is available in the following git branch.
git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git lost-spurious-irq
and contains the following changes.
arch/arm/mach-aaec2000/core.c | 2
arch/arm/mach-at91/at91rm9200_time.c | 2
arch/arm/mach-at91/at91sam926x_time.c | 2
arch/arm/mach-bcmring/core.c | 2
arch/arm/mach-clps711x/time.c | 2
arch/arm/mach-cns3xxx/core.c | 2
arch/arm/mach-ebsa110/core.c | 2
arch/arm/mach-ep93xx/core.c | 2
arch/arm/mach-footbridge/dc21285-timer.c | 2
arch/arm/mach-footbridge/isa-timer.c | 2
arch/arm/mach-h720x/cpu-h7201.c | 2
arch/arm/mach-h720x/cpu-h7202.c | 2
arch/arm/mach-integrator/integrator_ap.c | 2
arch/arm/mach-ixp2000/core.c | 2
arch/arm/mach-ixp23xx/core.c | 2
arch/arm/mach-ixp4xx/common.c | 2
arch/arm/mach-lh7a40x/time.c | 2
arch/arm/mach-mmp/time.c | 2
arch/arm/mach-netx/time.c | 2
arch/arm/mach-ns9xxx/irq.c | 3
arch/arm/mach-ns9xxx/time-ns9360.c | 2
arch/arm/mach-nuc93x/time.c | 2
arch/arm/mach-omap1/time.c | 2
arch/arm/mach-omap1/timer32k.c | 2
arch/arm/mach-omap2/timer-gp.c | 2
arch/arm/mach-pnx4008/time.c | 2
arch/arm/mach-pxa/time.c | 2
arch/arm/mach-sa1100/time.c | 2
arch/arm/mach-shark/core.c | 2
arch/arm/mach-u300/timer.c | 2
arch/arm/mach-w90x900/time.c | 2
arch/arm/plat-iop/time.c | 2
arch/arm/plat-mxc/time.c | 2
arch/arm/plat-samsung/time.c | 2
arch/arm/plat-versatile/timer-sp.c | 2
arch/blackfin/kernel/time-ts.c | 6
arch/ia64/kernel/time.c | 2
arch/parisc/kernel/irq.c | 2
arch/powerpc/platforms/cell/interrupt.c | 5
arch/x86/kernel/time.c | 2
drivers/ata/libata-core.c | 15
drivers/ata/libata-eh.c | 4
drivers/ata/libata-sff.c | 37 -
drivers/clocksource/sh_cmt.c | 3
drivers/clocksource/sh_mtu2.c | 3
drivers/clocksource/sh_tmu.c | 3
drivers/usb/core/hcd.c | 1
include/linux/interrupt.h | 43 -
include/linux/irq.h | 40 +
include/linux/libata.h | 2
kernel/irq/chip.c | 20
kernel/irq/handle.c | 7
kernel/irq/internals.h | 10
kernel/irq/manage.c | 18
kernel/irq/proc.c | 5
kernel/irq/spurious.c | 978 ++++++++++++++++++++++++++-----
56 files changed, 1008 insertions(+), 269 deletions(-)
Thanks.
--
tejun
next reply other threads:[~2010-06-13 15:32 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-13 15:31 Tejun Heo [this message]
2010-06-13 15:31 ` [PATCH 01/12] irq: cleanup irqfixup Tejun Heo
2010-06-13 15:31 ` [PATCH 02/12] irq: make spurious poll timer per desc Tejun Heo
2010-06-15 5:10 ` Konrad Rzeszutek Wilk
2010-06-15 16:34 ` Tejun Heo
2010-06-13 15:31 ` [PATCH 03/12] irq: use desc->poll_timer for irqpoll Tejun Heo
2010-06-13 15:31 ` [PATCH 04/12] irq: kill IRQF_IRQPOLL Tejun Heo
2010-06-13 15:31 ` [PATCH 05/12] irq: misc preparations for further changes Tejun Heo
2010-06-13 15:31 ` [PATCH 06/12] irq: implement irq_schedule_poll() Tejun Heo
2010-06-13 15:31 ` [PATCH 07/12] irq: improve spurious IRQ handling Tejun Heo
2010-06-13 15:31 ` [PATCH 08/12] irq: implement IRQ watching Tejun Heo
2010-06-13 15:31 ` [PATCH 09/12] irq: implement IRQ expecting Tejun Heo
2010-06-14 9:21 ` Jiri Slaby
2010-06-14 9:43 ` Tejun Heo
2010-06-14 9:46 ` Tejun Heo
2010-06-17 3:48 ` Arjan van de Ven
2010-06-17 8:18 ` Tejun Heo
2010-06-17 11:12 ` Thomas Gleixner
2010-06-17 11:23 ` Tejun Heo
2010-06-17 11:43 ` Alan Cox
2010-06-17 15:54 ` Tejun Heo
2010-06-17 16:02 ` Arjan van de Ven
2010-06-17 16:47 ` Tejun Heo
2010-06-18 6:26 ` Arjan van de Ven
2010-06-18 9:23 ` Tejun Heo
2010-06-18 9:45 ` Thomas Gleixner
2010-06-19 8:35 ` Andi Kleen
2010-06-19 8:42 ` Tejun Heo
2010-06-19 9:00 ` Andi Kleen
2010-06-19 9:03 ` Tejun Heo
2010-06-19 14:54 ` Arjan van de Ven
2010-06-19 19:49 ` Andi Kleen
2010-06-19 20:07 ` Arjan van de Ven
2010-06-13 15:31 ` [PATCH 10/12] irq: add comment about overall design of lost/spurious IRQ handling Tejun Heo
2010-06-13 15:31 ` [PATCH 11/12] libata: use IRQ expecting Tejun Heo
2010-06-13 15:31 ` [PATCH 12/12] usb: use IRQ watching Tejun Heo
2010-06-14 21:41 ` Greg KH
2010-06-14 21:52 ` Tejun Heo
2010-06-14 22:11 ` Greg KH
2010-06-14 22:19 ` Tejun Heo
2010-06-15 10:30 ` Kay Sievers
2010-06-15 11:05 ` Jean Delvare
2010-06-15 13:30 ` Kay Sievers
2010-06-15 11:20 ` Tejun Heo
2010-06-15 13:36 ` Kay Sievers
2010-06-15 17:36 ` Tejun Heo
2010-06-15 17:47 ` Greg KH
2010-06-15 17:52 ` Tejun Heo
2010-06-21 13:51 ` Tejun Heo
2010-06-21 20:27 ` Greg KH
2010-06-22 7:32 ` Tejun Heo
[not found] ` <1276443098-20653-12-git-send-email-tj@kernel.org>
2010-06-21 13:52 ` [PATCH 11/12] libata: use IRQ expecting Tejun Heo
2010-06-25 0:22 ` Jeff Garzik
2010-06-25 7:44 ` Tejun Heo
2010-06-25 9:48 ` Jeff Garzik
2010-06-25 9:51 ` Tejun Heo
2010-06-25 13:02 ` [PATCH 1/2 #upstream] sata_fsl,mv,nv: prepare for NCQ command completion update Tejun Heo
2010-06-25 13:03 ` [PATCH 2/2 #upstream] libata: always use ata_qc_complete_multiple() for NCQ command completions Tejun Heo
2010-08-17 22:03 ` Jeff Garzik
2010-08-01 23:47 ` [PATCH 1/2 #upstream] sata_fsl,mv,nv: prepare for NCQ command completion update Jeff Garzik
2010-08-02 7:18 ` Tejun Heo
2010-08-04 4:22 ` Jeff Garzik
2010-06-26 3:45 ` [PATCH 11/12] libata: use IRQ expecting Jeff Garzik
2010-06-26 3:52 ` Jeff Garzik
2010-06-26 8:31 ` Tejun Heo
2010-06-26 9:16 ` Jeff Garzik
2010-06-26 9:44 ` Tejun Heo
2010-07-02 14:41 ` Tejun Heo
2010-07-02 14:53 ` Tejun Heo
2010-07-10 10:06 ` Tejun Heo
2010-07-14 7:58 ` Jeff Garzik
2010-07-14 9:26 ` Tejun Heo
2010-07-27 17:37 ` Jeff Garzik
2010-07-02 14:59 ` [GIT PULL] irq: better lost/spurious irq handling Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1276443098-20653-1-git-send-email-tj@kernel.org \
--to=tj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bphilips@suse.de \
--cc=jeff@garzik.orglinux \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).