From: Jeff Garzik <jeff@garzik.org>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Rene Herman <rene.herman@keyaccess.nl>,
Adrian Bunk <bunk@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
rmk@arm.linux.org.uk, Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@redhat.com>
Subject: MSI, fun for the whole family (was Re: [git patch] free_irq() fixes)
Date: Thu, 24 Apr 2008 23:33:02 -0400 [thread overview]
Message-ID: <481150EE.3040103@garzik.org> (raw)
In-Reply-To: <m11w4u8x1c.fsf@frodo.ebiederm.org>
Eric W. Biederman wrote:
> Jeff Garzik <jeff@garzik.org> writes:
>
>> Eric W. Biederman wrote:
>>> And on x86 at least the hardware maps the MSI write into an interrupt.
>>> So there is not an opportunity to get any metdata/OOB data from the
>>> MSI message. Instead we just potentially get a boatload more irq
>>> sources. Which is one of the things making a static NR_IRQS painful.
>>>
>>> To be safe we have to make NR_IRQS 10x+ or so bigger then people use
>>> today. Just in case they decide to plug in some really irq hungry
>>> cards.
>>
>> Just to be clear, irq_chip/irq_desc and metadata/OOB data are two very different
>> beasts. irq_chip/irq_desc is more a system attribute as Linus notes. Also, it
>> doesn't change very often.
>>
>> metadata/OOB, on the other hand, is different -for each interrupt-, and is
>> highly relevant to drivers. Thus should be part of the driver API somehow.
>
> I'm not certain I follow so I will ask.
>
> Do you mean information that is different each time an interrupt is fires?
>
> Or do you mean information that differs for each different interrupt?
> Something like the current dev_id?
>
> To my knowledge there is not any information that varies each time an
> interrupt fires.
Absolutely there is! This is why MSI is so cool.
You get a tiny chunk of data from the hardware, across the PCI bus in a
single PCI transaction, sent [well, basically...] straight to the driver
__for each MSI interrupt__. Rather than having a separate interrupt
line -- really an ugly OOB mechanism -- you get a bus transaction as God
intended, a bus transaction just like all the others going across the
PCI bus.
Let's illustrate with a real world example, with hardware you probably
already have in your hands today.
Download AHCI 1.1 SATA controller specification from
http://www.intel.com/technology/serialata/ahci.htm
and check out Section 2.3 and MSI-related bits of Section 10.6.2 for the
usage of those PCI MSI registers on the PCI device.
An AHCI PCI device uses MSI messages to inform the driver which <mask>
of 32 SATA ports have asserted an activity indication.
This MSI message varies _for each interrupt_, and replaces the standard
driver idiom of reading a hardware Interrupt-Status register.
Thus you can see increased performance with MSI messages because the
hardware "pushes" useful information to the driver, using an in-band
mechanism (PCI bus transaction) rather than an out-of-band mechanism ($N
SATA ports sharing a single interrupt line).
This is the reverse of the standard model, where the driver receives the
knowledge "your interrupt line asserted... maybe" and it must deduce
activity from there by reading an Interrupt-Status register.
That is one fundamental of MSI messages: they carry data. To
illustrate with "kernel pseudocode", this equates to
irqreturn_t irq_handler(int irq, void *dev_id,
const void *metadata,
size_t metadata_len)
You have a fundamentally new model for interrupt handling with MSI...
You are no longer managing an interrupt line that is asserted and
cleared. It is now an asynchronous flow of data blobs from hardware to
various per-driver "mailboxes".
Jeff
next prev parent reply other threads:[~2008-04-25 3:33 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-22 22:17 [git patch] free_irq() fixes Jeff Garzik
2008-04-22 22:25 ` Linus Torvalds
2008-04-22 22:59 ` Jeff Garzik
2008-04-22 23:20 ` Linus Torvalds
2008-04-22 23:49 ` Jeff Garzik
2008-04-22 23:52 ` Linus Torvalds
2008-04-23 0:05 ` Adrian Bunk
2008-04-23 0:16 ` Linus Torvalds
2008-04-23 13:51 ` Rene Herman
2008-04-24 2:10 ` Jeff Garzik
2008-04-24 2:19 ` Linus Torvalds
2008-04-24 5:59 ` Eric W. Biederman
2008-04-24 10:53 ` Jeff Garzik
2008-04-24 15:16 ` Linus Torvalds
2008-04-24 15:40 ` Jeff Garzik
2008-04-24 15:55 ` Linus Torvalds
2008-04-24 15:37 ` Alan Cox
2008-04-24 16:20 ` Jeff Garzik
2008-04-24 16:16 ` Jeff Garzik
2008-04-24 16:48 ` Eric W. Biederman
2008-04-24 16:58 ` Linus Torvalds
2008-04-24 18:15 ` Eric W. Biederman
2008-04-24 17:30 ` Jeff Garzik
2008-04-25 2:53 ` Eric W. Biederman
2008-04-25 3:33 ` Jeff Garzik [this message]
2008-04-25 3:57 ` MSI, fun for the whole family Roland Dreier
2008-04-25 4:19 ` David Miller
2008-04-25 4:35 ` Jeff Garzik
2008-04-25 5:48 ` Eric W. Biederman
2008-04-25 22:44 ` Roland Dreier
2008-04-25 5:08 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=481150EE.3040103@garzik.org \
--to=jeff@garzik.org \
--cc=akpm@linux-foundation.org \
--cc=bunk@kernel.org \
--cc=ebiederm@xmission.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=rene.herman@keyaccess.nl \
--cc=rmk@arm.linux.org.uk \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox