linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Multiple MSI messages support
@ 2007-10-30 21:13 Shawn Jin
  2007-10-30 22:23 ` Jeff Garzik
  0 siblings, 1 reply; 8+ messages in thread
From: Shawn Jin @ 2007-10-30 21:13 UTC (permalink / raw)
  To: linux-kernel

Hi,

If this is really off-topic here, I apologize first. But I cannot
think a better place to ask this particular question.

I understand that the current PCI subsystem or linux kernel (x86)
supports only one message when MSI is enabled even for devices having
multiple MSI messages. But why? Is this a limitation solely due to the
OS or due to the x86 APIC?

I know the current linux kernel (2.6.23) changed MSI message data
format a little bit to support other architectures. But some older
version (e.g. 2.6.18) defined a specific format for the MSI msg data
in a way that 8 bits contain the irq number and the other 8 bits have
the interrupt attributes, which is x86 specific. Why does the msg data
need to contain the irq number? Here is my hypothetic explanation. The
device writes the MSI msg data to the specified MSI msg address. And
APIC uses the irq number in the msg data to generate appropriate
interrupt, which of course results in an appropriate ISR invoked. A
device having multiple MSI messages typically appends some information
of which MSI message to the msg data field. For example, if the system
(or OS) configures the MSI msg data as 0x5000, a device having 4 MSI
messages could write 0x5000, 0x5001, 0x5002, 0x5003 to differentiate
the MSI messages. However this cannot work with the APIC due to the
way how APIC asserts interrupts as I described above (if my
understanding is correct).

Hence my answer to the question is this is due to the x86 APIC. For
other architectures such as powerpc this is probably not a problem
since the interrupt controller is different. Am I correct?

TIA.
-Shawn.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Multiple MSI messages support
  2007-10-30 21:13 Multiple MSI messages support Shawn Jin
@ 2007-10-30 22:23 ` Jeff Garzik
  2007-10-30 22:51   ` Roland Dreier
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Garzik @ 2007-10-30 22:23 UTC (permalink / raw)
  To: Shawn Jin; +Cc: linux-kernel

Shawn Jin wrote:
> Hi,
> 
> If this is really off-topic here, I apologize first. But I cannot
> think a better place to ask this particular question.
> 
> I understand that the current PCI subsystem or linux kernel (x86)
> supports only one message when MSI is enabled even for devices having
> multiple MSI messages. But why? Is this a limitation solely due to the
> OS or due to the x86 APIC?
> 
> I know the current linux kernel (2.6.23) changed MSI message data
> format a little bit to support other architectures. But some older
> version (e.g. 2.6.18) defined a specific format for the MSI msg data
> in a way that 8 bits contain the irq number and the other 8 bits have
> the interrupt attributes, which is x86 specific. Why does the msg data
> need to contain the irq number? Here is my hypothetic explanation. The
> device writes the MSI msg data to the specified MSI msg address. And
> APIC uses the irq number in the msg data to generate appropriate
> interrupt, which of course results in an appropriate ISR invoked. A
> device having multiple MSI messages typically appends some information
> of which MSI message to the msg data field. For example, if the system
> (or OS) configures the MSI msg data as 0x5000, a device having 4 MSI
> messages could write 0x5000, 0x5001, 0x5002, 0x5003 to differentiate
> the MSI messages. However this cannot work with the APIC due to the
> way how APIC asserts interrupts as I described above (if my
> understanding is correct).
> 
> Hence my answer to the question is this is due to the x86 APIC. For
> other architectures such as powerpc this is probably not a problem
> since the interrupt controller is different. Am I correct?

IMO it's more like there has never been enough need for anybody to look 
into it, I bet...

The way drivers are written, you are typically must touch a few key 
hardware registers _anyway_, so the multiple messages in practice are 
not much more useful than the simple fact that your MSI irq handler 
function was called (with all that indicates and implies).

	Jeff



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Multiple MSI messages support
  2007-10-30 22:23 ` Jeff Garzik
@ 2007-10-30 22:51   ` Roland Dreier
  2007-10-30 23:02     ` David Miller
                       ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Roland Dreier @ 2007-10-30 22:51 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Shawn Jin, linux-kernel

 > > If this is really off-topic here, I apologize first. But I cannot
 > > think a better place to ask this particular question.
 > > I understand that the current PCI subsystem or linux kernel (x86)
 > > supports only one message when MSI is enabled even for devices having
 > > multiple MSI messages. But why? Is this a limitation solely due to the
 > > OS or due to the x86 APIC?
 > > I know the current linux kernel (2.6.23) changed MSI message data
 > > format a little bit to support other architectures. But some older
 > > version (e.g. 2.6.18) defined a specific format for the MSI msg data
 > > in a way that 8 bits contain the irq number and the other 8 bits have
 > > the interrupt attributes, which is x86 specific. Why does the msg data
 > > need to contain the irq number? Here is my hypothetic explanation. The
 > > device writes the MSI msg data to the specified MSI msg address. And
 > > APIC uses the irq number in the msg data to generate appropriate
 > > interrupt, which of course results in an appropriate ISR invoked. A
 > > device having multiple MSI messages typically appends some information
 > > of which MSI message to the msg data field. For example, if the system
 > > (or OS) configures the MSI msg data as 0x5000, a device having 4 MSI
 > > messages could write 0x5000, 0x5001, 0x5002, 0x5003 to differentiate
 > > the MSI messages. However this cannot work with the APIC due to the
 > > way how APIC asserts interrupts as I described above (if my
 > > understanding is correct).
 > > Hence my answer to the question is this is due to the x86 APIC. For
 > > other architectures such as powerpc this is probably not a problem
 > > since the interrupt controller is different. Am I correct?

 > IMO it's more like there has never been enough need for anybody to
 > look into it, I bet...

Actually the original poster's explanation is pretty accurate.  MSI
imposes a restriction on how a device generates multiple messages,
since there is only one message data register and the rest of the
messages must be based on that in a simple way.  And the format of
"Intel-style" APIC MSI messages does not match up with the way the PCI
spec generates multiple messages.

In fact at least IBM pSeries boxes seem to be able to use MSI to
generate multiple interrupts from the same device -- BenH can probably
give details.

Multiple interrupt messages are supported by Linux via MSI-X (which
allows each message to have completely different data).

 > The way drivers are written, you are typically must touch a few key
 > hardware registers _anyway_, so the multiple messages in practice are
 > not much more useful than the simple fact that your MSI irq handler
 > function was called (with all that indicates and implies).

For high-performance devices this is not really true.  The InfiniBand
mthca driver can use MSI to get a single interrupt, or MSI-X to get
different interrupts for different types of events.  MSI-X allows the
read of the "interrupt cause register" to be avoided, and this ends up
making a measurable performance difference for the fast path.

Also, obviously, having multiple interrupts means you can bind
different interrupts to different CPUs, which will become very
important if/when things like multi-queue NICs are supported.

 - R.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Multiple MSI messages support
  2007-10-30 22:51   ` Roland Dreier
@ 2007-10-30 23:02     ` David Miller
  2007-10-31  1:34     ` Shawn Jin
  2007-10-31  4:00     ` Jeff Garzik
  2 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2007-10-30 23:02 UTC (permalink / raw)
  To: rdreier; +Cc: jeff, shawnxjin, linux-kernel

From: Roland Dreier <rdreier@cisco.com>
Date: Tue, 30 Oct 2007 15:51:08 -0700

> In fact at least IBM pSeries boxes seem to be able to use MSI to
> generate multiple interrupts from the same device -- BenH can probably
> give details.

Sparc64 PCI-E controllers can do this as well.

> For high-performance devices this is not really true.  The InfiniBand
> mthca driver can use MSI to get a single interrupt, or MSI-X to get
> different interrupts for different types of events.  MSI-X allows the
> read of the "interrupt cause register" to be avoided, and this ends up
> making a measurable performance difference for the fast path.
> 
> Also, obviously, having multiple interrupts means you can bind
> different interrupts to different CPUs, which will become very
> important if/when things like multi-queue NICs are supported.

Right, and this is used heavily in the NIU driver for example.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Multiple MSI messages support
  2007-10-30 22:51   ` Roland Dreier
  2007-10-30 23:02     ` David Miller
@ 2007-10-31  1:34     ` Shawn Jin
  2007-10-31  4:00     ` Jeff Garzik
  2 siblings, 0 replies; 8+ messages in thread
From: Shawn Jin @ 2007-10-31  1:34 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Jeff Garzik, linux-kernel

On Oct 30, 2007 2:51 PM, Roland Dreier <rdreier@cisco.com> wrote:
>
>  > > I understand that the current PCI subsystem or linux kernel (x86)
>  > > supports only one message when MSI is enabled even for devices having
>  > > multiple MSI messages. But why? Is this a limitation solely due to the
>  > > OS or due to the x86 APIC?
>  > > I know the current linux kernel (2.6.23) changed MSI message data
>  > > format a little bit to support other architectures. But some older
>  > > version (e.g. 2.6.18) defined a specific format for the MSI msg data
>  > > in a way that 8 bits contain the irq number and the other 8 bits have
>  > > the interrupt attributes, which is x86 specific. Why does the msg data
>  > > need to contain the irq number? Here is my hypothetic explanation. The
>  > > device writes the MSI msg data to the specified MSI msg address. And
>  > > APIC uses the irq number in the msg data to generate appropriate
>  > > interrupt, which of course results in an appropriate ISR invoked. A
>  > > device having multiple MSI messages typically appends some information
>  > > of which MSI message to the msg data field. For example, if the system
>  > > (or OS) configures the MSI msg data as 0x5000, a device having 4 MSI
>  > > messages could write 0x5000, 0x5001, 0x5002, 0x5003 to differentiate
>  > > the MSI messages. However this cannot work with the APIC due to the
>  > > way how APIC asserts interrupts as I described above (if my
>  > > understanding is correct).
>  > > Hence my answer to the question is this is due to the x86 APIC. For
>  > > other architectures such as powerpc this is probably not a problem
>  > > since the interrupt controller is different. Am I correct?
>
>  > IMO it's more like there has never been enough need for anybody to
>  > look into it, I bet...
>
> Actually the original poster's explanation is pretty accurate.  MSI
> imposes a restriction on how a device generates multiple messages,
> since there is only one message data register and the rest of the
> messages must be based on that in a simple way.  And the format of
> "Intel-style" APIC MSI messages does not match up with the way the PCI
> spec generates multiple messages.
>
> In fact at least IBM pSeries boxes seem to be able to use MSI to
> generate multiple interrupts from the same device -- BenH can probably
> give details.
>
> Multiple interrupt messages are supported by Linux via MSI-X (which
> allows each message to have completely different data).

Thanks for confirming this. I found that Windows also allocates only
one MSI message for such device, though I know nobody here is
interested in Windows. :-P This also confirms from another perspective
that supporting only one MSI message is actually due to the APIC
limitation.

>  > The way drivers are written, you are typically must touch a few key
>  > hardware registers _anyway_, so the multiple messages in practice are
>  > not much more useful than the simple fact that your MSI irq handler
>  > function was called (with all that indicates and implies).
>
> For high-performance devices this is not really true.  The InfiniBand
> mthca driver can use MSI to get a single interrupt, or MSI-X to get
> different interrupts for different types of events.  MSI-X allows the
> read of the "interrupt cause register" to be avoided, and this ends up
> making a measurable performance difference for the fast path.

I agree. Accessing HW registers across PCIe is not quite desirable
since it takes relatively long time especially for reads. In some
situation, if multiple MSI messages were supported, even a single
interrupt would handle multiple events and avoid reading some HW
status register to determine the type of event since the message data
has such information. Hence the performance should be improved quite a
bit.

-Shawn.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Multiple MSI messages support
  2007-10-30 22:51   ` Roland Dreier
  2007-10-30 23:02     ` David Miller
  2007-10-31  1:34     ` Shawn Jin
@ 2007-10-31  4:00     ` Jeff Garzik
  2007-10-31  5:05       ` Shawn Jin
  2 siblings, 1 reply; 8+ messages in thread
From: Jeff Garzik @ 2007-10-31  4:00 UTC (permalink / raw)
  To: Roland Dreier; +Cc: Shawn Jin, linux-kernel

Roland Dreier wrote:
> Multiple interrupt messages are supported by Linux via MSI-X (which

Absolutely, but the poster seemed to be talking about MSI, not MSI-X.

	Jeff



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Multiple MSI messages support
  2007-10-31  4:00     ` Jeff Garzik
@ 2007-10-31  5:05       ` Shawn Jin
  2007-10-31  5:17         ` Roland Dreier
  0 siblings, 1 reply; 8+ messages in thread
From: Shawn Jin @ 2007-10-31  5:05 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Roland Dreier, linux-kernel

On Oct 30, 2007 8:00 PM, Jeff Garzik <jeff@garzik.org> wrote:
> Roland Dreier wrote:
> > Multiple interrupt messages are supported by Linux via MSI-X (which
>
> Absolutely, but the poster seemed to be talking about MSI, not MSI-X.

I guess Roland understood my intention. :-P

My interpretation of this statement is that you have to use MSIX if
multiple MSI messages are required on Linux.

-Shawn.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Multiple MSI messages support
  2007-10-31  5:05       ` Shawn Jin
@ 2007-10-31  5:17         ` Roland Dreier
  0 siblings, 0 replies; 8+ messages in thread
From: Roland Dreier @ 2007-10-31  5:17 UTC (permalink / raw)
  To: Shawn Jin; +Cc: Jeff Garzik, linux-kernel

 > My interpretation of this statement is that you have to use MSIX if
 > multiple MSI messages are required on Linux.

Yes, Linux only supports multiple messages with MSI-X, not MSI.  This
is an architectural limitation of "Intel-style" interrupt controllers
(x86 and ia64).  It is also a limitation of the Linux's MSI API, since
there are platforms and devices for which an OS could allocate
multiple MSI messages, but Linux has no interface that allows that.

The historical reason for the Linux interface is simply that Intel
platforms were the first place that MSI/MSI-X was supported.

 - R.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-10-31  5:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-30 21:13 Multiple MSI messages support Shawn Jin
2007-10-30 22:23 ` Jeff Garzik
2007-10-30 22:51   ` Roland Dreier
2007-10-30 23:02     ` David Miller
2007-10-31  1:34     ` Shawn Jin
2007-10-31  4:00     ` Jeff Garzik
2007-10-31  5:05       ` Shawn Jin
2007-10-31  5:17         ` Roland Dreier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).