public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Matthew Wilcox <matthew@wil.cx>
Cc: linux-pci@vger.kernel.org,
	Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>,
	Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
	David Miller <davem@davemloft.net>,
	Dan Williams <dan.j.williams@intel.com>,
	Martine.Silbermann@hp.com, linux-kernel@vger.kernel.org,
	Michael Ellerman <michaele@au1.ibm.com>
Subject: Re: Multiple MSI
Date: Thu, 03 Jul 2008 13:24:29 +1000	[thread overview]
Message-ID: <1215055469.21182.70.camel@pasglop> (raw)
In-Reply-To: <20080703024445.GA14894@parisc-linux.org>

On Wed, 2008-07-02 at 20:44 -0600, Matthew Wilcox wrote:
> At the moment, devices with the MSI-X capability can request multiple
> interrupts, but devices with MSI can only request one.  This isn't an
> inherent limitation of MSI, it's just the way that Linux currently
> implements it.  I intend to lift that restriction, so I'm throwing out
> some idea that I've had while looking into it.

Interesting. I've been thinking about that one for some time but
back then, the feedback I got left and right is that nobody cares :-)

I'm adding Michael Ellerman to the CC list, he's done a good part of the
PowerPC MSI stuff.

> First, architectures need to support MSI, and I'm ccing the people who
> seem to have done the work in the past to keep them in the loop.  I do
> intend to make supporting multiple MSIs optional (the midlayer code will
> fall back to supporting only a single MSI).

Ok.

> Next, MSI requires that you assign a block of interrupts that is a power
> of two in size (between 2^0 and 2^5), and aligned to at least that power
> of two.  I've looked at the x86 code and I think this is doable there
> [1]. I don't know how doable it is on other architectures.  If not, just
> ignore all this and continue to have MSI hardware use a single interrupt.

Well, it requires that for HW number. But I don't think it should
require that at API level (ie. for driver visible irq numbers). Some
architectures can fully remap between HW sources and "linux" visible IRQ
numbers and thus wouldn't have that limitation from an API point of
view.

> In a somewhat related topic, I really don't like the API for
> pci_enable_msix().  The all-or-nothing allocation and returning
> the number of vectors that could have been allocated is a bit kludgy,
> as is the existence of the msix_entry vector.  I'd like some advice on a
> couple of alternative schemes:
> 
> 1. pci_enable_msi_block(pdev, nr_irqs).  If successful, updates pdev->irq
> to be the base irq number; the allocated interrupts are from pdev->irq
> to pdev->irq + nr_irqs - 1.  If it fails, return the number of
> interrupts that could have been allocated.

That would constraint the linux IRQ numbers to be a linear block just
like the HW numbers. Better than having them be a power-of-two aligned
but still a restriction on SW number allocation, though it's probably
not as bad as the underlying HW limitation.

> 2. pci_enable_msi_block(pdev, nr_irqs, min_irqs).  Will allocate at
> least min_irqs or return failure, otherwise same as above.

I prefer 2.

> My design is largely influenced by the AHCI spec where the device can
> potentially cope with any number of MSI interrupts allocated and will
> use them as best it can.  I don't know how common that is.
> 
> One thing I do want to be clear in the API is that the driver can ask
> for any number of irqs, the pci layer will round up to the next power of
> two if necessary.

Well, that's where I'm not happy. The API shouldn't expose the
"power-of-two" thing. The numbers shown to drivers aren't in the same
space as the source numbers as seen by the HW on many architectures and
thus don't need to have the same constraints.


> I don't quite understand how IRQ affinity will work yet.  Is it feasible
> to redirect one interrupt from a block to a different CPU?  I don't even
> understand this on x86-64, let alone the other four architectures.  I'm
> OK with forcing all MSIs in the same block to move with the one that was
> assigned a new affinity if that's the way it has to be done.

It's very implementation specific. IE. On most powerpc implementations,
MSI just route via a decoder to sources of the existing interrupt
controller so we can control per-source affinity at that level. Some x86
seem to require different base addresses which makes it mostly
impossible to spread them I believe (maybe that's why people came up
with MSI-X ?)

> I'll leave it at that for now.  I do have some other thoughts and a
> half-baked implementation, but this should be enough to be going along
> with.
> 
> [1] The current scheme for assigning vectors on x86-64 will tend to
> fragment the space.  However, the number of interrupts actually requested
> on desktop-sized machines remains relatively small in comparison to the
> number of vectors available, and it is to be hoped that more and more
> devices will use MSI anyway.
> 

Cheers,
Ben.



  reply	other threads:[~2008-07-03  7:08 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-03  2:44 Multiple MSI Matthew Wilcox
2008-07-03  3:24 ` Benjamin Herrenschmidt [this message]
2008-07-03  3:59   ` Matthew Wilcox
2008-07-03  4:41     ` Benjamin Herrenschmidt
2008-07-03  6:44       ` Michael Ellerman
2008-07-03  9:10         ` Arnd Bergmann
2008-07-03  9:17           ` Benjamin Herrenschmidt
2008-07-03 11:31             ` Matthew Wilcox
2008-07-03 11:41               ` Benjamin Herrenschmidt
2008-07-04  1:52                 ` Michael Ellerman
2008-07-04  8:08                   ` Alan Cox
2008-07-03 11:34       ` Matthew Wilcox
2008-07-07 16:17   ` Grant Grundler
2008-07-07 16:39     ` Matthew Wilcox
2008-07-07 16:51       ` Grant Grundler
2008-07-07 23:06     ` Benjamin Herrenschmidt
2008-07-10  0:55     ` Michael Ellerman
2008-07-05 13:27 ` Matthew Wilcox
2008-07-05 13:34   ` [PATCH 1/4] PCI MSI: Store the number of messages in the msi_desc Matthew Wilcox
2008-07-07  2:05     ` Michael Ellerman
2008-07-07  2:41       ` Matthew Wilcox
2008-07-07  3:26         ` Benjamin Herrenschmidt
2008-07-07  3:48         ` Michael Ellerman
2008-07-07 12:04           ` Matthew Wilcox
2008-07-07 16:02             ` Grant Grundler
2008-07-07 16:19               ` Matthew Wilcox
2008-07-10  1:32             ` Michael Ellerman
2008-07-10  1:35               ` Matthew Wilcox
2008-07-05 13:34   ` [PATCH 2/4] PCI: Support multiple MSI Matthew Wilcox
2008-07-07  2:05     ` Michael Ellerman
2008-07-07  2:45       ` Matthew Wilcox
2008-07-07  3:56         ` Michael Ellerman
2008-07-07 11:31           ` Matthew Wilcox
2008-07-10  1:32             ` Michael Ellerman
2008-07-10  1:43               ` Matthew Wilcox
2008-07-10  4:00                 ` Michael Ellerman
2008-07-05 13:34   ` [PATCH 3/4] AHCI: Request multiple MSIs Matthew Wilcox
2008-07-07 16:45     ` Grant Grundler
2008-07-07 17:48       ` Matthew Wilcox
2008-07-20  7:49         ` Grant Grundler
2008-07-05 13:34   ` [PATCH 4/4] x86-64: Support for " Matthew Wilcox
2008-07-05 13:43   ` Multiple MSI Matthew Wilcox
2008-07-05 22:38     ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1215055469.21182.70.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=Martine.Silbermann@hp.com \
    --cc=dan.j.williams@intel.com \
    --cc=davem@davemloft.net \
    --cc=kaneshige.kenji@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=matthew@wil.cx \
    --cc=michaele@au1.ibm.com \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox