于 2013/3/6 11:54, Michael Ellerman 写道:
On Tue, Mar 05, 2013 at 03:19:57PM +0800, Mike Qiu wrote:
于 2013/3/5 10:23, Michael Ellerman 写道:
On Tue, Jan 15, 2013 at 03:38:55PM +0800, Mike Qiu wrote:
Adding a function irq_create_mapping_many() which can associate
multiple MSIs to a continous irq mapping.

This is needed to enable multiple MSI support for pSeries.

Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
---
 include/linux/irq.h       |    2 +
 include/linux/irqdomain.h |    3 ++
 kernel/irq/irqdomain.c    |   61 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 66 insertions(+), 0 deletions(-)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 60ef45b..e00a7ec 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -592,6 +592,8 @@ int __irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
 #define irq_alloc_desc_from(from, node)		\
 	irq_alloc_descs(-1, from, 1, node)
+#define irq_alloc_desc_n(nevc, node)		\
+	irq_alloc_descs(-1, 0, nevc, node)
This has been superseeded by irq_alloc_descs_from(), which is the right
way to do it.

      
Yes, but irq_alloc_descs_from() just for 1 irq
No it's not, look again.

#define irq_alloc_descs_from(from, cnt, node)   \
	irq_alloc_descs(-1, from, cnt, node)
Sorry, I see as irq_alloc_desc_from(from, node)
you are right


diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 96f3a1d..38648e6 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -636,6 +636,67 @@ int irq_create_strict_mappings(struct irq_domain *domain, unsigned int irq_base,
 }
 EXPORT_SYMBOL_GPL(irq_create_strict_mappings);
+/**
+ * irq_create_mapping_many - Map a range of hw IRQs to a range of virtual IRQs
+ * @domain: domain owning the interrupt range
+ * @hwirq_base: beginning of continuous hardware IRQ range
+ * @count: Number of interrupts to map

      
For multiple-MSI the allocated interrupt numbers must be a power-of-2,
and must be naturally aligned. I don't /think/ that's a requirement for
the virtual numbers, but it's probably best that we do it anyway.

So this API needs to specify that it will give you back a power-of-2
block that is naturally aligned - otherwise you can't use it for MSI.

      
rtas_call will return the numbers of hardware interrupt, and it
should be power-of-2, as this I think do not need to specify
You're confusing hardware interrupt numbers and virtual interrupt
numbers. My comment is about irq_create_mapping_many(), which returns
virtual interrupt numbers.

As I said I don't think there is a requirement that the virtual
interrupt numbers are also a power-of-2 naturally aligned block, but we
should allocate them as one anyway, to avoid any issues in future.
But for virtual interrupt numbersit should be a power-of-2 naturally
aligned block, because it must be continuous, as the MSI-HOWTO.txt says:

    4.2.2 pci_enable_msi_block
    int pci_enable_msi_block(struct pci_dev *dev, int count) 
    This variation on the above call allows a device driver to request
    multiple MSIs.  The MSI specification only allows interrupts to be
    allocated in powers of two, up to a maximum of 2^5 (32).
    If this function returns 0, it has succeeded in allocating at least
    as many interrupts as the driver requested
    (it may have allocated more in order to satisfy the power-of-two
    requirement). In this case, the function enables MSI on this device
    and updates dev->irq to be the lowest of the new interrupts
    assigned to it. The other interrupts assigned to the device are in
    the range dev->irq to dev->irq + count - 1.

See the last line, that means for the virtual interrupts must be a
continuous block.
And so this API, which returns virtual interrupt numbers, must satisfy
that specification.

+	/* Look for default domain if nececssary */
+	if (!domain)
+		domain = irq_default_domain;
+	if (!domain) {
+		pr_warn("irq_create_mapping called for NULL domain, hwirq=%lx\n"
+			, hwirq_base);
+		WARN_ON(1);
+		return 0;
+	}
+	pr_debug("-> using domain @%p\n", domain);
+
+	/* For IRQ_DOMAIN_MAP_LEGACY, get the first virtual interrupt number */
+	if (domain->revmap_type == IRQ_DOMAIN_MAP_LEGACY)
+		return irq_domain_legacy_revmap(domain, hwirq_base);
The above doesn't work.
Why it doesn't work ?
Because irq_domain_legacy_revmap() only allocates a single interrupt
number.
OK, your right.

      
+	/* Check if mapping already exists */
+	for (i = 0; i < count; i++) {
+		virq = irq_find_mapping(domain, hwirq_base+i);
+		if (virq) {
+			pr_debug("existing mapping on virq %d,"
+					" now dispose it first\n", virq);
+			irq_dispose_mapping(virq);

      
You might have just disposed of someone elses mapping, we shouldn't do
that. It should be an error to the caller.

      
It's a good question. If the interrupt used for someone elses, why I
can apply it from the system?
I agree, that would be a bug. But disposing of someone elses mapping is
not OK.

So it may someone else forget to dispose mapping, and it never be
used for others as I have got the interrupt I think.
Perhaps, but that is a bug that needs to be fixed in the code that
forgets to dispose of the mapping.

cheers