From: ebiederm@xmission.com (Eric W. Biederman)
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
"H. Peter Anvin" <hpa@zytor.com>,
Jesse Barnes <jbarnes@virtuousgeek.org>,
Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org,
Andrew Vasquez <andrew.vasquez@qlogic.com>
Subject: Re: [PATCH] pci: change msi-x vector to 32bit
Date: Mon, 18 Aug 2008 12:59:58 -0700 [thread overview]
Message-ID: <m14p5i2iwx.fsf@frodo.ebiederm.org> (raw)
In-Reply-To: <1218928162.3940.62.camel@localhost.localdomain> (James Bottomley's message of "Sat, 16 Aug 2008 18:09:22 -0500")
James Bottomley <James.Bottomley@HansenPartnership.com> writes:
> On Sat, 2008-08-16 at 15:17 -0700, Yinghai Lu wrote:
>> On Sat, Aug 16, 2008 at 1:45 PM, James Bottomley
>> <James.Bottomley@hansenpartnership.com> wrote:
>> >> > What I still don't quite get is the benefit of large IRQ spaces ...
>> >> > particularly if you encode things the system doesn't really need to know
>> >> > in them.
>> >>
>> >> then set nr_irqs = nr_cpu_ids * NR_VECTORS))
>> >> and count down for msi/msi-x?
>> >
>> > No, what I mean is that msis can trip directly to CPUs, so this is an
>> > affinity thing (that MSI is directly bound to that CPU now), so in the
>> > matrixed way we display this in show_interrupts() with the CPU along the
>> > top and the IRQ down the side, it doesn't make sense to me to encode IRQ
>> > affinity in the irq number again. So it makes more sense to assign the
>> > vectors based on both the irq number and the CPU affinity so that if the
>> > PCI MSI for qla is assigned to CPU4 you can reassign it to CPU5 and so
>> > on.
>>
>> msi-x entry index, cpu_vector, irq number...
>>
>> you want to different cpus have same vector?
>
> Obviously I'm not communicating very well. Your apparent assumption is
> that irq number == vector.
Careful. There are two entities termed vector in this conversation.
There is the MSI-X vector which can hold up to 4096 entries per device.
There is the idt vector which has 256 entries per cpu.
> What I'm saying is that's not what we've
> done for individually vectored CPU interrupts in other architectures.
> In those we did (cpu no, irq) == vector. i.e. the affinity and the irq
> number identify the vector. For non-numa systems, this is effectively
> what you're interested in doing anyway. For numa systems, it just
> becomes a sparse matrix.
I believe assign_irq_vector on x86_64 and soon on x86_32 does this already.
The number that was being changed was the irq number of for the
msi-x ``vectors'' from some random free irq number to roughly
bus(8 bits):device+function(8 bits):msix-vector(12 bits) so that we
could have a stable irq number for msi irqs.
Once pci domain is considered it is hard to claim we have enough bits.
I expect we need at least pci domains to have one per NUMA node, in
the general case.
The big motivation for killing NR_IRQS sized arrays comes from 2 directions.
msi-x which allows up to 4096 irqs per device and nic vendors starting
to produce cards with 256 queues, and from large SGI systems that don't do
I/O and want to be supported with the same kernel build as smaller systems.
A kernel built to handle 4096*32 irqs which is more or less reasonable if
the system was I/O heavy is a ridiculously sized array on smaller machines.
So a static irq_desc is out. And since with the combination of msi-x hotplug
we can not tell how many irq sources and thus irq numbers the machine is going
to have we can not reasonably even have a dynamic array at boot time. Further
we also want to allocate the irq_desc entries in node-local memory on NUMA
machines for better performance. Which means we need to dynamically allocate
irq_desc entries and have some lookup mechanism from irq# to irq_desc entry.
So once we have all of that. It becomes possible to look at assigning a static
irq number to each pci (bus:device:function:msi-x vector) pair so the system
is more reproducible.
Eric
next prev parent reply other threads:[~2008-08-18 20:04 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-16 3:26 [PATCH] pci: change msi-x vector to 32bit H. Peter Anvin
2008-08-16 6:42 ` Yinghai Lu
2008-08-16 14:50 ` James Bottomley
2008-08-16 15:39 ` Alan Cox
2008-08-16 16:13 ` James Bottomley
2008-08-16 18:56 ` Yinghai Lu
2008-08-16 20:10 ` Andrew Vasquez
2008-08-16 20:25 ` James Bottomley
2008-08-16 20:34 ` Yinghai Lu
2008-08-16 20:45 ` James Bottomley
2008-08-16 22:17 ` Yinghai Lu
2008-08-16 23:09 ` James Bottomley
2008-08-16 23:21 ` Yinghai Lu
2008-08-18 19:59 ` Eric W. Biederman [this message]
2008-08-18 20:59 ` James Bottomley
2008-08-18 21:45 ` Eric W. Biederman
2008-08-18 22:04 ` James Bottomley
2008-08-18 21:51 ` Alan Cox
2008-08-18 22:13 ` H. Peter Anvin
2008-08-18 22:27 ` James Bottomley
2008-08-18 21:24 ` H. Peter Anvin
2008-08-16 8:17 ` Eric W. Biederman
2008-08-16 9:00 ` Yinghai Lu
-- strict thread matches above, loose matches on Subject: below --
2008-08-16 2:36 Yinghai Lu
2008-08-21 20:33 ` Jesse Barnes
2008-08-21 20:47 ` Eric W. Biederman
2008-08-21 23:07 ` Jesse Barnes
2008-08-22 0:11 ` Eric W. Biederman
2008-08-22 0:35 ` Jesse Barnes
2008-08-27 23:34 ` Jesse Barnes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m14p5i2iwx.fsf@frodo.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=andrew.vasquez@qlogic.com \
--cc=hpa@zytor.com \
--cc=jbarnes@virtuousgeek.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=tglx@linutronix.de \
--cc=yhlu.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.