From: Alan Mayer <ajm@sgi.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Cliff Wickman <cpw@sgi.com>,
jeremy@goop.org, rusty@rustcorp.com.au,
suresh.b.siddha@intel.com, mingo@elte.hu,
torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
Dean Nelson <dcn@sgi.com>
Subject: Re: [PATCH] x86_64: Dynamically allocate arch specific system vectors
Date: Mon, 04 Aug 2008 14:37:25 -0500 [thread overview]
Message-ID: <48975A75.4010609@sgi.com> (raw)
In-Reply-To: <m11w18tmb1.fsf@frodo.ebiederm.org>
Eric W. Biederman wrote:
> Alan Mayer <ajm@sgi.com> writes:
>
>> Okay, I think we have it now. assign_irq_vector *almost* does what we need.
>> One minor thing is that assign_irq_vector ANDs against cpu_online_map. We would
>> need cpu_possible_map, so we get the vector on offline cpus that may come
>> online. The other thing is that assign_irq_vector doesn't allow the
>> specification of interrupt priorities. It would need to be modified to handle
>> returning either a high priority vector or a low priority vector. Would
>> modifying the api for assign_irq_vector be the proper approach?
>
> I don't know if it makes sense to modify assign_irq_vector or to
> have a companion function that uses the same data structures.
>
> I think I would work on the companion function and if the code
> can be made sufficiently similar merge the two functions.
>
Okay, If I understand you, here's what we can do. We currently have
this function that does pretty much what the combination of create_irq()
and __assign_irq_vector() do. We can accomplish the same thing that our
routine does using create_irq() and __assign_irq_vector() do if we make
the following changes:
__assign_irq_vector(int irq, cpumask_t mask) ==>
__assign_irq_vector(int irq, cpumask_t mask, int priority);
priority has three values: priority_none, priority_low, priority_high
priority_none means do everything the way it is done now.
priority_low means do everything the way its is done now, except use
cpu_possible_map rather than cpu_online_map.
priority_high means search the interrupt vectors from the top down,
rather than from the bottom up and use cpu_possible_map rather than
cpu_online_map.
create_irq(void) ==> create_irq(int priority, cpumask_t *mask)
priority_none, means do everything the way it is done now, passing in
TARGET_CPUS as the mask, but also sending the priority arg. into
__assign_irq_vector().
priority_low and priority_high means use create_irq()'s mask arg. as the
mask passed to __assign_irq_vector).
We would add an additional small routine on top of create_irq() to do
any massaging of the irq_desc, etc. that we need for these system vectors.
Is that what you were thinking about?
--ajm
>> The interrupts don't necessarily fire on all cpus, it's just that they *can*
>> fire on any cpu. For example, the GRU triggers an interrupt (it is very
>> IPI'ish) to a particular cpu in the event of a GRU TLB fault. That cpu handles
>> the fault and returns. But the fault can happen on any cpu, so all cpus need to
>> be registered for the same vector and irq. This is probably splitting hairs; it
>> is certainly no different in principal from timer interrupts or processor TLB
>> faults.
>
> Reasonable. As long as you don't need to read a status register to figure
> out what to do that sounds reasonable. This does sound very much like
> splitting hairs on a very platform specific capability.
>
> If we can generalize the mechanism to things like per cpu timer
> interrupts and such so that we reduced the total amount of code we
> have to maintain I would find it a very compelling mechanism.
>
>> As far as kernel_stat is concerned. I see you're point. NR_CPUS on our
>> machines is going to be big (4K? 8K? something like that). NR_IRQS is also
>> going to big because of that. It's unfortunate since the actual number of
>> interrupt sources is going to be an order of magnitude smaller, at least.
>
> The number of interrupts sources is going to be smaller only because
> SGI machines have or at least appear to have poor I/O compared to most
> of the rest of machines in existence. NR_CPUS*16 is a fairly
> reasonable estimate on most machines in existence. In the short term
> it is going to get worse in the presence of MSI-X. I was talking to a
> developer at Intel last week about 256 irqs for one card. I keep
> having dreams about finding a way to just keep stats for a few cpus
> but alas I don't think that is going to happen. Silly us.
>
> Eric
>
--
It's getting to the point
Where I'm no fun anymore.
--
Alan J. Mayer
SGI
ajm@sgi.com
WORK: 651-683-3131
HOME: 651-407-0134
--
next prev parent reply other threads:[~2008-08-04 19:35 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-31 21:22 [PATCH] x86_64: Dynamically allocate arch specific system vectors Alan Mayer
2008-07-31 22:10 ` Eric W. Biederman
2008-08-01 15:51 ` Cliff Wickman
2008-08-01 21:00 ` Eric W. Biederman
2008-08-01 21:51 ` Alan Mayer
2008-08-01 22:14 ` Eric W. Biederman
2008-08-04 19:37 ` Alan Mayer [this message]
2008-08-04 20:39 ` Mike Travis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48975A75.4010609@sgi.com \
--to=ajm@sgi.com \
--cc=cpw@sgi.com \
--cc=dcn@sgi.com \
--cc=ebiederm@xmission.com \
--cc=jeremy@goop.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rusty@rustcorp.com.au \
--cc=suresh.b.siddha@intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox