From: ebiederm@xmission.com (Eric W. Biederman)
To: Arjan van de Ven <arjan@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Dimitri Sivanich <sivanich@sgi.com>, Ingo Molnar <mingo@elte.hu>,
Suresh Siddha <suresh.b.siddha@intel.com>,
Yinghai Lu <yinghai@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Jesse Barnes <jbarnes@virtuousgeek.org>,
David Miller <davem@davemloft.net>,
Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>,
"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH v6] x86/apic: limit irq affinity
Date: Tue, 24 Nov 2009 09:41:18 -0800 [thread overview]
Message-ID: <m1ws1f6csh.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <20091124065022.6933be1a@infradead.org> (Arjan van de Ven's message of "Tue\, 24 Nov 2009 06\:50\:22 -0800")
Arjan van de Ven <arjan@infradead.org> writes:
> On Tue, 24 Nov 2009 14:55:15 +0100 (CET)
>> > Furthermore, the /sysfs topology information should include IRQ
>> > routing data in this case.
>>
>> Hmm, not sure about that. You'd need to scan through all the nodes to
>> find the set of CPUs where an irq can be routed to. I prefer to have
>> the information exposed by the irq enumeration (which is currently in
>> /proc/irq though).
>
> yes please.
>
> one device can have multiple irqs
> one irq can be servicing multiple devices
>
> expressing that in sysfs is a nightmare, while
> sticking it in /proc/irq *where the rest of the info is* is
> much nicer for apps like irqbalance
Oii.
I don't think it is bad to export information to applications like irqbalance.
I think it pretty horrible that one of the standard ways I have heard
to improve performance on 10G nics is to kill irqbalance.
Guys. Migrating an irq from one cpu to another while the device is
running without dropping interrupts is hard.
At the point we start talking about limiting what a process with
CAP_SYS_ADMIN can do because it makes bad decisions I think something
is really broken.
Currently the irq code treats /proc/irq/N/smp_affinity as a strong hint
on where we would like interrupts to be delivered, and we don't have good
feedback from there to architecture specific code that knows what we really
can do. It is going to take some effort and some work to make that happen.
I think the irq scheduler is the only scheduler (except for batch jobs) that we
don't put in the kernel. It seems to me that if we are going to go to all of the
trouble to rewrite the generic code to better support irqbalance because we
are having serious irqbalance problems, it will be less effort to suck irqbalance
into the kernel along with everything else.
I really think irqbalancing belongs in the kernel. It is hard to
export all of the information we need to user space and the
information that we need to export keeps changing. Until we master
this new trend of exponentially increasing core counts that
information is going to keep changing. Today we barely know how to balance
flows across cpus. So because of the huge communication problem and
the fact that there appears to be no benefit in keeping irqbalance in
user space (there is no config file) if we are going to rework all of the
interfaces let's pull irqbalance into the kernel.
As for the UV code, what we are looking at is a fundamental irq
routing property. Those irqs cannot be routed to some cpus. That is
something the code that sets up the routes needs to be aware of.
Dimitri could you put your the extra code in assign_irq_vector instead
of in the callers of assign_irq_vector? Since the probably is not
likely to stay unique we probably want to put the information you base
things on in struct irq_desc, but the logic I seems to live best in
in assign_irq_vector.
Eric
next prev parent reply other threads:[~2009-11-24 17:41 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-20 21:11 [PATCH v6] x86/apic: limit irq affinity Dimitri Sivanich
2009-11-21 18:49 ` Eric W. Biederman
2009-11-22 1:14 ` Dimitri Sivanich
2009-11-24 13:20 ` Thomas Gleixner
2009-11-24 13:39 ` Peter Zijlstra
2009-11-24 13:55 ` Thomas Gleixner
2009-11-24 14:50 ` Arjan van de Ven
2009-11-24 17:41 ` Eric W. Biederman [this message]
2009-11-24 18:00 ` Peter P Waskiewicz Jr
2009-11-24 18:20 ` Ingo Molnar
2009-11-24 18:27 ` Yinghai Lu
2009-11-24 18:32 ` Peter Zijlstra
2009-11-24 18:59 ` Yinghai Lu
2009-11-24 21:41 ` Dimitri Sivanich
2009-11-24 21:51 ` Thomas Gleixner
2009-11-24 23:06 ` Eric W. Biederman
2009-11-25 1:23 ` Thomas Gleixner
2009-11-24 22:42 ` Eric W. Biederman
2009-11-25 15:40 ` Arjan van de Ven
2009-12-03 16:50 ` Dimitri Sivanich
2009-12-03 16:53 ` Waskiewicz Jr, Peter P
2009-12-03 17:01 ` Dimitri Sivanich
2009-12-03 17:07 ` Waskiewicz Jr, Peter P
2009-12-03 17:19 ` Dimitri Sivanich
2009-12-03 18:50 ` Waskiewicz Jr, Peter P
2009-12-04 16:42 ` Dimitri Sivanich
2009-12-04 21:17 ` Peter P Waskiewicz Jr
2009-12-04 23:12 ` Eric W. Biederman
2009-12-05 10:38 ` Peter P Waskiewicz Jr
2009-12-07 13:44 ` Dimitri Sivanich
2009-12-07 13:39 ` Dimitri Sivanich
2009-12-07 23:28 ` Peter P Waskiewicz Jr
2009-12-08 15:04 ` Dimitri Sivanich
2009-12-11 3:16 ` david
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m1ws1f6csh.fsf@fess.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=arjan@infradead.org \
--cc=davem@davemloft.net \
--cc=hpa@zytor.com \
--cc=jbarnes@virtuousgeek.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peter.p.waskiewicz.jr@intel.com \
--cc=peterz@infradead.org \
--cc=sivanich@sgi.com \
--cc=suresh.b.siddha@intel.com \
--cc=tglx@linutronix.de \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox