From: William Lee Irwin III <wli@holomorphy.com>
To: Andrea Arcangeli <andrea@suse.de>
Cc: "David S. Miller" <davem@redhat.com>,
akpm@digeo.com, davidsen@tmr.com, haveblue@us.ibm.com,
habanero@us.ibm.com, mbligh@aracnet.com,
linux-kernel@vger.kernel.org
Subject: Re: userspace irq balancer
Date: Mon, 26 May 2003 19:26:17 -0700 [thread overview]
Message-ID: <20030527022617.GE8978@holomorphy.com> (raw)
In-Reply-To: <20030527021407.GM3767@dualathlon.random>
On Mon, May 26, 2003 at 06:53:07PM -0700, William Lee Irwin III wrote:
>> I should also point out that the cost of reprogramming the interrupt
>> controllers isn't taken into account by the kernel irq balancer. In
On Tue, May 27, 2003 at 04:14:07AM +0200, Andrea Arcangeli wrote:
> do you want to take that into account in userspace? if there's a place to
> take that into account that place is the kernel. You can even benchmark
> it at boot.
Userspace is preemptable and schedulable, so it's inherently rate
limited.
On Mon, May 26, 2003 at 06:53:07PM -0700, William Lee Irwin III wrote:
>> the userspace implementation the reprogramming is done infrequently
>> enough to make even significant cost negligible; in-kernel the cost
>> is entirely uncontrolled and the rate of reprogramming unlimited.
On Tue, May 27, 2003 at 04:14:07AM +0200, Andrea Arcangeli wrote:
> depends on the kernel algorithm.
> I feel like the in kernel algorithm is considered to be the one floating
> around that reprograms the apic even when it makes zero changes to the
> routing, like if nothing else was possible to do in kernel.
> start like this: put the userspace algorithm in kernel, then add a
> few bytes of info to keep an average of the idle cpus every second, then
> after 30 seconds a cpu is idle start to route the irqs to such idle cpu,
> slowly, after 60 seconds more aggressively. etc... For such an algorithm
> you don't care less about the reprogramming speed, just like with the
> current "userspace" algorithm, but thanks to the kernel info it will be
> able to do smarter decisions that would never be possible in userspace
> (w/o tlb flushing waste, and w/o kernel->user microkernel protocol
> implementation waste).
No, I'm not assuming that level of naivete. My primary interest is that
the amount of work be properly rate limited, and running at fixed
intervals isn't quite enough; it needs to be bounded amounts of work at
fixed intervals. I failed to point this out, but something more
incremental than a NR_IRQS sweep across all IRQ's every 60s is needed
for proper rate limiting.
On Mon, May 26, 2003 at 06:53:07PM -0700, William Lee Irwin III wrote:
>> The story of APIC code tripping over itself is an even unfunnier comedy
>> of errors, as the lack of TPR adjustment means that within any APIC
[...]
On Tue, May 27, 2003 at 04:14:07AM +0200, Andrea Arcangeli wrote:
> again, reading this I feel like there's the idea that the only possible
> kernel algorithm is the one that bounces stuff and reprograms stuff as
> quickly as it can like the hardware one did.
This is actually a more general concern about correctness. Any
in-kernel algorithm must rely on the in-kernel IO-APIC RTE formation
code, which is highly problematic at best, as partially described by
all of the confusions and incorrect declarations mentioned above. Even
the "Wal-Mart" SMP subarch, used for the most common of i386 machines,
incorrectly declares its physical broadcast destination to be non-xAPIC
physical broadcast despite being used for Pentium IV and prior cpus.
-- wli
next prev parent reply other threads:[~2003-05-27 2:16 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-05-21 21:43 userspace irq balancer Nakajima, Jun
2003-05-22 0:29 ` Gerrit Huizenga
2003-05-22 1:28 ` Martin J. Bligh
2003-05-22 1:44 ` Gerrit Huizenga
2003-05-22 2:03 ` William Lee Irwin III
2003-05-22 2:04 ` William Lee Irwin III
2003-05-22 2:12 ` Zwane Mwaikambo
2003-05-22 3:57 ` Martin J. Bligh
2003-05-22 17:24 ` Bill Davidsen
2003-05-22 22:44 ` David S. Miller
2003-05-26 22:24 ` Andrea Arcangeli
2003-05-26 23:26 ` Andrew Morton
2003-05-26 23:34 ` Andrea Arcangeli
2003-05-26 23:43 ` David S. Miller
[not found] ` <20030527000639.GA3767@dualathlon.random>
2003-05-27 0:15 ` David S. Miller
2003-05-27 0:41 ` Andrea Arcangeli
2003-05-27 0:48 ` David S. Miller
2003-05-27 1:09 ` Andrea Arcangeli
2003-05-27 1:13 ` David S. Miller
2003-05-27 1:26 ` Andrea Arcangeli
2003-05-27 6:11 ` David S. Miller
2003-05-27 11:53 ` Andrea Arcangeli
2003-05-27 22:04 ` David S. Miller
2003-05-27 22:27 ` Andrea Arcangeli
2003-05-27 23:55 ` David S. Miller
2003-06-13 6:22 ` David S. Miller
2003-06-13 18:23 ` Andrea Arcangeli
2003-05-27 1:16 ` Dave Jones
2003-05-27 1:17 ` David S. Miller
2003-05-27 9:07 ` Arjan van de Ven
2003-05-27 9:10 ` David S. Miller
2003-05-27 1:28 ` Andrea Arcangeli
2003-05-27 1:53 ` William Lee Irwin III
2003-05-27 1:59 ` Andrew Morton
2003-05-27 2:10 ` William Lee Irwin III
2003-05-27 2:15 ` Zwane Mwaikambo
2003-05-27 2:44 ` William Lee Irwin III
2003-05-27 2:45 ` Zwane Mwaikambo
2003-05-27 4:22 ` William Lee Irwin III
2003-05-27 2:15 ` Andrea Arcangeli
2003-05-27 2:14 ` Andrea Arcangeli
2003-05-27 2:26 ` William Lee Irwin III [this message]
2003-05-27 1:17 ` Andrea Arcangeli
2003-05-27 1:20 ` David S. Miller
2003-05-27 1:33 ` Andrea Arcangeli
2003-05-22 14:18 ` James Cleverdon
2003-05-22 14:43 ` William Lee Irwin III
2003-05-22 15:30 ` James Cleverdon
2003-05-22 15:45 ` William Lee Irwin III
-- strict thread matches above, loose matches on Subject: below --
2003-05-24 1:10 Nakajima, Jun
2003-05-21 16:31 James Bottomley
2003-05-21 20:16 ` Arjan van de Ven
2003-05-20 15:41 Nakajima, Jun
2003-05-21 13:54 ` James Cleverdon
2003-05-21 22:56 ` Zwane Mwaikambo
[not found] <200305191314.06216.pbadari@us.ibm.com>
2003-05-19 22:07 ` Dave Hansen
2003-05-19 22:11 ` Arjan van de Ven
2003-05-19 22:22 ` Dave Hansen
2003-05-20 3:25 ` David S. Miller
2003-05-20 3:46 ` William Lee Irwin III
2003-05-20 5:03 ` Dave Hansen
2003-05-20 5:53 ` Martin J. Bligh
2003-05-20 6:13 ` David S. Miller
2003-05-20 6:36 ` Dave Hansen
2003-05-20 6:40 ` David S. Miller
2003-05-20 14:07 ` Andrew Theurer
2003-05-20 14:21 ` Jeff Garzik
2003-05-20 14:35 ` Andrew Theurer
[not found] ` <20030520.163833.104040023.davem@redhat.com>
2003-05-21 14:58 ` Martin J. Bligh
2003-05-21 22:55 ` David S. Miller
2003-05-21 11:00 ` Kai Bankett
2003-05-20 14:01 ` Martin J. Bligh
2003-05-20 9:00 ` Arjan van de Ven
2003-05-20 9:14 ` William Lee Irwin III
2003-05-20 9:17 ` Andrew Morton
[not found] ` <20030520.172230.102567463.davem@redhat.com>
2003-05-21 14:27 ` James Cleverdon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20030527022617.GE8978@holomorphy.com \
--to=wli@holomorphy.com \
--cc=akpm@digeo.com \
--cc=andrea@suse.de \
--cc=davem@redhat.com \
--cc=davidsen@tmr.com \
--cc=habanero@us.ibm.com \
--cc=haveblue@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mbligh@aracnet.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox