From: Dave Hansen <haveblue@us.ibm.com>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: "Martin J. Bligh" <Martin.Bligh@us.ibm.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>,
linux-kernel <linux-kernel@vger.kernel.org>,
Andrew Theurer <habanero@us.ibm.com>
Subject: Re: [PATCH] NUMA-Q disable irqbalance
Date: Mon, 19 Aug 2002 17:49:48 -0700 [thread overview]
Message-ID: <3D61922C.90607@us.ibm.com> (raw)
In-Reply-To: Pine.LNX.4.33.0208131421190.3110-100000@penguin.transmeta.com
[-- Attachment #1: Type: text/plain, Size: 2078 bytes --]
Linus Torvalds wrote:
> On Tue, 13 Aug 2002, Martin J. Bligh wrote:
>
>>Was that before or after you changed HZ to 1000? I *think* that increased
>>the frequency of IO-APIC reprogramming by a factor of 10, though I might
>>be misreading the code. If it does depend on HZ, I think that's bad.
>
> The 1000Hz thing came much later, and I never noticed any impact of that
> on my machines.
>
> (Note that this is all entrely subjective. I was very disappointed in the
> feel of the first HT P4 machine I had for the first few weeks, but apart
> from running lmbench - which looked ok even though it shows that P4's are
> bad at system calls - I've not actually put numbers on it. But my feeling
> was that the irq thing made a noticeable difference. Caveat emptor -
> subjective feelings are not good).
>
>>People in our benchmarking group (Andrew, cc'ed) have told me that
>>reducing the frequency of IO-APIC reprogramming by a factor of 20 or
>>so improves performance greatly - don't know what HZ that was at, but
>>the whole thing seems a little overenthusiastic to me.
>
> The rebalancing was certainly done with a 100Hz clock, so yes, it might
> have become much worse lately.
Here's a patch from Andrea's tree that uses IRQ_BALANCE_INTERVAL to
define how often interrupts are balanced, staying independent from HZ.
It also makes sure that there _is_ a change to the configuration
before it actually writes it. It reminds me of the mod_timer
optimization.
While observing the affect that this has on /proc/interrupts, I
noticed that timer interrupts aren't very balanced across my 8 CPUs.
stock 2.5.30 kernel:
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
0:213935 63894 32373 1322 1900 1779 2895 181441 timer
2.5.31 w/attached patch
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
0: 19322 8778 35805 36771 9219 14091 1091741 1092294 timer
So, my patch didn't cause this situation. I don't know if it is
normal, but the other irq's were much more balanced than the timer one.
--
Dave Hansen
haveblue@us.ibm.com
[-- Attachment #2: irq-balance-tune-2.5.31+bk-0.patch --]
[-- Type: text/plain, Size: 1730 bytes --]
# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
# ChangeSet 1.489 -> 1.490
# arch/i386/kernel/io_apic.c 1.26 -> 1.27
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 02/08/19 haveblue@elm3b96.(none) 1.490
# Reduce irqbalance's dependence on HZ. Only write to io_apic when there is actually
# a change to write into it.
# --------------------------------------------
#
diff -Nru a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c
--- a/arch/i386/kernel/io_apic.c Mon Aug 19 14:49:03 2002
+++ b/arch/i386/kernel/io_apic.c Mon Aug 19 14:49:03 2002
@@ -220,6 +220,9 @@
((1 << cpu) & (allowed_mask))
#if CONFIG_SMP
+
+#define IRQ_BALANCE_INTERVAL (HZ/50)
+
static unsigned long move(int curr_cpu, unsigned long allowed_mask, unsigned long now, int direction)
{
int search_idle = 1;
@@ -254,8 +257,9 @@
if (clustered_apic_mode)
return;
- if (entry->timestamp != now) {
+ if (unlikely(time_after(now, entry->timestamp + IRQ_BALANCE_INTERVAL))) {
unsigned long allowed_mask;
+ unsigned int new_cpu;
int random_number;
rdtscl(random_number);
@@ -263,8 +267,11 @@
allowed_mask = cpu_online_map & irq_affinity[irq];
entry->timestamp = now;
- entry->cpu = move(entry->cpu, allowed_mask, now, random_number);
- set_ioapic_affinity(irq, 1 << entry->cpu);
+ new_cpu = move(entry->cpu, allowed_mask, now, random_number);
+ if (entry->cpu != new_cpu) {
+ entry->cpu = new_cpu;
+ set_ioapic_affinity(irq, 1 << new_cpu);
+ }
}
}
#else /* !SMP */
next prev parent reply other threads:[~2002-08-20 0:46 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-08-05 23:51 [PATCH] NUMA-Q disable irqbalance Martin J. Bligh
2002-08-13 16:13 ` Martin J. Bligh
2002-08-13 16:41 ` Linus Torvalds
2002-08-13 16:57 ` Alan Cox
2002-08-13 17:24 ` Martin J. Bligh
2002-08-13 17:38 ` Alan Cox
2002-08-13 17:14 ` Martin J. Bligh
2002-08-13 17:24 ` Linus Torvalds
2002-08-13 18:02 ` Martin J. Bligh
2002-08-13 18:20 ` Linus Torvalds
2002-08-13 18:58 ` Martin J. Bligh
2002-08-13 19:22 ` Linus Torvalds
2002-08-13 20:04 ` Martin J. Bligh
2002-08-13 20:22 ` Linus Torvalds
2002-08-14 5:52 ` Martin J. Bligh
2002-08-14 10:10 ` Jos Hulzink
2002-08-14 11:12 ` David Lang
2002-08-13 20:22 ` Alan Cox
2002-08-13 20:35 ` Linus Torvalds
2002-08-13 20:34 ` Alan Cox
2002-08-13 20:42 ` Martin J. Bligh
2002-08-13 21:24 ` Linus Torvalds
2002-08-13 22:29 ` Andrew Theurer
2002-08-13 23:30 ` Andrea Arcangeli
2002-08-14 21:16 ` James Cleverdon
2002-08-23 2:31 ` [PATCH] 2.5.31 Summit NUMA patch with dynamic IRQ balancing James Cleverdon
[not found] ` <20020813233007.GV14394-YO0K3UtdbwQZxi9Dt0/nrQC/G2K4zDHf@public.gmane.org>
2002-08-23 2:31 ` James Cleverdon
2002-08-20 0:49 ` Dave Hansen [this message]
2002-08-13 22:08 ` [PATCH] NUMA-Q disable irqbalance Rik van Riel
2002-08-13 22:14 ` Rik van Riel
2002-08-14 14:49 ` Linus Torvalds
2002-08-14 15:19 ` Rik van Riel
2002-08-24 12:19 ` Zwane Mwaikambo
2002-08-27 1:23 ` James Cleverdon
2002-08-27 7:46 ` Zwane Mwaikambo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3D61922C.90607@us.ibm.com \
--to=haveblue@us.ibm.com \
--cc=Martin.Bligh@us.ibm.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=habanero@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@transmeta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.