All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <haveblue@us.ibm.com>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: "Martin J. Bligh" <Martin.Bligh@us.ibm.com>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Andrew Theurer <habanero@us.ibm.com>
Subject: Re: [PATCH] NUMA-Q disable irqbalance
Date: Mon, 19 Aug 2002 17:49:48 -0700	[thread overview]
Message-ID: <3D61922C.90607@us.ibm.com> (raw)
In-Reply-To: Pine.LNX.4.33.0208131421190.3110-100000@penguin.transmeta.com

[-- Attachment #1: Type: text/plain, Size: 2078 bytes --]

Linus Torvalds wrote:
> On Tue, 13 Aug 2002, Martin J. Bligh wrote:
> 
>>Was that before or after you changed HZ to 1000? I *think* that increased
>>the frequency of IO-APIC reprogramming by a factor of 10, though I might
>>be misreading the code. If it does depend on HZ, I think that's bad.
> 
> The 1000Hz thing came much later, and I never noticed any impact of that 
> on my machines.
> 
> (Note that this is all entrely subjective. I was very disappointed in the
> feel of the first HT P4 machine I had for the first few weeks, but apart
> from running lmbench - which looked ok even though it shows that P4's are
> bad at system calls - I've not actually put numbers on it. But my feeling
> was that the irq thing made a noticeable difference. Caveat emptor -
> subjective feelings are not good).
> 
>>People in our benchmarking group (Andrew, cc'ed) have told me that
>>reducing the frequency of IO-APIC reprogramming by a factor of 20 or
>>so improves performance greatly - don't know what HZ that was at, but
>>the whole thing seems a little overenthusiastic to me.
> 
> The rebalancing was certainly done with a 100Hz clock, so yes, it might 
> have become much worse lately.

Here's a patch from Andrea's tree that uses IRQ_BALANCE_INTERVAL to 
define how often interrupts are balanced, staying independent from HZ. 
  It also makes sure that there _is_ a change to the configuration 
before it actually writes it.  It reminds me of the mod_timer 
optimization.

While observing the affect that this has on /proc/interrupts, I 
noticed that timer interrupts aren't very balanced across my 8 CPUs.

stock 2.5.30 kernel:
       CPU0  CPU1  CPU2  CPU3  CPU4  CPU5    CPU6    CPU7
   0:213935 63894 32373  1322  1900  1779    2895  181441  timer

2.5.31 w/attached patch
       CPU0  CPU1  CPU2  CPU3 CPU4   CPU5    CPU6    CPU7
   0: 19322  8778 35805 36771 9219  14091 1091741 1092294   timer

So, my patch didn't cause this situation.  I don't know if it is 
normal, but the other irq's were much more balanced than the timer one.
-- 
Dave Hansen
haveblue@us.ibm.com

[-- Attachment #2: irq-balance-tune-2.5.31+bk-0.patch --]
[-- Type: text/plain, Size: 1730 bytes --]

# This is a BitKeeper generated patch for the following project:
# Project Name: Linux kernel tree
# This patch format is intended for GNU patch command version 2.5 or higher.
# This patch includes the following deltas:
#	           ChangeSet	1.489   -> 1.490  
#	arch/i386/kernel/io_apic.c	1.26    -> 1.27   
#
# The following is the BitKeeper ChangeSet Log
# --------------------------------------------
# 02/08/19	haveblue@elm3b96.(none)	1.490
# Reduce irqbalance's dependence on HZ.  Only write to io_apic when there is actually
# a change to write into it.  
# --------------------------------------------
#
diff -Nru a/arch/i386/kernel/io_apic.c b/arch/i386/kernel/io_apic.c
--- a/arch/i386/kernel/io_apic.c	Mon Aug 19 14:49:03 2002
+++ b/arch/i386/kernel/io_apic.c	Mon Aug 19 14:49:03 2002
@@ -220,6 +220,9 @@
 		((1 << cpu) & (allowed_mask))
 
 #if CONFIG_SMP
+
+#define IRQ_BALANCE_INTERVAL (HZ/50)
+	
 static unsigned long move(int curr_cpu, unsigned long allowed_mask, unsigned long now, int direction)
 {
 	int search_idle = 1;
@@ -254,8 +257,9 @@
 	if (clustered_apic_mode)
 		return;
 
-	if (entry->timestamp != now) {
+	if (unlikely(time_after(now, entry->timestamp + IRQ_BALANCE_INTERVAL))) {
 		unsigned long allowed_mask;
+		unsigned int new_cpu;
 		int random_number;
 
 		rdtscl(random_number);
@@ -263,8 +267,11 @@
 
 		allowed_mask = cpu_online_map & irq_affinity[irq];
 		entry->timestamp = now;
-		entry->cpu = move(entry->cpu, allowed_mask, now, random_number);
-		set_ioapic_affinity(irq, 1 << entry->cpu);
+		new_cpu = move(entry->cpu, allowed_mask, now, random_number);
+		if (entry->cpu != new_cpu) {
+			entry->cpu = new_cpu;
+			set_ioapic_affinity(irq, 1 << new_cpu);
+		}
 	}
 }
 #else /* !SMP */

  parent reply	other threads:[~2002-08-20  0:46 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-08-05 23:51 [PATCH] NUMA-Q disable irqbalance Martin J. Bligh
2002-08-13 16:13 ` Martin J. Bligh
2002-08-13 16:41   ` Linus Torvalds
2002-08-13 16:57     ` Alan Cox
2002-08-13 17:24       ` Martin J. Bligh
2002-08-13 17:38         ` Alan Cox
2002-08-13 17:14     ` Martin J. Bligh
2002-08-13 17:24       ` Linus Torvalds
2002-08-13 18:02         ` Martin J. Bligh
2002-08-13 18:20           ` Linus Torvalds
2002-08-13 18:58             ` Martin J. Bligh
2002-08-13 19:22               ` Linus Torvalds
2002-08-13 20:04                 ` Martin J. Bligh
2002-08-13 20:22                   ` Linus Torvalds
2002-08-14  5:52                     ` Martin J. Bligh
2002-08-14 10:10                     ` Jos Hulzink
2002-08-14 11:12                       ` David Lang
2002-08-13 20:22                 ` Alan Cox
2002-08-13 20:35                   ` Linus Torvalds
2002-08-13 20:34                     ` Alan Cox
2002-08-13 20:42                     ` Martin J. Bligh
2002-08-13 21:24                       ` Linus Torvalds
2002-08-13 22:29                         ` Andrew Theurer
2002-08-13 23:30                           ` Andrea Arcangeli
2002-08-14 21:16                             ` James Cleverdon
2002-08-23  2:31                             ` [PATCH] 2.5.31 Summit NUMA patch with dynamic IRQ balancing James Cleverdon
     [not found]                             ` <20020813233007.GV14394-YO0K3UtdbwQZxi9Dt0/nrQC/G2K4zDHf@public.gmane.org>
2002-08-23  2:31                               ` James Cleverdon
2002-08-20  0:49                         ` Dave Hansen [this message]
2002-08-13 22:08                     ` [PATCH] NUMA-Q disable irqbalance Rik van Riel
2002-08-13 22:14                       ` Rik van Riel
2002-08-14 14:49                         ` Linus Torvalds
2002-08-14 15:19                           ` Rik van Riel
2002-08-24 12:19                 ` Zwane Mwaikambo
2002-08-27  1:23                   ` James Cleverdon
2002-08-27  7:46                     ` Zwane Mwaikambo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3D61922C.90607@us.ibm.com \
    --to=haveblue@us.ibm.com \
    --cc=Martin.Bligh@us.ibm.com \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=habanero@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.