public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: James Cleverdon <jamesclv@us.ibm.com>
To: Andrea Arcangeli <andrea@suse.de>, Andrew Theurer <habanero@us.ibm.com>
Cc: Linus Torvalds <torvalds@transmeta.com>,
	"Martin J. Bligh" <Martin.Bligh@us.ibm.com>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] NUMA-Q disable irqbalance
Date: Wed, 14 Aug 2002 14:16:59 -0700	[thread overview]
Message-ID: <200208141416.59397.jamesclv@us.ibm.com> (raw)
In-Reply-To: <20020813233007.GV14394@dualathlon.random>

On Tuesday 13 August 2002 04:30 pm, Andrea Arcangeli wrote:
> On Tue, Aug 13, 2002 at 05:29:50PM -0500, Andrew Theurer wrote:
> > > 2.4.19-rc3aa3:
> > >
> > > No Balance    Ingo IRQ Balance        Andrea IRQ Balance
> > > 794 Mbps              787 Mbps                        792 Mbps
> > >
> > > With hyperthreading:
> > >
> > > No Balance    Ingo IRQ Balance        Andrea IRQ Balance
> > > 773 Mbps              798 Mbps                        809 Mbps
>
> thanks again for running the above benchmarks.
>
> > version is a little less aggressive and has less overhead, something I'd
> > prefer in 2.5.
>
> Second that of course, btw, the detailed explanataion of the changes I
> did while merging it can be found on lse-tech.
>
> it is also possible HZ/50 is too high frequency still, I didn't run any
> extensive test on the reprogramming frequency. I would suggest to try
> with HZ/10 too (so every 100msec instead of every 20msec).
>
> BTW, the very same algorithm should also be shared by alpha, alpha never
> had hardware irq balancing support, it's like a p4, and we do static
> routing distribution choosed by the kernel at boot which is been pretty
> good so far (better than mainline 2.4 on a p4 smp) but the irqblanace
> algorithm should be better there too.
>
> Andrea

I've been thinking about doing away with balance_irq on P4 boxes by using the 
TPR.  It might even help P3 and other i386 CPUs by routing interrupts 
preferentially to idle CPUs.  And, it would do it in real time, not as a 
snapshot of the past stored in the I/O APIC's dest masks.

What do you think about this?  (Crudely ripped out of my 2.5 summit patch; may 
not apply correctly.)

The code in do_IRQ could be "if (clustered_apic_mode) apic_adj_tpr(TPR_IRQ)" 
or some such, if that kind of APIC foolery is not considered necessary for 
the non-P4 crowd.  (The automatic priority boost for serial APICs with 
unEOIed interrupts should do the job for us.)


diff -ruN 2.5.24/arch/i386/kernel/irq.c d24/arch/i386/kernel/irq.c
--- 2.5.24/arch/i386/kernel/irq.c	Thu Jun 20 15:53:44 2002
+++ d24/arch/i386/kernel/irq.c	Wed Jul 10 13:34:14 2002
@@ -582,6 +582,7 @@
 	unsigned int status;
 
 	kstat.irqs[cpu][irq]++;
+	apic_adj_tpr(TPR_IRQ);
 	spin_lock(&desc->lock);
 	desc->handler->ack(irq);
 	/*
@@ -642,6 +643,7 @@
 
 	if (softirq_pending(cpu))
 		do_softirq();
+	apic_adj_tpr(-TPR_IRQ);
 	return 1;
 }
 
diff -ruN 2.5.24/arch/i386/kernel/process.c d24/arch/i386/kernel/process.c
--- 2.5.24/arch/i386/kernel/process.c	Thu Jun 20 15:53:40 2002
+++ d24/arch/i386/kernel/process.c	Wed Jul 10 14:12:57 2002
@@ -145,7 +145,9 @@
 		irq_stat[smp_processor_id()].idle_timestamp = jiffies;
 		while (!need_resched())
 			idle();
+		apic_set_tpr(TPR_TASK);
 		schedule();
+		apic_set_tpr(TPR_IDLE);
 	}
 }
 
diff -ruN 2.5.24/include/asm-i386/apic.h d24/include/asm-i386/apic.h
--- 2.5.24/include/asm-i386/apic.h	Thu Jun 20 15:53:57 2002
+++ d24/include/asm-i386/apic.h	Wed Jul 10 13:34:14 2002
@@ -64,6 +64,22 @@
 	apic_write_around(APIC_EOI, 0);
 }
 
+static inline void apic_set_tpr(unsigned long val)
+{
+	unsigned long value;
+
+	value = apic_read(APIC_TASKPRI);
+	apic_write_around(APIC_TASKPRI, (value & ~APIC_TPRI_MASK) + val);
+}
+
+static inline void apic_adj_tpr(long adj)
+{
+	unsigned long value;
+
+	value = apic_read(APIC_TASKPRI);
+	apic_write_around(APIC_TASKPRI, value + adj);
+}
+
 extern int get_maxlvt(void);
 extern void clear_local_APIC(void);
 extern void connect_bsp_APIC (void);
@@ -95,6 +118,15 @@
 #define NMI_LOCAL_APIC	2
 #define NMI_INVALID	3
 
+#else /* CONFIG_X86_LOCAL_APIC */
+#define apic_set_tpr(val)
+#define apic_adj_tpr(adj)
 #endif /* CONFIG_X86_LOCAL_APIC */
 
+/* Priority values for apic_adj_tpr() and apic_set_tpr() */
+/* xAPICs only do priority comparisons on the upper nibble. */
+#define TPR_IDLE	(0x00ul)
+#define TPR_TASK	(0x10ul)
+#define TPR_IRQ		(0x20ul)	/* Or maybe 0x10 ?? */
+
 #endif /* __ASM_APIC_H */


-- 
James Cleverdon
IBM xSeries Linux Solutions
{jamesclv(Unix, preferred), cleverdj(Notes)} at us dot ibm dot com


  reply	other threads:[~2002-08-14 21:17 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-08-05 23:51 [PATCH] NUMA-Q disable irqbalance Martin J. Bligh
2002-08-13 16:13 ` Martin J. Bligh
2002-08-13 16:41   ` Linus Torvalds
2002-08-13 16:57     ` Alan Cox
2002-08-13 17:24       ` Martin J. Bligh
2002-08-13 17:38         ` Alan Cox
2002-08-13 17:14     ` Martin J. Bligh
2002-08-13 17:24       ` Linus Torvalds
2002-08-13 18:02         ` Martin J. Bligh
2002-08-13 18:20           ` Linus Torvalds
2002-08-13 18:58             ` Martin J. Bligh
2002-08-13 19:22               ` Linus Torvalds
2002-08-13 20:04                 ` Martin J. Bligh
2002-08-13 20:22                   ` Linus Torvalds
2002-08-14  5:52                     ` Martin J. Bligh
2002-08-14 10:10                     ` Jos Hulzink
2002-08-14 11:12                       ` David Lang
2002-08-13 20:22                 ` Alan Cox
2002-08-13 20:35                   ` Linus Torvalds
2002-08-13 20:34                     ` Alan Cox
2002-08-13 20:42                     ` Martin J. Bligh
2002-08-13 21:24                       ` Linus Torvalds
2002-08-13 22:29                         ` Andrew Theurer
2002-08-13 23:30                           ` Andrea Arcangeli
2002-08-14 21:16                             ` James Cleverdon [this message]
2002-08-23  2:31                             ` [PATCH] 2.5.31 Summit NUMA patch with dynamic IRQ balancing James Cleverdon
2002-08-20  0:49                         ` [PATCH] NUMA-Q disable irqbalance Dave Hansen
2002-08-13 22:08                     ` Rik van Riel
2002-08-13 22:14                       ` Rik van Riel
2002-08-14 14:49                         ` Linus Torvalds
2002-08-14 15:19                           ` Rik van Riel
2002-08-24 12:19                 ` Zwane Mwaikambo
2002-08-27  1:23                   ` James Cleverdon
2002-08-27  7:46                     ` Zwane Mwaikambo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200208141416.59397.jamesclv@us.ibm.com \
    --to=jamesclv@us.ibm.com \
    --cc=Martin.Bligh@us.ibm.com \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=andrea@suse.de \
    --cc=habanero@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox