public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch] genapic: optimize & fix APIC mode setup
@ 2006-11-11 15:14 Ingo Molnar
  2006-11-11 15:20 ` Andi Kleen
  0 siblings, 1 reply; 23+ messages in thread
From: Ingo Molnar @ 2006-11-11 15:14 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Andi Kleen

From: Ingo Molnar <mingo@elte.hu>

this patch fixes a couple of inconsistencies/problems i found while 
reviewing the x86_64 genapic code (when i was chasing mysterious eth0 
timeouts that would only trigger if CPU_HOTPLUG is enabled):

 - AMD systems defaulted to the slower flat-physical mode instead
   of the flat-logical mode. The only restriction on AMD systems
   is that they should not use clustered APIC mode.

 - removed the CPU hotplug hacks, switching the default for small
   systems back from phys-flat to logical-flat. The switching to logical 
   flat mode on small systems fixed sporadic ethernet driver timeouts i 
   was getting on a dual-core Athlon64 system:

    NETDEV WATCHDOG: eth0: transmit timed out
    eth0: Transmit timeout, status 0c 0005 c07f media 80.
    eth0: Tx queue start entry 32  dirty entry 28.
    eth0:  Tx descriptor 0 is 0008a04a. (queue head)
    eth0:  Tx descriptor 1 is 0008a04a.
    eth0:  Tx descriptor 2 is 0008a04a.
    eth0:  Tx descriptor 3 is 0008a04a.
    eth0: link up, 100Mbps, full-duplex, lpa 0xC5E1

 - The use of '<= 8' was a bug by itself (the valid APIC ids
   for logical flat mode go from 0 to 7, not 0 to 8). The new logic
   is to use logical flat mode on both AMD and Intel systems, and
   to only switch to physical mode when logical mode cannot be used.
   If CPU hotplug is racy wrt. APIC shutdown then CPU hotplug needs
   fixing, not the whole IRQ system be made inconsistent and slowed
   down.

 - minor cleanups: simplified some code constructs

build & booted on a couple of AMD and Intel SMP systems.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86_64/kernel/genapic.c |   54 ++++++++++++++++---------------------------
 1 file changed, 21 insertions(+), 33 deletions(-)

Index: linux/arch/x86_64/kernel/genapic.c
===================================================================
--- linux.orig/arch/x86_64/kernel/genapic.c
+++ linux/arch/x86_64/kernel/genapic.c
@@ -32,30 +32,26 @@ extern struct genapic apic_cluster;
 extern struct genapic apic_flat;
 extern struct genapic apic_physflat;
 
-struct genapic *genapic = &apic_flat;
-
+struct genapic __read_mostly *genapic = &apic_flat;
 
 /*
  * Check the APIC IDs in bios_cpu_apicid and choose the APIC mode.
  */
 void __init clustered_apic_check(void)
 {
-	long i;
-	u8 clusters, max_cluster;
-	u8 id;
-	u8 cluster_cnt[NUM_APIC_CLUSTERS];
-	int max_apic = 0;
+	u8 id, clusters, max_cluster, cluster_cnt[NUM_APIC_CLUSTERS];
+	int i, max_apic = 0;
 
-#if defined(CONFIG_ACPI)
+#ifdef CONFIG_ACPI
 	/*
 	 * Some x86_64 machines use physical APIC mode regardless of how many
 	 * procs/clusters are present (x86_64 ES7000 is an example).
 	 */
-	if (acpi_fadt.revision > FADT2_REVISION_ID)
-		if (acpi_fadt.force_apic_physical_destination_mode) {
-			genapic = &apic_cluster;
-			goto print;
-		}
+	if (acpi_fadt.revision > FADT2_REVISION_ID &&
+			acpi_fadt.force_apic_physical_destination_mode) {
+		genapic = &apic_cluster;
+		goto print;
+	}
 #endif
 
 	memset(cluster_cnt, 0, sizeof(cluster_cnt));
@@ -68,20 +64,17 @@ void __init clustered_apic_check(void)
 		cluster_cnt[APIC_CLUSTERID(id)]++;
 	}
 
-	/* Don't use clustered mode on AMD platforms. */
+	/*
+	 * Don't use clustered mode on AMD platforms, default
+	 * to flat logical mode.
+	 */
  	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
-		genapic = &apic_physflat;
-#ifndef CONFIG_HOTPLUG_CPU
-		/* In the CPU hotplug case we cannot use broadcast mode
-		   because that opens a race when a CPU is removed.
-		   Stay at physflat mode in this case.
-		   It is bad to do this unconditionally though. Once
-		   we have ACPI platform support for CPU hotplug
-		   we should detect hotplug capablity from ACPI tables and
-		   only do this when really needed. -AK */
-		if (max_apic <= 8)
-			genapic = &apic_flat;
-#endif
+		/*
+		 * Switch to physical flat mode if more than 8 APICs
+		 * (In the case of 8 CPUs APIC ID goes from 0 to 7):
+		 */
+		if (max_apic >= 8)
+			genapic = &apic_physflat;
  		goto print;
  	}
 
@@ -103,14 +96,9 @@ void __init clustered_apic_check(void)
 	 * (We don't use lowest priority delivery + HW APIC IRQ steering, so
 	 * can ignore the clustered logical case and go straight to physical.)
 	 */
-	if (clusters <= 1 && max_cluster <= 8 && cluster_cnt[0] == max_cluster) {
-#ifdef CONFIG_HOTPLUG_CPU
-		/* Don't use APIC shortcuts in CPU hotplug to avoid races */
-		genapic = &apic_physflat;
-#else
+	if (clusters <= 1 && max_cluster <= 8 && cluster_cnt[0] == max_cluster)
 		genapic = &apic_flat;
-#endif
-	} else
+	else
 		genapic = &apic_cluster;
 
 print:

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-11 15:14 [patch] genapic: optimize & fix APIC mode setup Ingo Molnar
@ 2006-11-11 15:20 ` Andi Kleen
  2006-11-11 15:27   ` Ingo Molnar
                     ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Andi Kleen @ 2006-11-11 15:20 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Andrew Morton, linux-kernel, ashok.raj

On Saturday 11 November 2006 16:14, Ingo Molnar wrote:
> From: Ingo Molnar <mingo@elte.hu>
> 
> this patch fixes a couple of inconsistencies/problems i found while 
> reviewing the x86_64 genapic code (when i was chasing mysterious eth0 
> timeouts that would only trigger if CPU_HOTPLUG is enabled):
> 
>  - AMD systems defaulted to the slower flat-physical mode instead
>    of the flat-logical mode. The only restriction on AMD systems
>    is that they should not use clustered APIC mode.

This will open a race on CPU hotunplug unfortunately
(common for multi core suspend) 

> 
>  - removed the CPU hotplug hacks, switching the default for small
>    systems back from phys-flat to logical-flat. The switching to logical 
>    flat mode on small systems fixed sporadic ethernet driver timeouts i 
>    was getting on a dual-core Athlon64 system:

That will break CPU hotplug on some Intel systems (Ashok can give details) 

That is caused ethernet timeouts is weird, probably needs to be root caused.

>     NETDEV WATCHDOG: eth0: transmit timed out
>     eth0: Transmit timeout, status 0c 0005 c07f media 80.
>     eth0: Tx queue start entry 32  dirty entry 28.
>     eth0:  Tx descriptor 0 is 0008a04a. (queue head)
>     eth0:  Tx descriptor 1 is 0008a04a.
>     eth0:  Tx descriptor 2 is 0008a04a.
>     eth0:  Tx descriptor 3 is 0008a04a.
>     eth0: link up, 100Mbps, full-duplex, lpa 0xC5E1
> 
>  - The use of '<= 8' was a bug by itself (the valid APIC ids
>    for logical flat mode go from 0 to 7, not 0 to 8). The new logic
>    is to use logical flat mode on both AMD and Intel systems, and
>    to only switch to physical mode when logical mode cannot be used.
>    If CPU hotplug is racy wrt. APIC shutdown then CPU hotplug needs
>    fixing, not the whole IRQ system be made inconsistent and slowed
>    down.

Yes that needs to be fixed.

-Andi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-11 15:20 ` Andi Kleen
@ 2006-11-11 15:27   ` Ingo Molnar
  2006-11-11 15:39   ` Ingo Molnar
  2006-11-13  1:50   ` Siddha, Suresh B
  2 siblings, 0 replies; 23+ messages in thread
From: Ingo Molnar @ 2006-11-11 15:27 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, linux-kernel, ashok.raj


* Andi Kleen <ak@suse.de> wrote:

> On Saturday 11 November 2006 16:14, Ingo Molnar wrote:
> > From: Ingo Molnar <mingo@elte.hu>
> > 
> > this patch fixes a couple of inconsistencies/problems i found while 
> > reviewing the x86_64 genapic code (when i was chasing mysterious eth0 
> > timeouts that would only trigger if CPU_HOTPLUG is enabled):
> > 
> >  - AMD systems defaulted to the slower flat-physical mode instead
> >    of the flat-logical mode. The only restriction on AMD systems
> >    is that they should not use clustered APIC mode.
> 
> This will open a race on CPU hotunplug unfortunately (common for multi 
> core suspend)

Note that i386 still defaults to logical flat mode, so whatever hotplug 
CPU races there are, they need to be fixed! Given how rare CPU hotplug 
systems are i have no problem with having these races in the kernel for 
a while until it's fixed.

Also, distro kernels enable CPU_HOTPLUG frequently. It is just ugly 
beyond recognition to switch the programming of the IRQ hardware on 
non-hotplug hardware just because a mostly-software feature (hotplug) is 
enabled ...

if hotplug breaks suspend then fix it, dont hack it around (on one 
platform) by slowing down the system [and causing other problems] ...

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-11 15:20 ` Andi Kleen
  2006-11-11 15:27   ` Ingo Molnar
@ 2006-11-11 15:39   ` Ingo Molnar
  2006-11-13  1:50   ` Siddha, Suresh B
  2 siblings, 0 replies; 23+ messages in thread
From: Ingo Molnar @ 2006-11-11 15:39 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, linux-kernel, ashok.raj


* Andi Kleen <ak@suse.de> wrote:

> This will open a race on CPU hotunplug unfortunately
> (common for multi core suspend) 

how can i reproduce this btw, any instructions/pointers for that?

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-11 15:20 ` Andi Kleen
  2006-11-11 15:27   ` Ingo Molnar
  2006-11-11 15:39   ` Ingo Molnar
@ 2006-11-13  1:50   ` Siddha, Suresh B
  2006-11-13  2:32     ` Andi Kleen
  2006-11-13  8:43     ` Ingo Molnar
  2 siblings, 2 replies; 23+ messages in thread
From: Siddha, Suresh B @ 2006-11-13  1:50 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ingo Molnar, Andrew Morton, linux-kernel, ashok.raj

On Sat, Nov 11, 2006 at 04:20:24PM +0100, Andi Kleen wrote:
> On Saturday 11 November 2006 16:14, Ingo Molnar wrote:
> > 
> >  - removed the CPU hotplug hacks, switching the default for small
> >    systems back from phys-flat to logical-flat. The switching to logical 
> >    flat mode on small systems fixed sporadic ethernet driver timeouts i 
> >    was getting on a dual-core Athlon64 system:
> 
> That will break CPU hotplug on some Intel systems (Ashok can give details) 

There is an issue of using clustered mode along with cpu hotplug. More details
are at the below link.

http://marc.theaimsgroup.com/?l=linux-kernel&m=113261865814107&w=2

thanks,
suresh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13  1:50   ` Siddha, Suresh B
@ 2006-11-13  2:32     ` Andi Kleen
  2006-11-13  8:16       ` Ingo Molnar
  2006-11-13  8:43     ` Ingo Molnar
  1 sibling, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2006-11-13  2:32 UTC (permalink / raw)
  To: Siddha, Suresh B; +Cc: Ingo Molnar, Andrew Morton, linux-kernel, ashok.raj

On Monday 13 November 2006 02:50, Siddha, Suresh B wrote:
> On Sat, Nov 11, 2006 at 04:20:24PM +0100, Andi Kleen wrote:
> > On Saturday 11 November 2006 16:14, Ingo Molnar wrote:
> > > 
> > >  - removed the CPU hotplug hacks, switching the default for small
> > >    systems back from phys-flat to logical-flat. The switching to logical 
> > >    flat mode on small systems fixed sporadic ethernet driver timeouts i 
> > >    was getting on a dual-core Athlon64 system:
> > 
> > That will break CPU hotplug on some Intel systems (Ashok can give details) 
> 
> There is an issue of using clustered mode along with cpu hotplug. More details
> are at the below link.

Thanks Suresh.

> 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=113261865814107&w=2

I should add i have no definite proof this is an issue on AMD systems
too, but I changed it there too anyways to better be safe than sorry
(and there is not very much performance difference anyways)

Now if it causes device driver issues that's different of course. I wasn't
aware of this before.

-Andi


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13  2:32     ` Andi Kleen
@ 2006-11-13  8:16       ` Ingo Molnar
  2006-11-13  9:08         ` Andi Kleen
  0 siblings, 1 reply; 23+ messages in thread
From: Ingo Molnar @ 2006-11-13  8:16 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Siddha, Suresh B, Andrew Morton, linux-kernel, ashok.raj


* Andi Kleen <ak@suse.de> wrote:

> Now if it causes device driver issues that's different of course. I 
> wasn't aware of this before.

lets try my patch in -mm for a while.

Had i ever noticed this hack in the first place i would have NAK-ed it. 
There is a fundamental design friction of a high-level feature like 
HOTPLUG_CPU /requiring/ a fundamental change to the lowlevel IRQ 
delivery mode! Such a requirement is broken and just serves to hide a 
flaw in the hotplug design - which flaw would trigger on i386 /anyway/, 
because i386 still uses logical delivery mode for APIC IPIs. Also, i'd 
like to have a description of how to reproduce those CPU hotplug 
problems, so that i can try to fix it.

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13  1:50   ` Siddha, Suresh B
  2006-11-13  2:32     ` Andi Kleen
@ 2006-11-13  8:43     ` Ingo Molnar
  2006-11-13 17:34       ` Siddha, Suresh B
  1 sibling, 1 reply; 23+ messages in thread
From: Ingo Molnar @ 2006-11-13  8:43 UTC (permalink / raw)
  To: Siddha, Suresh B; +Cc: Andi Kleen, Andrew Morton, linux-kernel, ashok.raj


* Siddha, Suresh B <suresh.b.siddha@intel.com> wrote:

> There is an issue of using clustered mode along with cpu hotplug. More 
> details are at the below link.
> 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=113261865814107&w=2

ok, to make sure i understand this right: it is not safe to switch any 
local APIC in the system into clustered APIC mode on the E850x chipset 
/at all/, because if one of the CPUs gets an INIT/startup IPI message 
its local APIC will default to logical flat mode and might confuse the 
chipset?

on large systems that have their APIC IDs set up to group CPUs amongst 
different clusters and hence triggered cluster mode, the chipset does 
not get confused by this, correct?

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13  8:16       ` Ingo Molnar
@ 2006-11-13  9:08         ` Andi Kleen
  2006-11-13 14:05           ` Ingo Molnar
  2006-11-13 14:14           ` Ingo Molnar
  0 siblings, 2 replies; 23+ messages in thread
From: Andi Kleen @ 2006-11-13  9:08 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Siddha, Suresh B, Andrew Morton, linux-kernel, ashok.raj

On Monday 13 November 2006 09:16, Ingo Molnar wrote:
> 
> * Andi Kleen <ak@suse.de> wrote:
> 
> > Now if it causes device driver issues that's different of course. I 
> > wasn't aware of this before.
> 
> lets try my patch in -mm for a while.

I don't think that's a good idea.

> 
> Had i ever noticed this hack in the first place i would have NAK-ed it. 
> There is a fundamental design friction of a high-level feature like 
> HOTPLUG_CPU /requiring/ a fundamental change to the lowlevel IRQ 
> delivery mode! 

Well to be honest masked mode isn't that useful anyways. It's only
theoretical advantage would be a bit more performance for multicast IPIs, 
but Ashok did benchmarks and it didn't make any significant difference. With that
I prefer to use always the same mode for small and large systems.
Ok should probably drop the ifdef and just always use physical mode.


> Such a requirement is broken and just serves to hide a  
> flaw in the hotplug design - which flaw would trigger on i386 /anyway/, 
> because i386 still uses logical delivery mode for APIC IPIs. 

i386 cpu hotplug is somewhat broken anyways, but it should be fixed
there too eventually. But some very old chipsets don't seem to support
physical properly so it wasn't changed there.

> Also, i'd  
> like to have a description of how to reproduce those CPU hotplug 
> problems, so that i can try to fix it.

iirc they just did stress tests. Plug/unplug cpus in a tight loop and 
do some workloads and see what happens.

-Andi
 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13  9:08         ` Andi Kleen
@ 2006-11-13 14:05           ` Ingo Molnar
  2006-11-13 14:29             ` Andi Kleen
  2006-11-13 14:14           ` Ingo Molnar
  1 sibling, 1 reply; 23+ messages in thread
From: Ingo Molnar @ 2006-11-13 14:05 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Siddha, Suresh B, Andrew Morton, linux-kernel, ashok.raj


* Andi Kleen <ak@suse.de> wrote:

> > Had i ever noticed this hack in the first place i would have NAK-ed 
> > it. There is a fundamental design friction of a high-level feature 
> > like HOTPLUG_CPU /requiring/ a fundamental change to the lowlevel 
> > IRQ delivery mode!
> 
> Well to be honest masked mode isn't that useful anyways. It's only 
> theoretical advantage would be a bit more performance for multicast 
> IPIs, but Ashok did benchmarks and it didn't make any significant 
> difference. [...]

this argument is false for at least two reasons. Firstly, i can show you 
a 1000 small changes that wont be measurable individually but that as a 
summary effect can degrade the kernel substantially.

Secondly, in the physical case, /all/ IPI sending goes through this 
code:

        for_each_cpu_mask(query_cpu, mask) {

yes, even the single-IPI calls which give the overwhelming majority of 
the use of IPIs. Even on systems that have only 2 CPUs to begin with. 
This should be measurable.

> [...] With that I prefer to use always the same mode for small and 
> large systems. Ok should probably drop the ifdef and just always use 
> physical mode.

you are still not getting it i think. The IO-APIC is still in logical 
delivery mode on small systems, and we very much make use of this fact 
by using LowestPrio messages (that current CPUs started to support 
again). The switching to physical mode is dangerous because it creates 
'mixed' APIC messages (physical and logical targeted messages as well) - 
which 'mixed mode' is notorious for erratas both in the CPU and in the 
chipset. (I strongly suspect that my eth timeouts are due to that.) 

Combine this with the fact that /normally/ we default to logical mode, 
the basis of your position is quite puzzling to me.

the right solution is to use pure physical mode (both local APIC and 
IO-APIC) only on large systems, and to use pure logical mode on small 
systems - maybe with the combination of clustered mode as well.

and that's precisely what my patch achieves ...

> > Such a requirement is broken and just serves to hide a flaw in the 
> > hotplug design - which flaw would trigger on i386 /anyway/, because 
> > i386 still uses logical delivery mode for APIC IPIs.
> 
> i386 cpu hotplug is somewhat broken anyways, but it should be fixed 
> there too eventually. But some very old chipsets don't seem to support 
> physical properly so it wasn't changed there.

as i said it before, what you are suggesting is not a 'fix', it's a 
workaround for a design flaw in the hotplug code which flaw is hitting 
us in other places and architectures anyway, and which workaround makes 
us use an inferior IRQ delivery method on small systems ...

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13  9:08         ` Andi Kleen
  2006-11-13 14:05           ` Ingo Molnar
@ 2006-11-13 14:14           ` Ingo Molnar
  1 sibling, 0 replies; 23+ messages in thread
From: Ingo Molnar @ 2006-11-13 14:14 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Siddha, Suresh B, Andrew Morton, linux-kernel, ashok.raj


* Andi Kleen <ak@suse.de> wrote:

> > Also, i'd like to have a description of how to reproduce those CPU 
> > hotplug problems, so that i can try to fix it.
> 
> iirc they just did stress tests. Plug/unplug cpus in a tight loop and 
> do some workloads and see what happens.

i just tested it and it works fine for me.

Andrew, please apply my patch, which fixes a real regression on my box, 
until some detailed instruction / testcase is shown about how to 
reproduce whatever supposed CPU-hotplug bug is hidden by this kludge - 
at which point i offer to fix that CPU-hotplug bug too. (whatever it 
takes)

(I cannot believe that this kludge was allowed into the x86_64 tree and 
that its non-removal is now justified with 'it breaks something around 
the CPU hotplug code, go figure it out yourself'. That's a sure way to 
let the kernel bitrot...)

	Ingo

------------------------>
Subject: [patch] genapic: optimize & fix APIC mode setup
From: Ingo Molnar <mingo@elte.hu>

this patch fixes a couple of inconsistencies/problems i found while
reviewing the x86_64 genapic code (when i was chasing mysterious eth0
timeouts that would only trigger if CPU_HOTPLUG is enabled):

 - AMD systems defaulted to the slower flat-physical mode instead
   of the flat-logical mode. The only restriction on AMD systems
   is that they should not use clustered APIC mode.

 - removed the CPU hotplug hacks, switching the default for small
   systems back from phys-flat to logical-flat. The switching to logical
   flat mode on small systems fixed sporadic ethernet driver timeouts i
   was getting on a dual-core Athlon64 system:

    NETDEV WATCHDOG: eth0: transmit timed out
    eth0: Transmit timeout, status 0c 0005 c07f media 80.
    eth0: Tx queue start entry 32  dirty entry 28.
    eth0:  Tx descriptor 0 is 0008a04a. (queue head)
    eth0:  Tx descriptor 1 is 0008a04a.
    eth0:  Tx descriptor 2 is 0008a04a.
    eth0:  Tx descriptor 3 is 0008a04a.
    eth0: link up, 100Mbps, full-duplex, lpa 0xC5E1

 - The use of '<= 8' was a bug by itself (the valid APIC ids
   for logical flat mode go from 0 to 7, not 0 to 8). The new logic
   is to use logical flat mode on both AMD and Intel systems, and
   to only switch to physical mode when logical mode cannot be used.
   If CPU hotplug is racy wrt. APIC shutdown then CPU hotplug needs
   fixing, not the whole IRQ system be made inconsistent and slowed
   down.

 - minor cleanups: simplified some code constructs

build & booted on a couple of AMD and Intel SMP systems.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
 arch/x86_64/kernel/genapic.c |   54 ++++++++++++++++---------------------------
 1 file changed, 21 insertions(+), 33 deletions(-)

Index: linux/arch/x86_64/kernel/genapic.c
===================================================================
--- linux.orig/arch/x86_64/kernel/genapic.c
+++ linux/arch/x86_64/kernel/genapic.c
@@ -32,30 +32,26 @@ extern struct genapic apic_cluster;
 extern struct genapic apic_flat;
 extern struct genapic apic_physflat;
 
-struct genapic *genapic = &apic_flat;
-
+struct genapic __read_mostly *genapic = &apic_flat;
 
 /*
  * Check the APIC IDs in bios_cpu_apicid and choose the APIC mode.
  */
 void __init clustered_apic_check(void)
 {
-	long i;
-	u8 clusters, max_cluster;
-	u8 id;
-	u8 cluster_cnt[NUM_APIC_CLUSTERS];
-	int max_apic = 0;
+	u8 id, clusters, max_cluster, cluster_cnt[NUM_APIC_CLUSTERS];
+	int i, max_apic = 0;
 
-#if defined(CONFIG_ACPI)
+#ifdef CONFIG_ACPI
 	/*
 	 * Some x86_64 machines use physical APIC mode regardless of how many
 	 * procs/clusters are present (x86_64 ES7000 is an example).
 	 */
-	if (acpi_fadt.revision > FADT2_REVISION_ID)
-		if (acpi_fadt.force_apic_physical_destination_mode) {
-			genapic = &apic_cluster;
-			goto print;
-		}
+	if (acpi_fadt.revision > FADT2_REVISION_ID &&
+			acpi_fadt.force_apic_physical_destination_mode) {
+		genapic = &apic_cluster;
+		goto print;
+	}
 #endif
 
 	memset(cluster_cnt, 0, sizeof(cluster_cnt));
@@ -68,20 +64,17 @@ void __init clustered_apic_check(void)
 		cluster_cnt[APIC_CLUSTERID(id)]++;
 	}
 
-	/* Don't use clustered mode on AMD platforms. */
+	/*
+	 * Don't use clustered mode on AMD platforms, default
+	 * to flat logical mode.
+	 */
  	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
-		genapic = &apic_physflat;
-#ifndef CONFIG_HOTPLUG_CPU
-		/* In the CPU hotplug case we cannot use broadcast mode
-		   because that opens a race when a CPU is removed.
-		   Stay at physflat mode in this case.
-		   It is bad to do this unconditionally though. Once
-		   we have ACPI platform support for CPU hotplug
-		   we should detect hotplug capablity from ACPI tables and
-		   only do this when really needed. -AK */
-		if (max_apic <= 8)
-			genapic = &apic_flat;
-#endif
+		/*
+		 * Switch to physical flat mode if more than 8 APICs
+		 * (In the case of 8 CPUs APIC ID goes from 0 to 7):
+		 */
+		if (max_apic >= 8)
+			genapic = &apic_physflat;
  		goto print;
  	}
 
@@ -103,14 +96,9 @@ void __init clustered_apic_check(void)
 	 * (We don't use lowest priority delivery + HW APIC IRQ steering, so
 	 * can ignore the clustered logical case and go straight to physical.)
 	 */
-	if (clusters <= 1 && max_cluster <= 8 && cluster_cnt[0] == max_cluster) {
-#ifdef CONFIG_HOTPLUG_CPU
-		/* Don't use APIC shortcuts in CPU hotplug to avoid races */
-		genapic = &apic_physflat;
-#else
+	if (clusters <= 1 && max_cluster <= 8 && cluster_cnt[0] == max_cluster)
 		genapic = &apic_flat;
-#endif
-	} else
+	else
 		genapic = &apic_cluster;
 
 print:

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13 14:05           ` Ingo Molnar
@ 2006-11-13 14:29             ` Andi Kleen
  2006-11-13 15:04               ` Ingo Molnar
  0 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2006-11-13 14:29 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Siddha, Suresh B, Andrew Morton, linux-kernel, ashok.raj


> 
> Secondly, in the physical case, /all/ IPI sending goes through this 
> code:
> 
>         for_each_cpu_mask(query_cpu, mask) {
> 
> yes, even the single-IPI calls which give the overwhelming majority of 
> the use of IPIs. Even on systems that have only 2 CPUs to begin with. 
> This should be measurable.

I thought so too originally, but it wasn't. My original thinking
was that logical must be faster because it can in theory send less messages
on large systems.

On the two CPU case it is basically always the same anyways because
the loop is very cheap compared to the IPIs (IPIs tend to be thousands
of cycles). for_each_cpu_mask is essentially just BSF with some glue.

For the > 2 CPU case it is not that obvious -- in theory the hardware
could optimize it to be more efficient, but it doesn't seem to. Or at least
not in a good enough way to show significant differences.

> you are still not getting it i think. The IO-APIC is still in logical 
> delivery mode on small systems, 

The IO-APIC delivery mode that is configured comes from genapic, and should be 
physical for physflat unless I'm totally confused about the code.

> the right solution is to use pure physical mode (both local APIC and 
> IO-APIC) only on large systems, and to use pure logical mode on small 
> systems - maybe with the combination of clustered mode as well.

I disagree.  I think we should just use physical mode everywhere,
except on the old i386 systems.
 
> as i said it before, what you are suggesting is not a 'fix', it's a 
> workaround for a design flaw in the hotplug code which flaw is hitting 
> us in other places and architectures anyway, and which workaround makes 
> us use an inferior IRQ delivery method on small systems ...

It isn't inferior as far as I know.

-Andi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13 14:29             ` Andi Kleen
@ 2006-11-13 15:04               ` Ingo Molnar
  2006-11-13 16:10                 ` Andi Kleen
  0 siblings, 1 reply; 23+ messages in thread
From: Ingo Molnar @ 2006-11-13 15:04 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Siddha, Suresh B, Andrew Morton, linux-kernel, ashok.raj


* Andi Kleen <ak@suse.de> wrote:

> On the two CPU case it is basically always the same anyways because 
> the loop is very cheap compared to the IPIs (IPIs tend to be thousands 
> of cycles). [...]

the basic thing that you are missing is that some of the most important 
IPI deliveries are /asynchronous/! For those it does not matter at all 
that the IPI delivery itself might be thousands of cycles ... 
Programming the IPI itself is cheap.

> for_each_cpu_mask is essentially just BSF with some glue.

yes - it's a minimum of 2 BSF scans, and that isnt exactly the cheapest 
of instructions. It should be tens of cycles or more, combined with the 
nonzero cost of passing a cpumask (which is 32 bytes) into the function 
by value ...

and your argument again, is to hide a hotplug bug via extra cost (and by 
causing additional regressions) and not fix the hotplug bug? Your 
position makes no sense to me.

> For the > 2 CPU case it is not that obvious -- in theory the hardware 
> could optimize it to be more efficient, but it doesn't seem to. Or at 
> least not in a good enough way to show significant differences.

how did you measure it?

> > you are still not getting it i think. The IO-APIC is still in 
> > logical delivery mode on small systems,
> 
> The IO-APIC delivery mode that is configured comes from genapic, and 
> should be physical for physflat unless I'm totally confused about the 
> code.

indeed, you are right. Btw., this is another change to io_apic.c that i 
would never have ACKed ;-) This whole thing is hidden via something that 
looks like a macro value:

                entry.delivery_mode = INT_DELIVERY_MODE;

but is defined behind the curtain as:

		#define INT_DELIVERY_MODE (genapic->int_delivery_mode)

same goes for TARGET_CPUS:

		#define TARGET_CPUS   (genapic->target_cpus())

ugh!

physical delivery is bad on the IO-APIC anyway - today LowestPrio 
delivery mode works again and the hardware can help us. In physical mode 
there's no LowestPrio delivery mode...

> > the right solution is to use pure physical mode (both local APIC and 
> > IO-APIC) only on large systems, and to use pure logical mode on 
> > small systems - maybe with the combination of clustered mode as 
> > well.
> 
> I disagree.  I think we should just use physical mode everywhere, 
> except on the old i386 systems.

you are risking the introduction of the kind of regressions that i'm 
seeing on my box: that physical delivery mystically doesnt work 
occasionally. AFAIK Windows defaults to logical APIC delivery too on 
small systems. (it in fact uses the TPR IIRC which can only work with 
logical delivery)

So your suggestion would put us up to use an uncommon (and more 
expensive ...) mode of IPI delivery - instead of limiting that rare 
delivery mode to large SMP systems only ...

and again, please remind me of what you are trying to argue: that we 
should keep the slower and less common delivery mode just to not have to 
fix the hotplug bug?

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13 15:04               ` Ingo Molnar
@ 2006-11-13 16:10                 ` Andi Kleen
  2006-11-13 16:32                   ` Ingo Molnar
  0 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2006-11-13 16:10 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Siddha, Suresh B, Andrew Morton, linux-kernel, ashok.raj


> Programming the IPI itself is cheap.

yep, that is why physical mode doesn't hurt much.
 
> > for_each_cpu_mask is essentially just BSF with some glue.
> 
> yes - it's a minimum of 2 BSF scans, and that isnt exactly the cheapest 
> of instructions. It should be tens of cycles or more, combined with the 
> nonzero cost of passing a cpumask (which is 32 bytes) into the function 
> by value ...

Sorry, but everytime IPIs and all the other overhead etc. are involved even 10 BSFs 
would be completely  in the noise.  The relative cost is so much smaller.
I think you're barking up the wrong tree here regarding that loop.

There are surely inefficiencies in the code, but I don't think they're
related to cpu loops.

[BTW the code for the cpu loop is much worse than it could be [1], but 
that's a different story. But even with the worse code it doesn't matter much]

[1] I had an optimization patch for find_next_bit for that case once but gcc
didn't like it and fixing it completely (= getting rid of the bogus 
additional test) would require auditing all architectures. I didn't push
it very hard because I didn't have much evidence it is really performance
critical -- compared to cache misses and locks it is all small fries.

> 
> and your argument again, is to hide a hotplug bug 

I don't think it's a bug -- it's inherent.

> indeed, you are right. Btw., this is another change to io_apic.c that i 
> would never have ACKed ;-) This whole thing is hidden via something that 
> looks like a macro value:

Yes agreed the magic macros are ugly. It has historical reasons from i386 (and I'm 
partly to blame). Essentially it comes out of the mess that i386 subarchs 
are and when moving genapic to x86-64 that particular ugliness wasn't cleaned up.

> physical delivery is bad on the IO-APIC anyway - today LowestPrio 
> delivery mode works again and the hardware can help us. 

Help us with what exactly?

> you are risking the introduction of the kind of regressions that i'm 
> seeing on my box: that physical delivery mystically doesnt work 
> occasionally. AFAIK Windows defaults to logical APIC delivery too on 
> small systems. (it in fact uses the TPR IIRC which can only work with 
> logical delivery)

Yes that's a real risk.

Probably need a solution for your chipset. But currently as proposed by you it 
would be robbing Peter to pay Paul.

> So your suggestion would put us up to use an uncommon (and more 
> expensive ...) mode of IPI delivery - instead of limiting that rare 
> delivery mode to large SMP systems only ...
> 
> and again, please remind me of what you are trying to argue: that we 
> should keep the slower and less common delivery mode just to not have to 
> fix the hotplug bug?

Ok assuming temporarily it's a bug, how would you want to fix it?

-Andi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13 16:10                 ` Andi Kleen
@ 2006-11-13 16:32                   ` Ingo Molnar
  2006-11-13 18:03                     ` Siddha, Suresh B
  2006-11-13 19:08                     ` Ingo Molnar
  0 siblings, 2 replies; 23+ messages in thread
From: Ingo Molnar @ 2006-11-13 16:32 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Siddha, Suresh B, Andrew Morton, linux-kernel, ashok.raj


* Andi Kleen <ak@suse.de> wrote:

> > > (IPIs tend to be thousands of cycles). [...]
> >
> > Programming the IPI itself is cheap.
> 
> yep, that is why physical mode doesn't hurt much.

and that is why your argument of thousands of cycles per IPI 'cost', in 
comparison to which any other cost is miniscule, was misleading IMO.

> > > for_each_cpu_mask is essentially just BSF with some glue.
> > 
> > yes - it's a minimum of 2 BSF scans, and that isnt exactly the cheapest 
> > of instructions. It should be tens of cycles or more, combined with the 
> > nonzero cost of passing a cpumask (which is 32 bytes) into the function 
> > by value ...
> 
> Sorry, but everytime IPIs and all the other overhead etc. are involved 
> even 10 BSFs would be completely in the noise.  The relative cost is 
> so much smaller. I think you're barking up the wrong tree here 
> regarding that loop.

i'm puzzled, you just confirmed above what i said before, that 
programming the IPI is cheap... If something is cheap then other costs 
added to (such as cpu-mask scanning, and passing around the cpumask) 
definitely do matter.

> There are surely inefficiencies in the code, but I don't think they're 
> related to cpu loops.

it's the death by thousands of small cuts ...

> [BTW the code for the cpu loop is much worse than it could be [1], but 
> that's a different story. But even with the worse code it doesn't 
> matter much]

so by your argument nothing matters, as long as it's just a small cost?? 
My guesstimation is that the cost is a few dozen cycles per IPI.

> > and your argument again, is to hide a hotplug bug
> 
> I don't think it's a bug -- it's inherent.

please explain.

> > indeed, you are right. Btw., this is another change to io_apic.c that i 
> > would never have ACKed ;-) This whole thing is hidden via something that 
> > looks like a macro value:
> 
> Yes agreed the magic macros are ugly. It has historical reasons from 
> i386 (and I'm partly to blame). Essentially it comes out of the mess 
> that i386 subarchs are and when moving genapic to x86-64 that 
> particular ugliness wasn't cleaned up.

yes. It comes out of tacking a feature ontop of a messy codebase (which 
got messy partly because other, similar features like subarch ioapic 
code was not done in conjunction with the necessary cleanup) - this 
mirrors the mess that is in the time code currently. Mess like that is 
just like the overhead of small featurities - it only shows up in a 
noticeable way if piled together.

> > physical delivery is bad on the IO-APIC anyway - today LowestPrio 
> > delivery mode works again and the hardware can help us.
> 
> Help us with what exactly?

logical flat APIC mode defaults to dest_LowestPrio - this delivery mode 
allows the hardware (the chipset) to optimize IRQ routing. This is only 
available in logical delivery mode, because the 'target CPU range' there 
is an APIC ID mask, inherently.

the hardware might know details like 'is a target core in lowpower mode' 
and redirect IRQs accordingly. We could do the same from the kernel only 
via expensive irq-redirection of IO-APIC entries.

> > you are risking the introduction of the kind of regressions that i'm 
> > seeing on my box: that physical delivery mystically doesnt work 
> > occasionally. AFAIK Windows defaults to logical APIC delivery too on 
> > small systems. (it in fact uses the TPR IIRC which can only work 
> > with logical delivery)
> 
> Yes that's a real risk.

it's an unknown risk against the supposedly-known risk of the 
hotplug-CPU breakage. We should pick known risks, obviously.

> > So your suggestion would put us up to use an uncommon (and more 
> > expensive ...) mode of IPI delivery - instead of limiting that rare 
> > delivery mode to large SMP systems only ...
> > 
> > and again, please remind me of what you are trying to argue: that we 
> > should keep the slower and less common delivery mode just to not 
> > have to fix the hotplug bug?
> 
> Ok assuming temporarily it's a bug, how would you want to fix it?

for that i'd have to know the bug, and this is the third time i'm asking 
about specifics :-) (The URL that was given in the thread was about a 
chipset bug regarding incompatibility with clustered-APIC mode - my 
patch in fact - because it switches small systems to use logical flat 
mode always - solves that kind of regression too.)

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13  8:43     ` Ingo Molnar
@ 2006-11-13 17:34       ` Siddha, Suresh B
  0 siblings, 0 replies; 23+ messages in thread
From: Siddha, Suresh B @ 2006-11-13 17:34 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Siddha, Suresh B, Andi Kleen, Andrew Morton, linux-kernel,
	ashok.raj

On Mon, Nov 13, 2006 at 09:43:16AM +0100, Ingo Molnar wrote:
> 
> * Siddha, Suresh B <suresh.b.siddha@intel.com> wrote:
> 
> > There is an issue of using clustered mode along with cpu hotplug. More 
> > details are at the below link.
> > 
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=113261865814107&w=2
> 
> ok, to make sure i understand this right: it is not safe to switch any 
> local APIC in the system into clustered APIC mode on the E850x chipset 
> /at all/, because if one of the CPUs gets an INIT/startup IPI message 
> its local APIC will default to logical flat mode and might confuse the 
> chipset?

"at all" is not quite correct. We are fine as long as all the cpus are up
in clustered APIC mode before the IO-APIC RTE's are programmed.

Once the IO-APIC subsystem is up and running and later if the cpu comes online
then we have a window between the INIT/startup IPI message and the place
where we program the DFR and LDR, with in which the IO-APIC will interpret
the logical mode in RTE as 'logical flat' and will probably result in missing
the interrupt.

thanks,
suresh
> 
> on large systems that have their APIC IDs set up to group CPUs amongst 
> different clusters and hence triggered cluster mode, the chipset does 
> not get confused by this, correct?
> 
> 	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13 16:32                   ` Ingo Molnar
@ 2006-11-13 18:03                     ` Siddha, Suresh B
  2006-11-13 18:42                       ` Ingo Molnar
  2006-11-13 19:08                     ` Ingo Molnar
  1 sibling, 1 reply; 23+ messages in thread
From: Siddha, Suresh B @ 2006-11-13 18:03 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andi Kleen, Siddha, Suresh B, Andrew Morton, linux-kernel,
	ashok.raj

On Mon, Nov 13, 2006 at 05:32:16PM +0100, Ingo Molnar wrote:
> * Andi Kleen <ak@suse.de> wrote:
> > Ok assuming temporarily it's a bug, how would you want to fix it?
> 
> for that i'd have to know the bug, and this is the third time i'm asking 
> about specifics :-) (The URL that was given in the thread was about a 
> chipset bug regarding incompatibility with clustered-APIC mode - my 
> patch in fact - because it switches small systems to use logical flat 
> mode always - solves that kind of regression too.)

Not really. That chipset belongs to a MP platform and with your proposed patch,
we will endup using clustered APIC mode and will hit the issue(in the presence
of cpu hotplug) mentioned in that URL.

We will find out if this behavior is specific to this single chipset and getback
to you.

thanks,
suresh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13 18:42                       ` Ingo Molnar
@ 2006-11-13 18:30                         ` Siddha, Suresh B
  2006-11-13 19:04                           ` Ingo Molnar
  2006-11-13 19:31                         ` Ashok Raj
  1 sibling, 1 reply; 23+ messages in thread
From: Siddha, Suresh B @ 2006-11-13 18:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Siddha, Suresh B, Andi Kleen, Andrew Morton, linux-kernel,
	ashok.raj

On Mon, Nov 13, 2006 at 07:42:56PM +0100, Ingo Molnar wrote:
> * Siddha, Suresh B <suresh.b.siddha@intel.com> wrote:
> > Not really. That chipset belongs to a MP platform and with your 
> > proposed patch, we will endup using clustered APIC mode and will hit 
> > the issue(in the presence of cpu hotplug) mentioned in that URL.
> 
> hm, why does it end up in clustered mode? Cluster mode should only 
> trigger if the APIC IDs go beyond 16.

go beyond '8' not 16. With Dual-core+HT these MP platforms will have 16 logical
cpus.

> 
> but i'd be fine with never going into cluster mode, instead always using 
> physical flat mode when having more than 8 APICs (independent of the 
> presence of CPU hotplug). On small systems, logical flat mode is what is 
> the best-tested variant (it's also slightly faster).

Ok.

thanks,
suresh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13 18:03                     ` Siddha, Suresh B
@ 2006-11-13 18:42                       ` Ingo Molnar
  2006-11-13 18:30                         ` Siddha, Suresh B
  2006-11-13 19:31                         ` Ashok Raj
  0 siblings, 2 replies; 23+ messages in thread
From: Ingo Molnar @ 2006-11-13 18:42 UTC (permalink / raw)
  To: Siddha, Suresh B; +Cc: Andi Kleen, Andrew Morton, linux-kernel, ashok.raj


* Siddha, Suresh B <suresh.b.siddha@intel.com> wrote:

> > for that i'd have to know the bug, and this is the third time i'm 
> > asking about specifics :-) (The URL that was given in the thread was 
> > about a chipset bug regarding incompatibility with clustered-APIC 
> > mode - my patch in fact - because it switches small systems to use 
> > logical flat mode always - solves that kind of regression too.)
> 
> Not really. That chipset belongs to a MP platform and with your 
> proposed patch, we will endup using clustered APIC mode and will hit 
> the issue(in the presence of cpu hotplug) mentioned in that URL.

hm, why does it end up in clustered mode? Cluster mode should only 
trigger if the APIC IDs go beyond 16.

but i'd be fine with never going into cluster mode, instead always using 
physical flat mode when having more than 8 APICs (independent of the 
presence of CPU hotplug). On small systems, logical flat mode is what is 
the best-tested variant (it's also slightly faster).

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13 19:04                           ` Ingo Molnar
@ 2006-11-13 18:58                             ` Siddha, Suresh B
  0 siblings, 0 replies; 23+ messages in thread
From: Siddha, Suresh B @ 2006-11-13 18:58 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Siddha, Suresh B, Andi Kleen, Andrew Morton, linux-kernel,
	ashok.raj

On Mon, Nov 13, 2006 at 08:04:52PM +0100, Ingo Molnar wrote:
> * Siddha, Suresh B <suresh.b.siddha@intel.com> wrote:
> > On Mon, Nov 13, 2006 at 07:42:56PM +0100, Ingo Molnar wrote:
> > > but i'd be fine with never going into cluster mode, instead always 
> > > using physical flat mode when having more than 8 APICs (independent 
> > > of the presence of CPU hotplug). On small systems, logical flat mode 
> > > is what is the best-tested variant (it's also slightly faster).
> > 
> > Ok.
> 
> ok, that's really good. Is there any 'weird' platform that you are aware 
> of that absolutely needs clustered APIC mode (because it has no physical 
> delivery mode or something)?

None that I am aware of, atleast in the x86_64 world.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13 18:30                         ` Siddha, Suresh B
@ 2006-11-13 19:04                           ` Ingo Molnar
  2006-11-13 18:58                             ` Siddha, Suresh B
  0 siblings, 1 reply; 23+ messages in thread
From: Ingo Molnar @ 2006-11-13 19:04 UTC (permalink / raw)
  To: Siddha, Suresh B; +Cc: Andi Kleen, Andrew Morton, linux-kernel, ashok.raj


* Siddha, Suresh B <suresh.b.siddha@intel.com> wrote:

> On Mon, Nov 13, 2006 at 07:42:56PM +0100, Ingo Molnar wrote:
> > * Siddha, Suresh B <suresh.b.siddha@intel.com> wrote:
> > > Not really. That chipset belongs to a MP platform and with your 
> > > proposed patch, we will endup using clustered APIC mode and will hit 
> > > the issue(in the presence of cpu hotplug) mentioned in that URL.
> > 
> > hm, why does it end up in clustered mode? Cluster mode should only 
> > trigger if the APIC IDs go beyond 16.
> 
> go beyond '8' not 16. With Dual-core+HT these MP platforms will have 
> 16 logical cpus.

yes - then the APIC ids will go beyond 15.

> > but i'd be fine with never going into cluster mode, instead always 
> > using physical flat mode when having more than 8 APICs (independent 
> > of the presence of CPU hotplug). On small systems, logical flat mode 
> > is what is the best-tested variant (it's also slightly faster).
> 
> Ok.

ok, that's really good. Is there any 'weird' platform that you are aware 
of that absolutely needs clustered APIC mode (because it has no physical 
delivery mode or something)?

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13 16:32                   ` Ingo Molnar
  2006-11-13 18:03                     ` Siddha, Suresh B
@ 2006-11-13 19:08                     ` Ingo Molnar
  1 sibling, 0 replies; 23+ messages in thread
From: Ingo Molnar @ 2006-11-13 19:08 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Siddha, Suresh B, Andrew Morton, linux-kernel, ashok.raj


* Ingo Molnar <mingo@elte.hu> wrote:

> [...] If something is cheap then other costs added to (such as 
> cpu-mask scanning, and passing around the cpumask) definitely do 
> matter.

btw., i dont claim this is a big issue - we could solve this for the 
common case even in physical delivery mode by adding a 'single IPI 
delivery' callback to the genapic methods. That could be used by the 
reschedule IPI for example.

	Ingo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch] genapic: optimize & fix APIC mode setup
  2006-11-13 18:42                       ` Ingo Molnar
  2006-11-13 18:30                         ` Siddha, Suresh B
@ 2006-11-13 19:31                         ` Ashok Raj
  1 sibling, 0 replies; 23+ messages in thread
From: Ashok Raj @ 2006-11-13 19:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Siddha, Suresh B, Andi Kleen, Andrew Morton, linux-kernel,
	ashok.raj

On Mon, Nov 13, 2006 at 07:42:56PM +0100, Ingo Molnar wrote:
> 
> * Siddha, Suresh B <suresh.b.siddha@intel.com> wrote:
> 
> > > for that i'd have to know the bug, and this is the third time i'm 
> > > asking about specifics :-) (The URL that was given in the thread was 
> > > about a chipset bug regarding incompatibility with clustered-APIC 
> > > mode - my patch in fact - because it switches small systems to use 
> > > logical flat mode always - solves that kind of regression too.)
> > 
> > Not really. That chipset belongs to a MP platform and with your 
> > proposed patch, we will endup using clustered APIC mode and will hit 
> > the issue(in the presence of cpu hotplug) mentioned in that URL.
> 
> hm, why does it end up in clustered mode? Cluster mode should only 
> trigger if the APIC IDs go beyond 16.

number of cpus >8 we need to switch to either clustered or flat physical mode.

> 
> but i'd be fine with never going into cluster mode, instead always using 
> physical flat mode when having more than 8 APICs (independent of the 
> presence of CPU hotplug). On small systems, logical flat mode is what is 
> the best-tested variant (it's also slightly faster).
> 
> 	Ingo

I think we choose flat physical mode for two reasons, x86_64 was doing the same
so we were consistent with the approach. The other was when we have cpu hotplug
enabled its difficult to say what the max number of cpus will be. So
choosing flat logical was not right even if we had 8 or less cpus, and 
technically we could think of something that starts with less and go beyond 
max cluster limit of 60 as well.

Hence Andi introduces another way to limit it based on disabled cpu count in 
MADT, or user can specify at startup time to specify the limit of potentially
hot-pluggable cpus.

bigsmp was also a relatively new variant at the time, and all agreed to keep
things simple instead of having 3 different approaches, (logical flat,
logical cluster, and flat physical). Running some simple tests we didnt
see any appreciable differences at that time, maybe we were not that exhaustive.



-- 
Cheers,
Ashok Raj
- Open Source Technology Center

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2006-11-13 19:54 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-11-11 15:14 [patch] genapic: optimize & fix APIC mode setup Ingo Molnar
2006-11-11 15:20 ` Andi Kleen
2006-11-11 15:27   ` Ingo Molnar
2006-11-11 15:39   ` Ingo Molnar
2006-11-13  1:50   ` Siddha, Suresh B
2006-11-13  2:32     ` Andi Kleen
2006-11-13  8:16       ` Ingo Molnar
2006-11-13  9:08         ` Andi Kleen
2006-11-13 14:05           ` Ingo Molnar
2006-11-13 14:29             ` Andi Kleen
2006-11-13 15:04               ` Ingo Molnar
2006-11-13 16:10                 ` Andi Kleen
2006-11-13 16:32                   ` Ingo Molnar
2006-11-13 18:03                     ` Siddha, Suresh B
2006-11-13 18:42                       ` Ingo Molnar
2006-11-13 18:30                         ` Siddha, Suresh B
2006-11-13 19:04                           ` Ingo Molnar
2006-11-13 18:58                             ` Siddha, Suresh B
2006-11-13 19:31                         ` Ashok Raj
2006-11-13 19:08                     ` Ingo Molnar
2006-11-13 14:14           ` Ingo Molnar
2006-11-13  8:43     ` Ingo Molnar
2006-11-13 17:34       ` Siddha, Suresh B

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox