All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode
       [not found] ` <20050905231628.GA16476@muc.de>
@ 2005-09-06 23:12   ` Ashok Raj
  2005-09-09 17:07     ` Zwane Mwaikambo
  0 siblings, 1 reply; 11+ messages in thread
From: Ashok Raj @ 2005-09-06 23:12 UTC (permalink / raw)
  To: Andi Kleen; +Cc: akpm, ashok.raj, linux-kernel

On Tue, Sep 06, 2005 at 01:16:28AM +0200, Andi Kleen wrote:
> On Sat, Sep 03, 2005 at 02:33:30PM -0700, akpm@osdl.org wrote:
> > 
> > From: Ashok Raj <ashok.raj@intel.com>
> > 
> > Newly introduced physflat_* shares way too much with cluster with only a very
> > differences.  So we introduce some common functions in that can be reused in
> > both cases.
> > 
> > In addition the following are also fixed.
> > - Use of non-existent CONFIG_CPU_HOTPLUG option renamed to actual one in use.
> > - Removed comment that ACPI would provide a way to select this dynamically
> >   since ACPI_CONFIG_HOTPLUG_CPU already exists that indicates platform support
> >   for hotplug via ACPI. In addition CONFIG_HOTPLUG_CPU only indicates logical 
> >   offline/online which is even used by Suspend/Resume folks where the same 
> >   support (for no-broadcast) is required.
> 
> 
> (hmm did I reply to that? I think I did but my mailer seems to have
> lost the r flag. My apologies if it's a duplicate) 
> 
> I didn't like that one because it makes the code less readable than
> before imho. I did a separate patch for the CPU_HOTPLUG typo.

The code is less readable? Now iam confused. Attached the link to patch
below to refresh your memory.

http://marc.theaimsgroup.com/?l=linux-kernel&m=112293577309653&w=2

diffstat would show we have fewer lines ~40 less lines of code. physflat
basicaly copied/cloned some useful code in cluster and some from flat mode
genapic code. 

I would have consolidated the code in the first place when you put the physflat
mode. Again this was just my habit, cant step over code bloat and duplication.

Which part of the code is unreadable to you? If you are happy with just renamed
functions with copied body of the code which is what physflat did, thats fine.

I was just puzzeled at the convoluted and less readable part of the code. If 
there is something you like to point out, i would be happy to fix it.. or you 
can if you prefer it that way.


-- 
Cheers,
Ashok Raj
- Open Source Technology Center

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode
  2005-09-06 23:12   ` [patch 13/14] x86_64: Use common functions in cluster and physflat mode Ashok Raj
@ 2005-09-09 17:07     ` Zwane Mwaikambo
  2005-09-09 20:45       ` Ashok Raj
  2005-09-10  0:30       ` Andi Kleen
  0 siblings, 2 replies; 11+ messages in thread
From: Zwane Mwaikambo @ 2005-09-09 17:07 UTC (permalink / raw)
  To: Ashok Raj; +Cc: Andi Kleen, Andrew Morton, Linux Kernel

On Tue, 6 Sep 2005, Ashok Raj wrote:

> On Tue, Sep 06, 2005 at 01:16:28AM +0200, Andi Kleen wrote:
> > On Sat, Sep 03, 2005 at 02:33:30PM -0700, akpm@osdl.org wrote:
> > > 
> > > From: Ashok Raj <ashok.raj@intel.com>
> > > 
> > > Newly introduced physflat_* shares way too much with cluster with only a very
> > > differences.  So we introduce some common functions in that can be reused in
> > > both cases.

On a slightly different topic, how come we're using physflat for hotplug 
cpu?

-#ifndef CONFIG_CPU_HOTPLUG
 		/* In the CPU hotplug case we cannot use broadcast mode
 		   because that opens a race when a CPU is removed.
-		   Stay at physflat mode in this case.
-		   It is bad to do this unconditionally though. Once
-		   we have ACPI platform support for CPU hotplug
-		   we should detect hotplug capablity from ACPI tables and
-		   only do this when really needed. -AK */
+		   Stay at physflat mode in this case. - AK */
+#ifdef CONFIG_HOTPLUG_CPU
 		if (num_cpus <= 8)
 			genapic = &apic_flat;

Thanks,
	Zwane


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode
  2005-09-09 17:07     ` Zwane Mwaikambo
@ 2005-09-09 20:45       ` Ashok Raj
  2005-09-11 16:44         ` Zwane Mwaikambo
  2005-09-10  0:30       ` Andi Kleen
  1 sibling, 1 reply; 11+ messages in thread
From: Ashok Raj @ 2005-09-09 20:45 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Ashok Raj, Andi Kleen, Andrew Morton, Linux Kernel

On Fri, Sep 09, 2005 at 10:07:28AM -0700, Zwane Mwaikambo wrote:
> On a slightly different topic, how come we're using physflat for hotplug 
> cpu?
> 
> -#ifndef CONFIG_CPU_HOTPLUG
>  		/* In the CPU hotplug case we cannot use broadcast mode
>  		   because that opens a race when a CPU is removed.
> -		   Stay at physflat mode in this case.
> -		   It is bad to do this unconditionally though. Once
> -		   we have ACPI platform support for CPU hotplug
> -		   we should detect hotplug capablity from ACPI tables and
> -		   only do this when really needed. -AK */
> +		   Stay at physflat mode in this case. - AK */
> +#ifdef CONFIG_HOTPLUG_CPU
>  		if (num_cpus <= 8)
>  			genapic = &apic_flat;

What you say was true before this patch, (Although now that you point out i 
realize the ifdef CONFIG_HOTPLUG_CPU is not required). 

Think Andi is fixing this in his next drop to -mm*

When physflat was introduced, it also  switched to use physflat mode for 
#cpus <=8 when hotplug is enabled, since it doesnt use shortcuts and 
so is also safer (although slower). 

http://marc.theaimsgroup.com/?l=linux-kernel&m=112317686712929&w=2

The link above made using genapic_flat safer by using the
flat_send_IPI_mask(), and hence i switched back to using
logical flat mode when #cpus <=8, since that a little more efficient than
the send_IPI_mask_sequence() used in physflat mode.

In general we need

flat_mode - #cpus <= 8 (Hotplug defined or not, so we use mask version 
                       for safety)

physflat or cluster_mode when #cpus >8.

If we choose physflat as default for #cpus <=8 (with hotplug) would make
IPI performance worse, since it would do one cpu at a time, and requires 2 
writes per cpu for each IPI v.s just 2 for a flat mode mask version of the API.

-- 
Cheers,
Ashok Raj
- Open Source Technology Center

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode
  2005-09-09 17:07     ` Zwane Mwaikambo
  2005-09-09 20:45       ` Ashok Raj
@ 2005-09-10  0:30       ` Andi Kleen
  2005-09-10  1:51         ` Zwane Mwaikambo
  1 sibling, 1 reply; 11+ messages in thread
From: Andi Kleen @ 2005-09-10  0:30 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Ashok Raj, Andrew Morton, Linux Kernel

On Fri, Sep 09, 2005 at 10:07:28AM -0700, Zwane Mwaikambo wrote:
> On Tue, 6 Sep 2005, Ashok Raj wrote:
> 
> > On Tue, Sep 06, 2005 at 01:16:28AM +0200, Andi Kleen wrote:
> > > On Sat, Sep 03, 2005 at 02:33:30PM -0700, akpm@osdl.org wrote:
> > > > 
> > > > From: Ashok Raj <ashok.raj@intel.com>
> > > > 
> > > > Newly introduced physflat_* shares way too much with cluster with only a very
> > > > differences.  So we introduce some common functions in that can be reused in
> > > > both cases.
> 
> On a slightly different topic, how come we're using physflat for hotplug 
> cpu?

The original idea was to always use physflat mode for hotplug because
that does all the sequencing stuff and avoids the shortcut races.
But then Ashok decided it was better to add more ifdefs to flat mode
instead and I gave up protesting at some point.

-Andi

> 
> -#ifndef CONFIG_CPU_HOTPLUG
>  		/* In the CPU hotplug case we cannot use broadcast mode
>  		   because that opens a race when a CPU is removed.
> -		   Stay at physflat mode in this case.
> -		   It is bad to do this unconditionally though. Once
> -		   we have ACPI platform support for CPU hotplug
> -		   we should detect hotplug capablity from ACPI tables and
> -		   only do this when really needed. -AK */
> +		   Stay at physflat mode in this case. - AK */
> +#ifdef CONFIG_HOTPLUG_CPU
>  		if (num_cpus <= 8)
>  			genapic = &apic_flat;
> 
> Thanks,
> 	Zwane
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode
  2005-09-10  0:30       ` Andi Kleen
@ 2005-09-10  1:51         ` Zwane Mwaikambo
  0 siblings, 0 replies; 11+ messages in thread
From: Zwane Mwaikambo @ 2005-09-10  1:51 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ashok Raj, Andrew Morton, Linux Kernel

On Fri, 10 Sep 2005, Andi Kleen wrote:

> On Fri, Sep 09, 2005 at 10:07:28AM -0700, Zwane Mwaikambo wrote:
> > On Tue, 6 Sep 2005, Ashok Raj wrote:
> > 
> > > On Tue, Sep 06, 2005 at 01:16:28AM +0200, Andi Kleen wrote:
> > > > On Sat, Sep 03, 2005 at 02:33:30PM -0700, akpm@osdl.org wrote:
> > > > > 
> > > > > From: Ashok Raj <ashok.raj@intel.com>
> > > > > 
> > > > > Newly introduced physflat_* shares way too much with cluster with only a very
> > > > > differences.  So we introduce some common functions in that can be reused in
> > > > > both cases.
> > 
> > On a slightly different topic, how come we're using physflat for hotplug 
> > cpu?
> 
> The original idea was to always use physflat mode for hotplug because
> that does all the sequencing stuff and avoids the shortcut races.
> But then Ashok decided it was better to add more ifdefs to flat mode
> instead and I gave up protesting at some point.

Ok so you wanted to segragate them, i can understand that, but didn't we 
have a version which worked around the races by doing the same thing, 
hotplug or not? Is this the one where you weren't pleased with the 
supposed execution penalty?

Thanks,
	Zwane


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode
  2005-09-09 20:45       ` Ashok Raj
@ 2005-09-11 16:44         ` Zwane Mwaikambo
  2005-09-11 23:02           ` Andi Kleen
  0 siblings, 1 reply; 11+ messages in thread
From: Zwane Mwaikambo @ 2005-09-11 16:44 UTC (permalink / raw)
  To: Ashok Raj; +Cc: Andi Kleen, Andrew Morton, Linux Kernel

On Fri, 9 Sep 2005, Ashok Raj wrote:

> In general we need
> 
> flat_mode - #cpus <= 8 (Hotplug defined or not, so we use mask version 
>                        for safety)
> 
> physflat or cluster_mode when #cpus >8.

I agree there.

> If we choose physflat as default for #cpus <=8 (with hotplug) would make
> IPI performance worse, since it would do one cpu at a time, and requires 2 
> writes per cpu for each IPI v.s just 2 for a flat mode mask version of the API.

I don't see the benefit then :/ I certainly hope we don't go that route.

Thanks,
	Zwane


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode
  2005-09-11 16:44         ` Zwane Mwaikambo
@ 2005-09-11 23:02           ` Andi Kleen
  2005-09-12 22:23             ` Ashok Raj
  0 siblings, 1 reply; 11+ messages in thread
From: Andi Kleen @ 2005-09-11 23:02 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Ashok Raj, Andrew Morton, Linux Kernel

On Sun, Sep 11, 2005 at 09:44:16AM -0700, Zwane Mwaikambo wrote:
> On Fri, 9 Sep 2005, Ashok Raj wrote:
> 
> > In general we need
> > 
> > flat_mode - #cpus <= 8 (Hotplug defined or not, so we use mask version 
> >                        for safety)
> > 
> > physflat or cluster_mode when #cpus >8.
> 
> I agree there.
> 
> > If we choose physflat as default for #cpus <=8 (with hotplug) would make
> > IPI performance worse, since it would do one cpu at a time, and requires 2 
> > writes per cpu for each IPI v.s just 2 for a flat mode mask version of the API.
> 
> I don't see the benefit then :/ I certainly hope we don't go that route.

I originally made that objection, but Ashok then did some benchmarks
that showed essentially no difference. I can see the point - it's 
likely that an access to the local APIC (which is in the CPU) is fast
(we know it is) and all the time is dominated by sending the 
requests over the wires between the CPUs. So it shouldn't matter 
much if you use sequence mode or masks.

That is why I changed my mind and just made physflat default for the hotplug
case (which will be essentially everywhere because I expect most kernels
to have hotplug enabled in the future) 

It's a bit of a mess in mm right now because me and Ashok have been fixing 
similar problems in a different way (e.g. the patch in flat to
use the sequence sending path is also not needed anymore with that) 
Need to clean this up a bit.

Handling it properly for i386 is also still needed e.g. the older physflat
patch I did needs to be merged with bigsmp and cleaned up a bit.
But I don't have time to do it before monday so it'll miss the 2.6.14
window anyways.

-Andi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode
  2005-09-11 23:02           ` Andi Kleen
@ 2005-09-12 22:23             ` Ashok Raj
  2005-09-13  8:10               ` Andi Kleen
  0 siblings, 1 reply; 11+ messages in thread
From: Ashok Raj @ 2005-09-12 22:23 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Zwane Mwaikambo, Ashok Raj, Andrew Morton, Linux Kernel

On Mon, Sep 12, 2005 at 01:02:20AM +0200, Andi Kleen wrote:
> On Sun, Sep 11, 2005 at 09:44:16AM -0700, Zwane Mwaikambo wrote:
> > On Fri, 9 Sep 2005, Ashok Raj wrote:
> > 
> > > IPI performance worse, since it would do one cpu at a time, and requires 2 
> > > writes per cpu for each IPI v.s just 2 for a flat mode mask version of the API.
> > 
> > I don't see the benefit then :/ I certainly hope we don't go that route.
> 
> I originally made that objection, but Ashok then did some benchmarks
> that showed essentially no difference. I can see the point - it's 


Just a minor difference is that for a 8 CPU system

Using IPI shortcut uses just 1 write to local apic
Using mask version of broadcast uses 2 writes to local apic

The stat showed no significant difference when we do the 1 extra write. But
that is for the whole system.

When we use send_IPI_mask_sequence() we use 14 writes to (two writes
for each CPU that we need to target a IPI). 

Using the same stats i used earlier, i see about 0x200 - 0x1000 extra cycles 
when using mask_sequence() on certain long runs about 100K samples.

We should probably remove the !HOTPLUG case and just use the mask version
for all cases <=8 CPUS, use physflat or the cluster mode for >8cpus as 
the case may be, instead of defaulting to sequence_IPI which seems
a little overkill for the intended purpose.

> likely that an access to the local APIC (which is in the CPU) is fast
> (we know it is) and all the time is dominated by sending the 
> requests over the wires between the CPUs. So it shouldn't matter 
> much if you use sequence mode or masks.
> 
> That is why I changed my mind and just made physflat default for the hotplug
> case (which will be essentially everywhere because I expect most kernels
> to have hotplug enabled in the future) 
> 
> It's a bit of a mess in mm right now because me and Ashok have been fixing 
> similar problems in a different way (e.g. the patch in flat to
> use the sequence sending path is also not needed anymore with that) 
> Need to clean this up a bit.
> 
> Handling it properly for i386 is also still needed e.g. the older physflat
> patch I did needs to be merged with bigsmp and cleaned up a bit.
> But I don't have time to do it before monday so it'll miss the 2.6.14
> window anyways.
> 
> -Andi

-- 
Cheers,
Ashok Raj
- Open Source Technology Center

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode
  2005-09-12 22:23             ` Ashok Raj
@ 2005-09-13  8:10               ` Andi Kleen
  2005-09-13 14:53                 ` Zwane Mwaikambo
  0 siblings, 1 reply; 11+ messages in thread
From: Andi Kleen @ 2005-09-13  8:10 UTC (permalink / raw)
  To: Ashok Raj; +Cc: Zwane Mwaikambo, Andrew Morton, Linux Kernel

> We should probably remove the !HOTPLUG case and just use the mask version
> for all cases <=8 CPUS, use physflat or the cluster mode for >8cpus as 
> the case may be, instead of defaulting to sequence_IPI which seems
> a little overkill for the intended purpose.

Or just always use physflat and remove the logical flat case? 
That seems cleanest to me. Any objections? 

-Andi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode
  2005-09-13  8:10               ` Andi Kleen
@ 2005-09-13 14:53                 ` Zwane Mwaikambo
  2005-09-13 15:31                   ` Andi Kleen
  0 siblings, 1 reply; 11+ messages in thread
From: Zwane Mwaikambo @ 2005-09-13 14:53 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Ashok Raj, Andrew Morton, Linux Kernel

On Tue, 13 Sep 2005, Andi Kleen wrote:

> > We should probably remove the !HOTPLUG case and just use the mask version
> > for all cases <=8 CPUS, use physflat or the cluster mode for >8cpus as 
> > the case may be, instead of defaulting to sequence_IPI which seems
> > a little overkill for the intended purpose.
> 
> Or just always use physflat and remove the logical flat case? 
> That seems cleanest to me. Any objections? 

My objection is the number of APIC writes required to issue IPIs to a 
group of processors, however i do understand that it would help 
maintainability and testing coverage if we reduce the number of operating 
modes, are you proposing physflat for _everything_ ?

Thanks,
	Zwane


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode
  2005-09-13 14:53                 ` Zwane Mwaikambo
@ 2005-09-13 15:31                   ` Andi Kleen
  0 siblings, 0 replies; 11+ messages in thread
From: Andi Kleen @ 2005-09-13 15:31 UTC (permalink / raw)
  To: Zwane Mwaikambo; +Cc: Ashok Raj, Andrew Morton, Linux Kernel

On Tue, Sep 13, 2005 at 07:53:03AM -0700, Zwane Mwaikambo wrote:
> On Tue, 13 Sep 2005, Andi Kleen wrote:
> 
> > > We should probably remove the !HOTPLUG case and just use the mask version
> > > for all cases <=8 CPUS, use physflat or the cluster mode for >8cpus as 
> > > the case may be, instead of defaulting to sequence_IPI which seems
> > > a little overkill for the intended purpose.
> > 
> > Or just always use physflat and remove the logical flat case? 
> > That seems cleanest to me. Any objections? 
> 
> My objection is the number of APIC writes required to issue IPIs to a 
> group of processors, however i do understand that it would help 

local APIC accesses are cheap, they don't go over any external bus.
It's not like a PCI cycle.

> maintainability and testing coverage if we reduce the number of operating 
> modes, are you proposing physflat for _everything_ ?

On everything that's currently run by flat yes. Possibly the others
too when tested.

-Andi

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-09-13 15:31 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <200509032135.j83LZ8gX020554@shell0.pdx.osdl.net>
     [not found] ` <20050905231628.GA16476@muc.de>
2005-09-06 23:12   ` [patch 13/14] x86_64: Use common functions in cluster and physflat mode Ashok Raj
2005-09-09 17:07     ` Zwane Mwaikambo
2005-09-09 20:45       ` Ashok Raj
2005-09-11 16:44         ` Zwane Mwaikambo
2005-09-11 23:02           ` Andi Kleen
2005-09-12 22:23             ` Ashok Raj
2005-09-13  8:10               ` Andi Kleen
2005-09-13 14:53                 ` Zwane Mwaikambo
2005-09-13 15:31                   ` Andi Kleen
2005-09-10  0:30       ` Andi Kleen
2005-09-10  1:51         ` Zwane Mwaikambo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.