* change last level cache alignment on x86?
@ 2012-03-01 8:33 Alex,Shi
2012-03-02 7:30 ` Alex Shi
0 siblings, 1 reply; 6+ messages in thread
From: Alex,Shi @ 2012-03-01 8:33 UTC (permalink / raw)
To: tglx, hpa, mingo; +Cc: linux-kernel@vger.kernel.org, x86, asit.k.mallick
Currently last level defined in kernel is still 128 bytes, but actually
I checked intel's core2, NHM, SNB, atom, serial platforms, all of them
are using 64 bytes.
I did not get detailed info on AMD platforms. Guess someone like to give
the info here. So, Is if it possible to do the similar following changes
to use 64 byte cache alignment in kernel?
===
diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
index 3c57033..f342a5a 100644
--- a/arch/x86/Kconfig.cpu
+++ b/arch/x86/Kconfig.cpu
@@ -303,7 +303,7 @@ config X86_GENERIC
config X86_INTERNODE_CACHE_SHIFT
int
default "12" if X86_VSMP
- default "7" if NUMA
+ default "7" if NUMA && (MPENTIUM4)
default X86_L1_CACHE_SHIFT
config X86_CMPXCHG
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: change last level cache alignment on x86?
2012-03-01 8:33 change last level cache alignment on x86? Alex,Shi
@ 2012-03-02 7:30 ` Alex Shi
2012-03-02 8:12 ` Ingo Molnar
0 siblings, 1 reply; 6+ messages in thread
From: Alex Shi @ 2012-03-02 7:30 UTC (permalink / raw)
To: tglx; +Cc: hpa, mingo, linux-kernel@vger.kernel.org, x86, asit.k.mallick
On Thu, 2012-03-01 at 16:33 +0800, Alex,Shi wrote:
> Currently last level defined in kernel is still 128 bytes, but actually
> I checked intel's core2, NHM, SNB, atom, serial platforms, all of them
> are using 64 bytes.
> I did not get detailed info on AMD platforms. Guess someone like to give
> the info here. So, Is if it possible to do the similar following changes
> to use 64 byte cache alignment in kernel?
>
> ===
> diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
> index 3c57033..f342a5a 100644
> --- a/arch/x86/Kconfig.cpu
> +++ b/arch/x86/Kconfig.cpu
> @@ -303,7 +303,7 @@ config X86_GENERIC
> config X86_INTERNODE_CACHE_SHIFT
> int
> default "12" if X86_VSMP
> - default "7" if NUMA
> + default "7" if NUMA && (MPENTIUM4)
> default X86_L1_CACHE_SHIFT
>
> config X86_CMPXCHG
In arch/x86/include/asm/cache.h, the INTERNODE_CACHE_SHIFT macro will
transfer to '__cacheline_aligned_in_smp' finally.
#ifdef CONFIG_X86_VSMP
#ifdef CONFIG_SMP
#define __cacheline_aligned_in_smp \
__attribute__((__aligned__(INTERNODE_CACHE_BYTES))) \
__page_aligned_data
#endif
#endif
look at the following contents in Kconfig.cpu, I wondering if it is
possible to remove 'default "7" if NUMA' line. Then a thin and fit cache
alignment will be potential helpful on performance.
Anyone like to give some comments?
===
diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
index 3c57033..6443c6f 100644
--- a/arch/x86/Kconfig.cpu
+++ b/arch/x86/Kconfig.cpu
@@ -303,7 +303,6 @@ config X86_GENERIC
config X86_INTERNODE_CACHE_SHIFT
int
default "12" if X86_VSMP
- default "7" if NUMA
default X86_L1_CACHE_SHIFT
config X86_CMPXCHG
====
some contents in Kconfig.cpu:
config X86_INTERNODE_CACHE_SHIFT
int
default "12" if X86_VSMP
default "7" if NUMA && (MPENTIUM4 || MPSC)
default X86_L1_CACHE_SHIFT
config X86_CMPXCHG
def_bool X86_64 || (X86_32 && !M386)
config X86_L1_CACHE_SHIFT
int
default "7" if MPENTIUM4 || MPSC
default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU
default "4" if MELAN || M486 || M386 || MGEODEGX1
default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODE_LX
>
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: change last level cache alignment on x86?
2012-03-02 7:30 ` Alex Shi
@ 2012-03-02 8:12 ` Ingo Molnar
2012-03-02 14:42 ` Alex Shi
0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2012-03-02 8:12 UTC (permalink / raw)
To: Alex Shi
Cc: tglx, hpa, mingo, linux-kernel@vger.kernel.org, x86,
asit.k.mallick
* Alex Shi <alex.shi@intel.com> wrote:
> On Thu, 2012-03-01 at 16:33 +0800, Alex,Shi wrote:
> > Currently last level defined in kernel is still 128 bytes, but actually
> > I checked intel's core2, NHM, SNB, atom, serial platforms, all of them
> > are using 64 bytes.
> > I did not get detailed info on AMD platforms. Guess someone like to give
> > the info here. So, Is if it possible to do the similar following changes
> > to use 64 byte cache alignment in kernel?
> >
> > ===
> > diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
> > index 3c57033..f342a5a 100644
> > --- a/arch/x86/Kconfig.cpu
> > +++ b/arch/x86/Kconfig.cpu
> > @@ -303,7 +303,7 @@ config X86_GENERIC
> > config X86_INTERNODE_CACHE_SHIFT
> > int
> > default "12" if X86_VSMP
> > - default "7" if NUMA
> > + default "7" if NUMA && (MPENTIUM4)
> > default X86_L1_CACHE_SHIFT
> >
> > config X86_CMPXCHG
>
> In arch/x86/include/asm/cache.h, the INTERNODE_CACHE_SHIFT macro will
> transfer to '__cacheline_aligned_in_smp' finally.
>
> #ifdef CONFIG_X86_VSMP
> #ifdef CONFIG_SMP
> #define __cacheline_aligned_in_smp \
> __attribute__((__aligned__(INTERNODE_CACHE_BYTES))) \
> __page_aligned_data
> #endif
> #endif
Note the #ifdef CONFIG_X86_VSMP - so the 128 bytes does not
actually transform into __cacheline_aligned_in_smp.
> look at the following contents in Kconfig.cpu, I wondering if
> it is possible to remove 'default "7" if NUMA' line. Then a
> thin and fit cache alignment will be potential helpful on
> performance. Anyone like to give some comments?
> config X86_INTERNODE_CACHE_SHIFT
> int
> default "12" if X86_VSMP
> - default "7" if NUMA
> default X86_L1_CACHE_SHIFT
Yes, removing that line would be fine I think - I think it was
copied from the old L1 alignment of 128 bytes (which was a P4
artifact when that CPU was the dominant platform - that's not
been the case for a long time already).
Could you please also do a before/after build of an x86
defconfig with NUMA enabled and see what the alignments in the
before/after System.map are?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: change last level cache alignment on x86?
2012-03-02 8:12 ` Ingo Molnar
@ 2012-03-02 14:42 ` Alex Shi
2012-03-02 15:25 ` Ingo Molnar
0 siblings, 1 reply; 6+ messages in thread
From: Alex Shi @ 2012-03-02 14:42 UTC (permalink / raw)
To: Ingo Molnar
Cc: tglx, hpa, mingo, linux-kernel@vger.kernel.org, x86,
asit.k.mallick
>> #ifdef CONFIG_X86_VSMP
>> #ifdef CONFIG_SMP
>> #define __cacheline_aligned_in_smp \
>> __attribute__((__aligned__(INTERNODE_CACHE_BYTES))) \
>> __page_aligned_data
>> #endif
>> #endif
>
> Note the #ifdef CONFIG_X86_VSMP - so the 128 bytes does not
> actually transform into __cacheline_aligned_in_smp.
Oh, sorry, I used a inappropriate example here, actually there are lot
places reference to this value, like in cscope show
INTERNODE_CACHE_BYTES usages:
1 13 arch/x86/include/asm/cache.h <<GLOBAL>>
#define INTERNODE_CACHE_BYTES (1 << INTERNODE_CACHE_SHIFT)
2 148 arch/x86/kernel/vmlinux.lds.S <<GLOBAL>>
READ_MOSTLY_DATA(INTERNODE_CACHE_BYTES)
3 190 arch/x86/kernel/vmlinux.lds.S <<GLOBAL>>
PERCPU_VADDR(INTERNODE_CACHE_BYTES, 0, :percpu)
4 285 arch/x86/kernel/vmlinux.lds.S <<GLOBAL>>
PERCPU_SECTION(INTERNODE_CACHE_BYTES)
5 48 arch/x86/mm/tlb.c <<GLOBAL>>
char pad[INTERNODE_CACHE_BYTES];
6 18 arch/x86/include/asm/cache.h <<__cacheline_aligned_in_smp>>
__attribute__((__aligned__(INTERNODE_CACHE_BYTES))) \
and also many references to INTERNODE_CACHE_SHIFT,
>
>> look at the following contents in Kconfig.cpu, I wondering if
>> it is possible to remove 'default "7" if NUMA' line. Then a
>> thin and fit cache alignment will be potential helpful on
>> performance. Anyone like to give some comments?
>
>> config X86_INTERNODE_CACHE_SHIFT
>> int
>> default "12" if X86_VSMP
>> - default "7" if NUMA
>> default X86_L1_CACHE_SHIFT
>
> Yes, removing that line would be fine I think - I think it was
> copied from the old L1 alignment of 128 bytes (which was a P4
> artifact when that CPU was the dominant platform - that's not
> been the case for a long time already).
Thanks! I will write a patch later.
>
> Could you please also do a before/after build of an x86
> defconfig with NUMA enabled and see what the alignments in the
> before/after System.map are?
So, with defconfig on x86_64, I saw much changes in System.map:
before patched after patched
...
000000000000b000 d tlb_vector_| 000000000000b000 d tlb_vector
000000000000b080 d cpu_loops_p| 000000000000b040 d cpu_loops_
...
>
> Thanks,
>
> Ingo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: change last level cache alignment on x86?
2012-03-02 14:42 ` Alex Shi
@ 2012-03-02 15:25 ` Ingo Molnar
2012-03-03 11:30 ` Alex Shi
0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2012-03-02 15:25 UTC (permalink / raw)
To: Alex Shi
Cc: tglx, hpa, mingo, linux-kernel@vger.kernel.org, x86,
asit.k.mallick
* Alex Shi <alex.shi@intel.com> wrote:
> So, with defconfig on x86_64, I saw much changes in System.map:
> before patched after patched
> ...
> 000000000000b000 d tlb_vector_| 000000000000b000 d tlb_vector
> 000000000000b080 d cpu_loops_p| 000000000000b040 d cpu_loops_
> ...
Ok, mind sending a patch, changelogged, with a SOB?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: change last level cache alignment on x86?
2012-03-02 15:25 ` Ingo Molnar
@ 2012-03-03 11:30 ` Alex Shi
0 siblings, 0 replies; 6+ messages in thread
From: Alex Shi @ 2012-03-03 11:30 UTC (permalink / raw)
To: Ingo Molnar
Cc: tglx, hpa, mingo, linux-kernel@vger.kernel.org, x86,
asit.k.mallick
On 03/02/2012 11:25 PM, Ingo Molnar wrote:
> * Alex Shi <alex.shi@intel.com> wrote:
>
>> So, with defconfig on x86_64, I saw much changes in System.map:
>> before patched after patched
>> ...
>> 000000000000b000 d tlb_vector_| 000000000000b000 d tlb_vector
>> 000000000000b080 d cpu_loops_p| 000000000000b040 d cpu_loops_
>> ...
> Ok, mind sending a patch, changelogged, with a SOB?
Thanks a lot for review! A patch was sent to you.
> Thanks,
>
> Ingo
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2012-03-03 11:31 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-01 8:33 change last level cache alignment on x86? Alex,Shi
2012-03-02 7:30 ` Alex Shi
2012-03-02 8:12 ` Ingo Molnar
2012-03-02 14:42 ` Alex Shi
2012-03-02 15:25 ` Ingo Molnar
2012-03-03 11:30 ` Alex Shi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox