xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86, amd: Disable way access filter on Piledriver CPUs
@ 2012-12-17 17:08 Jan Beulich
  2012-12-17 17:49 ` Mats Petersson
  2012-12-17 20:17 ` Keir Fraser
  0 siblings, 2 replies; 4+ messages in thread
From: Jan Beulich @ 2012-12-17 17:08 UTC (permalink / raw)
  To: xen-devel; +Cc: Andre Przywara

[-- Attachment #1: Type: text/plain, Size: 2348 bytes --]

The Way Access Filter in recent AMD CPUs may hurt the performance of
some workloads, caused by aliasing issues in the L1 cache.
This patch disables it on the affected CPUs.

The issue is similar to that one of last year:
http://lkml.indiana.edu/hypermail/linux/kernel/1107.3/00041.html 
This new patch does not replace the old one, we just need another
quirk for newer CPUs.

The performance penalty without the patch depends on the
circumstances, but is a bit less than the last year's 3%.

The workloads affected would be those that access code from the same
physical page under different virtual addresses, so different
processes using the same libraries with ASLR or multiple instances of
PIE-binaries. The code needs to be accessed simultaneously from both
cores of the same compute unit.

More details can be found here:
http://developer.amd.com/Assets/SharedL1InstructionCacheonAMD15hCPU.pdf 

CPUs affected are anything with the core known as Piledriver.
That includes the new parts of the AMD A-Series (aka Trinity) and the
just released new CPUs of the FX-Series (aka Vishera).
The model numbering is a bit odd here: FX CPUs have model 2,
A-Series has model 10h, with possible extensions to 1Fh. Hence the
range of model ids.

Signed-off-by: Andre Przywara <osp@andrep.de>

Add and use MSR_AMD64_IC_CFG. Update the value whenever it is found to
not have all bits set, rather than just when it's zero.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -448,6 +448,14 @@ static void __devinit init_amd(struct cp
 		}
 	}
 
+	/*
+	 * The way access filter has a performance penalty on some workloads.
+	 * Disable it on the affected CPUs.
+	 */
+	if (c->x86 == 0x15 && c->x86_model >= 0x02 && c->x86_model < 0x20 &&
+	    !rdmsr_safe(MSR_AMD64_IC_CFG, value) && (value & 0x1e) != 0x1e)
+		wrmsr_safe(MSR_AMD64_IC_CFG, value | 0x1e);
+
         amd_get_topology(c);
 
 	/* Pointless to use MWAIT on Family10 as it does not deep sleep. */
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -211,6 +211,7 @@
 
 /* AMD64 MSRs */
 #define MSR_AMD64_NB_CFG		0xc001001f
+#define MSR_AMD64_IC_CFG		0xc0011021
 #define MSR_AMD64_DC_CFG		0xc0011022
 #define AMD64_NB_CFG_CF8_EXT_ENABLE_BIT	46
 




[-- Attachment #2: x86-AMD-Fam15-way-access-filter.patch --]
[-- Type: text/plain, Size: 2398 bytes --]

x86, amd: Disable way access filter on Piledriver CPUs

The Way Access Filter in recent AMD CPUs may hurt the performance of
some workloads, caused by aliasing issues in the L1 cache.
This patch disables it on the affected CPUs.

The issue is similar to that one of last year:
http://lkml.indiana.edu/hypermail/linux/kernel/1107.3/00041.html
This new patch does not replace the old one, we just need another
quirk for newer CPUs.

The performance penalty without the patch depends on the
circumstances, but is a bit less than the last year's 3%.

The workloads affected would be those that access code from the same
physical page under different virtual addresses, so different
processes using the same libraries with ASLR or multiple instances of
PIE-binaries. The code needs to be accessed simultaneously from both
cores of the same compute unit.

More details can be found here:
http://developer.amd.com/Assets/SharedL1InstructionCacheonAMD15hCPU.pdf

CPUs affected are anything with the core known as Piledriver.
That includes the new parts of the AMD A-Series (aka Trinity) and the
just released new CPUs of the FX-Series (aka Vishera).
The model numbering is a bit odd here: FX CPUs have model 2,
A-Series has model 10h, with possible extensions to 1Fh. Hence the
range of model ids.

Signed-off-by: Andre Przywara <osp@andrep.de>

Add and use MSR_AMD64_IC_CFG. Update the value whenever it is found to
not have all bits set, rather than just when it's zero.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/cpu/amd.c
+++ b/xen/arch/x86/cpu/amd.c
@@ -448,6 +448,14 @@ static void __devinit init_amd(struct cp
 		}
 	}
 
+	/*
+	 * The way access filter has a performance penalty on some workloads.
+	 * Disable it on the affected CPUs.
+	 */
+	if (c->x86 == 0x15 && c->x86_model >= 0x02 && c->x86_model < 0x20 &&
+	    !rdmsr_safe(MSR_AMD64_IC_CFG, value) && (value & 0x1e) != 0x1e)
+		wrmsr_safe(MSR_AMD64_IC_CFG, value | 0x1e);
+
         amd_get_topology(c);
 
 	/* Pointless to use MWAIT on Family10 as it does not deep sleep. */
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -211,6 +211,7 @@
 
 /* AMD64 MSRs */
 #define MSR_AMD64_NB_CFG		0xc001001f
+#define MSR_AMD64_IC_CFG		0xc0011021
 #define MSR_AMD64_DC_CFG		0xc0011022
 #define AMD64_NB_CFG_CF8_EXT_ENABLE_BIT	46
 

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86, amd: Disable way access filter on Piledriver CPUs
  2012-12-17 17:08 [PATCH] x86, amd: Disable way access filter on Piledriver CPUs Jan Beulich
@ 2012-12-17 17:49 ` Mats Petersson
  2012-12-18  9:05   ` Jan Beulich
  2012-12-17 20:17 ` Keir Fraser
  1 sibling, 1 reply; 4+ messages in thread
From: Mats Petersson @ 2012-12-17 17:49 UTC (permalink / raw)
  To: xen-devel

On 17/12/12 17:08, Jan Beulich wrote:
> The Way Access Filter in recent AMD CPUs may hurt the performance of
> some workloads, caused by aliasing issues in the L1 cache.
> This patch disables it on the affected CPUs.
>
> The issue is similar to that one of last year:
> http://lkml.indiana.edu/hypermail/linux/kernel/1107.3/00041.html
> This new patch does not replace the old one, we just need another
> quirk for newer CPUs.
>
> The performance penalty without the patch depends on the
> circumstances, but is a bit less than the last year's 3%.
>
> The workloads affected would be those that access code from the same
> physical page under different virtual addresses, so different
> processes using the same libraries with ASLR or multiple instances of
> PIE-binaries. The code needs to be accessed simultaneously from both
> cores of the same compute unit.
>
> More details can be found here:
> http://developer.amd.com/Assets/SharedL1InstructionCacheonAMD15hCPU.pdf
>
> CPUs affected are anything with the core known as Piledriver.
> That includes the new parts of the AMD A-Series (aka Trinity) and the
> just released new CPUs of the FX-Series (aka Vishera).
> The model numbering is a bit odd here: FX CPUs have model 2,
> A-Series has model 10h, with possible extensions to 1Fh. Hence the
> range of model ids.
>
> Signed-off-by: Andre Przywara <osp@andrep.de>
>
> Add and use MSR_AMD64_IC_CFG. Update the value whenever it is found to
> not have all bits set, rather than just when it's zero.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -448,6 +448,14 @@ static void __devinit init_amd(struct cp
>   		}
>   	}
>   
> +	/*
> +	 * The way access filter has a performance penalty on some workloads.
> +	 * Disable it on the affected CPUs.
> +	 */
> +	if (c->x86 == 0x15 && c->x86_model >= 0x02 && c->x86_model < 0x20 &&
> +	    !rdmsr_safe(MSR_AMD64_IC_CFG, value) && (value & 0x1e) != 0x1e)
> +		wrmsr_safe(MSR_AMD64_IC_CFG, value | 0x1e);
Would it not be better to simply write 0x1e, rather than only write it 
when it's not 0x1e? It's a one-off operation, but the extra readmsr 
seems unnecessary to me.

--
Mats
> +
>           amd_get_topology(c);
>   
>   	/* Pointless to use MWAIT on Family10 as it does not deep sleep. */
> --- a/xen/include/asm-x86/msr-index.h
> +++ b/xen/include/asm-x86/msr-index.h
> @@ -211,6 +211,7 @@
>   
>   /* AMD64 MSRs */
>   #define MSR_AMD64_NB_CFG		0xc001001f
> +#define MSR_AMD64_IC_CFG		0xc0011021
>   #define MSR_AMD64_DC_CFG		0xc0011022
>   #define AMD64_NB_CFG_CF8_EXT_ENABLE_BIT	46
>   
>
>
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86, amd: Disable way access filter on Piledriver CPUs
  2012-12-17 17:08 [PATCH] x86, amd: Disable way access filter on Piledriver CPUs Jan Beulich
  2012-12-17 17:49 ` Mats Petersson
@ 2012-12-17 20:17 ` Keir Fraser
  1 sibling, 0 replies; 4+ messages in thread
From: Keir Fraser @ 2012-12-17 20:17 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Mats Petersson, Andre Przywara

On 17/12/2012 17:08, "Jan Beulich" <JBeulich@suse.com> wrote:

> The Way Access Filter in recent AMD CPUs may hurt the performance of
> some workloads, caused by aliasing issues in the L1 cache.
> This patch disables it on the affected CPUs.
> 
> The issue is similar to that one of last year:
> http://lkml.indiana.edu/hypermail/linux/kernel/1107.3/00041.html
> This new patch does not replace the old one, we just need another
> quirk for newer CPUs.
> 
> The performance penalty without the patch depends on the
> circumstances, but is a bit less than the last year's 3%.
> 
> The workloads affected would be those that access code from the same
> physical page under different virtual addresses, so different
> processes using the same libraries with ASLR or multiple instances of
> PIE-binaries. The code needs to be accessed simultaneously from both
> cores of the same compute unit.
> 
> More details can be found here:
> http://developer.amd.com/Assets/SharedL1InstructionCacheonAMD15hCPU.pdf
> 
> CPUs affected are anything with the core known as Piledriver.
> That includes the new parts of the AMD A-Series (aka Trinity) and the
> just released new CPUs of the FX-Series (aka Vishera).
> The model numbering is a bit odd here: FX CPUs have model 2,
> A-Series has model 10h, with possible extensions to 1Fh. Hence the
> range of model ids.
> 
> Signed-off-by: Andre Przywara <osp@andrep.de>
> 
> Add and use MSR_AMD64_IC_CFG. Update the value whenever it is found to
> not have all bits set, rather than just when it's zero.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Keir Fraser <keir@xen.org>

I don't have a strong opinion on Mats' comment.

 -- Keir

> --- a/xen/arch/x86/cpu/amd.c
> +++ b/xen/arch/x86/cpu/amd.c
> @@ -448,6 +448,14 @@ static void __devinit init_amd(struct cp
> }
> }
>  
> + /*
> +  * The way access filter has a performance penalty on some workloads.
> +  * Disable it on the affected CPUs.
> +  */
> + if (c->x86 == 0x15 && c->x86_model >= 0x02 && c->x86_model < 0x20 &&
> +     !rdmsr_safe(MSR_AMD64_IC_CFG, value) && (value & 0x1e) != 0x1e)
> +  wrmsr_safe(MSR_AMD64_IC_CFG, value | 0x1e);
> +
>          amd_get_topology(c);
>  
> /* Pointless to use MWAIT on Family10 as it does not deep sleep. */
> --- a/xen/include/asm-x86/msr-index.h
> +++ b/xen/include/asm-x86/msr-index.h
> @@ -211,6 +211,7 @@
>  
>  /* AMD64 MSRs */
>  #define MSR_AMD64_NB_CFG  0xc001001f
> +#define MSR_AMD64_IC_CFG  0xc0011021
>  #define MSR_AMD64_DC_CFG  0xc0011022
>  #define AMD64_NB_CFG_CF8_EXT_ENABLE_BIT 46
>  
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86, amd: Disable way access filter on Piledriver CPUs
  2012-12-17 17:49 ` Mats Petersson
@ 2012-12-18  9:05   ` Jan Beulich
  0 siblings, 0 replies; 4+ messages in thread
From: Jan Beulich @ 2012-12-18  9:05 UTC (permalink / raw)
  To: Mats Petersson; +Cc: xen-devel

>>> On 17.12.12 at 18:49, Mats Petersson <mats.petersson@citrix.com> wrote:
> On 17/12/12 17:08, Jan Beulich wrote:
>> The Way Access Filter in recent AMD CPUs may hurt the performance of
>> some workloads, caused by aliasing issues in the L1 cache.
>> This patch disables it on the affected CPUs.
>>
>> The issue is similar to that one of last year:
>> http://lkml.indiana.edu/hypermail/linux/kernel/1107.3/00041.html 
>> This new patch does not replace the old one, we just need another
>> quirk for newer CPUs.
>>
>> The performance penalty without the patch depends on the
>> circumstances, but is a bit less than the last year's 3%.
>>
>> The workloads affected would be those that access code from the same
>> physical page under different virtual addresses, so different
>> processes using the same libraries with ASLR or multiple instances of
>> PIE-binaries. The code needs to be accessed simultaneously from both
>> cores of the same compute unit.
>>
>> More details can be found here:
>> http://developer.amd.com/Assets/SharedL1InstructionCacheonAMD15hCPU.pdf 
>>
>> CPUs affected are anything with the core known as Piledriver.
>> That includes the new parts of the AMD A-Series (aka Trinity) and the
>> just released new CPUs of the FX-Series (aka Vishera).
>> The model numbering is a bit odd here: FX CPUs have model 2,
>> A-Series has model 10h, with possible extensions to 1Fh. Hence the
>> range of model ids.
>>
>> Signed-off-by: Andre Przywara <osp@andrep.de>
>>
>> Add and use MSR_AMD64_IC_CFG. Update the value whenever it is found to
>> not have all bits set, rather than just when it's zero.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>
>> --- a/xen/arch/x86/cpu/amd.c
>> +++ b/xen/arch/x86/cpu/amd.c
>> @@ -448,6 +448,14 @@ static void __devinit init_amd(struct cp
>>   		}
>>   	}
>>   
>> +	/*
>> +	 * The way access filter has a performance penalty on some workloads.
>> +	 * Disable it on the affected CPUs.
>> +	 */
>> +	if (c->x86 == 0x15 && c->x86_model >= 0x02 && c->x86_model < 0x20 &&
>> +	    !rdmsr_safe(MSR_AMD64_IC_CFG, value) && (value & 0x1e) != 0x1e)
>> +		wrmsr_safe(MSR_AMD64_IC_CFG, value | 0x1e);
> Would it not be better to simply write 0x1e, rather than only write it 
> when it's not 0x1e? It's a one-off operation, but the extra readmsr 
> seems unnecessary to me.

Possibly, but I guess you checked the original Linux commit that
this was derived from, and saw that they do it yet a little more
oddly (writing the disabling value only when the original value
had all 4 bits clear. Fact is that the way it is now, we avoid an
MSR write to a register we can't even read (and I think that is
desirable in any case; the checking of the value is pretty benign
then).

Jan

>> +
>>           amd_get_topology(c);
>>   
>>   	/* Pointless to use MWAIT on Family10 as it does not deep sleep. */
>> --- a/xen/include/asm-x86/msr-index.h
>> +++ b/xen/include/asm-x86/msr-index.h
>> @@ -211,6 +211,7 @@
>>   
>>   /* AMD64 MSRs */
>>   #define MSR_AMD64_NB_CFG		0xc001001f
>> +#define MSR_AMD64_IC_CFG		0xc0011021
>>   #define MSR_AMD64_DC_CFG		0xc0011022
>>   #define AMD64_NB_CFG_CF8_EXT_ENABLE_BIT	46
>>   
>>
>>
>>
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org 
> http://lists.xen.org/xen-devel 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-12-18  9:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-17 17:08 [PATCH] x86, amd: Disable way access filter on Piledriver CPUs Jan Beulich
2012-12-17 17:49 ` Mats Petersson
2012-12-18  9:05   ` Jan Beulich
2012-12-17 20:17 ` Keir Fraser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).