public inbox for linux-acpi@vger.kernel.org
 help / color / mirror / Atom feed
* Wishlist: Disable C6 in intel_idle for Model 44 processors
@ 2013-06-11 21:30 Larry Baker
  2013-06-11 21:34 ` Larry Baker
  2013-06-12 13:40 ` Matthew Garrett
  0 siblings, 2 replies; 10+ messages in thread
From: Larry Baker @ 2013-06-11 21:30 UTC (permalink / raw)
  To: linux-acpi; +Cc: Rob E Russell

I have an IBM System x3650 M3 with an Intel Xeon L5630 processor.  My IBM support team alerted me to an issue with C6 states for those processors when running Linux and the intel_idle kernel module is used.  The IBM solutions page, http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5091901, recommends disabling intel_idle.  I propose intel_idle disable C6 for the affected processors.

Description:

Intel Xeon 5600 and Core i7-900 processors (Family 6 Model 44) have a flaw when the C6 state is used.  See Intel® Xeon® Processor 5600 Series Specification Update May 2012 (http://www.intel.eu/content/dam/www/public/us/en/documents/specification-updates/xeon-5600-specification-update.pdf) and Intel® CoreTM i7-900 Desktop Processor Extreme Edition Series and Intel® CoreTM i7-900 Desktop Processor Series on 32-nm Process Specification Update June 2013 (http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/core-i7-900-ee-and-desktop-processor-series-32nm-spec-update.pdf):

> Package C6 Transitions May Cause Memory Bit Errors to be Observed
> 
> Problem:
> During Package C6 transitions, internal signaling noise may cause the DDRx_CKE signal to become asserted during self-refresh. These assertions may result in memory bit errors upon exiting from the package C6 state. Due to this erratum the DDRx_CKE signals can be driven during times in which the DDR3 JEDEC specification requires that they are idle.
> 
> Implication:
> DDRx_CKE signals can be driven during package C6 memory self-refresh creating an invalid memory DRAM state.  A system hang, memory ECC errors or unpredictable system behavior may occur when exiting the package C6 state.
> 
> Workaround: 
> It is possible for the BIOS to contain a workaround for this erratum.
> 
> Status:
> For the steppings affected, see the Summary Table of Changes.

The intel_idle kernel module uses the standard Nehalem C states table for these processors (boot_cpu_data.x86_model==0x2c):

> static const struct x86_cpu_id intel_idle_ids[] = {
> <snip>
> 	ICPU(0x2c, idle_cpu_xeon5600),
> <snip>
> 	{}
> };
> MODULE_DEVICE_TABLE(x86cpu, intel_idle_ids);

I propose two alternatives for intel_idle.c to avoid the processor flaw.  (I didn't try to compile them -- I only modified intel_idle.c.)  The first limits max_cstates=3, and prints in a console message if the limit was forced.  The second creates a Xeon 5600-specific C states table which leaves off the C6 state.  No message is printed in that case.  I like the idea of a message, and I like the idea of a proper C states table.  I did not look to see if acpi_idle can also benefit from a similar modification.

Thank you,

Larry Baker
US Geological Survey
650-329-5608
baker@usgs.gov

===== Version 1 Patch - Limit max_cstate=3 for Family 6 Model 44 Processors =====

--- intel_idle.c.orig	2013-06-11 11:41:49.000000000 -0700
+++ intel_idle-fix-v1.c	2013-06-11 13:47:15.000000000 -0700
@@ -534,2 +534,13 @@
 
+/*
+ * Disable C6-state for Xeon 5600 and Core i7-900.  Refer to Xeon 5600 errata
+ * BD104 and Core i7-900 errata BC82: Package C6 Transitions May Result in
+ * Single and Multi-Bit Memory Errors.
+ */
+	if (boot_cpu_data.x86_model == 0x2c &&
+	    max_cstate > 3) {
+		pr_debug(PREFIX "limiting model 0x2c to max_cstate=3\n");
+		max_cstate = 3;
+	}
+
 	pr_debug(PREFIX "lapic_timer_reliable_states 0x%x\n",

===== Version 2 Patch - Disable C6 for Family 6 Model 44 Processors =====

--- intel_idle.c.orig	2013-06-11 11:41:49.000000000 -0700
+++ intel_idle-fix-v2.c	2013-06-11 13:51:32.000000000 -0700
@@ -158,2 +158,33 @@
 
+/*
+ * Disable C6-state for Xeon 5600 and Core i7-900.  Refer to Xeon 5600 errata
+ * BD104 and Core i7-900 errata BC82: Package C6 Transitions May Result in
+ * Single and Multi-Bit Memory Errors.
+ */
+static struct cpuidle_state xeon5600_cstates[CPUIDLE_STATE_MAX] = {
+	{
+		.name = "C1-NHM",
+		.desc = "MWAIT 0x00",
+		.flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID,
+		.exit_latency = 3,
+		.target_residency = 6,
+		.enter = &intel_idle },
+	{
+		.name = "C1E-NHM",
+		.desc = "MWAIT 0x01",
+		.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_TIME_VALID,
+		.exit_latency = 10,
+		.target_residency = 20,
+		.enter = &intel_idle },
+	{
+		.name = "C3-NHM",
+		.desc = "MWAIT 0x10",
+		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED,
+		.exit_latency = 20,
+		.target_residency = 80,
+		.enter = &intel_idle },
+	{
+		.enter = NULL }
+};
+
 static struct cpuidle_state snb_cstates[CPUIDLE_STATE_MAX] = {
@@ -440,2 +471,8 @@
 
+static const struct idle_cpu idle_cpu_xeon5600 = {
+	.state_table = xeon5600_cstates,
+	.auto_demotion_disable_flags = NHM_C1_AUTO_DEMOTE | NHM_C3_AUTO_DEMOTE,
+	.disable_promotion_to_c1e = true,
+};
+
 static const struct idle_cpu idle_cpu_atom = {
@@ -472,3 +509,3 @@
 	ICPU(0x25, idle_cpu_nehalem),
-	ICPU(0x2c, idle_cpu_nehalem),
+	ICPU(0x2c, idle_cpu_xeon5600),
 	ICPU(0x2e, idle_cpu_nehalem),

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Wishlist: Disable C6 in intel_idle for Model 44 processors
  2013-06-11 21:30 Wishlist: Disable C6 in intel_idle for Model 44 processors Larry Baker
@ 2013-06-11 21:34 ` Larry Baker
  2013-06-12 13:40 ` Matthew Garrett
  1 sibling, 0 replies; 10+ messages in thread
From: Larry Baker @ 2013-06-11 21:34 UTC (permalink / raw)
  To: linux-acpi; +Cc: Rob E Russell

Sorry, I snipped the C states table from my modified intel_idle.c, not the original.  I should have shown

> static const struct x86_cpu_id intel_idle_ids[] = { 
> <snip>
> 	ICPU(0x2c, idle_cpu_nehalem),
> <snip>
> 	{}
> };
> MODULE_DEVICE_TABLE(x86cpu, intel_idle_ids);

Larry Baker
US Geological Survey
650-329-5608
baker@usgs.gov



On 11 Jun 2013, at 2:30 PM, Larry Baker wrote:

> I have an IBM System x3650 M3 with an Intel Xeon L5630 processor.  My IBM support team alerted me to an issue with C6 states for those processors when running Linux and the intel_idle kernel module is used.  The IBM solutions page, http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5091901, recommends disabling intel_idle.  I propose intel_idle disable C6 for the affected processors.
> 
> Description:
> 
> Intel Xeon 5600 and Core i7-900 processors (Family 6 Model 44) have a flaw when the C6 state is used.  See Intel® Xeon® Processor 5600 Series Specification Update May 2012 (http://www.intel.eu/content/dam/www/public/us/en/documents/specification-updates/xeon-5600-specification-update.pdf) and Intel® CoreTM i7-900 Desktop Processor Extreme Edition Series and Intel® CoreTM i7-900 Desktop Processor Series on 32-nm Process Specification Update June 2013 (http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/core-i7-900-ee-and-desktop-processor-series-32nm-spec-update.pdf):
> 
>> Package C6 Transitions May Cause Memory Bit Errors to be Observed
>> 
>> Problem:
>> During Package C6 transitions, internal signaling noise may cause the DDRx_CKE signal to become asserted during self-refresh. These assertions may result in memory bit errors upon exiting from the package C6 state. Due to this erratum the DDRx_CKE signals can be driven during times in which the DDR3 JEDEC specification requires that they are idle.
>> 
>> Implication:
>> DDRx_CKE signals can be driven during package C6 memory self-refresh creating an invalid memory DRAM state.  A system hang, memory ECC errors or unpredictable system behavior may occur when exiting the package C6 state.
>> 
>> Workaround: 
>> It is possible for the BIOS to contain a workaround for this erratum.
>> 
>> Status:
>> For the steppings affected, see the Summary Table of Changes.
> 
> The intel_idle kernel module uses the standard Nehalem C states table for these processors (boot_cpu_data.x86_model==0x2c):
> 
>> static const struct x86_cpu_id intel_idle_ids[] = {
>> <snip>
>> 	ICPU(0x2c, idle_cpu_xeon5600),
>> <snip>
>> 	{}
>> };
>> MODULE_DEVICE_TABLE(x86cpu, intel_idle_ids);
> 
> I propose two alternatives for intel_idle.c to avoid the processor flaw.  (I didn't try to compile them -- I only modified intel_idle.c.)  The first limits max_cstates=3, and prints in a console message if the limit was forced.  The second creates a Xeon 5600-specific C states table which leaves off the C6 state.  No message is printed in that case.  I like the idea of a message, and I like the idea of a proper C states table.  I did not look to see if acpi_idle can also benefit from a similar modification.
> 
> Thank you,
> 
> Larry Baker
> US Geological Survey
> 650-329-5608
> baker@usgs.gov
> 
> ===== Version 1 Patch - Limit max_cstate=3 for Family 6 Model 44 Processors =====
> 
> --- intel_idle.c.orig	2013-06-11 11:41:49.000000000 -0700
> +++ intel_idle-fix-v1.c	2013-06-11 13:47:15.000000000 -0700
> @@ -534,2 +534,13 @@
> 
> +/*
> + * Disable C6-state for Xeon 5600 and Core i7-900.  Refer to Xeon 5600 errata
> + * BD104 and Core i7-900 errata BC82: Package C6 Transitions May Result in
> + * Single and Multi-Bit Memory Errors.
> + */
> +	if (boot_cpu_data.x86_model == 0x2c &&
> +	    max_cstate > 3) {
> +		pr_debug(PREFIX "limiting model 0x2c to max_cstate=3\n");
> +		max_cstate = 3;
> +	}
> +
> 	pr_debug(PREFIX "lapic_timer_reliable_states 0x%x\n",
> 
> ===== Version 2 Patch - Disable C6 for Family 6 Model 44 Processors =====
> 
> --- intel_idle.c.orig	2013-06-11 11:41:49.000000000 -0700
> +++ intel_idle-fix-v2.c	2013-06-11 13:51:32.000000000 -0700
> @@ -158,2 +158,33 @@
> 
> +/*
> + * Disable C6-state for Xeon 5600 and Core i7-900.  Refer to Xeon 5600 errata
> + * BD104 and Core i7-900 errata BC82: Package C6 Transitions May Result in
> + * Single and Multi-Bit Memory Errors.
> + */
> +static struct cpuidle_state xeon5600_cstates[CPUIDLE_STATE_MAX] = {
> +	{
> +		.name = "C1-NHM",
> +		.desc = "MWAIT 0x00",
> +		.flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_TIME_VALID,
> +		.exit_latency = 3,
> +		.target_residency = 6,
> +		.enter = &intel_idle },
> +	{
> +		.name = "C1E-NHM",
> +		.desc = "MWAIT 0x01",
> +		.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_TIME_VALID,
> +		.exit_latency = 10,
> +		.target_residency = 20,
> +		.enter = &intel_idle },
> +	{
> +		.name = "C3-NHM",
> +		.desc = "MWAIT 0x10",
> +		.flags = MWAIT2flg(0x10) | CPUIDLE_FLAG_TIME_VALID | CPUIDLE_FLAG_TLB_FLUSHED,
> +		.exit_latency = 20,
> +		.target_residency = 80,
> +		.enter = &intel_idle },
> +	{
> +		.enter = NULL }
> +};
> +
> static struct cpuidle_state snb_cstates[CPUIDLE_STATE_MAX] = {
> @@ -440,2 +471,8 @@
> 
> +static const struct idle_cpu idle_cpu_xeon5600 = {
> +	.state_table = xeon5600_cstates,
> +	.auto_demotion_disable_flags = NHM_C1_AUTO_DEMOTE | NHM_C3_AUTO_DEMOTE,
> +	.disable_promotion_to_c1e = true,
> +};
> +
> static const struct idle_cpu idle_cpu_atom = {
> @@ -472,3 +509,3 @@
> 	ICPU(0x25, idle_cpu_nehalem),
> -	ICPU(0x2c, idle_cpu_nehalem),
> +	ICPU(0x2c, idle_cpu_xeon5600),
> 	ICPU(0x2e, idle_cpu_nehalem),
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Wishlist: Disable C6 in intel_idle for Model 44 processors
  2013-06-11 21:30 Wishlist: Disable C6 in intel_idle for Model 44 processors Larry Baker
  2013-06-11 21:34 ` Larry Baker
@ 2013-06-12 13:40 ` Matthew Garrett
  2013-06-12 18:11   ` Larry Baker
  2013-06-16  3:29   ` Henrique de Moraes Holschuh
  1 sibling, 2 replies; 10+ messages in thread
From: Matthew Garrett @ 2013-06-12 13:40 UTC (permalink / raw)
  To: Larry Baker; +Cc: linux-acpi, Rob E Russell

On Tue, Jun 11, 2013 at 02:30:30PM -0700, Larry Baker wrote:
> I have an IBM System x3650 M3 with an Intel Xeon L5630 processor.  My IBM support team alerted me to an issue with C6 states for those processors when running Linux and the intel_idle kernel module is used.  The IBM solutions page, http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5091901, recommends disabling intel_idle.  I propose intel_idle disable C6 for the affected processors.

As you note:

> > Workaround: 
> > It is possible for the BIOS to contain a workaround for this erratum.

Are you actually seeing this specific problem? If so, you should 
probably request a firmware update from IBM. If that's not a 
possibility, and if it is necessary to disable package C6 in this 
situation, it'd be nice to determine what the BIOS workaround actually 
does and limit the quirk to that case.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Wishlist: Disable C6 in intel_idle for Model 44 processors
  2013-06-12 13:40 ` Matthew Garrett
@ 2013-06-12 18:11   ` Larry Baker
  2013-06-12 18:32     ` Matthew Garrett
  2013-06-16  3:29   ` Henrique de Moraes Holschuh
  1 sibling, 1 reply; 10+ messages in thread
From: Larry Baker @ 2013-06-12 18:11 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: linux-acpi, Rob E Russell

On 12 Jun 2013, at 6:40 AM, Matthew Garrett wrote:

> On Tue, Jun 11, 2013 at 02:30:30PM -0700, Larry Baker wrote:
>> I have an IBM System x3650 M3 with an Intel Xeon L5630 processor.  My IBM support team alerted me to an issue with C6 states for those processors when running Linux and the intel_idle kernel module is used.  The IBM solutions page, http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5091901, recommends disabling intel_idle.  I propose intel_idle disable C6 for the affected processors.
> 
> As you note:
> 
>>> Workaround: 
>>> It is possible for the BIOS to contain a workaround for this erratum.

Not my comment -- this comes from the Intel processor errata.  It is not really very clear from this exactly what the BIOS is supposed to do.  The IBM BIOS, for example, disables all C states.  However, intel_idle ignores the BIOS settings.

I did some Google'ing and found HP and Dell documents describing how to configure Linux to reduce latency, which, of course, requires that C states be disabled.

http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01804533/c01804533.pdf
http://en.community.dell.com/techcenter/extras/m/white_papers/20227764/download.aspx

The Dell document explains that the intel_idle driver overrides what the BIOS says and does its own thing.  It says the acpi_idle driver gets the ACPI information from the BIOS and does what it says to do:

> And, to reiterate—there are a number of kernel parameters and BIOS settings that deal with C-states, but most of these will be ignored if intel_idle is in use. Disabling intel_idle with kernel parameter ―intel_idle.max_cstate=0 will result in more intuitive control of C-states, and there should not be any disadvantage to disabling it on systems that provide correct C-state information to the operating system via ACPI.

Maybe disabling intel_idle is the best solution?  I don't know.  But, if that is so, why does Linux have it at all?  I suspect it is because intel_idle can use C states that are Intel processor-specific, while acpi_idle probably sticks to ACPI "standard" C states.

> Are you actually seeing this specific problem?

On my systems, not that I know of.  The processor errata only says this may happen.  I only have a few, and I suppose I have been lucky.  I am sure that Intel would have caught this prior to shipping the processor if it was a solid failure.  My IBM support contact has told me that it has caused grief at his accounts.

> If so, you should 
> probably request a firmware update from IBM. If that's not a 
> possibility, and if it is necessary to disable package C6 in this 
> situation, it'd be nice to determine what the BIOS workaround actually 
> does and limit the quirk to that case.

The processor errata is operating system agnostic.  I suspect the BIOS workaround they envision would be a feature in the BIOS to limit the choice of C states to be used.  The IBM BIOS has such a setting, and the default value is to completely disable C states.  However, intel_idle ignores the BIOS setting.

I think the intel_idle code came from Intel engineers.  I assume they have good reason to prefer it to acpi_idle.  I hope they read this list and will decide if my proposals have merit.

Thanks.

> -- 
> Matthew Garrett | mjg59@srcf.ucam.org

Larry Baker
US Geological Survey
650-329-5608
baker@usgs.gov

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Wishlist: Disable C6 in intel_idle for Model 44 processors
  2013-06-12 18:11   ` Larry Baker
@ 2013-06-12 18:32     ` Matthew Garrett
  2013-06-14 19:32       ` Len Brown
  0 siblings, 1 reply; 10+ messages in thread
From: Matthew Garrett @ 2013-06-12 18:32 UTC (permalink / raw)
  To: Larry Baker; +Cc: linux-acpi, Rob E Russell

On Wed, Jun 12, 2013 at 11:11:21AM -0700, Larry Baker wrote:

> The processor errata is operating system agnostic.  I suspect the BIOS 
> workaround they envision would be a feature in the BIOS to limit the 
> choice of C states to be used.  The IBM BIOS has such a setting, and 
> the default value is to completely disable C states.  However, 
> intel_idle ignores the BIOS setting.

No, the BIOS workaround will typically be to program the memory 
controller such that the problem isn't triggered. If you're having 
latency-related issues then the appropriate fix is to use the pm_qos 
interface that's present in RHEL 6 and the upstream kernel, and you can 
do so in RHEL using the ktune command. 

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Wishlist: Disable C6 in intel_idle for Model 44 processors
  2013-06-12 18:32     ` Matthew Garrett
@ 2013-06-14 19:32       ` Len Brown
  2013-06-14 20:23         ` Larry Baker
  2013-06-20 19:22         ` Larry Baker
  0 siblings, 2 replies; 10+ messages in thread
From: Len Brown @ 2013-06-14 19:32 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Larry Baker, linux acpi, Rob E Russell

Hi Larry,

Thanks for the note.

I use two Westmere systems:
An Extreme Edition X980 on an Intel DX58SO motherboard,
and a pair of Xeon X5680's on a Intel S5520SC motherboard.

Both processors model 0x2c, and thus subject to this errata.

Both system are running the latest BIOS and firmware from Intel.
Both systems enable and use CC6 and PC6, by default.
This is true whether they are running ACPI idle
(such as Windows would do, or acpi_idle in Linux)
or Linux's intel_idle driver.

This suggests that the fix is not to disable PC6 on model 0x2c.
I would expect, as Matthew does, that the "BIOS workaround"
is likely something to do with how the BIOS initialization code sets
up the memory controller...  But in the event that the real fix
is to disable PC6 and Intel itself has not updated its own BIOS
to comply with its own errata, I'll contact the hardware designers
to see if I can get a more fact-based response.

So I concur with Matthew.
If you are concerned about configuration of your chip-set,
then you want to run the latest BIOS from the the vendor.
A Linux workaround doesn't currently look warranted.

thanks,
-Len Brown, Intel Open Source Technology Center

ps.

Yes, we have an issue that intel_idle doesn't respect when
the BIOS "disables" C-states via ACPI tables.  Indeed,
part of the value proposition of intel_idle is that it is immune
to ACPI table bugs that crop up from system to system.
Also, intel_idle is not subject to some of the limitations of ACPI.
We believe this is one of the reasons that Linux on Intel
is better than some other operating systems on Intel.

The OEMs such as Dell, HP and IBM are accustomed to having
control in the BIOS and so they are unhappy about losing
that capability.  We do hear them, but unfortunately it will
likely be the Haswell Server generation before we can give their
BIOS programmers that absolute control back by
empowering them to modify CPUID.MWAIT.EDX --
which is how the HW enumerates C-states.

This issue comes up mostly when latency sensitive
customers want to disable the high latency C-states.
In the past, the OEM could configure their BIOS to
handle that situation.  But with modern Linux,
a cmdline param such as intel_idle.max_cstate=N
is necessary.  OEM's don't like Linux cmdline params,
they prefer BIOS control.

As Matthew pointed out, the Linux community believes
that the answer for latency-sensitive customers is
to use Linux PM-QOS to tell the machine how
the customer wants it to run.  From a Linux point
of view, this is a universal solution, it requires
no BIOS SETUP tweaks and no kernel cmdline parms.

BTW. If the workaround for the errata were actually
to disable C6, it would be (Package) PC6, not (Core) CC6.
The BIOS already has control over Package C-states,
and if the BIOS doesn't lock the MSR, Linux also
has that capability.

Get the latest turbostat from the kernel tree
and run turbostat -v
and look for a line like this:

cpu0: MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x06008403 (demote-C3, demote-C1,
locked: pkg-cstate-limit=3: pc6)

cpu0: MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x06000403 (demote-C3, demote-C1,
UNlocked: pkg-cstate-limit=3: pc6)

As described in the Intel Software Developer's Manual,
this MSR, MSR_PKG_CST_CONFIG_CONTROL has a package C-state limit field.
Above it limits the hardware to PC6, but could easily be set to PC3.

In one of the examples above, the register was locked by the BIOS,
preventing Linux from modifying it, in the 2nd example, it is unlocked.

if we limited the package to PC3 here, then Linux would still choose CC6,
but when all the cores entered CC6, the deepest the package would
go would be PC3.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Wishlist: Disable C6 in intel_idle for Model 44 processors
  2013-06-14 19:32       ` Len Brown
@ 2013-06-14 20:23         ` Larry Baker
       [not found]           ` <OF3D706B0A.D5540D6C-ON85257B8A.0076278F-85257B8A.0076E4B1@us.ibm.com>
  2013-06-20 19:22         ` Larry Baker
  1 sibling, 1 reply; 10+ messages in thread
From: Larry Baker @ 2013-06-14 20:23 UTC (permalink / raw)
  To: Len Brown; +Cc: Matthew Garrett, linux acpi, Rob E Russell

Len,

I like your proposal to run this by the hardware boys.  I'd like to hear what they say about 1) how likely the problem is (certain DIMM brands/models?), 2) what they recommend to avoid the system failure/lockup, and 3) whether Linux can help work around the errata.  I think my IBM support person was going to try to ask about this through his contacts at Intel as well.

I'm not a BIOS programmer at all.  In the old days, BIOS's could (had to?) set up memory controllers.  I remember mucking with CAS settings.  Then came SPD, and BIOS's would either use those settings or let you override them.  Those memory controllers were separate devices on the bus, and I am sure there were registers to set them up.  Now that memory controllers are built into the processors, I don't know if that is possible any more.  I glanced through the two volume Intel Xeon Processor 5500 Series Datasheet (http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-5500-vol-1-datasheet.pdf, http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-5500-vol-2-datasheet.pdf), also a Nehalem core, and I didn't see any registers to control DRAM voltages (V-DDQ, I think).  I did find where it said (7.5 Enhanced Intel SpeedStep® Technology) "The processor controls voltage ramp rates internally to ensure smooth transitions."  That's as much as I could find about DRAM voltage control.

I agree, and I do run the latest BIOS from IBM.  My mention of the two low-latency tips documents I found on the web was only to give a reference for the description of how intel_idle works.  I found those while I was trying to understand the issue -- before I looked for the source code.  I do not need such extreme measures.  In fact, I prefer to keep things as cool as possible (the reason I use an L series product) for reliability.

Thanks for your informative comments.

Larry Baker
US Geological Survey
650-329-5608
baker@usgs.gov



On 14 Jun 2013, at 12:32 PM, Len Brown wrote:

> Hi Larry,
> 
> Thanks for the note.
> 
> I use two Westmere systems:
> An Extreme Edition X980 on an Intel DX58SO motherboard,
> and a pair of Xeon X5680's on a Intel S5520SC motherboard.
> 
> Both processors model 0x2c, and thus subject to this errata.
> 
> Both system are running the latest BIOS and firmware from Intel.
> Both systems enable and use CC6 and PC6, by default.
> This is true whether they are running ACPI idle
> (such as Windows would do, or acpi_idle in Linux)
> or Linux's intel_idle driver.
> 
> This suggests that the fix is not to disable PC6 on model 0x2c.
> I would expect, as Matthew does, that the "BIOS workaround"
> is likely something to do with how the BIOS initialization code sets
> up the memory controller...  But in the event that the real fix
> is to disable PC6 and Intel itself has not updated its own BIOS
> to comply with its own errata, I'll contact the hardware designers
> to see if I can get a more fact-based response.
> 
> So I concur with Matthew.
> If you are concerned about configuration of your chip-set,
> then you want to run the latest BIOS from the the vendor.
> A Linux workaround doesn't currently look warranted.
> 
> thanks,
> -Len Brown, Intel Open Source Technology Center
> 
> ps.
> 
> Yes, we have an issue that intel_idle doesn't respect when
> the BIOS "disables" C-states via ACPI tables.  Indeed,
> part of the value proposition of intel_idle is that it is immune
> to ACPI table bugs that crop up from system to system.
> Also, intel_idle is not subject to some of the limitations of ACPI.
> We believe this is one of the reasons that Linux on Intel
> is better than some other operating systems on Intel.
> 
> The OEMs such as Dell, HP and IBM are accustomed to having
> control in the BIOS and so they are unhappy about losing
> that capability.  We do hear them, but unfortunately it will
> likely be the Haswell Server generation before we can give their
> BIOS programmers that absolute control back by
> empowering them to modify CPUID.MWAIT.EDX --
> which is how the HW enumerates C-states.
> 
> This issue comes up mostly when latency sensitive
> customers want to disable the high latency C-states.
> In the past, the OEM could configure their BIOS to
> handle that situation.  But with modern Linux,
> a cmdline param such as intel_idle.max_cstate=N
> is necessary.  OEM's don't like Linux cmdline params,
> they prefer BIOS control.
> 
> As Matthew pointed out, the Linux community believes
> that the answer for latency-sensitive customers is
> to use Linux PM-QOS to tell the machine how
> the customer wants it to run.  From a Linux point
> of view, this is a universal solution, it requires
> no BIOS SETUP tweaks and no kernel cmdline parms.
> 
> BTW. If the workaround for the errata were actually
> to disable C6, it would be (Package) PC6, not (Core) CC6.
> The BIOS already has control over Package C-states,
> and if the BIOS doesn't lock the MSR, Linux also
> has that capability.
> 
> Get the latest turbostat from the kernel tree
> and run turbostat -v
> and look for a line like this:
> 
> cpu0: MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x06008403 (demote-C3, demote-C1,
> locked: pkg-cstate-limit=3: pc6)
> 
> cpu0: MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x06000403 (demote-C3, demote-C1,
> UNlocked: pkg-cstate-limit=3: pc6)
> 
> As described in the Intel Software Developer's Manual,
> this MSR, MSR_PKG_CST_CONFIG_CONTROL has a package C-state limit field.
> Above it limits the hardware to PC6, but could easily be set to PC3.
> 
> In one of the examples above, the register was locked by the BIOS,
> preventing Linux from modifying it, in the 2nd example, it is unlocked.
> 
> if we limited the package to PC3 here, then Linux would still choose CC6,
> but when all the cores entered CC6, the deepest the package would
> go would be PC3.

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Wishlist: Disable C6 in intel_idle for Model 44 processors
       [not found]           ` <OF3D706B0A.D5540D6C-ON85257B8A.0076278F-85257B8A.0076E4B1@us.ibm.com>
@ 2013-06-14 22:40             ` Larry Baker
  0 siblings, 0 replies; 10+ messages in thread
From: Larry Baker @ 2013-06-14 22:40 UTC (permalink / raw)
  Cc: linux acpi

Rob,

I think I found the description of the "Driver Impedance" fix in the IBM uEFI firmware at http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5091950.  I'm puzzled by the Workaround section.  I still can't tell from this if the "Driver Impedance" setting is or is not ignored by intel_idle.  Anyway, I hope the hardware boys can say if intel_idle should force this "Driver Impedance" setting on these processors (assuming that is possible) when it wants to use C states.

Larry Baker
US Geological Survey
650-329-5608
baker@usgs.gov



On 14 Jun 2013, at 2:39 PM, Rob E Russell wrote:

> Larry, 
> 
> The fix from an IBM uEFI standpoint was to include an option in system settings called "Driver Impedance" in uEFI v1.16.  If the user wants C-states to work, then driver impedance should also be enabled to avoid this package C6 transition issue.   If the user does not care about C-states, then on Linux, the intel_idle driver should be disabled.   C-states are disabled by default in uEFI system settings.
> 
> Regards, 
> Rob Russell
> Technical Sales Support, IBM System x, US Federal Government
> Phone: (720) 396-2235, TL 938-2235, Cell: 919-389-4874
> <Mail Attachment.jpeg>	Phone: 1-720-396-2235 | Phone: 1-919-651-1294 | Mobile: 1-919-389-4874
> E-mail: robr@us.ibm.com	
> <Mail Attachment.gif>
> 
> 3500 Blue Lake Dr
> Birmingham, AL 35243-1900
> United States
> 
> 
> 
> 
> 
> From:        Larry Baker <baker@usgs.gov> 
> To:        Len Brown <lenb@kernel.org>, 
> Cc:        Matthew Garrett <mjg59@srcf.ucam.org>, linux acpi <linux-acpi@vger.kernel.org>, Rob E Russell/Raleigh/IBM@IBMUS 
> Date:        06/14/2013 04:23 PM 
> Subject:        Re: Wishlist: Disable C6 in intel_idle for Model 44 processors 
> 
> 
> 
> Len,
> 
> I like your proposal to run this by the hardware boys.  I'd like to hear what they say about 1) how likely the problem is (certain DIMM brands/models?), 2) what they recommend to avoid the system failure/lockup, and 3) whether Linux can help work around the errata.  I think my IBM support person was going to try to ask about this through his contacts at Intel as well.
> 
> I'm not a BIOS programmer at all.  In the old days, BIOS's could (had to?) set up memory controllers.  I remember mucking with CAS settings.  Then came SPD, and BIOS's would either use those settings or let you override them.  Those memory controllers were separate devices on the bus, and I am sure there were registers to set them up.  Now that memory controllers are built into the processors, I don't know if that is possible any more.  I glanced through the two volume Intel Xeon Processor 5500 Series Datasheet (http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-5500-vol-1-datasheet.pdf, http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-5500-vol-2-datasheet.pdf), also a Nehalem core, and I didn't see any registers to control DRAM voltages (V-DDQ, I think).  I did find where it said (7.5 Enhanced Intel SpeedStep® Technology) "The processor controls voltage ramp rates internally to ensure smooth transitions."  That's as much as I could find about DRAM voltage control.
> 
> I agree, and I do run the latest BIOS from IBM.  My mention of the two low-latency tips documents I found on the web was only to give a reference for the description of how intel_idle works.  I found those while I was trying to understand the issue -- before I looked for the source code.  I do not need such extreme measures.  In fact, I prefer to keep things as cool as possible (the reason I use an L series product) for reliability.
> 
> Thanks for your informative comments.
> 
> Larry Baker
> US Geological Survey
> 650-329-5608
> baker@usgs.gov
> 
> 
> 
> On 14 Jun 2013, at 12:32 PM, Len Brown wrote:
> 
> > Hi Larry,
> > 
> > Thanks for the note.
> > 
> > I use two Westmere systems:
> > An Extreme Edition X980 on an Intel DX58SO motherboard,
> > and a pair of Xeon X5680's on a Intel S5520SC motherboard.
> > 
> > Both processors model 0x2c, and thus subject to this errata.
> > 
> > Both system are running the latest BIOS and firmware from Intel.
> > Both systems enable and use CC6 and PC6, by default.
> > This is true whether they are running ACPI idle
> > (such as Windows would do, or acpi_idle in Linux)
> > or Linux's intel_idle driver.
> > 
> > This suggests that the fix is not to disable PC6 on model 0x2c.
> > I would expect, as Matthew does, that the "BIOS workaround"
> > is likely something to do with how the BIOS initialization code sets
> > up the memory controller...  But in the event that the real fix
> > is to disable PC6 and Intel itself has not updated its own BIOS
> > to comply with its own errata, I'll contact the hardware designers
> > to see if I can get a more fact-based response.
> > 
> > So I concur with Matthew.
> > If you are concerned about configuration of your chip-set,
> > then you want to run the latest BIOS from the the vendor.
> > A Linux workaround doesn't currently look warranted.
> > 
> > thanks,
> > -Len Brown, Intel Open Source Technology Center
> > 
> > ps.
> > 
> > Yes, we have an issue that intel_idle doesn't respect when
> > the BIOS "disables" C-states via ACPI tables.  Indeed,
> > part of the value proposition of intel_idle is that it is immune
> > to ACPI table bugs that crop up from system to system.
> > Also, intel_idle is not subject to some of the limitations of ACPI.
> > We believe this is one of the reasons that Linux on Intel
> > is better than some other operating systems on Intel.
> > 
> > The OEMs such as Dell, HP and IBM are accustomed to having
> > control in the BIOS and so they are unhappy about losing
> > that capability.  We do hear them, but unfortunately it will
> > likely be the Haswell Server generation before we can give their
> > BIOS programmers that absolute control back by
> > empowering them to modify CPUID.MWAIT.EDX --
> > which is how the HW enumerates C-states.
> > 
> > This issue comes up mostly when latency sensitive
> > customers want to disable the high latency C-states.
> > In the past, the OEM could configure their BIOS to
> > handle that situation.  But with modern Linux,
> > a cmdline param such as intel_idle.max_cstate=N
> > is necessary.  OEM's don't like Linux cmdline params,
> > they prefer BIOS control.
> > 
> > As Matthew pointed out, the Linux community believes
> > that the answer for latency-sensitive customers is
> > to use Linux PM-QOS to tell the machine how
> > the customer wants it to run.  From a Linux point
> > of view, this is a universal solution, it requires
> > no BIOS SETUP tweaks and no kernel cmdline parms.
> > 
> > BTW. If the workaround for the errata were actually
> > to disable C6, it would be (Package) PC6, not (Core) CC6.
> > The BIOS already has control over Package C-states,
> > and if the BIOS doesn't lock the MSR, Linux also
> > has that capability.
> > 
> > Get the latest turbostat from the kernel tree
> > and run turbostat -v
> > and look for a line like this:
> > 
> > cpu0: MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x06008403 (demote-C3, demote-C1,
> > locked: pkg-cstate-limit=3: pc6)
> > 
> > cpu0: MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x06000403 (demote-C3, demote-C1,
> > UNlocked: pkg-cstate-limit=3: pc6)
> > 
> > As described in the Intel Software Developer's Manual,
> > this MSR, MSR_PKG_CST_CONFIG_CONTROL has a package C-state limit field.
> > Above it limits the hardware to PC6, but could easily be set to PC3.
> > 
> > In one of the examples above, the register was locked by the BIOS,
> > preventing Linux from modifying it, in the 2nd example, it is unlocked.
> > 
> > if we limited the package to PC3 here, then Linux would still choose CC6,
> > but when all the cores entered CC6, the deepest the package would
> > go would be PC3.
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Wishlist: Disable C6 in intel_idle for Model 44 processors
  2013-06-12 13:40 ` Matthew Garrett
  2013-06-12 18:11   ` Larry Baker
@ 2013-06-16  3:29   ` Henrique de Moraes Holschuh
  1 sibling, 0 replies; 10+ messages in thread
From: Henrique de Moraes Holschuh @ 2013-06-16  3:29 UTC (permalink / raw)
  To: Matthew Garrett; +Cc: Larry Baker, linux-acpi, Rob E Russell

On Wed, 12 Jun 2013, Matthew Garrett wrote:
> On Tue, Jun 11, 2013 at 02:30:30PM -0700, Larry Baker wrote:
> > I have an IBM System x3650 M3 with an Intel Xeon L5630 processor.  My
> > IBM support team alerted me to an issue with C6 states for those
> > processors when running Linux and the intel_idle kernel module is used.
> > The IBM solutions page,
> > http://www-947.ibm.com/support/entry/portal/docdisplay?lndocid=migr-5091901,
> > recommends disabling intel_idle.  I propose intel_idle disable C6 for
> > the affected processors.
> 
> As you note:
> 
> > > Workaround: It is possible for the BIOS to contain a workaround for
> > > this erratum.
> 
> Are you actually seeing this specific problem? If so, you should probably
> request a firmware update from IBM. If that's not a possibility, and if it
> is necessary to disable package C6 in this situation, it'd be nice to
> determine what the BIOS workaround actually does and limit the quirk to
> that case.

"It is possible for the BIOS to contain a workaround for this erratum." is
Intel speak for either "there is a processor microcode update that fixes
this", or "there is an undocumented bit somewhere you can toggle that
disables some functionality, and also works around this bug".

Now, good luck finding out WHICH microcode update is required or which bit
to toggle on whichever MSR/PCIe register.  It is certainly something we
could easily detect and disable C6 only when required (i.e. workaround not
in place) or even fix it ourselves... IF we had enough information.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Wishlist: Disable C6 in intel_idle for Model 44 processors
  2013-06-14 19:32       ` Len Brown
  2013-06-14 20:23         ` Larry Baker
@ 2013-06-20 19:22         ` Larry Baker
  1 sibling, 0 replies; 10+ messages in thread
From: Larry Baker @ 2013-06-20 19:22 UTC (permalink / raw)
  To: Len Brown; +Cc: Matthew Garrett, linux acpi, Rob E Russell

Len at al.,

I think an improvement might be possible for my proposal using the information you provided that intel_idle can tell if the BIOS has protected the package from this flaw, and apply a workaround, if necessary (and, if possible).  This would enhance the stability of these systems -- a good thing for users.

As I understand core and package C states, all cores must be in C6 state before the package can (will?) enter PC6 state.  (http://software.intel.com/en-us/blogs/2013/06/03/intel-xeon-phi-coprocessor-power-management-part-2a-core-c-states-the-details says "As you can guess, to drop the package into a PC-6 state, all the cores must also be in a C6 state.")  It is sufficient, then, to prevent a single core (e.g., core 0) in each package from entering C6 state to disable the package transition into PC6 state.  The existing logic could continue to be applied to all other cores.  Core 0 could have a separate max_c0_cstate, which, for processors other than boot_cpu_data.x86_model == 0x2c would be the same as max_cstate.

Some logic like this would enable intel_idle to detect and work around the flaw in these processors:

/*
 * Disable package C6 state, if possible, else core C6 state for core 0
 * on Xeon 5600 and Core i7-900 to prevent the package from entering PC6
 * state.  Refer to Xeon 5600 errata BD104 and Core i7-900 errata BC82:
 * Package C6 Transitions May Result in Single and Multi-Bit Memory Errors.
 */
	max_c0_state = max_state;
	if (boot_cpu_data.x86_model == 0x2c &&
	    max_cstate > 3 &&
	    !< BIOS workaround can be verified, e.g.,
	       IBM's "Driver Impedance" setting is enabled >) {
	    if (< Package C-state limit can be set > &&
		< Package C-state limiting succeeds >) {
		pr_debug(PREFIX "limiting model 0x2c to package C-state ???\n");
	    } else if (< Package C-state limit is bad >) {
		pr_debug(PREFIX "limiting model 0x2c core 0 to max_cstate=3\n");
		max_c0_cstate = 3;
	    }
	}

A message appears telling the user exactly what, if any, workaround was applied.

Of course, there would have to be conditional logic in all the places where max_state is used to determine the physical core no. and substitute max_c0_state for the case core no. == 0.

My original proposal would have prevented all cores from entering low-power C states.  This workaround allows greater power savings.

Larry Baker
US Geological Survey
650-329-5608
baker@usgs.gov



On 14 Jun 2013, at 12:32 PM, Len Brown wrote:

> Hi Larry,
> 
> Thanks for the note.
> 
> I use two Westmere systems:
> An Extreme Edition X980 on an Intel DX58SO motherboard,
> and a pair of Xeon X5680's on a Intel S5520SC motherboard.
> 
> Both processors model 0x2c, and thus subject to this errata.
> 
> Both system are running the latest BIOS and firmware from Intel.
> Both systems enable and use CC6 and PC6, by default.
> This is true whether they are running ACPI idle
> (such as Windows would do, or acpi_idle in Linux)
> or Linux's intel_idle driver.
> 
> This suggests that the fix is not to disable PC6 on model 0x2c.
> I would expect, as Matthew does, that the "BIOS workaround"
> is likely something to do with how the BIOS initialization code sets
> up the memory controller...  But in the event that the real fix
> is to disable PC6 and Intel itself has not updated its own BIOS
> to comply with its own errata, I'll contact the hardware designers
> to see if I can get a more fact-based response.
> 
> So I concur with Matthew.
> If you are concerned about configuration of your chip-set,
> then you want to run the latest BIOS from the the vendor.
> A Linux workaround doesn't currently look warranted.
> 
> thanks,
> -Len Brown, Intel Open Source Technology Center
> 
> ps.
> 
> Yes, we have an issue that intel_idle doesn't respect when
> the BIOS "disables" C-states via ACPI tables.  Indeed,
> part of the value proposition of intel_idle is that it is immune
> to ACPI table bugs that crop up from system to system.
> Also, intel_idle is not subject to some of the limitations of ACPI.
> We believe this is one of the reasons that Linux on Intel
> is better than some other operating systems on Intel.
> 
> The OEMs such as Dell, HP and IBM are accustomed to having
> control in the BIOS and so they are unhappy about losing
> that capability.  We do hear them, but unfortunately it will
> likely be the Haswell Server generation before we can give their
> BIOS programmers that absolute control back by
> empowering them to modify CPUID.MWAIT.EDX --
> which is how the HW enumerates C-states.
> 
> This issue comes up mostly when latency sensitive
> customers want to disable the high latency C-states.
> In the past, the OEM could configure their BIOS to
> handle that situation.  But with modern Linux,
> a cmdline param such as intel_idle.max_cstate=N
> is necessary.  OEM's don't like Linux cmdline params,
> they prefer BIOS control.
> 
> As Matthew pointed out, the Linux community believes
> that the answer for latency-sensitive customers is
> to use Linux PM-QOS to tell the machine how
> the customer wants it to run.  From a Linux point
> of view, this is a universal solution, it requires
> no BIOS SETUP tweaks and no kernel cmdline parms.
> 
> BTW. If the workaround for the errata were actually
> to disable C6, it would be (Package) PC6, not (Core) CC6.
> The BIOS already has control over Package C-states,
> and if the BIOS doesn't lock the MSR, Linux also
> has that capability.
> 
> Get the latest turbostat from the kernel tree
> and run turbostat -v
> and look for a line like this:
> 
> cpu0: MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x06008403 (demote-C3, demote-C1,
> locked: pkg-cstate-limit=3: pc6)
> 
> cpu0: MSR_NHM_SNB_PKG_CST_CFG_CTL: 0x06000403 (demote-C3, demote-C1,
> UNlocked: pkg-cstate-limit=3: pc6)
> 
> As described in the Intel Software Developer's Manual,
> this MSR, MSR_PKG_CST_CONFIG_CONTROL has a package C-state limit field.
> Above it limits the hardware to PC6, but could easily be set to PC3.
> 
> In one of the examples above, the register was locked by the BIOS,
> preventing Linux from modifying it, in the 2nd example, it is unlocked.
> 
> if we limited the package to PC3 here, then Linux would still choose CC6,
> but when all the cores entered CC6, the deepest the package would
> go would be PC3.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-06-20 19:22 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-11 21:30 Wishlist: Disable C6 in intel_idle for Model 44 processors Larry Baker
2013-06-11 21:34 ` Larry Baker
2013-06-12 13:40 ` Matthew Garrett
2013-06-12 18:11   ` Larry Baker
2013-06-12 18:32     ` Matthew Garrett
2013-06-14 19:32       ` Len Brown
2013-06-14 20:23         ` Larry Baker
     [not found]           ` <OF3D706B0A.D5540D6C-ON85257B8A.0076278F-85257B8A.0076E4B1@us.ibm.com>
2013-06-14 22:40             ` Larry Baker
2013-06-20 19:22         ` Larry Baker
2013-06-16  3:29   ` Henrique de Moraes Holschuh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox