All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Hurley <peter@hurleysoftware.com>
To: Mike Galbraith <umgwanakikbuti@gmail.com>,
	Josh Boyer <jwboyer@fedoraproject.org>,
	Len Brown <len.brown@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@kernel.org>,
	Ian Malone <ibmalone@gmail.com>,
	"Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>
Subject: Re: Kernel bug 60770
Date: Fri, 19 Sep 2014 12:24:16 -0400	[thread overview]
Message-ID: <541C58B0.4060704@hurleysoftware.com> (raw)
In-Reply-To: <1409234000.5212.7.camel@marge.simpson.net>

On 08/28/2014 09:53 AM, Mike Galbraith wrote:
> On Thu, 2014-08-28 at 09:23 -0400, Josh Boyer wrote: 
>> On Sat, Aug 16, 2014 at 10:33 AM, Josh Boyer <jwboyer@fedoraproject.org> wrote:
>>> Hi Len,
>>>
>>> Kernel bug https://bugzilla.kernel.org/show_bug.cgi?id=60770 is marked
>>> as closed, but there is a patch that at least one user seems to need
>>> to get things booting properly.  It was sent upstream a while ago:
>>>
>>> http://marc.info/?l=linux-kernel&m=138976439211647&w=2
>>>
>>> but has never made it into the kernel.  Do you know why this is or
>>> what happened to the patch?
>>
>> Adding Peter and Ingo.  Len seems to be MIA or otherwise occupied.
>>
>> Peter and Ingo, and thoughts on the bug/thread above?
> 
> That patch needs some bend adjustment, now looks like below here.
> 
> Patch also has a secondary benefit for core2 boxen, when booted
> processor.max_cstate=1, box can still use mwait.
> 
> Subject: [PATCH REGRESSION FIX] x86 idle: restore mwait_idle()
> From: Len Brown <lenb@kernel.org>
> Date: Wed, 15 Jan 2014 00:37:34 -0500
> 
> From: Len Brown <len.brown@intel.com>
> 
> In Linux-3.9 we removed the mwait_idle() loop:
> 'x86 idle: remove mwait_idle() and "idle=mwait" cmdline param'
> (69fb3676df3329a7142803bb3502fa59dc0db2e3)
> 
> The reasoning was that modern machines should be sufficiently
> happy during the boot process using the default_idle() HALT loop,
> until cpuidle loads and either acpi_idle or intel_idle
> invoke the newer MWAIT-with-hints idle loop.
> 
> But two machines reported problems:
> 1. Certain Core2-era machines support MWAIT-C1 and HALT only.
>    MWAIT-C1 is preferred for optimal power and performance.
>    But if they support just C1, cpuidle never loads and
>    so they use the boot-time default idle loop forever.
> 
> 2. Some laptops will boot-hang if HALT is used,
>    but will boot successfully if MWAIT is used.
>    This appears to be a hidden assumption in BIOS SMI,
>    that is presumably valid on the proprietary OS
>    where the BIOS was validated.
> 
>    https://bugzilla.kernel.org/show_bug.cgi?id=60770
> 
> So here we effectively revert the patch above, restoring
> the mwait_idle() loop.  However, we don't bother restoring
> the idle=mwait cmdline parameter, since it appears to add
> no value.
> 
> Maintainer notes:
> For 3.9, simply revert 69fb3676df
> for 3.10, patch -F3 applies, fuzz needed due to __cpuinit use in context
> For 3.11, 3.12, 3.13, this patch applies cleanly
> 
> Mike: add clflush barriers and resched IPI avoidance.

Mike,

The changes for clflush don't build prior to 3.17-rcX;
X86_BUG_CLFLUSH_MONITOR was X86_FEATURE_CLFLUSH_MONITOR prior to
commit 9b13a93df267af681a66a6a738bf1af10102da7d,
'x86, cpufeature: Convert more "features" to bugs'.

Len,

FWIW, I tested this patch on a dual-socket Xeon E5420, and nothing died :)
The change in core temps was not statistically significant though,
and I don't have more accurate testing gear for monitoring cpu power
consumption.

Regards,
Peter Hurley


> Cc: Mike Galbraith <bitbucket@online.de>
> Cc: Ian Malone <ibmalone@gmail.com>
> Cc: Josh Boyer <jwboyer@redhat.com>
> Cc: <stable@vger.kernel.org> # 3.9, 3.10, 3.11, 3.12, 3.13
> Signed-off-by: Len Brown <len.brown@intel.com>
> Signed-off-by: Mike Galbraith <bitbucket@online.de>
> ---
>  arch/x86/include/asm/mwait.h |    8 ++++++
>  arch/x86/kernel/process.c    |   50 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 58 insertions(+)
> 
> --- a/arch/x86/include/asm/mwait.h
> +++ b/arch/x86/include/asm/mwait.h
> @@ -30,6 +30,14 @@ static inline void __mwait(unsigned long
>  		     :: "a" (eax), "c" (ecx));
>  }
>  
> +static inline void __sti_mwait(unsigned long eax, unsigned long ecx)
> +{
> +	trace_hardirqs_on();
> +	/* "mwait %eax, %ecx;" */
> +	asm volatile("sti; .byte 0x0f, 0x01, 0xc9;"
> +		     :: "a" (eax), "c" (ecx));
> +}
> +
>  /*
>   * This uses new MONITOR/MWAIT instructions on P4 processors with PNI,
>   * which can obviate IPI to trigger checking of need_resched.
> --- a/arch/x86/kernel/process.c
> +++ b/arch/x86/kernel/process.c
> @@ -28,6 +28,7 @@
>  #include <asm/fpu-internal.h>
>  #include <asm/debugreg.h>
>  #include <asm/nmi.h>
> +#include <asm/mwait.h>
>  
>  /*
>   * per-CPU TSS segments. Threads are completely 'soft' on Linux,
> @@ -396,6 +397,52 @@ static void amd_e400_idle(void)
>  		default_idle();
>  }
>  
> +/*
> + * Intel Core2 and older machines prefer MWAIT over HALT for C1.
> + * We can't rely on cpuidle installing MWAIT, because it will not load
> + * on systems that support only C1 -- so the boot default must be MWAIT.
> + *
> + * Some AMD machines are the opposite, they depend on using HALT.
> + *
> + * So for default C1, which is used during boot until cpuidle loads,
> + * use MWAIT-C1 on Intel HW that has it, else use HALT.
> + */
> +static int prefer_mwait_c1_over_halt(const struct cpuinfo_x86 *c)
> +{
> +	if (c->x86_vendor != X86_VENDOR_INTEL)
> +		return 0;
> +
> +	if (!cpu_has(c, X86_FEATURE_MWAIT))
> +		return 0;
> +
> +	return 1;
> +}
> +
> +/*
> + * MONITOR/MWAIT with no hints, used for default default C1 state.
> + * This invokes MWAIT with interrutps enabled and no flags,
> + * which is backwards compatible with the original MWAIT implementation.
> + */
> +
> +static void mwait_idle(void)
> +{
> +	if (!current_set_polling_and_test()) {
> +		if (static_cpu_has_bug(X86_BUG_CLFLUSH_MONITOR)) {
> +			mb();
> +			clflush((void *)&current_thread_info()->flags);
> +			mb();
> +		}
> +
> +		__monitor((void *)&current_thread_info()->flags, 0, 0);
> +		if (!need_resched())
> +			__sti_mwait(0, 0);
> +		else
> +			local_irq_enable();
> +	} else
> +		local_irq_enable();
> +	current_clr_polling();
> +}
> +
>  void select_idle_routine(const struct cpuinfo_x86 *c)
>  {
>  #ifdef CONFIG_SMP
> @@ -409,6 +456,9 @@ void select_idle_routine(const struct cp
>  		/* E400: APIC timer interrupt does not wake up CPU from C1e */
>  		pr_info("using AMD E400 aware idle routine\n");
>  		x86_idle = amd_e400_idle;
> +	} else if (prefer_mwait_c1_over_halt(c)) {
> +		pr_info("using mwait in idle threads\n");
> +		x86_idle = mwait_idle;
>  	} else
>  		x86_idle = default_idle;
>  }



  parent reply	other threads:[~2014-09-19 16:24 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-16 14:33 Kernel bug 60770 Josh Boyer
2014-08-28 13:23 ` Josh Boyer
2014-08-28 13:53   ` Mike Galbraith
2014-08-29  2:19     ` Brown, Len
2014-09-19 16:24     ` Peter Hurley [this message]
2014-09-19 18:36       ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=541C58B0.4060704@hurleysoftware.com \
    --to=peter@hurleysoftware.com \
    --cc=hpa@zytor.com \
    --cc=ibmalone@gmail.com \
    --cc=jwboyer@fedoraproject.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=umgwanakikbuti@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.